Installation#
There are two ways of using MatFlow:
The MatFlow command-line interface (CLI)
The MatFlow Python package
Both of these options allow workflows to be designed and executed. The MatFlow CLI is recommended for beginners and strongly recommended if you want to run MatFlow on a cluster. The Python package allows workflows to be designed and explored via the Python API and is recommended for users comfortable working with Python. If you are interested in contributing to the development of MatFlow, the Python package is the place to start.
The CLI and the Python package can be used simultaneously.
Using pip#
The recommended way to install MatFlow is to use pip to install the Python package from PyPI:
pip install matflow-new
This installs the python package, which also gives the CLI version of MatFlow.
Release notes#
Release notes for this version (0.3.0a173) are available on GitHub. Use the version switcher in the top-right corner of the page to download/install other versions.
Alternative installation methods#
Although not currently recommended, advanced users may wish to use one of the alternative installation methods.
Configuration#
MatFlow uses a config file to control details of how it executes workflows. A default config file will be created the first time you submit a workflow. This will work without modification on a personal machine, however if you are using MatFlow on HPC you will likely need to make some modifications to describe the job scheduler, settings for multiple cores, and to point to your MatFlow environments file.
Some examples are given for the University of Manchester’s CSF.
If there is a suitable config file for your HPC system, you can pull the relevant file using the following syntax (example shown for Manchester’s CSF3):
matflow config import github://hpcflow:matflow-configs@main/manchester-CSF3.yaml
After pulling a config file using the above command, you still need to edit it to set the path to
your MatFlow environments file.
The path to your config file can be found using matflow manage get-config-path
,
or to open the config file directly, use matflow open config
.
Environments#
Matflow has the concept of environments, similar to python virtual environments.
These are required so that tasks can run using the specific software they require.
Your MatFlow environments must be defined in your environments (YAML) file before MatFlow
can run workflows, and this environment file must be pointed to in the config file
via the environment_sources
key.
Once this has been done,
your environment file can be be opened using matflow open env-source
.
A template environments file is given below.
It is recommended to use this as a starting point, making modifications for your own computer/HPC system,
in particular the setup
sections for each environment.
Note that currently MatFlow works with DAMASK version 3.0.0a7.post0
but not the latest versions.
As such the MatFlow damask_parse
environment should use pip install damask==3.0.0a7.post0
.
Linux/macOS#
- name: damask_parse_env
setup: |
source /full/path/to/.venv/bin/activate
executables:
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores:
start: 1
stop: 32
parallel_mode: null
- name: formable_env
setup: |
source /full/path/to/.venv/bin/activate
executables:
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores:
start: 1
stop: 32
parallel_mode: null
- name: defdap_env
setup: |
source /full/path/to/.venv/bin/activate
executables:
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores:
start: 1
stop: 32
parallel_mode: null
- name: damask_env
setup: |
module load mpi/intel-18.0/openmpi/4.1.0
IMG_PATH=/full/path/to/DAMASK-docker-images/damask-grid_3.0.0-alpha7.sif
export HDF5_USE_FILE_LOCKING=FALSE
executables:
- label: damask_grid
instances:
- command: singularity run $IMG_PATH
num_cores: 1
parallel_mode: null
- command: mpirun singularity run $IMG_PATH
num_cores:
start: 2
stop: 32
parallel_mode: null
- name: matlab_env
setup: |
module load matlab/module/file/version
MTEX_DIR=/full/path/to/toolboxes/mtex/mtex-6.0.0
executables:
- label: run_mtex
instances:
- command: |
for dir in $(find ${MTEX_DIR} -type d | grep -v -e ".git" -e "@" -e "private"); do MATLABPATH="${dir};${MATLABPATH}"; done
export MATLABPATH=${MATLABPATH}
matlab -softwareopengl -singleCompThread -batch "addpath('<<script_dir>>'); <<script_name_no_ext>> <<args>>"
num_cores: 1
parallel_mode: null
- label: compile_mtex
instances:
- command: |
for dir in $(find ${MTEX_DIR} -type d | grep -v -e ".git" -e "@" -e "private" -e "data" -e "makeDoc" -e "templates" -e "nfft_openMP" -e "compatibility/")
do
MTEX_INCLUDE="-I ${dir} ${MTEX_INCLUDE}"
done
export MTEX_INCLUDE="${MTEX_INCLUDE} -a ${MTEX_DIR}/data -a ${MTEX_DIR}/plotting/plotting_tools/colors.mat"
mcc -R -singleCompThread -R -softwareopengl -m "<<script_path>>" <<args>> -o matlab_exe ${MTEX_INCLUDE}
num_cores: 1
parallel_mode: null
- label: run_compiled_mtex
instances:
- command: |
export MATLAB_RUNTIME=/full/path/to/matlab/runtime-or-installation
./run_matlab_exe.sh ${MATLAB_RUNTIME} <<args>>
num_cores: 1
parallel_mode: null
- name: python_env
executables:
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores:
start: 1
stop: 32
parallel_mode: null
- name: dream_3D_env
executables:
- label: dream_3D_runner
instances:
- command: /full/path/to/dream3d/DREAM3D-6.5.171-Linux-x86_64/bin/PipelineRunner
num_cores: 1
parallel_mode: null
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores: 1
parallel_mode: null
Windows#
- name: matlab_env
executables:
- label: run_mtex
instances:
- command: |
& 'C:\path\to\matlab.exe' -batch "addpath('<<script_dir>>'); <<script_name_no_ext>> <<args>>"
num_cores: 1
parallel_mode: null
- label: compile_mtex
instances:
- command: |
$mtex_path = 'C:\path\to\mtex\folder'
& 'C:\path\to\mcc.bat' -R -singleCompThread -m "<<script_path>>" <<args>> -o matlab_exe -a "$mtex_path/data" -a "$mtex_path/plotting/plotting_tools/colors.mat"
num_cores: 1
parallel_mode: null
- label: run_compiled_mtex
instances:
- command: .\matlab_exe.exe <<args>>
num_cores: 1
parallel_mode: null
- name: dream_3D_env
executables:
- label: dream_3D_runner
instances:
- command: "& 'C:\\path\\to\\DREAM3D-directory\\PipelineRunner.exe'"
num_cores: 1
parallel_mode: null
- label: python_script
instances:
- command: python "<<script_path>>" <<args>>
num_cores: 1
parallel_mode: null
Tips for SLURM#
hpcFlow (which MatFlow uses) currently has a fault such that it doesn’t select a SLURM partition based on the resources requested in your workflow file. As such, users must manually define this in their workflow files e.g.
resources:
any:
scheduler_args:
directives:
--time: 00:30:00
--partition: serial
Note also that for many SLURM schedulers, a time limit must also be specified as shown above.
A default time limit and partition
can be set in the config file, which will be used for tasks which don’t have this set explicitly
in a resources
block like the example above.