Alphafold Parameter Reference
Introduction
This reference contains information on the parameters that can be passed to AlphaFold. Most information comes from the AlphaFold github page and the contents of execution scripts and is duplicated here for convinience.
Parameters
fasta_paths
Description
Paths to FASTA files, each containing a prediction target that will be folded one after another. If a FASTA file contains multiple sequences, then it will be folded as a multimer. Paths should be separated by commas.
Values
All FASTA paths must have a unique basename, as this basename is used to name the output directories for each prediction.
data_dir
Description
Path to directory of supporting data. This directory is roughly 2.2 TB in size.
Values
The value set in the submit script points to the current version of the datasets available on the UFS HPC. In most cases, users will not need to change this.
If a historical dataset is required, please see the max_template_date parameter first.
output_dir
Description
Path to a directory that will store the results.
Values
This directory will be created and contain a directory for each fasta file provided.
max_template_date
Description
Maximum template release date to consider. Any template with a release date after this date will be ignored. This parameter is important if folding historical test sets.
Values
The submit script automatically sets the date to the current date unless explicitly changed.
If another date is required, enter the new date in the format yyyy-mm-dd. For example: 2022-01-25
db_preset
Description
Preset for MSA database configuration which can be used to optimize for speed and lower hardware requirements.
Values
The two accepted values are:
-
full_dbs : Use all genetic databases. This is the default value in the submit script and is the value used at CASP14
-
reduced_dbs : Runs with a reduced version of the BFD. This preset is not recommended for the UFS HPC
model_preset
Description
Selects the AlphaFold model to run. The available models are:
- The monomer model (monomer)
- The monomer model with extra ensembling (monomer_casp14)
- The monomer with pTM head (monomer_ptm)
- The multimer model (multimer)
Values
The accepted values are: * monomer * monomer_casp14 * monomer_ptm * multimer
Note that when selecting multimer*, the is_prokaryote_list parameter needs to be set*
model_preset
Description
Selects the AlphaFold model to run. The available models are:
- The monomer model (monomer)
- The monomer model with extra ensembling (monomer_casp14)
- The monomer with pTM head (monomer_ptm)
- The multimer model (multimer)
Values
The accepted values are:
- monomer
- monomer_casp14 (Recommended for monomer runs)
- monomer_ptm
- multimer
Note that when selecting multimer*, a multi-sequence FASTA file must be provided and the is_prokaryote_list parameter needs to be set*
is_prokaryote_list
Description
Note: This is an option for the multimer model and is not used by the single chain system.
These values determine the pairing method for the MSA.
Values
The two accepted values are:
-
true : The target complex is from a prokaryote
-
false : The target complex is not from a prokaryote, or the orgin is not known.