hpc_blast User Manual

Introduction

The hpc_blast script is a tool that allows the user to interface with BLAST databases in the UFS HPC environment. This user manual explains how to use the script.

This tool faciliates version control of BLAST databases on the HPC cluster and can be used to perform the following tasks:

  • List the BLAST databases present in the environment or in a supplied directory;
  • Query the version (i.e. the creation date) a BLAST database in the enviroment or in a supplied directory
  • Clone a specifed BLAST database in the enviroment to a destination directory in a user's home directory

The first two tasks are self-explanatory, while cloning a BLAST database to the user's home directory is used to effectively freeze the version of a particular BLAST database. Note, that because some databases are very large, the cloning operation must only be performed when ABSOLUTELY needed.

Getting access to hpc_blast

Please consult the usage page for the steps involved in gaining access to hpc_blast.

Available Commands

The hpc_blast script can be invoked with the following commands:

  • list_db
  • ver_db
  • clone_db

list_db

List the BLAST databases present in the environment or in a user-supplied directory

Usage:

$ hpc_blast list_db [--dest <destination>]


--dest : destination directory for BLAST databases [Optional]

Examples:

  • List BLAST databases in the enviroment (i.e. loaded via module load)

    $ hpc_blast list_db
    
  • List BLAST databases present in a user-supplied directory: ~/some/directory/

    $ hpc_blast list_db --dest ~/some/directory/
    

ver_db

List the version (creation date) of a BLAST database present in the environment or in a user-supplied directory

Usage:

$ hpc_blast ver_db (--db_name <database name>) [--dest <destination>]


--db_name : Name of the BLAST database to be queried (Required)
--dest : destination directory for BLAST databases [Optional]

Examples:

  • Retrieve the version of the BLAST database swissprot in the enviroment (i.e. loaded via module load)

    $ hpc_blast ver_db --db_name swissprot
    
  • Retrieve the version of the BLAST database swissprot present in a user-supplied directory: ~/some/directory/

    $ hpc_blast ver_db --db_name swissprot --dest ~/some/directory/
    

    NOTE: This is not the same database as the one loaded via module load

clone_db

Clone a BLAST database in the environment to a destination directory in the user's home directory

Usage:

$ hpc_blast clone_db (--db_name <database name>) (--dest <destination>)


--db_name : Name of the BLAST database to be queried (Required)
--dest : destination directory for BLAST databases (Required)

NOTES:

  • The NCBI BLAST databases are extremely large and thus cloning these databases to the user's home directory should only be done in cases where it is absolutely neccesary.
  • The cloning operation IS NOT ALLOWED ON THE LOGIN NODE.
  • The cloning of very large databases will take time. The nr database was cloned in 30-45 minutes.
  • If a previous version of a BLAST database is already in the supplied directory, the current version to be cloned WILL OVERWRITE the previous version.
  • To use the cloned database in BLAST tools such as blastp, you need to pass the destination directory and the database name to the -db flag. For example: ~/some/directory/swissprot

Examples:

  • Clone the current version of the BLAST database swissprot in the enviroment (i.e. loaded via module load) to the directory ~/some/directory/

    $ hpc_blast clone_db --db_name swissprot --dest ~/some/directory/
    
  • Follow the prompt on screen to confirm the operation.