BLAST+ Tutorial

Introduction to BLAST+

The Basic Local Alignment Search Tool (BLAST) is the most popular bioinformatics tool currently available. Most users interface with this similarity search tool through the web-interface hosted by the NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi), which creates the illusion that it is a GUI – based software program.

However, this is not the case. BLAST is a suite of command-line programs which can be run on high-performance clusters which lack GUIs. While running BLAST through the NCBI web-interface is convenient, some scenarios require one to run BLAST in-house. These include, but are not limited to:

  • Not being able to use proprietary sequence data off-site (This includes custom sequence databases)

  • Security concerns of running applications off-site

  • High-volume searches need to be performed and you have access to high performance computing resources

  • Have a custom pipeline (Many other web-applications in bioinformatics run BLAST as part of their pipeline – thus on their own servers)

It is important to note that the software tools themselves are separate from the databases used. In other words, given correct formatting of the sequence database, BLAST command-line tools can be used to search any database provided. In fact, BLAST is distributed as such, with the software suite provided separately from the databases.

The BLAST software suite is distributed as BLAST+ via a ftp server hosted at the NCBI (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) and is available as source- code or binary executables for Linux, Windows and MAC computers (About 300 MB). BLAST+ and the NCBI Databases are also hosted on the mirror at the University of the Free State (BLAST+ | Databases)

The databases used through the web-interface is also distributed via ftp (ftp://ftp.ncbi.nlm.nih.gov/blast/db/), however they are quite large in size.

The following executables are found in the BLAST+ distribution and are sub-categorized based on their function:

  1. Search tools (They execute BLAST search):

    • blastn
    • blastp
    • blastx
    • tblastn
    • psiblast
    • rpsblast
    • rpsblastn
  2. Database applications (They create or examine BLAST databases):

    • makeblastdb
    • blastdb_aliastool
    • makeprofiledb
    • blastdbcmd
  3. Filtering tools (these are not separate programs, but can be used with blastn and blastp)

Tutorial Outcomes

This tutorial will demonstrate and thus enable the user to perform the following operations using the BLAST+ command-line tools:

  • Creation of custom BLAST databases
  • Performing BLAST searches databases