Skip to content

Local NCBI BLAST problems? Some solutions for the unexperienced bioinformatician

Is local blast driving you nuts? Blast is a super powerful tool if you download it onto your own computer because you can blast more than one sequence at once, and not have to worry about the server dying on you. Having set it up both with the experience of a senior bioinformatics miracle worker (2012), I needed to do the process again with the new Blast+. Here are some problems I encountered that might help you deal with it too.

 

My Blast seems to be incredibly picky about where things are. In the end, I just ended up specifying where EVERYTHING was with a hard defined path.

/Users/JohnSmith/ncbi-blast-2.2.31+/bin/blastn -query ./yourfile.fasta -db /Users/JohnSmith/ncbi-blast-2.2.31+/db/nt -out ./yourfile.fasta.blast.txt -num_threads 2

That bit of text on the end there ‘-num_threads‘ may work for you, or it might not. It can significantly speed up your Blast run, but will also majorly slow down your computer. I’m using a machine with 8 cores, and running two sets of ‘num_threads 2’ slows it to a standstill. Even writing this post is taxing things. It is said to be quicker if you split your .fasta files into multiples, and assign each one a single thread. Or if you are like me, you can just let it run over the weekend.

 

Warning: lcl|Query_20167 contig20167: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options.

Make sure that you are using the appropriate type of database for your content. If you are using .fasta sequences with nucleotides, make sure you are using the nt database.

Occasionally, Blast threw up this error for me because there were spaces (shock, horror) in the sequence .fasta file I presented it with. To get around this, I just removed the spaces using this command

grep . yourfile.fasta > yourfile.nospaces.fasta

and voila! It was fixed. Maybe not the most professional of work-arounds, but it worked for me.

Finally, if it keeps giving you that error, don’t despair! You can just ignore it, and go check out the related sequences later. Don’t exit your Blast session because of this error, otherwise you’ll have to start all over again. Even if it looks like Blast is doing nothing, trust me, it still is.

 

Are you tempted to copy paste my instructions? Paste them into notepad or TextWrangler first, because otherwise you could be taking formatting marks that want to make yet more errors for you. Otherwise, write them out, but you’ll probably get sick of that quickly. I keep a notepad open with all the commands I am currently using for fast reference.

 

 

Published inBioinformatics

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.