Matrix Science
Home Mascot Help  
   
  Help > Sequence Database Setup > Common Mistakes   
 
 

Sequence Database Setup: Common Mistakes

  1. Forgetting the wild card in the database filename
  2. Putting the wild card in the filename extension
  3. Using spaces or special characters in the database path
  4. Using back slashes in the database path
  5. Out of date taxonomy files
  6. Creating a sequence database with inconsistent title syntax

  1. Forgetting the wild card in the database filename

    The wild card is important. First because it masks the time-stamp or version number. Second, because it allows the database to be updated without interrupting ongoing searches. Even if you don't want to use a time-stamp or version number, you must still include a wild card.


  2. Putting the wild card in the filename extension

    The wild card is there to mask the time-stamp or version number, not the extension. The filename should be like MSDB_*.fasta or MSDB*.fasta. If you specify the filename as MSDB.*, then Mascot won't be able to distinguish the Fasta file from the Reference file, with interesting results.


  3. Using spaces or special characters in the database path

    Spaces in paths may be legal in Windows, but they shouldn't be. Besides a wild card in the filename, only alphanumerics and the following characters are permitted in paths: /:_.-$%&()[]


  4. Using back slashes in the database path

    Even if you are running Mascot on a Windows platform, all paths must use forward slashes.


  5. Out of date taxonomy files

    The taxonomy files need to be of a similar date to the Fasta file. These files are used to create a taxonomy index at the time the Fasta file is compressed, so its no use updating the files afterwards.


  6. Creating a sequence database with inconsistent title syntax

    If you merge two or more Fasta files into a single file, you need to ensure that a unique identifier (accession) can be parsed from all entries with a single parse rule. It may be necessary to use a few lines of Perl to reformat the title lines, rather than just copying or cat'ing the files together.

 
 
Copyright © 2007 Matrix Science Ltd. All Rights Reserved.