Matrix Science
Home Mascot Help  
   
  Help > Finnigan Xcalibur   
 
 

Finnigan Xcalibur

Interactive Searching

The utilities provided by Thermo Electron for converting Xcalibur binary (RAW) files into peak list (DTA) files, lcq_dta.exe and extract_msn.exe, are Windows console (DOS) applications. Usage information can be displayed by executing either utility without any arguments. Additional information can be found below, in your Xcalibur / Bioworks documentation, and on the Sequest web site at Scripps, under the heading extractms.

Mascot supports the DTA format. However, if the data are from an LC-MS/MS experiment, lcq_dta.exe and extract_msn.exe will generate separate DTA files for each precursor mass. Searching individual DTA files is inefficient, and doesn't allow Mascot to generate a proper results summary. The recommended ways to search Xcalibur data using Mascot are:

  1. Create DTA files and merge them into a single file: If you have a set of DTA files but do not have access to the RAW file, or if you are using Mascot on the public web site, your only option is to concatenate the DTA files using one of these utilities.

    • merge.pl, a Perl script (any platform)
    • merge.bat, a DOS batch file (Windows)
    • merge.sh, a shell script (Unix)

    Download all three utilities for Windows or Unix

    If possible, you should choose the Perl script, because this creates a Mascot Generic Format (MGF) file in which each DTA file name is preserved as a spectrum title. This makes it easier to compare the Mascot search results with the original data, because you can identify the scan range represented by each spectrum. It also enables the origin of each DTA file to be tracked when data from multiple RAW files from a MudPIT experiment are merged together.

    Most Unix systems will already have Perl installed. If your Windows system doesn't have Perl, it can be downloaded free from ActiveState. (Quote from Bugzilla: "Any machine that doesn't have Perl on it is a sad machine indeed.")

  2. If you have a Windows-based Mascot server in-house: You can use the lcq_dta search form to upload and process the RAW file. When this form is submitted, the processing options are passed to lcq_dta.exe or extract_msn.exe, and the RAW file is processed into DTA files which are automatically merged into a single file, pre-loaded into a Mascot search form.

    lcq_dta_shell

    When Mascot is first installed, you need to edit the underlying script (lcq_dta_shell.pl) to specify the locations of a workspace directory and the directory containing the lcq_dta.exe or extract_msn.exe executable. These are defined by two variables near the top of the script:

    # local name of temp directory on Mascot server (no trailing slash)
    my $tempDir = "c:\\temp";

    # local path to lcq_dta.exe or extract_msn.exe on Mascot server
    my $lcqExe = "c:\\LCQ\\system\\programs\\lcq_dta.exe";
    Note the use of double backslashes in the path names.

    If you have Bioworks 3.0 or later, you will have extract_msn.exe instead of lcq_dta.exe. Note that old versions of lcq_dta_shell.pl used the option flag -P to set the temporary path, and this changed to -D in extract_msn.exe. So, if your lcq_dta_shell.pl is earlier than rev 19, the following line needs to be modified:

    push @comLine, "-P\"$tempDir\"";

NB If you are submitting searches to the public web site, remember that the size of the upload file is limited to 300 spectra or 5 Mb. To avoid these limits, license Mascot to run on your in-house server.

LCQ_DTA.EXE and EXTRACT_MSN.EXE

Versions and Files

There have been several versions of lcq_dta.exe and, in general, you cannot process files from one release of Xcalibur using lcq_dta.exe from an earlier release. Unfortunately, it isn't always easy to figure out which version you have. All versions depend on two dynamic link libraries: FILEIO.DLL and FREGISTRY.DLL. In some versions, these libraries depend on further libraries (FGLOBAL.DLL, FCONTROL2.DLL, MFC42U.DLL).

For Xcalibur releases up to and including 1.2, lcq_dta.exe and associated files were part of the standard Xcalibur distribution, and could be found in the C:\Xcalibur\system\programs directory. In Xcalibur 1.3 / Bioworks 3.0, a version of lcq_dta.exe may be present in C:\inetpub\etc\Xcalibur. However, a similar utility called extract_msn.exe seems to be more up to date and reliable. This can be found in the C:\Xcalibur\system\programs directory.

If you need to move either utility to a different location, there is an excellent tool for identifying which files need to be moved: Dependency Walker. Drag and drop an executable or a DLL onto this utility, and it will highlight any conflicts or missing dependencies.

Processing Tips

Executing lcq_dta.exe or extract_msn.exe without any arguments provides usage information, and most options are self-explanatory. The following are worth noting:
  • Intermediate scans (-S): Although it looks like it should be OK to set S to zero, this can sometimes result in no output
  • Min. Peaks in DTA (-I): The default is 0, but this should always be set to a sensible number, say 10, to remove empty or near empty scans, since these can never give significant matches in Mascot.
  • Precursor Charge (-C): With triple-play data, precursor charge state determination is fairly sophisticated, and the default settings should not be changed. If your data don't include zoom scans, the code attempts to recognise singly charged precursors, while precursors with higher charge states are output twice, with 2+ and 3+ charge states.
  • TIC Threshold (-E): Not described in the Usage information
  • Extract MSn (-P): Not described in the Usage information

Profile vs. Centroid

lcq_dta.exe does not perform centroiding of profile data. If you generate DTA files from a RAW file containing profile data, the DTA files are themselves profile data. Zero intensity values are dropped, and non-zero intensities are output at 0.1 Da intervals. Mascot deals with this as best it can by performing simple peak detection, but this is less than ideal. The other problem of working with profile data is that the DTA files will be very large, and you may occasionally get a Mascot error message that there are more than 10,000 data points in a single spectrum.

Migrating from Sequest to Mascot

If you have previously used Sequest to search Xcalibur data, the main difference to watch for in Mascot is the way in which mass tolerances are specified. Sequest works in integers. Even if you specify a fragment ion tolerance of 0, the effective tolerance is still approx ± 1 Da. With Mascot, try using a peptide mass tolerance of ± 1.5 Da and a fragment ion tolerance of ± 0.8 Da. The mass error graphs in the Protein View and Peptide View reports will allow you to judge whether these are appropriate settings for your data.

Automation using Mascot Daemon

Mascot Daemon can be used for automated searching of RAW files by choosing "ThermoFinnigan LCQ / DECA RAW file" as the data import filter. Unlike the lcq_dta web browser form, Daemon executes lcq_dta.exe or extract_msn.exe on the Windows client, so this option is available even if your Mascot server is on a Unix platform.

Real-time monitor

By running Mascot Daemon in real-time monitor mode, each RAW file can be searched automatically, as soon as acquisition is complete. First, create a suitable parameter set for the task:

daemon parameter tab

(Note that the file format is Mascot Generic, not DTA, because Mascot data import filters always create MGF files.) Second, create a real-time monitor task to monitor the directory where the RAW files are being created. Remember to select the correct parameter file, and choose 'ThermoFinnigan LCQ / DECA RAW file' as the data import filter.

daemon task tab

LCQ_DTA processing parameters are specified by choosing the Options button next to the data import filter list box.

options dialog

Troubleshooting

  • In real-time monitor mode, it is important that Mascot Daemon waits until acquisition is complete before processing the RAW file into peak lists. To avoid taking a file while it is still being written, Daemon checks the file size at intervals, and waits until it has stopped increasing. The default interval is 60 seconds, which may not be long enough when the file size grows only slowly. If Daemon tries to process a RAW file before acquisition is complete, increase this interval by going to the Timer Settings tab of the Preferences dialog. Increase the value of 'Delay after failing to open read-locked file' until the problem disappears.

    daemon preferences

  • If there are problems processing very large RAW files, check that you have adequate disk space. When Daemon processes a RAW file, the workspace is in the local user's temp directory, the location of which is system dependent. Under Windows 2000, the path is C:\Documents and Settings\YourName\Local Settings\Temp. You'll know when you've found the right location because it will contain a sub-directory called Mascot_Daemon_workspace.
  • If Mascot Daemon reports "No output from lcq_dta.exe (check parameters)" or the lcq_dta search form returns "Must choose at least one query for repeat search" this means that no DTA files were produced. The most common causes are (i) the lcq_dta parameters are too restrictive, (ii) the data file does not contain MS/MS scans, (iii) the version of lcq_dta.exe is older than the version of Xcalibur used to create the data file. The easiest way to investigate and debug this problem is to execute lcq_dta.exe at a command prompt, using identical processing parameters.
  • If your Mascot server runs under Windows XP, and you get the message "cannot create temporary directory" when you try to use the lcq_dta search form, this may be because the security settings do not allow CGI programs to execute the command processor. A fix is described on the Support page, in the Windows XP section.

Acknowledgements

Sequest is a registered trademark of the University of Washington. Xcalibur is a registered trademark and BioWorks is a trademark of Thermo Electron Corporation.
 
 
Copyright © 2007 Matrix Science Ltd. All Rights Reserved.