Finnigan Xcalibur
Interactive Searching
The utilities provided by Thermo Electron for converting Xcalibur
binary (RAW) files into peak list (DTA) files, lcq_dta.exe and extract_msn.exe, are Windows console
(DOS) applications. Usage information can be displayed by executing either utility without
any arguments. Additional information can be found below, in your Xcalibur / Bioworks documentation, and on the
Sequest web site at Scripps,
under the heading extractms.
Mascot supports the DTA format. However, if the data are from an LC-MS/MS experiment,
lcq_dta.exe and extract_msn.exe will generate separate DTA files for each precursor mass.
Searching individual DTA files is inefficient, and doesn't allow Mascot to generate a proper results summary. The recommended
ways to search Xcalibur data using Mascot are:
- Create DTA files and merge them into a single file:
If you have a set of DTA files but
do not have access to the RAW file, or if you are using Mascot on the public web site,
your only option is to concatenate the DTA files using one of these utilities.
- merge.pl, a Perl script (any platform)
- merge.bat, a DOS batch file (Windows)
- merge.sh, a shell script (Unix)
Download all three utilities for Windows
or Unix
If possible, you should choose the Perl script, because this creates a Mascot Generic Format (MGF)
file in which each DTA file name is preserved as a spectrum title. This makes it easier to compare the Mascot
search results with the original data, because you can identify the scan range represented by each
spectrum. It also enables the origin of each DTA file to be tracked when data from
multiple RAW files from a MudPIT experiment are merged together.
Most Unix systems will already have Perl installed. If your Windows system doesn't have Perl, it
can be downloaded free from ActiveState.
(Quote from Bugzilla:
"Any machine that doesn't have Perl on it is a sad machine indeed.")
- If you have a Windows-based Mascot server in-house:
You can use the lcq_dta search form
to upload and process the RAW file. When this form is submitted, the processing options are passed
to lcq_dta.exe or extract_msn.exe, and the RAW file is processed into DTA files which are automatically merged into a single file,
pre-loaded into a Mascot search form.
When Mascot is first installed, you need to edit the underlying script (lcq_dta_shell.pl) to
specify the locations of a workspace directory and the
directory containing the lcq_dta.exe or extract_msn.exe executable. These are
defined by two variables near the top of the script:
# local name of temp directory on Mascot server (no trailing slash)
my $tempDir = "c:\\temp";
# local path to lcq_dta.exe or extract_msn.exe on Mascot server
my $lcqExe = "c:\\LCQ\\system\\programs\\lcq_dta.exe";
Note the use of double backslashes in the path names.
If you have Bioworks 3.0 or later, you will have extract_msn.exe instead of lcq_dta.exe.
Note that old versions of
lcq_dta_shell.pl used the option flag -P to set the temporary path, and this changed to -D in
extract_msn.exe. So, if your lcq_dta_shell.pl is earlier than rev 19, the
following line needs to be modified:
push @comLine, "-P\"$tempDir\"";
NB If you are submitting searches to the public web site, remember
that the size of the upload file is
limited to 300 spectra or 5 Mb. To avoid these limits, license Mascot
to run on your in-house server.
LCQ_DTA.EXE and EXTRACT_MSN.EXE
Versions and Files
There have been several versions of lcq_dta.exe and, in general, you cannot process files
from one release of Xcalibur using lcq_dta.exe from an earlier release. Unfortunately, it isn't always
easy to figure out which version you have.
All versions depend on two dynamic link libraries: FILEIO.DLL and FREGISTRY.DLL. In some versions,
these libraries depend on further libraries (FGLOBAL.DLL, FCONTROL2.DLL, MFC42U.DLL).
For Xcalibur releases up to and including 1.2, lcq_dta.exe and associated files were part of the
standard Xcalibur distribution, and could be found in
the C:\Xcalibur\system\programs directory.
In Xcalibur 1.3 / Bioworks 3.0, a version of lcq_dta.exe may be present in C:\inetpub\etc\Xcalibur.
However, a similar utility called extract_msn.exe seems to be more up to date and reliable.
This can be found in the C:\Xcalibur\system\programs directory.
If you need to move either utility to a different location,
there is an excellent tool for identifying which files need to be moved:
Dependency Walker. Drag and drop
an executable or a DLL onto this utility, and it will highlight any conflicts or missing dependencies.
Processing Tips
Executing lcq_dta.exe or extract_msn.exe without any arguments provides usage information,
and most options are self-explanatory. The following are worth noting:
- Intermediate scans (-S): Although it looks like it should be OK to set S to zero, this
can sometimes result in no output
- Min. Peaks in DTA (-I): The default is 0, but this should always be set to a sensible number,
say 10, to remove empty or near empty scans, since these can never give significant matches
in Mascot.
- Precursor Charge (-C): With triple-play data, precursor charge state determination is
fairly sophisticated, and the default settings should not be changed. If your data don't include zoom scans, the code
attempts to recognise singly charged precursors, while precursors with higher charge states are
output twice, with 2+ and 3+ charge states.
- TIC Threshold (-E): Not described in the Usage information
- Extract MSn (-P): Not described in the Usage information
Profile vs. Centroid
lcq_dta.exe does not perform centroiding of profile data. If you generate DTA files from a
RAW file containing profile data, the DTA files are themselves profile data. Zero intensity values are dropped,
and non-zero intensities are output at 0.1 Da intervals. Mascot deals with this as best it can by performing simple
peak detection, but this is less than ideal. The other problem of working with profile data is
that the DTA files will be very large, and you may
occasionally get a Mascot error message that there are more than 10,000 data points in a single spectrum.
Migrating from Sequest to Mascot
If you have previously used Sequest to search Xcalibur data, the main difference to
watch for in Mascot is the way in which mass tolerances are specified. Sequest works in
integers. Even if you specify a fragment ion tolerance of 0, the effective
tolerance is still approx ± 1 Da. With Mascot, try using a peptide mass tolerance
of ± 1.5 Da and a fragment ion tolerance of ± 0.8 Da. The mass error graphs in the Protein View
and Peptide View reports will allow you to judge whether these are appropriate settings for your
data.
Automation using Mascot Daemon
Mascot Daemon can be used for automated
searching of RAW files by choosing "ThermoFinnigan LCQ / DECA RAW file" as the data import filter. Unlike the lcq_dta web browser
form, Daemon executes lcq_dta.exe or extract_msn.exe on the Windows client, so this option is
available even if your Mascot server is on a Unix platform.
Real-time monitor
By running Mascot Daemon in real-time monitor mode, each RAW file can be searched automatically,
as soon as acquisition is complete. First, create a suitable parameter set for the task:
(Note that the file format is Mascot Generic, not DTA, because Mascot data import filters
always create MGF files.) Second, create a real-time monitor task to monitor the directory where the RAW files
are being created. Remember to select the correct parameter file, and choose 'ThermoFinnigan LCQ /
DECA RAW file' as the data import filter.
LCQ_DTA processing parameters are specified by choosing the Options button next to the
data import filter list box.
Troubleshooting
-
In real-time monitor mode, it is important that Mascot Daemon waits until acquisition is complete before
processing the RAW file into peak lists. To avoid taking a file while it is still being written, Daemon checks the file size
at intervals, and waits until it has stopped increasing. The default interval is 60 seconds, which
may not be long enough when the file size grows only slowly. If Daemon tries to process a RAW file
before acquisition is complete, increase this interval by going to the
Timer Settings tab of the Preferences dialog. Increase the value of 'Delay after failing to open
read-locked file' until the problem disappears.

- If there are problems processing very large RAW files, check that you have adequate disk space. When Daemon
processes a RAW file, the workspace is in the local user's temp
directory, the location of which is system dependent. Under Windows 2000, the path is C:\Documents and Settings\YourName\Local
Settings\Temp. You'll know when you've found the right location because it will contain a sub-directory called
Mascot_Daemon_workspace.
- If Mascot Daemon reports "No output from lcq_dta.exe (check parameters)"
or the lcq_dta search form returns "Must choose at least one query for
repeat search"
this means that no DTA files were produced. The most common causes are (i) the lcq_dta parameters
are too restrictive, (ii) the data file does not contain MS/MS scans, (iii) the version of lcq_dta.exe
is older than the version of Xcalibur used to create the data file. The easiest way to investigate
and debug this problem is to execute lcq_dta.exe at a command prompt, using identical processing
parameters.
- If your Mascot server runs under Windows XP, and you get the message "cannot create
temporary directory" when you try to use the lcq_dta search form, this may be because
the security settings do not allow CGI programs to execute the command processor. A fix
is described on the Support page, in the Windows XP section.
Acknowledgements
Sequest is a registered trademark of the University of Washington.
Xcalibur is a registered trademark and BioWorks is a trademark of Thermo Electron Corporation.
|