The inclusion of intensity values, separated from mass values by colons, is optional.
If intensity values are not included, then the colons must also be omitted, as in the
y series example.
Mascot uses the intensity information to iteratively select sub-sets of
the most intense peaks in order to optimise scoring discrimination.
Mass values do not need to be in order, or represent contiguous
sequence ion ladders.
A line may contain several ions information qualifiers, for
example:
1454.4 ions(b-610,707,804,1086) ions(y-2909) ions(2106,2632,2545)
The sequence tag qualifier consists of the observed mass of the first peak of an
identified sequence ladder, a stretch of interpreted amino acid sequence, and the observed mass of the
final peak of the ladder. For example
1890.2 tag(1004.1, LSADTG, 1548.5)
Use of whitespace (tabs, spaces) inside the parentheses is optional, for readability.
Case is not significant. Other qualifiers, including other sequence
tags, may be included in the same query.
The syntax for the sequence string is similar to a
seq(
) qualifier, but without the prefix.
That is, [IL]SAXTG would be allowed. X means unknown, and is equivalent
to [ACDEFGHIKLMNPQRSTUVWY]. There is little point in specifying X in
a standard tag, but can be useful in an error tolerant tag.
In a tag, the sequence syntax is extended to describe alternative dimers, trimers, etc.
For example: LSA[DT|M|F]G. The pipe symbol divides alternatives,
so that the defined possibilities in this case are LSADTG, LSAMG, LSAFG.
This provides a convenient way to represent the ambiguities that are found
when trying to interpret a spectrum. A term in square brackets without pipe
symbols defaults to the original sense of a character class. That is [IL]
is identical to [I|L].
Note that alternatives delimited by pipe symbols are sequences,
not character classes. [DT|M|F] is not the same as [DT|TD|M|F].
A tag may run in either direction, but the mass values
are 'glued' to the ends of the tag. Hence, tag(1004, LSADTG, 1548) is
the same as tag(1548, GTDASL, 1004) but different to tag(1548, LSADTG, 1004).
The observed fragment ion mass values can belong to any series,
including doubly charged series if permitted by the precursor charge
and instrument type. However, both fragment ion mass values must belong to
the same series. That is, they can both be y or y++ or y-17 but one
cannot be y and the other y-17.
If the tag includes an ambiguous sequence string and there are variable modifications
or a wide peptide mass tolerance
or no enzyme specificity, this may generate a very large number of possibilities.
Such searches can take a long time to complete and are unlikely to give a high score.
It is not possible to mix ions(
) qualifiers and sequence
tags in the same query.
A sequence tag can match to a peptide despite there being an unsuspected
modification or point mutation by allowing the mass values to 'float'.
For example, take the peptide GVQVETISPGDGR, MH+ = 1314.7 and
the (b ion) sequence tag:
1314.7 tag(614.3,TISP,911.5)
If there was an unsuspected modification on the N-terminal side of the tag,
which increased the mass by 100, this would affect both the fragment ion mass
values in tandem. The tag interpreted from the spectrum would become:
1414.7 tag(714.3,TISP,1011.5)
On the other hand, if the unsuspected modification was on the C-terminal
side of the tag, or if the fragment ions were y series ions, the fragment
ion mass values would be unchanged, and the interpreted tag would be:
1414.7 tag(614.3,TISP,911.5)
By entering a sequence tag as an error tolerant sequence tag, using the keyword etag, you can have
Mascot search for these possibilities automatically. When searching an etag,
the peptide molecular weight constraint is relaxed and the fragment ion mass
values must fit one of two possibilities. Either both values
are unchanged or both values are shifted by the same amount as the
peptide mass.
Because an etag sacrifices most of the specificity of a standard sequence
tag, it is not permitted to combine it with a very wide peptide mass tolerance
(> 1% or > 10 Da) or no enzyme
specificity. Also, because the constraint on the peptide mass is dropped, if
one tag is error tolerant, then any other tags for the same query are also treated
as error tolerant, even if they have been entered as standard tags. Finally, it is
not possible to mix ions(
) qualifiers and sequence tags.
peptol(tolerance,unit) may be used to specify a mass tolerance
for an individual query, over-riding the search form default. For example, peptol(10,%)
or peptol(2,Da).
If you re-Search a Sequence Query from the results page, you may notice
two additional qualifiers which are used internally by Mascot:
from(mass,charge) is used to track the original
mass and charge state of the peptide, after it has been converted to a neutral,
Mr value. For example, if the
peptide charge state was specified to be 1+,
the query 1234.5 would become 1233.492 from(1234.5,1+)
title(encoded title text) can be used to associate a text string
with an individual query. If the text contains non alphanumeric characters, these
must be Url encoded by conversion to %nn, where nn is the hexadecimal ASCII code
for the character. For example, Sample(1) becomes Sample%281%29.
Load a Sequence Query form, paste the following search into the query window, and submit
the search.
TAXONOMY=. . . . . . . . . . lobe-finned fish and tetrapod clade
REPTYPE=Peptide
TOL=0.03
TOLU=%
ITOL=0.5
ITOLU=Da
CHARGE=2+
INSTRUMENT=ESI-TRAP
877.4 tag(376.2, [IL][QK][IL], 730.2)
687.3 etag(782.3, NG[IL], 1066.1)
These two sequence tags are taken from the original paper of
Mann and Wilm. You should
find that both match to Lysozyme:
1. 1A2YC Mass: 14260 Score: 76 Peptides matched: 2
lysozyme (EC 3.2.1.17) mutant (D18A), chain C - chicken
Check to include this hit in error tolerant search or archive report
Query Observed Mr(expt) Mr(calc) Delta Miss Score Expect Rank Peptide
1 877.4000 1752.7854 1752.8278 -0.0424 0 35 0.0021 1 NTDGSTDYGILQINSR
2 687.3000 1372.5854 1267.6019 104.9836 0 42 0.31 1 GYSLGNWVCAAK
The error tolerant tag has found a match by adjusting the peptide mass by 105 Da, corresponding
to s-pyridylethylation of the cysteine residue.
|