Modifications tab
The Modifications tab defines specific modification parameter values and Wildcard searches.
Figure 1: Modifications options
Like most proteomics search engines, Byonic supports two types of modifications: fixed and variable. A fixed modification is assumed to occur on all the residues of that type, but a variable modification is optional, so that each site for a variable modification is considered with and without the modification.
Figure 2: Click within the text box to activate the light blue “…”button
Select Modifications
To add fixed and variable modifications, click Enter/Edit to open the Select Modifications dialog:
Figure 3: Select Modifications dialog
The user can then specify any number of modification rules via a pull-down menu containing all the modifications listed in www.unimod.org. For convenience, frequently used modifications are listed twice, at the top and again in the complete list. The three pull-down menus in each row select modification type, target residues, and fine control.
NOTE: There is a fourth pull-down, which lets the user delete, invert (as in (De)Carbamidomethyl), or add “attributes” to modifications. Attributes allow the user to define protein-specific modifications.
- Modifications displays a list of common modifications and their corresponding delta masses. The list includes all candidates found in www.unimod.org.
Figure 4: Modification Name / Mass Delta
NOTE: For convenience, frequently used modifications are listed twice, once at the top and again in the complete list.
- Targets displays a list of possible target locations associated with the selected modification name. The Targets field allows the 20 one-letter amino acid abbreviations, as well as four special locations: NTerm, CTerm, Protein NTerm, and Protein CTerm. NTerm, CTerm, Protein NTerm, and Protein CTerm can also be used as modifiers of amino acid residues. Targets form a comma-separated list.
Figure 5: Modification Targets
- Fine control marks the modification as “Fixed” or “Variable”, whether the variable modification is considered “Rare” or “Common”, and the maximum count of that modification in the peptide
Figure 6: Modification Targets
The button in the last column exposes additional actions that can be performed on modifications:
Figure 7: Additional Modification actions
-
Invert modification defines a variable modification that is the removal of a fixed modification (for example, under-alkylation. The modification is labeled with the prefix “(De)”.
Figure 8: Invert modification
-
Add attribute allow the user to define protein-specific modifications. For example, adding the attribute “ProteinLabel{collagen}” to Oxidation on P allows hydroxyproline only on proteins with “collagen” in their protein names:
Figure 9: Add modification attribute
-
Delete removes that modification record.
To add a new modification, scroll to the bottom of the modification records and add entries for Modifications, Targets and Fine Control:
Total Common Max and Total Rare Max
A unique feature not found in other search engines is offered: the user designates each variable modification as either “common” or “rare”, with the names suggesting their use. The user can define separate limits on the number of occurrences of each variable modification, so that “common 2” means at most two occurrences per peptides. Separate limits can also be set for the total number of common and rare modifications per peptide. A typical search allows a total of at most two common modifications and a total of at most one rare modification per peptide. To search for, say, three phosphoserines per peptide, the user can change Total common modification max to 3 or split phosphorylated serine between two rules: common2 and rare1. Depending upon the other modification rules, the latter approach may give a faster search.
NOTE: The single most important factor in search time is Total Common Max. Roughly speaking, the search time grows as C*T where C is the number of common modifications enabled and T is Total Common Max.
Conceptually, the search engine has one modification “slot” for each residue, along with slots for the peptide’s N- and C-termini. A variable modification such as +0.984016 @ N uses up the residue slot; a nonspecific terminal modification such as +57.021464 @ NTerm uses up the terminal slot; but residue-specific N-terminal modifications, such as -17.026549 @ NTerm Q, use up both the residue and the N-terminal slots.
The big open box (shown in the figure above) is a space for the user to type in custom modifications not listed in Unimod. The manual fine control format has the form:
Modification_Name / Mass_Delta @ Targets | Fine_Control
Figure 9: Add modification attribute
Modification_Name / is optional. The Targets field allows the 20 one-letter amino acid abbreviations, as well as four special locations: NTerm, CTerm, Protein NTerm, and Protein CTerm can also be used as modifiers of amino acid residues. Targets form a comma-separated list.
Here is an example of a real modification not (yet) in Unimod:
DehydroFormyl / +9.98435 @ NTerm S, NTerm T | rare1
To set maximum allowed modifications, set values for Total common max and Total rare max. To set no maximum, leave the value at zero.
In Figure , the user specified Carbamidomethyl / 57.021464 @ C | fixed, meaning carbamidomethylated cysteine (camC). The user also specified Oxidation / +15.994915 @ M | common2, directing the program to consider each methionine residue with and without this modification, up to a limit of 2 such modifications per peptide. In addition, the user specified Ammonia-loss / -17.026549 @ N-term C | rare1, indicating that the program also considers this modification for any N-terminal cysteines as a rare variable modification. Variable modifications are added on top of fixed modifications, so the total mass added to these N-terminal C’s will be 57.021+14.016 = 71.037, which represents cysteine propionamide.
One way to represent incomplete carbamidomethylation is with these two rules: Carbamidomethyl / +57.021464 @ C | fixed and (De)Carbamidomethyl / -57.021464 @ C | common2.
The rule Carbamidomethyl / +57.021464 @ NTerm | rare1 specifies a common artifact (over-alkylation) on the peptide N-terminus.
The next two rules, +0.984016 @ N | common2 and +0.984016 @ Q | common1, represent deamidation; here the user is allowing up to two deamidated asparagines (the more common deamidation) but only one deamidated glutamine per peptide.
The rule Gln->pyro-Glu / -17.026549 @ NTerm Q | rare1 specifies a modification that occurs only on peptides with N-terminal glutamine.
Conceptually, Byonic has one modification “slot” for each residue, along with slots for the peptide’s N- and C-termini. A variable modification such as +0.984016 @ N uses up the residue slot; a nonspecific terminal modification such as +57.021464 @ NTerm uses up the terminal slot; but residue-specific N-terminal modifications, such as -17.026549 @ NTerm Q, use up both the residue and the N-terminal slots.
Non-Standard (Unatural) Amino Acid
A limited number of nonstandard amino acid residues can be supported by redefining one-letter amino acid abbreviations using fixed modifications. B, Z, U, O, J, and X are accepted within FASTA protein databases, with masses, respectively, of 114.042927 (same as N), 128.058578 (same as Q), 150.95363 (selenocysteine), 237.052645 (pyrrolysine), 100.0, and 110.05 (close to averagine). By placing, for example, a fixed modification of +13.04768 on J, the user can make J in a FASTA database have mass 113.04768, correct for hydroxyproline. However, the amino acid sequence is used to predict peak intensity, so this fixed modification on J will not give the same scores as a +15.9949 variable modification on P.
Figure 10: Non-Standard Amino Acid residues table in Byonic
Wildcard search
The Wildcard search section of the Modifications tab lets the user turn on wildcard searches, set the range for the wildcard mass, and restrict the wildcard to certain residues if desired.
Figure 12: Byonic Wildcard seach drop down menu
The Restrict to residues box uses the common 20 single-letter residue abbreviations, and (lower case) n denotes peptide N-terminus and (lower case) c denotes peptide C-terminus. When an invalid entry for Restrict to residues will give an error message designating valid entries. The tooltip also specifies the entry format:
Figure 13: Byonic Wildcard seach Restrict to residue entry format
A wildcard, even one with a mass range of only 50 or 60 Da, greatly increases the size of the search. It is best used with a focused database (see the Advanced tab section below) and used either alone or with only a few other modifications enabled. Most wildcard mass shifts will be recognizable by an expert; hence, a wildcard can be used to discover which known modifications should be enabled in a subsequent search.
By specifying most modifications as rare, it is quite feasible to search for 10 – 20 modification types at once with Byonic. Even larger searches are possible with focused protein databases, for example with therapeutic proteins. Such a focused database easily allows efficient mutation searches with 200+ possible substitutions, or oxidative footprinting searches with 50+ types of oxidations. Glycans and wildcards can easily enlarge the search space by 2 to 3 orders of magnitude, so these options should be used with care, and in conjunction with only the most common variable modifications (such as oxidized methionine or pyro-Glu N-terminus).
The wildcard search has been expanded to amino acid positions with an already present glycan mass modification. N-linked, O-linked, and other glycan motifs are supported.
The user must use the below parameters:
• Wildcard search: All peptides
• Minimum and Maximum mass: Set values
• Restrict to residues: set to "g" to search for wildcard modifications on top of glycan modifications.