Introduction
The discovery, analysis, and monitoring of Host Cell Proteins (HCPs) by mass spectrometry (MS) appears to be an increasing trend in the biopharmaceutical industry as instrumentation and sample preparation techniques improve. This Technical Note introduces the use of Byos™ (legacy Byonic™ and Byologic®) to determine HCPs from MS/MS data. Processing parameters and a report template focusing on the search for HCPs and other types of protein contaminants are provided in this workflow.
Configuring the Workflow
Protein database
The user provides a FASTA file containing a list of proteins from an organism in which the protein is expressed. For example, if the sample is expressed in Chinese Hamster Ovary (CHO) cells, one may use a database of proteins from ‘organism:Cricetulus’. We recommend appending this database with expression medium proteins (e.g. Bovine) and including the purified protein’s sequence, as well as the digestion enzyme and other reagents used during the purification process (e.g. Protein A).
Replicate Analysis
If the data contains replicates, using the correct syntax for the “Samples Name” (or “MS Alias name”) will allow Byos to derive statistics. These can be displayed in tables or bar charts. An example set up is shown below. Here “s(…)” stands for sample, and “r(…)” for replicate.
MS/MS Search (legacy Byonic™)
The first processing node in the Byos HCP workflow is the MS/MS search by Byonic. The goal of the search for HCP analysis is to maximize protein identifications. The default parameters for HCP analysis are as follows:
- Digestion: Tryptic and fully specific. Semi-specific searches identify more peptides, however in our experience do not increase protein IDs or affect relative protein quantification.
- Fragmentation Type: QTOF / HCD.
- Mass tolerance settings: 15/20 ppm for precursors and fragments, respectively. The user can lower these values in a high mass accuracy instrument, or relax them for older TOFs/IonTraps. The mass tolerances may also be determined empirically by Preview™ from Protein Metrics, Inc.
- Variable Modifications: For the purposes of HCP analysis, the analyst only needs to identify and quantify the most common peptides from each protein. Thus a standard set of modifications is sufficient, depending on the sample preparation. For example:
(For explanations of ‘common’ and ‘rare’ settings, please refer to the Byonic manual or “Modification Fine Control” application note). - Glycans: The Byos workflow includes the most common N-linked glycans in the search.
- Advanced settings: Although seldom required, the search may need to be further optimized depending on sample, chromatography, and MS acquisition settings. For example, to analyze DIA data, increase the maximum # of precursors per MS2 to 6-10. For short gradients, increase to 2. For heavily glycosylated HCPs (e.g. plasma-derived proteins), enable “Show all N-glycopeptides”.
Peptide and Protein Quantification (legacy Byologic®)
Identified proteins are quantified and compared across samples or fractions by the Quan node (Byologic). The two parameters that may need to be modified in the Byos workflow quan node are:
- Advanced Proteins: The default setting of [auto decoys=2] means that all proteins that are ranked above the 2nd decoy protein hit will be included in the results. The user can increase this value up to 100 to include low scoring proteins in the results.
- MS extract options -> m/z Window (ppm): This is the width of the integration window around a single isotope (shown as a pink area in MS1 plot). Typical values are 15-20 ppm for 15-120k resolution instruments, 25-50 ppm for lower resolution.
Inspecting Results and Creating Reports
When the analysis is completed, Byos opens up an inspection view and the corresponding report as tabs within the Byos window. The inspection view includes MS/MS, MS1, and XIC plots, as well as a comprehensive list of identified peptides, proteins, raw files and other project information. The meaningful output for HCP analysis including most filters are in the report view.
The HCP report template is configured to sum the XICs of the top 3 most intense peptides per protein. This behavior may be modified by applying filters to the “Peptide Ranking” column. For example, if the user chooses 1, 2, and 3, the top 3 peptides will be included/displayed in the pivot table.
Proteins with less than 3 peptides identified may give false abundance values. These proteins are removed from the report by default via the filter applied to the “Number of peptides” column. The filter can be removed by checking values 1 and 2.
Example HCP data in tabular form:
The same data represented as an error bar chart, excluding the therapeutic product:
If one would like to include spectrum and XIC images in the report, click “Tabs -> Add Plots”. This step can take a few minutes depending on the number of peptides included in the analysis.
Hide configuration fields before exporting.
Click “File->Export->Export to PDF…” to generate a single .pdf file including all of the tabs.
Using a less purified sample as a basis for an HCP project
Very low concentration peptides frequently elude identification either because the mass spectrometer did not trigger during their elution, or an obtained MS/MS spectrum was of insufficient quality. Consequently, one strategy to employ for HCP analysis is to include in the Byos project one or more less purified samples, with higher amounts of HCPs. Peptides from these more abundant proteins will generally provide robust identifications. For the most highly purified sample, it is worthwhile to perform a Byonic analysis. Importantly, XICs can still be produced in those cases of missing identification by using the Byologic function: “Edit Menu -> In-silico Peptides -> Add Missing Via Existing Peptides”. When inspecting, be sure to note consistent elution characteristics of a low concentration sample with a higher HCP concentration sample.