Summary
This article outlines the capabilities of Byos' Multi-Protein Quantitation (MPQ) workflow. Key points include:
- Top-N Method Application: The Top-N quantification method, implemented in the MPQ module by Byos, enables efficient and robust protein quantification in proteomic studies, with particular emphasis on reducing and aggregating data for robust analysis.
- Advanced Analysis with GraphPad Prism: Using 2D hierarchical clustering in GraphPad Prism, researchers can visualize complex proteomic data, enabling insights into both sample-level differences and protein interdependencies.
- Case Study Insights: A diet-induced obesity model in rats illustrated the combined utility of Byos for protein identification and GraphPad Prism for advanced data visualization, revealing significant diet-related proteome changes.
Introduction
The Top-N method, also known as the Hi3 quantification approach, is a widely used strategy in label-free quantitative proteomics for determining protein abundance. This technique leverages the intensities of the three most abundant tryptic peptides per protein, offering a robust and accurate proxy for protein concentration.
Initially developed by Silva et al. (2006) [1] for absolute protein quantification, the Top-N method has been extended to various applications, including relative quantification in comparative proteomics and biomarker discovery. Its simplicity and compatibility with standard LC-MS workflows make it a preferred choice in both academic research and industrial settings. Notably, Top-N is extensively utilized for analyzing host cell proteins in therapeutic protein development.
In 2024, Protein Metrics introduced a tailored software solution for the Top-N method, named MPQ (Multi-Protein Quantitation). This workflow streamlines the application of the Top-N method, enhancing its utility and efficiency. Here, we demonstrate the capability of this workflow using samples from a study investigating heart proteome changes in a diet-induced obesity model. The visualization of the results is performed using GraphPad Prism.
MPQ Module Workflows
The MPQ module consists of three workflows:
Multi-Protein Preview - Rapid first pass analysis on MS quality check and sample preparation evaluation.
This workflow suggests search parameters for the database search, such as error tolerances for MS1 and MS2. After defining the samples and the protein FASTA file, the processing parameters must be specified. For a Top-N experiment with trypsin, most default settings are suitable, although the fragmentation type may need adjustment. For each sample, a report is generated providing useful insights into instrument precision and the degree of cysteine alkylation. The results are produced quickly, significantly reducing the time required to optimize search parameters. For more information, refer to the FAQ about Preview.
Multi-Protein Identification - High level protein and peptide identification for focused verification.
This workflow is designed to create a focused protein FASTA database. More details about the creation of Focused Protein Databases can be found here. Previously, only one focused database could be generated per file. However, by enabling the "Merge results into a single project" option, it is now possible to create a focused database containing all protein sequences from the search results across all samples.
Multi-Protein Quantitation - Comprehensive label-free, relative quantification of complex samples.
The primary workflow of the MPQ module, Multi-Protein Quantitation, is used to generate results. This workflow will be described in detail in the subsequent sections.
Project Creation
To begin, add the samples into the samples table by either dragging and dropping them into the table or by clicking the "Add sample(s)..." button. Within the samples table, the utilized enzyme can be selected for each sample via a drop-down menu. This setting is important for calculating the number of missed cleavages and can also be modified after project creation. Additionally, a condition and replicate number can be assigned to each sample. These settings are critical for grouping the samples for reporting and filtering, which will be discussed during the processing step (see Figure 1).
Figure 1. Adding samples during project creation. Assigning a condition and replicate number along the enzyme the sample was prepared with.
Next, define the protein FASTA file location. Utilizing a focused database can significantly decrease search times. A focused database can be created using the Multi-Protein Identification workflow (see Figure 2).
Figure 2. Adding the protein FASTA file.
Finally, configure the processing nodes (see Figure 3). MPQ Quantitation workflows consist of two steps:
- Database Search: The first step involves a Byonic database search from MS/MS spectra to identify peptides and reassemble the presence of proteins.
- Quantification: The second step is the quantification process. Experienced Byos users will notice similarities to the Byologic node. The primary difference, highlighted in Figure 3 and discussed thoroughly below, lies in the objectives: while protein characterization workflows aim to extract maximum data to characterize a single protein, proteomics workflows focus on reducing and aggregating data to identify and quantify all proteins in the sample effectively. Dynamic filtering is introduced in the MPQ workflow to achieve this goal.
Figure 3. Defining each processing step. MPQ workflow specific parameters are the “PSM filtering”, “Match between runs” and “Reported peptide filtering”
PSM Filtering
The PSM (Peptide-to-Spectra Matches) filtering applies criteria to reduce the number of peptides based on specific attributes. These filters are applied in the order they are listed in the configuration window. These options must be defined prior to project creation and cannot be changed afterward.
Attribute | Description |
Minimum peptide length | Default value 5 |
Maximum peptide length | Default value 32 |
Maximum missed cleavage count | Maximum number of missed cleavage sites a peptide can contain (Default value: 2) |
Minimum peptide score | Default value 80 |
Minimum peptide replicate count | Minimum number of replicates within a condition that identified the same peptide. For example, a value of 2 means each peptide must be identified in at least two replicates. Replicates are defined in the sample table in the samples section |
Keep homologous sequences | Disabling this feature will filter out all peptide sequences that are not unique to a protein |
Minimum peptide matching a protein count | Minimum number of peptides that can be assigned to a protein |
Include modifications | Options: None, all fixed modifications, all variable modifications, or all modifications. |
Table 1. PSM Filtering
Match Between Runs
This option enables the creation of XICs (Extracted Ion Chromatograms) for peptides that have not been detected by the search engine in all samples but were identified in at least one. This functionality is similar to the "Add missing via existing" feature, with the key difference being that the XICs are created during project creation, eliminating the need for an additional manual step.
Reported peptide filtering
The filters defined in this section can also be changed after the project is created. Table 2 gives a detailed description about each option. The function of both filters is visualized in Figure 4.
Attribute | Description |
Retain only primary M2 peptide ID | In cases where a peptide has multiple MS2 spectra, only the one with the highest score will be retained and displayed. Enabling this option reduces loaded spectra in the inspection view for faster response. |
Use only top charge state for quantitation | Only the charge state with the highest intensity will remain. Disabling this option shows all detected charge states. |
Use the same top N peptides for quantification | Enables the filter to select only the N most intense peptides. |
Top N count | Defines N. For example, setting a value of 3 with the above option enabled will select the top 3 peptides based on intensity.. |
Apply RSD limit for replicates | Filters out all peptides with intensity values across replicates having a higher relative standard deviation (RSD) than the threshold. |
Minimum acceptable %RSD for replicates | Defines the minimum acceptable %RSD for replicates. |
Table 2. Reported peptide filtering
Figure 4. Overview of the dynamic filtering. The PSM Filtering filters the search engine results. The Reported Peptide Filtering on the other hand filters peptides based on certain attributes after the project was created and can be reversed. Applied Top N count is here equal to 3.
Data inspection
After creating a project, the results can be reviewed in the Inspection View, which is detailed in Figure 5. This view, while largely similar to the one used for PTM projects, features two key differences tailored to the MPQ workflow.
-
Search Filter Customization:
The search filter—accessible via a dedicated button —brings up a window containing the "Reported Filtering Options" described earlier. For instance, adjusting the "Top N count" setting dynamically changes the number of peptides displayed per protein, offering flexibility in how data is visualized. -
Protein Table Integration:
A protein table unique to MPQ projects provides enhanced filtering functionality. By selecting the checkbox next to a specific protein in the table, the view narrows to display only the peptides associated with that protein. Resetting this filter via the table's reset button reverts the view to show all peptides.
In addition to these differences, the Inspection View retains powerful features like peptide validation, simultaneous XIC boundary adjustments, and theoretical isotope envelope visualizations. These tools enable efficient and detailed data interrogation, facilitating in-depth proteomic analysis.
Figure 5. Overview of the MPQ inspection view.
Report Creation
All the results can be summarized in a custom report. The report function of Byos, of which the MPQ module is a part of, allows the generation of data tables and plots. Custom solutions can be created like a linear fitting curve to calculate quantities based on a spiked standard. The report template of HCP analysis contains the following pivot table:
Figure 6. Pivot table in the reporting section of the MPQ result file.
2D Hierarchical Clustering with GraphPad Prism
The MPQ workflow can also analyze samples of higher complexity beyond typical HCP samples, such as the heart proteome samples from a study by Vileigas et al. (2019) 2. In this study, two populations of rats were fed either a control diet or a western diet for 41 weeks. A total of 18 samples were processed with the MPQ workflow, identifying 980 proteins. Only proteins showing a significant fold change between the two groups were exported for further analysis using GraphPad Prism's hierarchical clustering functionality.
The 2D clustering analysis enabled both sample and protein-level grouping based on intensity profiles across the samples (see Figure 7). The plot revealed that the western diet induced significant changes in the heart proteome of the rats. At the sample level, the clustering algorithm successfully separated the two diet groups. At the protein level, two main clusters were observed, corresponding to up-regulated and down-regulated proteins. This grouping highlights potential dependencies between proteins, offering opportunities for deeper investigation into the biological pathways affected by diet.
Figure 7: 2D Hierarchical Clustering of the protein intensities with GraphPad PRISM
Conclusion
The study of heart proteome changes in a diet-induced obesity model demonstrates the power of integrating the MPQ workflow with GraphPad Prism to identify and interpret significant proteomic shifts. Using MPQ, proteomic data was efficiently processed and quantified, while GraphPad Prism enabled advanced visualization through 2D hierarchical clustering.
Key findings include:
- Clear identification of diet-induced proteomic changes, with clustering separating the two diet groups.
- Discovery of co-regulated protein clusters, highlighting potential pathway-level dependencies for further investigation.
Byos provides robust, streamlined workflows for efficient data processing, and GraphPad Prism complements these by offering comprehensive visualization and clustering capabilities. Together, they form a powerful toolkit for advanced proteomic studies in both academic and industrial settings.
References
[1] Silva, J. C.; Gorenstein, M. V.; Li, G.-Z.; Vissers, J. P. C.; Geromanos, S. J. Absolute Quantification of Proteins by LCMSE: A Virtue of Parallel Ms Acquisition * S. Molecular & Cellular Proteomics 2006, 5 (1), 144–156. https://doi.org/10.1074/mcp.M500230-MCP200.
[2] Vileigas, D. F.; Harman, V. M.; Freire, P. P.; Marciano, C. L. C.; Sant’Ana, P. G.; de Souza, S. L. B.; Mota, G. A. F.; da Silva, V. L.; Campos, D. H. S.; Padovani, C. R.; Okoshi, K.; Beynon, R. J.; Santos, L. D.; Cicogna, A. C. Landscape of Heart Proteome Changes in a Diet-Induced Obesity Model. Sci Rep 2019, 9 (1), 18050. https://doi.org/10.1038/s41598-019-54522-2.