Date Published: January 4, 2011
Publisher: Hindawi Publishing Corporation
Author(s): Zheng Yuan, Jinhong Shi, Wenjun Lin, Bolin Chen, Fang-Xiang Wu.
For high-resolution tandem mass spectra, the determination of monoisotopic masses of fragment ions plays a key role in the subsequent peptide and protein identification. In this paper, we present a new algorithm for deisotoping the bottom-up spectra. Isotopic-cluster graphs are constructed to describe the relationship between all possible isotopic clusters. Based on the relationship in isotopic-cluster graphs, each possible isotopic cluster is assessed with a score function, which is built by combining nonintensity and intensity features of fragment ions. The non-intensity features are used to prevent fragment ions with low intensity from being removed. Dynamic programming is adopted to find the highest score path with the most reliable isotopic clusters. The experimental results have shown that the average Mascot scores and F-scores of identified peptides from spectra processed by our deisotoping method are greater than those by YADA and MS-Deconv software.
With the development of tandem mass spectrometry, it has obtained an important status in protein and peptide analysis, such as the acquisition of structure information and identification and qualitative analysis . Since the fundamental data used for peptide identification in tandem mass spectra (MS/MS) is the m/z values, charge states of fragment ions, their detection can directly influence the subsequent analysis of mass spectra including the peptide identification and quantification . However, there are two difficulties during the process of detecting fragment ions: first, in some cases many real fragment ions have very low intensity that they can be removed as noise peaks by accident . Numerous noisy peaks in tandem mass spectra can cause either false negative or false positive fragment ions. Second, due to the existence of heavy isotopes in nature, more than one isotopic peak for each fragment ion is resolved in high-resolution tandem mass spectra. Though isotopic peaks can provide us useful information, such as compound composition and charge states, it will cost an expensive computation if peptide identification is done without removing them. And, also, isotopic peaks can overlap that could result in wrong interpretation of masses of fragment ions. Thus, to increase the accuracy of the peptide identification and reduce the complexity of MS/MS analysis, many existing deisotoping algorithms [4–19] have already been explored to detect the isotopic clusters of fragment ions.
Our deisotoping method is composed of four parts: searching all possible isotopic clusters, constructing isotopic cluster graphs, scoring all possible isotopic clusters and searching paths. The first part aims to find all possible isotopic clusters. The second part is used to describe the relationship between possible isotopic clusters. The third part is used to assess each possible isotopic cluster based on the assumed relationship. The goal of the fourth part is to determine the most possible arrangement of isotopic clusters.
This paper has presented a deisotoping algorithm for bottom-up spectra to increase the accuracy of monoisotopic mass determination of fragment ions. The algorithm takes overlapping cases into account by firstly constructing isotopic-cluster graphs which describe the relationship between possible isotopic clusters. Based on the assumed relationships in the graphs, all possible isotopic clusters are evaluated by a score function which combines nonintensity and intensity features of fragment ions. This method could help retain fragment ions with very low intensity in spectra. The experimental results on two data sets have indeed indicated that our method performs better in deisotoping compared with YADA and MS-Deconv software from three aspects: (1) the number of interpreted proteins and peptides from the dataset processed by our deisotoping method is larger than that from raw data, data processed by YADA and MS-Deconv, (2) the peptide and protein identifications from the data processed by our method are more reliable than those from the other two kinds of software, and (3) the F-scores of our method are greater than those of other two kinds of software. In the future, we will test our method on more mass spectral datasets.