{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T22:28:37Z","timestamp":1761863317155},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2014,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Tandem mass spectrometry-based database searching is currently the main method for protein identification in shotgun proteomics. The explosive growth of protein and peptide databases, which is a result of genome translations, enzymatic digestions, and post-translational modifications (PTMs), is making computational efficiency in database searching a serious challenge. Profile analysis shows that most search engines spend 50%-90% of their total time on the scoring module, and that the spectrum dot product (SDP) based scoring module is the most widely used. As a general purpose and high performance parallel hardware, graphics processing units (GPUs) are promising platforms for speeding up database searches in the protein identification process.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We designed and implemented a parallel SDP-based scoring module on GPUs that exploits the efficient use of GPU registers, constant memory and shared memory. Compared with the CPU-based version, we achieved a 30 to 60 times speedup using a single GPU. We also implemented our algorithm on a GPU cluster and achieved an approximately favorable speedup.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Our GPU-based SDP algorithm can significantly improve the speed of the scoring module in mass spectrometry-based protein identification. The algorithm can be easily implemented in many database search engines such as X!Tandem, SEQUEST, and pFind. A software tool implementing this algorithm is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/www.comp.hkbu.edu.hk\/~youli\/ProteinByGPU.html\" ext-link-type=\"uri\">http:\/\/www.comp.hkbu.edu.hk\/~youli\/ProteinByGPU.html<\/jats:ext-link>\n            <\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-15-121","type":"journal-article","created":{"date-parts":[[2014,4,28]],"date-time":"2014-04-28T23:02:10Z","timestamp":1398726130000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Accelerating the scoring module of mass spectrometry-based peptide identification using GPUs"],"prefix":"10.1186","volume":"15","author":[{"given":"You","family":"Li","sequence":"first","affiliation":[]},{"given":"Hao","family":"Chi","sequence":"additional","affiliation":[]},{"given":"Leihao","family":"Xia","sequence":"additional","affiliation":[]},{"given":"Xiaowen","family":"Chu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2014,4,28]]},"reference":[{"key":"6832_CR1","first-page":"1315","volume-title":"The Fifth International Symposium on Advances of High Performance Computing and Networking","author":"Y Li","year":"2012","unstructured":"Li Y, Chu X: Speeding up Scoring Module of Mass Spectrometry Based Protein Identification by GPU. The Fifth International Symposium on Advances of High Performance Computing and Networking. 2012, High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference, 1315-1320."},{"issue":"3","key":"6832_CR2","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1038\/nbt0303-255","volume":"21","author":"M Mann","year":"2003","unstructured":"Mann M, Jensen ON: Proteomic analysis of post-translational modifications. Nat Biotechnol. 2003, 21 (3): 255-261. 10.1038\/nbt0303-255.","journal-title":"Nat Biotechnol"},{"issue":"4320","key":"6832_CR3","doi-asserted-by":"publisher","first-page":"890","DOI":"10.1126\/science.337487","volume":"198","author":"R Uy","year":"1977","unstructured":"Uy R, Wold F: Posttranslational covalent modification of proteins. Science. 1977, 198 (4320): 890-896. 10.1126\/science.337487.","journal-title":"Science"},{"key":"6832_CR4","volume-title":"Posttranslational Modification Of Proteins: Expanding Nature\u2019s Inventory","author":"C Walsh","year":"2006","unstructured":"Walsh C: Posttranslational Modification Of Proteins: Expanding Nature\u2019s Inventory. 2006, Roberts and Company Publishers, http:\/\/www.amazon.com\/Posttranslational-Modification-Proteins-Expanding-Inventory\/dp\/0974707732,"},{"issue":"3","key":"6832_CR5","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1006\/jmbi.1999.2794","volume":"289","author":"MR Wilkins","year":"1999","unstructured":"Wilkins MR, Gasteiger E, Gooley AA, Herbert BR, Molloy MP, Binz PA, Ou K, Sanchez JC, Bairoch A, Williams KL, Hochstrasser DF: High-throughput mass spectrometric discovery of protein post-translational modifications. J Mol Biol. 1999, 289 (3): 645-657. 10.1006\/jmbi.1999.2794.","journal-title":"J Mol Biol"},{"issue":"10","key":"6832_CR6","doi-asserted-by":"publisher","first-page":"798","DOI":"10.1038\/nmeth1100","volume":"4","author":"ES Witze","year":"2007","unstructured":"Witze ES, Old WM, Resing KA, Ahn NG: Mapping protein post-translational modifications with mass spectrometry. Nat Methods. 2007, 4 (10): 798-806. 10.1038\/nmeth1100.","journal-title":"Nat Methods"},{"issue":"18","key":"6832_CR7","doi-asserted-by":"publisher","first-page":"3551","DOI":"10.1002\/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2","volume":"20","author":"DN Perkins","year":"1999","unstructured":"Perkins DN, Pappin DJ, Creasy DM, Cottrell JS: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999, 20 (18): 3551-3567. 10.1002\/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2.","journal-title":"Electrophoresis"},{"issue":"11","key":"6832_CR8","doi-asserted-by":"publisher","first-page":"976","DOI":"10.1016\/1044-0305(94)80016-2","volume":"5","author":"JK Eng","year":"1994","unstructured":"Eng JK, McCormack AL, Yates Iii JR: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994, 5 (11): 976-989. 10.1016\/1044-0305(94)80016-2.","journal-title":"J Am Soc Mass Spectrom"},{"issue":"12","key":"6832_CR9","doi-asserted-by":"publisher","first-page":"1948","DOI":"10.1093\/bioinformatics\/bth186","volume":"20","author":"Y Fu","year":"2004","unstructured":"Fu Y, Yang Q, Sun R, Li D, Zeng R, Ling CX, Gao W: Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry. Bioinformatics. 2004, 20 (12): 1948-1954. 10.1093\/bioinformatics\/bth186.","journal-title":"Bioinformatics"},{"issue":"13","key":"6832_CR10","doi-asserted-by":"publisher","first-page":"3049","DOI":"10.1093\/bioinformatics\/bti439","volume":"21","author":"D Li","year":"2005","unstructured":"Li D, Fu Y, Sun R, Ling CX, Wei Y, Zhou H, Zeng R, Yang Q, He S, Gao W: pFind: a novel database-searching software system for automated peptide and protein identification via tandem mass spectrometry. Bioinformatics. 2005, 21 (13): 3049-3050. 10.1093\/bioinformatics\/bti439.","journal-title":"Bioinformatics"},{"issue":"18","key":"6832_CR11","doi-asserted-by":"publisher","first-page":"2985","DOI":"10.1002\/rcm.3173","volume":"21","author":"LH Wang","year":"2007","unstructured":"Wang LH, Li DQ, Fu Y, Wang HP, Zhang JF, Yuan ZF, Sun RX, Zeng R, He SM, Gao W: pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Commun Mass Spectrom. 2007, 21 (18): 2985-2991. 10.1002\/rcm.3173.","journal-title":"Rapid Commun Mass Spectrom"},{"issue":"9","key":"6832_CR12","doi-asserted-by":"publisher","first-page":"1466","DOI":"10.1093\/bioinformatics\/bth092","volume":"20","author":"R Craig","year":"2004","unstructured":"Craig R, Beavis RC: TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004, 20 (9): 1466-1467. 10.1093\/bioinformatics\/bth092.","journal-title":"Bioinformatics"},{"issue":"5","key":"6832_CR13","doi-asserted-by":"publisher","first-page":"958","DOI":"10.1021\/pr0499491","volume":"3","author":"LY Geer","year":"2004","unstructured":"Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M, Maynard DM, Yang X, Shi W, Bryant SH: Open mass spectrometry search algorithm. J Proteome Res. 2004, 3 (5): 958-964. 10.1021\/pr0499491.","journal-title":"J Proteome Res"},{"issue":"8","key":"6832_CR14","doi-asserted-by":"publisher","first-page":"1454","DOI":"10.1002\/pmic.200300485","volume":"3","author":"J Colinge","year":"2003","unstructured":"Colinge J, Masselot A, Giron M, Dessingy T, Magnin J: OLAV: towards high-throughput tandem mass spectrometry data identification. Proteomics. 2003, 3 (8): 1454-1463. 10.1002\/pmic.200300485.","journal-title":"Proteomics"},{"issue":"6","key":"6832_CR15","doi-asserted-by":"publisher","first-page":"807","DOI":"10.1002\/rcm.4448","volume":"24","author":"Y Li","year":"2010","unstructured":"Li Y, Chi H, Wang LH, Wang HP, Fu Y, Yuan ZF, Li SJ, Liu YS, Sun RX, Zeng R, He SM: Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing. Rapid Commun Mass Spectrom. 2010, 24 (6): 807-814. 10.1002\/rcm.4448.","journal-title":"Rapid Commun Mass Spectrom"},{"key":"6832_CR16","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1007\/3-540-45784-4_6","volume-title":"Proceedings of the Second International Workshop on Algorithms in Bioinformatics","author":"N Edwards","year":"2002","unstructured":"Edwards N, Lippert R: Generating Peptide Candidates from Amino-Acid Sequence Databases for Protein Identification via Mass Spectrometry. Proceedings of the Second International Workshop on Algorithms in Bioinformatics. 2002, Rome, Italy: Springer-Verlag, 673261: 68-81. http:\/\/link.springer.com\/chapter\/10.1007%2F3-540-45784-4_6,"},{"issue":"13","key":"6832_CR17","doi-asserted-by":"publisher","first-page":"3931","DOI":"10.1021\/ac0481046","volume":"77","author":"WH Tang","year":"2005","unstructured":"Tang WH, Halpern BR, Shilov IV, Seymour SL, Keating SP, Loboda A, Patel AA, Schaeffer DA, Nuwaysir LM: Discovering known and unanticipated protein modifications using MS\/MS database searching. Anal Chem. 2005, 77 (13): 3931-3946. 10.1021\/ac0481046.","journal-title":"Anal Chem"},{"issue":"5","key":"6832_CR18","doi-asserted-by":"publisher","first-page":"612","DOI":"10.1093\/bioinformatics\/btl645","volume":"23","author":"D Dutta","year":"2007","unstructured":"Dutta D, Chen T: Speeding up tandem mass spectrometry database search: metric embeddings and fast near neighbor search. Bioinformatics. 2007, 23 (5): 612-618. 10.1093\/bioinformatics\/btl645.","journal-title":"Bioinformatics"},{"issue":"6","key":"6832_CR19","doi-asserted-by":"publisher","first-page":"1307","DOI":"10.1021\/ac026199a","volume":"75","author":"S Sunyaev","year":"2003","unstructured":"Sunyaev S, Liska AJ, Golod A, Shevchenko A, Shevchenko A: MultiTag: multiple error-tolerant sequence tag search for the sequence-similarity identification of proteins by mass spectrometry. Anal Chem. 2003, 75 (6): 1307-1315. 10.1021\/ac026199a.","journal-title":"Anal Chem"},{"issue":"1","key":"6832_CR20","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1021\/pr0701198","volume":"7","author":"RD Bjornson","year":"2008","unstructured":"Bjornson RD, Carriero NJ, Colangelo C, Shifman M, Cheung KH, Miller PL, Williams K: X!!Tandem, an improved method for running X!tandem in parallel on collections of commodity computers. J Proteome Res. 2008, 7 (1): 293-299. 10.1021\/pr0701198.","journal-title":"J Proteome Res"},{"issue":"5","key":"6832_CR21","doi-asserted-by":"publisher","first-page":"1842","DOI":"10.1021\/pr050058i","volume":"4","author":"DT Duncan","year":"2005","unstructured":"Duncan DT, Craig R, Link AJ: Parallel tandem: a program for parallel processing of tandem mass spectra using PVM or MPI and X!Tandem. J Proteome Res. 2005, 4 (5): 1842-1847. 10.1021\/pr050058i.","journal-title":"J Proteome Res"},{"issue":"12","key":"6832_CR22","doi-asserted-by":"publisher","first-page":"1503","DOI":"10.1016\/j.jpdc.2006.08.003","volume":"66","author":"D Battr","year":"2006","unstructured":"Battr D, Angulo DS: MPI framework for parallel searching in large biological databases. J Parallel Distrib Comput. 2006, 66 (12): 1503-1511. 10.1016\/j.jpdc.2006.08.003.","journal-title":"J Parallel Distrib Comput"},{"issue":"6","key":"6832_CR23","doi-asserted-by":"publisher","first-page":"3148","DOI":"10.1021\/pr800970z","volume":"8","author":"BD Halligan","year":"2009","unstructured":"Halligan BD, Geiger JF, Vallejos AK, Greene AS, Twigger SN: Low cost, scalable proteomics data analysis using Amazon\u2019s cloud computing services and open source search algorithms. J Proteome Res. 2009, 8 (6): 3148-3153. 10.1021\/pr800970z.","journal-title":"J Proteome Res"},{"issue":"6","key":"6832_CR24","doi-asserted-by":"publisher","first-page":"724","DOI":"10.1093\/bioinformatics\/btl656","volume":"23","author":"I Bogdan","year":"2007","unstructured":"Bogdan I, Coca D, Rivers J, Beynon RJ: Hardware acceleration of processing of mass spectrometric data for proteomics. Bioinformatics. 2007, 23 (6): 724-731. 10.1093\/bioinformatics\/btl656.","journal-title":"Bioinformatics"},{"issue":"15","key":"6832_CR25","doi-asserted-by":"publisher","first-page":"1937","DOI":"10.1093\/bioinformatics\/btp294","volume":"25","author":"R Hussong","year":"2009","unstructured":"Hussong R, Gregorius B, Tholey A, Hildebrandt A: Highly accelerated feature detection in proteomics data sets using modern graphics processing units. Bioinformatics. 2009, 25 (15): 1937-1943. 10.1093\/bioinformatics\/btp294.","journal-title":"Bioinformatics"},{"issue":"6","key":"6832_CR26","doi-asserted-by":"publisher","first-page":"2882","DOI":"10.1021\/pr200074h","volume":"10","author":"LA Baumgardner","year":"2011","unstructured":"Baumgardner LA, Shanmugam AK, Lam H, Eng JK, Martin DB: Fast parallel tandem mass spectral library searching using GPU hardware acceleration. J Proteome Res. 2011, 10 (6): 2882-2888. 10.1021\/pr200074h.","journal-title":"J Proteome Res"},{"issue":"7","key":"6832_CR27","doi-asserted-by":"publisher","first-page":"3581","DOI":"10.1021\/pr300338p","volume":"11","author":"JA Milloy","year":"2012","unstructured":"Milloy JA, Faherty BK, Gerber SA: Tempest: GPU-CPU computing for high-throughput database spectral matching. J Proteome Res. 2012, 11 (7): 3581-3591. 10.1021\/pr300338p.","journal-title":"J Proteome Res"},{"issue":"2","key":"6832_CR28","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1016\/j.jcss.2012.05.004","volume":"79","author":"Y Li","year":"2013","unstructured":"Li Y, Zhao K, Chu X, Liu J: Speeding up k-means algorithm by GPUs. J Comput Syst Sci. 2013, 79 (2): 216-229. 10.1016\/j.jcss.2012.05.004.","journal-title":"J Comput Syst Sci"},{"key":"6832_CR29","first-page":"144","volume-title":"IPCCC","author":"X Chu","year":"2008","unstructured":"Chu X, Zhao K, Wang M: Massively Parallel Network Coding on GPUs. IPCCC. Edited by: Znati T, Zhang Y. 2008, Performance, Computing and Communications Conference: IEEE, 144-151. conf\/ipccc\/ChuZW08"},{"key":"6832_CR30","first-page":"573","volume-title":"Proceedings of the 8th International IFIP-TC 6 Networking Conference","author":"X Chu","year":"2009","unstructured":"Chu X, Zhao K, Wang M: Practical Random Linear Network Coding on GPUs. Proceedings of the 8th International IFIP-TC 6 Networking Conference. 2009, Aachen, Germany: Springer-Verlag, 1560189: 573-585."},{"key":"6832_CR31","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1109\/CIT.2010.60","volume-title":"Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology","author":"Y Li","year":"2010","unstructured":"Li Y, Zhao K, Chu X, Liu J: Speeding up K-Means Algorithm by GPUs. Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology. 2010, Computer and Information Technology (CIT), 2010 IEEE 10th International Conference: IEEE Computer Society, 1901155: 115-122."},{"issue":"12","key":"6832_CR32","doi-asserted-by":"publisher","first-page":"1791","DOI":"10.1002\/rcm.4578","volume":"24","author":"L Wang","year":"2010","unstructured":"Wang L, Wang W, Chi H, Wu Y, Li Y, Fu Y, Zhou C, Sun R, Wang H, Liu C, Yuan Z, Xiu L, He SM: An efficient parallelization of phosphorylated peptide and protein identification. Rapid Commun Mass Spectrom. 2010, 24 (12): 1791-1798. 10.1002\/rcm.4578.","journal-title":"Rapid Commun Mass Spectrom"},{"key":"6832_CR33","volume-title":"Bioinformatics","author":"Z Kaiyong","year":"2014","unstructured":"Kaiyong Z, Xiaowen C: G-BLASTN: accelerating nucleotide alignment by graphics processors. Bioinformatics. 2014, http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24463183,"},{"issue":"3","key":"6832_CR34","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1038\/nmeth1019","volume":"4","author":"JE Elias","year":"2007","unstructured":"Elias JE, Gygi SP: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods. 2007, 4 (3): 207-214. 10.1038\/nmeth1019.","journal-title":"Nat Methods"},{"key":"6832_CR35","unstructured":"NVIDIA CUDA Compute Unified Device Architechture: Programming Guide, Version 2.0beta2. 2008, http:\/\/www.cs.ucla.edu\/~palsberg\/course\/cs239\/papers\/CudaReferenceManual_2.0.pdf,"},{"key":"6832_CR36","first-page":"544","volume-title":"VLDB\u201996, Proceedings of 22th International Conference on Very Large Data Bases","author":"JC Shafer","year":"1996","unstructured":"Shafer JC, Agrawal R, Mehta M: SPRINT: A Scalable Parallel Classifier for Data Mining. VLDB\u201996, Proceedings of 22th International Conference on Very Large Data Bases. 1996, San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 544-555."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-15-121.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,2]],"date-time":"2021-09-02T10:01:41Z","timestamp":1630576901000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-15-121"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,4,28]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,12]]}},"alternative-id":["6832"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-15-121","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,4,28]]},"assertion":[{"value":"31 October 2012","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2014","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2014","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"121"}}