{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:11:11Z","timestamp":1772165471537,"version":"3.50.1"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"S2","license":[{"start":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T00:00:00Z","timestamp":1729728000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T00:00:00Z","timestamp":1729728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Conventional differential gene expression analysis pipelines for non-model organisms require computationally expensive transcriptome assembly. We recently proposed an alternative strategy of directly aligning RNA-seq reads to a protein database, and demonstrated drastic improvements in speed, memory usage, and accuracy in identifying differentially expressed genes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Result<\/jats:title>\n                    <jats:p>Here we report a further speed-up by replacing DNA-protein alignment by quasi-mapping, making our pipeline &gt; 1000\u00d7 faster than assembly-based approach, and still more accurate. We also compare quasi-mapping to other mapping techniques, and show that it is faster but at the cost of sensitivity.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>We provide a quick-and-dirty differential gene expression analysis pipeline for non-model organisms without a reference transcriptome, which directly quasi-maps RNA-seq reads to a reference protein database, avoiding computationally expensive transcriptome assembly.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-024-05924-1","type":"journal-article","created":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T04:03:52Z","timestamp":1729742632000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["DNA-protein quasi-mapping for rapid differential gene expression analysis in non-model organisms"],"prefix":"10.1186","volume":"25","author":[{"given":"Kyle Christian L.","family":"Santiago","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9192-9709","authenticated-orcid":false,"given":"Anish M. S.","family":"Shrestha","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,10,24]]},"reference":[{"issue":"3","key":"5924_CR1","doi-asserted-by":"publisher","first-page":"620","DOI":"10.1111\/mec.12014","volume":"22","author":"N Vijay","year":"2012","unstructured":"Vijay N, Poelstra JW, K\u00fcnstner A, Wolf JBW. Challenges and strategies in transcriptome assembly and differential gene expression quantification a. comprehensive in-silico assessment of RNA-seq experiments. Mol Ecol. 2012;22(3):620\u201334.","journal-title":"Mol Ecol"},{"issue":"1","key":"5924_CR2","doi-asserted-by":"publisher","first-page":"8304","DOI":"10.1038\/s41598-019-44499-3","volume":"9","author":"P-H Hsieh","year":"2019","unstructured":"Hsieh P-H, Oyang Y-J, Chen C-Y. Effect of de novo transcriptome assembly on transcript quantification. Sci Rep. 2019;9(1):8304.","journal-title":"Sci Rep"},{"issue":"1","key":"5924_CR3","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1186\/s12864-021-08278-7","volume":"23","author":"AMS Shrestha","year":"2022","unstructured":"Shrestha AMS, Guiao JEB, Santiago KCL. Assembly-free rapid differential gene expression analysis in non-model organisms using DNA-protein alignment. BMC Genom. 2022;23(1):97.","journal-title":"BMC Genom"},{"issue":"4","key":"5924_CR4","doi-asserted-by":"publisher","first-page":"713","DOI":"10.1101\/gr.269894.120","volume":"31","author":"P Liu","year":"2021","unstructured":"Liu P, Ewald J, Galvez JH, Head J, Crump D, Bourque G, Basu N, Xia J. Ultrafast functional profiling of RNA-seq data for nonmodel organisms. Genome Res. 2021;31(4):713\u201320.","journal-title":"Genome Res"},{"issue":"4","key":"5924_CR5","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1038\/nmeth.4197","volume":"14","author":"R Patro","year":"2017","unstructured":"Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417\u20139.","journal-title":"Nat Methods"},{"issue":"12","key":"5924_CR6","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1093\/bioinformatics\/btw277","volume":"32","author":"A Srivastava","year":"2016","unstructured":"Srivastava A, Sarkar H, Gupta N, Patro R. RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes. Bioinformatics. 2016;32(12):192\u2013200.","journal-title":"Bioinformatics"},{"issue":"5","key":"5924_CR7","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1038\/nbt.3519","volume":"34","author":"NL Bray","year":"2016","unstructured":"Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34(5):525\u20137.","journal-title":"Nat Biotechnol"},{"issue":"4","key":"5924_CR8","first-page":"407","volume":"65","author":"S Grabowski","year":"2017","unstructured":"Grabowski S, Raniszewski M. Compact and hash based variants of the suffix array. Bull Pol Acad Sci Tech Sci. 2017;65(4):407\u201318.","journal-title":"Bull Pol Acad Sci Tech Sci"},{"issue":"7","key":"5924_CR9","doi-asserted-by":"publisher","first-page":"621","DOI":"10.1038\/nmeth.1226","volume":"5","author":"A Mortazavi","year":"2008","unstructured":"Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat Methods. 2008;5(7):621\u20138.","journal-title":"Nat Methods"},{"issue":"3","key":"5924_CR10","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1145\/2692956.2663188","volume":"34","author":"ND Matsakis","year":"2014","unstructured":"Matsakis ND, Klock FS. The rust language. ACM SIGAda Ada Lett. 2014;34(3):103\u20134.","journal-title":"ACM SIGAda Ada Lett"},{"issue":"3","key":"5924_CR11","doi-asserted-by":"publisher","first-page":"444","DOI":"10.1093\/bioinformatics\/btv573","volume":"32","author":"J K\u00f6ster","year":"2015","unstructured":"K\u00f6ster J. Rust-bio: a fast and safe bioinformatics library. Bioinformatics. 2015;32(3):444\u20136.","journal-title":"Bioinformatics"},{"issue":"12","key":"5924_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-014-0550-8","volume":"15","author":"MI Love","year":"2014","unstructured":"Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):1\u201321.","journal-title":"Genome Biol"},{"key":"5924_CR13","doi-asserted-by":"publisher","DOI":"10.12688\/f1000research.29032.2","author":"F M\u00f6lder","year":"2021","unstructured":"M\u00f6lder F, Jablonski KP, Letcher B, Hall MB, Tomkins-Tinch CH, Sochat V, Lee S, Twardziok SO, Kanitz A, Wilm A, Holtgrewe M, Rahmann S, Nahnsen S, K\u00f6ster J. Sustainable data analysis with snakemake. F1000Res. 2021. https:\/\/doi.org\/10.12688\/f1000research.29032.2.","journal-title":"F1000Res"},{"issue":"17","key":"5924_CR14","doi-asserted-by":"publisher","first-page":"2778","DOI":"10.1093\/bioinformatics\/btv272","volume":"31","author":"AC Frazee","year":"2015","unstructured":"Frazee AC, Jaffe AE, Langmead B, Leek JT. Polyester: simulating RNA-seq datasets with differential transcript expression. Bioinformatics. 2015;31(17):2778\u201384.","journal-title":"Bioinformatics"},{"issue":"12","key":"5924_CR15","doi-asserted-by":"publisher","first-page":"1880","DOI":"10.1101\/gr.7062307","volume":"17","author":"A Bhutkar","year":"2007","unstructured":"Bhutkar A, Russo SM, Smith TF, Gelbart WM. Genome-scale analysis of positionally relocated genes. Genome Res. 2007;17(12):1880\u20137.","journal-title":"Genome Res"},{"issue":"8","key":"5924_CR16","doi-asserted-by":"publisher","first-page":"883","DOI":"10.1534\/g3.112.002527","volume":"2","author":"B Haubold","year":"2012","unstructured":"Haubold B, Pfaffelhuber P. Alignment-free population genomics: an efficient estimator of sequence diversity. G3 Genes|Genomes|Genet. 2012;2(8):883\u20139.","journal-title":"G3 Genes|Genomes|Genet"},{"key":"5924_CR17","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1101\/gr.196101","volume":"12","author":"VN Bolshakov","year":"2002","unstructured":"Bolshakov VN, Topalis P, Blass C, Kokoza E, Torre A, Kafatos FC, Louis C. A comparative genomic analysis of two distant diptera, the fruit fly, drosophila melanogaster, and the malaria mosquito, anopheles gambiae. Genome Res. 2002;12:57\u201366.","journal-title":"Genome Res"},{"key":"5924_CR18","doi-asserted-by":"crossref","unstructured":"Yao Y, Frith MC. Improved DNA-versus-protein homology search for protein fossils. In: Algorithms for computational biology, Cham: Springer; 2021. pp. 146\u2013158.","DOI":"10.1007\/978-3-030-74432-8_11"},{"issue":"1","key":"5924_CR19","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/nmeth.3176","volume":"12","author":"B Buchfink","year":"2014","unstructured":"Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2014;12(1):59\u201360.","journal-title":"Nat Methods"},{"issue":"1","key":"5924_CR20","doi-asserted-by":"publisher","first-page":"11257","DOI":"10.1038\/ncomms11257","volume":"7","author":"P Menzel","year":"2016","unstructured":"Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun. 2016;7(1):11257.","journal-title":"Nat Commun"},{"issue":"D1","key":"5924_CR21","doi-asserted-by":"publisher","first-page":"D234","DOI":"10.1093\/nar\/gku1203","volume":"43","author":"ELL Sonnhammer","year":"2014","unstructured":"Sonnhammer ELL, \u00d6stlund G. Inparanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucl Acids Res. 2014;43(D1):D234\u20139.","journal-title":"Nucl Acids Res"},{"key":"5924_CR22","doi-asserted-by":"publisher","first-page":"1494","DOI":"10.1038\/nprot.2013.084","volume":"8","author":"BJ Haas","year":"2013","unstructured":"...Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A. De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494\u2013512.","journal-title":"Nat Protoc"},{"issue":"R25","key":"5924_CR23","first-page":"1","volume":"10","author":"B Langmead","year":"2009","unstructured":"Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(R25):1\u201310.","journal-title":"Genome Biol"},{"issue":"323","key":"5924_CR24","first-page":"1","volume":"12","author":"B Li","year":"2011","unstructured":"Li B, Dewey CN. Rsem: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinform. 2011;12(323):1\u201316.","journal-title":"BMC Bioinform"},{"key":"5924_CR25","doi-asserted-by":"publisher","DOI":"10.12688\/f1000research.7563.2","author":"C Soneson","year":"2015","unstructured":"Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research. 2015. https:\/\/doi.org\/10.12688\/f1000research.7563.2.","journal-title":"F1000Research"},{"issue":"13","key":"5924_CR26","doi-asserted-by":"publisher","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","volume":"22","author":"W Li","year":"2006","unstructured":"Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658\u20139.","journal-title":"Bioinformatics"},{"issue":"1","key":"5924_CR27","doi-asserted-by":"publisher","first-page":"2542","DOI":"10.1038\/s41467-018-04964-5","volume":"9","author":"M Steinegger","year":"2018","unstructured":"Steinegger M, S\u00f6ding J. Clustering huge protein sequence sets in linear time. Nat Commun. 2018;9(1):2542.","journal-title":"Nat Commun"},{"issue":"1","key":"5924_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-019-1832-y","volume":"20","author":"DM Emms","year":"2019","unstructured":"Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20(1):1\u201314.","journal-title":"Genome Biol"},{"issue":"3","key":"5924_CR29","doi-asserted-by":"publisher","first-page":"440","DOI":"10.1093\/bioinformatics\/18.3.440","volume":"18","author":"B Ma","year":"2002","unstructured":"Ma B, Tromp J, Li M. PatternHunter: faster and more sensitive homology search. Bioinformatics. 2002;18(3):440\u20135.","journal-title":"Bioinformatics"},{"key":"5924_CR30","doi-asserted-by":"publisher","first-page":"e10805","DOI":"10.7717\/peerj.10805","volume":"9","author":"R Edgar","year":"2021","unstructured":"Edgar R. Syncmers are more sensitive than minimizers for selecting conserved k-mers in biological sequences. PeerJ. 2021;9:e10805.","journal-title":"PeerJ"},{"issue":"22\u201323","key":"5924_CR31","first-page":"5344","volume":"36","author":"MC Frith","year":"2020","unstructured":"Frith MC, No\u00e9 L, Kucherov G. Minimally-overlapping words for sequence similarity search. Bioinformatics. 2020;36(22\u201323):5344\u201350.","journal-title":"Bioinformatics"},{"key":"5924_CR32","unstructured":"Boden M., ch\u00f6neich M, Horwege S, Lindner S, Leimeister C-A, Morgenstern B. Alignment-free sequence comparison with spaced k-mers. Germ Conf Bioinformat. 2013;2013."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05924-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-024-05924-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05924-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T04:04:03Z","timestamp":1729742643000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-024-05924-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,24]]},"references-count":32,"journal-issue":{"issue":"S2","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["5924"],"URL":"https:\/\/doi.org\/10.1186\/s12859-024-05924-1","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.12.15.520671","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,24]]},"assertion":[{"value":"28 October 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 September 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 October 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"335"}}