{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T03:39:01Z","timestamp":1752550741685},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"S6","license":[{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2019,12,19]],"date-time":"2019-12-19T00:00:00Z","timestamp":1576713600000},"content-version":"vor","delay-in-days":18,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Sequence alignment is a way of arranging sequences (e.g., DNA, RNA, protein, natural language, financial data, or medical events) to identify the relatedness between two or more sequences and regions of similarity. For Electronic Health Records (EHR) data, sequence alignment helps to identify patients of similar disease trajectory for more relevant and precise prognosis, diagnosis and treatment of patients.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>We tested two cutting-edge global sequence alignment methods, namely dynamic time warping (DTW) and Needleman-Wunsch algorithm (NWA), together with their local modifications, DTW for Local alignment (DTWL) and Smith-Waterman algorithm (SWA), for aligning patient medical records. We also used 4 sets of synthetic patient medical records generated from a large real-world EHR database as gold standard data, to objectively evaluate these sequence alignment algorithms.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>For global sequence alignments, 47 out of 80 DTW alignments and 11 out of 80 NWA alignments had superior similarity scores than reference alignments while the rest 33 DTW alignments and 69 NWA alignments had the same similarity scores as reference alignments. Forty-six out of 80 DTW alignments had better similarity scores than NWA alignments with the rest 34 cases having the equal similarity scores from both algorithms. For local sequence alignments, 70 out of 80 DTWL alignments and 68 out of 80 SWA alignments had larger coverage and higher similarity scores than reference alignments while the rest DTWL alignments and SWA alignments received the same coverage and similarity scores as reference alignments. Six out of 80 DTWL alignments\u2009showed larger coverage and higher similarity scores than SWA alignments. Thirty DTWL alignments had the equal coverage but better similarity scores than SWA. DTWL and SWA received the equal coverage and similarity scores for the rest 44 cases.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>DTW, NWA, DTWL and SWA outperformed the reference alignments. DTW (or DTWL) seems to align better than NWA (or SWA) by inserting new daily events and identifying more similarities between patient medical records. The evaluation results could provide valuable information on the strengths and weakness of these sequence alignment methods for future development of sequence alignment methods and patient similarity-based studies.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12911-019-0965-y","type":"journal-article","created":{"date-parts":[[2019,12,19]],"date-time":"2019-12-19T09:04:20Z","timestamp":1576746260000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Evaluating global and local sequence alignment methods for comparing patient medical records"],"prefix":"10.1186","volume":"19","author":[{"given":"Ming","family":"Huang","sequence":"first","affiliation":[]},{"given":"Nilay D.","family":"Shah","sequence":"additional","affiliation":[]},{"given":"Lixia","family":"Yao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,12,19]]},"reference":[{"issue":"5","key":"965_CR1","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1007\/s10916-015-0237-z","volume":"39","author":"Y Wang","year":"2015","unstructured":"Wang Y, Tian Y, Tian L-L, Qian Y-M, Li J-S. An electronic medical record system with treatment recommendations based on patient similarity. J Med Syst. 2015;39(5):55.","journal-title":"J Med Syst"},{"key":"965_CR2","unstructured":"Wang F, Hu J, Sun J, editors. Medical prognosis based on patient similarity and expert feedback. 2012 21st International Conference on Pattern Recognition (ICPR); 2012: IEEE. ISBN: 4990644107."},{"issue":"5","key":"965_CR3","doi-asserted-by":"publisher","first-page":"e0127428","DOI":"10.1371\/journal.pone.0127428","volume":"10","author":"J Lee","year":"2015","unstructured":"Lee J, Maslove DM, Dubin JA. Personalized mortality prediction driven by electronic medical data and a patient similarity metric. PLoS One. 2015;10(5):e0127428.","journal-title":"PLoS One"},{"issue":"1","key":"965_CR4","doi-asserted-by":"publisher","first-page":"e7","DOI":"10.2196\/medinform.6730","volume":"5","author":"A Sharafoddini","year":"2017","unstructured":"Sharafoddini A, Dubin JA, Lee J. Patient similarity in prediction models based on health data: a scoping review. JMIR Med Inform. 2017;5(1):e7. PMID: 28258046. https:\/\/doi.org\/10.2196\/medinform.6730.","journal-title":"JMIR Med Inform"},{"key":"965_CR5","doi-asserted-by":"publisher","unstructured":"Brown S-A. Patient Similarity: Emerging Concepts in Systems and Precision Medicine. Front Physiol. 2016;7(561). https:\/\/doi.org\/10.3389\/fphys.2016.00561.","DOI":"10.3389\/fphys.2016.00561"},{"key":"965_CR6","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1016\/j.jbi.2018.06.001","volume":"83","author":"E Parimbelli","year":"2018","unstructured":"Parimbelli E, Marini S, Sacchi L, Bellazzi R. Patient similarity for precision medicine: A systematic review. J Biomed Inform. 2018;83:87\u201396. https:\/\/doi.org\/10.1016\/j.jbi.2018.06.001.","journal-title":"J Biomed Inform"},{"key":"965_CR7","doi-asserted-by":"crossref","unstructured":"Huang M, Zolnoori M, Shah ND, Yao L, editors. Temporal sequence alignment in electronic health records for computable patient representation. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM): IEEE; 2018. ISBN: 1538654881","DOI":"10.1109\/BIBM.2018.8621428"},{"key":"965_CR8","doi-asserted-by":"crossref","unstructured":"Che C, Xiao C, Liang J, Jin B, Zho J, Wang F, editors. An RNN Architecture with Dynamic Temporal Matching for Personalized Predictions of Parkinson's Disease. Proceedings of the 2017 SIAM International Conference on Data Mining: SIAM; 2017.","DOI":"10.1137\/1.9781611974973.23"},{"issue":"1","key":"965_CR9","doi-asserted-by":"publisher","first-page":"4216","DOI":"10.1038\/s41598-018-22578-1","volume":"8","author":"A Giannoula","year":"2018","unstructured":"Giannoula A, Gutierrez-Sacrist\u00e1n A, Bravo \u00c1, Sanz F, Furlong LI. Identifying temporal patterns in patient disease trajectories using dynamic time warping: A population-based study. Scientific Rep. 2018;8(1):4216. https:\/\/doi.org\/10.1038\/s41598-018-22578-1.","journal-title":"Scientific Rep"},{"issue":"3","key":"965_CR10","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","volume":"48","author":"SB Needleman","year":"1970","unstructured":"Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48(3):443\u201353.","journal-title":"J Mol Biol"},{"key":"965_CR11","doi-asserted-by":"crossref","unstructured":"Sung W-K. Algorithms in bioinformatics: A practical introduction. 1st ed: CRC Press; 2009. ISBN: 1420070347","DOI":"10.1201\/9781420070347"},{"issue":"1","key":"965_CR12","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","volume":"147","author":"TF Smith","year":"1981","unstructured":"Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195\u20137.","journal-title":"J Mol Biol"},{"issue":"3","key":"965_CR13","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1016\/j.bbrc.2018.05.134","volume":"502","author":"J Sun","year":"2018","unstructured":"Sun J, Chen K, Hao Z. Pairwise alignment for very long nucleic acid sequences. Biochem Biophys Res Commun. 2018;502(3):313\u20137.","journal-title":"Biochem Biophys Res Commun"},{"issue":"1","key":"965_CR14","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/nmeth.3176","volume":"12","author":"B Buchfink","year":"2015","unstructured":"Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12(1):59.","journal-title":"Nat Methods"},{"issue":"7615","key":"965_CR15","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1136\/bmj.39279.482963.AD","volume":"335","author":"Claudia Pagliari","year":"2007","unstructured":"Pagliari C, Detmer D, Singleton P. Potential of electronic personal health records. BMJ. 2007;335(7615):330\u20133.","journal-title":"BMJ"},{"key":"965_CR16","doi-asserted-by":"crossref","unstructured":"Li D, Liu P, Huang M, Gu Y, Zhang Y, Li X, et al., editors. Mapping client messages to a unified data model with mixture feature embedding convolutional neural network. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM): IEEE; 2017. ISBN: 1509030506","DOI":"10.1109\/BIBM.2017.8217680"},{"issue":"3","key":"965_CR17","doi-asserted-by":"publisher","first-page":"165","DOI":"10.1109\/TNB.2018.2841053","volume":"17","author":"D Li","year":"2018","unstructured":"Li D, Huang M, Li X, Ruan Y, Yao L. MfeCNN: mixture feature embedding convolutional neural network for data mapping. IEEE Trans Nanobioscience. 2018;17(3):165\u201371.","journal-title":"IEEE Trans Nanobioscience"},{"key":"965_CR18","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1007\/978-3-540-74048-3","volume-title":"Dynamic time warping. Information retrieval for music and motion","author":"M M\u00fcller","year":"2007","unstructured":"M\u00fcller M. Dynamic time warping. Information retrieval for music and motion; 2007. p. 69\u201384."},{"issue":"2","key":"965_CR19","doi-asserted-by":"publisher","first-page":"368-j","DOI":"10.1093\/ije\/dyx268","volume":"47","author":"WA Rocca","year":"2018","unstructured":"Rocca WA, Grossardt BR, Brue SM, Bock-Goodner CM, Chamberlain AM, Wilson PM, et al. Data resource profile: expansion of the Rochester epidemiology project medical records-linkage system (E-REP). Int J Epidemiol. 2018;47(2):368-j.","journal-title":"Int J Epidemiol"},{"issue":"6","key":"965_CR20","doi-asserted-by":"publisher","first-page":"1614","DOI":"10.1093\/ije\/dys195","volume":"41","author":"JL St Sauver","year":"2012","unstructured":"St Sauver JL, Grossardt BR, Yawn BP, Melton LJ III, Pankratz JJ, Brue SM, et al. Data resource profile: the Rochester epidemiology project (REP) medical records-linkage system. Int J Epidemiol. 2012;41(6):1614\u201324.","journal-title":"Int J Epidemiol"},{"issue":"9","key":"965_CR21","doi-asserted-by":"publisher","first-page":"1059","DOI":"10.1093\/aje\/kwq482","volume":"173","author":"JL St. Sauver","year":"2011","unstructured":"St. Sauver JL, Grossardt BR, Yawn BP, Melton LJ III, Rocca WA. Use of a medical records linkage system to enumerate a dynamic population over time: the Rochester epidemiology project. Am J Epidemiol. 2011;173(9):1059\u201368.","journal-title":"Am J Epidemiol"},{"key":"965_CR22","volume-title":"International classification of diseases, ninth revision, clinical modification (ICD-9-CM)","author":"National Center for Health Statistics","year":"2013","unstructured":"National Center for Health Statistics. International classification of diseases, ninth revision, clinical modification (ICD-9-CM). Atlanta: Centers for Disease Control Prevention; 2013. Available from: https:\/\/www.cdc.gov\/nchs\/icd\/icd9cm.htm"},{"issue":"7","key":"965_CR23","doi-asserted-by":"publisher","first-page":"e0175508","DOI":"10.1371\/journal.pone.0175508","volume":"12","author":"W-Q Wei","year":"2017","unstructured":"Wei W-Q, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, et al. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One. 2017;12(7):e0175508.","journal-title":"PLoS One"},{"issue":"8","key":"965_CR24","doi-asserted-by":"publisher","first-page":"807","DOI":"10.1038\/nbt.3276","volume":"33","author":"L Yao","year":"2015","unstructured":"Yao L, Li Y, Ghosh S, Evans JA, Rzhetsky A. Health ROI as a measure of misalignment of biomedical needs and resources. Nat Biotechnol. 2015;33(8):807\u201311 PMID: 26252133.","journal-title":"Nat Biotechnol"},{"issue":"5","key":"965_CR25","doi-asserted-by":"publisher","first-page":"e10047","DOI":"10.2196\/10047","volume":"20","author":"M Huang","year":"2018","unstructured":"Huang M, ElTayeby O, Zolnoori M, Yao L. Public opinions toward diseases: infodemiological study on News Media Data. J Med Internet Res. 2018;20(5):e10047.","journal-title":"J Med Internet Res"},{"issue":"4","key":"965_CR26","doi-asserted-by":"publisher","first-page":"e13316","DOI":"10.2196\/13316","volume":"21","author":"M Huang","year":"2019","unstructured":"Huang M, Zolnoori M, Balls-Berry JE, Brockman TA, Patten CA, Yao L. Technological innovations in disease management: text mining US patent data from 1995 to 2017. J Med Internet Res. 2019;21(4):e13316.","journal-title":"J Med Internet Res"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-019-0965-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12911-019-0965-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-019-0965-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,12,18]],"date-time":"2020-12-18T00:20:21Z","timestamp":1608250821000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-019-0965-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12]]},"references-count":26,"journal-issue":{"issue":"S6","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["965"],"URL":"https:\/\/doi.org\/10.1186\/s12911-019-0965-y","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12]]},"assertion":[{"value":"19 December 2019","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"No patients were exposed to any intervention. We used the data from the Rochester Epidemiology Project (REP) to generate simulated patient medical records. The REP was approved by the Mayo Clinic Institutional Review Board (1945\u201399).","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"263"}}