{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T01:04:34Z","timestamp":1773363874009,"version":"3.50.1"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,4,3]],"date-time":"2020-04-03T00:00:00Z","timestamp":1585872000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2020,4,3]],"date-time":"2020-04-03T00:00:00Z","timestamp":1585872000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000050","name":"NHLBI","doi-asserted-by":"crossref","award":["R01HL132344"],"award-info":[{"award-number":["R01HL132344"]}],"id":[{"id":"10.13039\/100000050","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Admixed populations arise when two or more previously isolated populations interbreed. A powerful approach to addressing the genetic complexity in admixed populations is to infer ancestry. Ancestry inference including the proportion of an individual\u2019s genome coming from each population and its ancestral origin along the chromosome of an admixed population requires the use of ancestry informative markers (AIMs) from reference ancestral populations. AIMs exhibit substantial differences in allele frequency between ancestral populations. Given the huge amount of human genetic variation data available from diverse populations, a computationally feasible and cost-effective approach is becoming increasingly important to extract or filter AIMs with the maximum information content for ancestry inference, admixture mapping, forensic applications, and detecting genomic regions that have been under recent selection.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>To address this gap, we present MI-MAAP, an easy-to-use web-based bioinformatics tool designed to prioritize informative markers for multi-ancestry admixed populations by utilizing feature selection methods and multiple genomics resources including 1000 Genomes Project and Human Genome Diversity Project. Specifically, this tool implements a novel allele frequency-based feature selection algorithm, Lancaster Estimator of Independence (LEI), as well as other genotype-based methods such as Principal Component Analysis (PCA), Support Vector Machine (SVM), and Random Forest (RF). We demonstrated that MI-MAAP is a useful tool in prioritizing informative markers and accurately classifying ancestral populations. LEI is an efficient feature selection strategy to retrieve ancestry informative variants with different allele frequency\/selection pressure among (or between) ancestries without requiring computationally expensive individual-level genotype data.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>MI-MAAP has a user-friendly interface which provides researchers an easy and fast way to filter and identify AIMs. MI-MAAP can be accessed at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/research.cchmc.org\/mershalab\/MI-MAAP\/login\/\">https:\/\/research.cchmc.org\/mershalab\/MI-MAAP\/login\/<\/jats:ext-link>.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-020-3462-5","type":"journal-article","created":{"date-parts":[[2020,4,3]],"date-time":"2020-04-03T10:07:53Z","timestamp":1585908473000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["MI-MAAP: marker informativeness for multi-ancestry admixed populations"],"prefix":"10.1186","volume":"21","author":[{"given":"Siqi","family":"Chen","sequence":"first","affiliation":[]},{"given":"Sudhir","family":"Ghandikota","sequence":"additional","affiliation":[]},{"given":"Yadu","family":"Gautam","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9189-8447","authenticated-orcid":false,"given":"Tesfaye B.","family":"Mersha","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,4,3]]},"reference":[{"key":"3462_CR1","doi-asserted-by":"publisher","first-page":"292","DOI":"10.3389\/fgene.2015.00292","volume":"6","author":"TB Mersha","year":"2015","unstructured":"Mersha TB. Mapping asthma-associated variants in admixed populations. Front Genet. 2015;6:292.","journal-title":"Front Genet"},{"issue":"6","key":"3462_CR2","doi-asserted-by":"publisher","first-page":"623","DOI":"10.2217\/pme.09.54","volume":"6","author":"TM Baye","year":"2009","unstructured":"Baye TM, Wilke RA, Olivier M. Genomic and geographic distribution of private SNPs and pathways in human populations. Per Med. 2009;6(6):623\u201341.","journal-title":"Per Med"},{"issue":"7063","key":"3462_CR3","doi-asserted-by":"publisher","first-page":"1299","DOI":"10.1038\/nature04226","volume":"437","author":"International HapMap C","year":"2005","unstructured":"International HapMap C. A haplotype map of the human genome. Nature. 2005;437(7063):1299\u2013320.","journal-title":"Nature"},{"issue":"7616","key":"3462_CR4","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1038\/nature19057","volume":"536","author":"M Lek","year":"2016","unstructured":"Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285\u201391.","journal-title":"Nature"},{"issue":"5866","key":"3462_CR5","doi-asserted-by":"publisher","first-page":"1100","DOI":"10.1126\/science.1153717","volume":"319","author":"JZ Li","year":"2008","unstructured":"Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319(5866):1100\u20134.","journal-title":"Science"},{"issue":"5","key":"3462_CR6","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1089\/bio.2015.29031.hmm","volume":"13","author":"LJ Carithers","year":"2015","unstructured":"Carithers LJ, Moore HM. The genotype-tissue expression (GTEx) project. Biopreserv Biobank. 2015;13(5):307\u20138.","journal-title":"Biopreserv Biobank"},{"issue":"6","key":"3462_CR7","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1038\/ng.2653","volume":"45","author":"GTEx","year":"2013","unstructured":"GTEx. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45(6):580\u20135.","journal-title":"Nat Genet"},{"issue":"6","key":"3462_CR8","doi-asserted-by":"publisher","first-page":"465","DOI":"10.1038\/tpj.2010.71","volume":"10","author":"TM Baye","year":"2010","unstructured":"Baye TM, Wilke RA. Mapping genes that predict treatment outcome in admixed populations. Pharmacogenomics J. 2010;10(6):465\u201377.","journal-title":"Pharmacogenomics J"},{"key":"3462_CR9","first-page":"1157","volume":"3","author":"I Guyon","year":"2003","unstructured":"Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157\u201382.","journal-title":"J Mach Learn Res"},{"issue":"7571","key":"3462_CR10","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1038\/nature15393","volume":"526","author":"GP Consortium","year":"2015","unstructured":"Consortium GP. A global reference for human genetic variation. Nature. 2015;526(7571):68.","journal-title":"Nature"},{"issue":"Suppl 9","key":"3462_CR11","doi-asserted-by":"publisher","first-page":"S8","DOI":"10.1186\/1753-6561-5-S9-S8","volume":"5","author":"TM Baye","year":"2011","unstructured":"Baye TM, He H, Ding L, Kurowski BG, Zhang X, Martin LJ. Population structure analysis using rare and common functional variants. BMC Proc. 2011;5(Suppl 9):S8.","journal-title":"BMC Proc"},{"key":"3462_CR12","doi-asserted-by":"publisher","first-page":"622","DOI":"10.1186\/1471-2164-12-622","volume":"12","author":"L Ding","year":"2011","unstructured":"Ding L, Wiener H, Abebe T, Altaye M, Go RC, Kercsmar C, Grabowski G, Martin LJ, Khurana Hershey GK, Chakorborty R, et al. Comparison of measures of marker informativeness for ancestry and admixture mapping. BMC Genomics. 2011;12:622.","journal-title":"BMC Genomics"},{"issue":"1","key":"3462_CR13","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.ygeno.2012.05.003","volume":"100","author":"S Amirisetty","year":"2012","unstructured":"Amirisetty S, Hershey GK, Baye TM. AncestrySNPminer: a bioinformatics tool to retrieve and develop ancestry informative SNP panels. Genomics. 2012;100(1):57\u201363.","journal-title":"Genomics"},{"issue":"1","key":"3462_CR14","doi-asserted-by":"publisher","first-page":"11103","DOI":"10.1038\/s41598-019-47012-y","volume":"9","author":"MJ Wathen","year":"2019","unstructured":"Wathen MJ, Gautam Y, Ghandikota S, Rao MB, Mersha TB. LEI: a novel allele frequency-based feature selection method for multi-ancestry admixed populations. Sci Rep. 2019;9(1):11103.","journal-title":"Sci Rep"},{"key":"3462_CR15","volume-title":"The Chi-squared Distribution. 1969","author":"HO Lancaster","year":"1969","unstructured":"Lancaster HO. The Chi-squared Distribution. 1969. New York: Wiley; 1969."},{"issue":"6968","key":"3462_CR16","doi-asserted-by":"publisher","first-page":"789","DOI":"10.1038\/nature02168","volume":"426","author":"International HapMap C","year":"2003","unstructured":"International HapMap C. The international HapMap project. Nature. 2003;426(6968):789\u201396.","journal-title":"Nature"},{"issue":"D1","key":"3462_CR17","doi-asserted-by":"publisher","first-page":"D840","DOI":"10.1093\/nar\/gkw971","volume":"45","author":"KJ Karczewski","year":"2017","unstructured":"Karczewski KJ, Weisburd B, Thomas B, Solomonson M, Ruderfer DM, Kavanagh D, Hamamsy T, Lek M, Samocha KE, Cummings BB, et al. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 2017;45(D1):D840\u20135.","journal-title":"Nucleic Acids Res"},{"issue":"18","key":"3462_CR18","doi-asserted-by":"publisher","first-page":"3160","DOI":"10.1093\/bioinformatics\/bty182","volume":"34","author":"S Ghandikota","year":"2018","unstructured":"Ghandikota S, Hershey GKK, Mersha TB. GENEASE: real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization. Bioinformatics. 2018;34(18):3160\u20138.","journal-title":"Bioinformatics"},{"issue":"9","key":"3462_CR19","doi-asserted-by":"publisher","first-page":"1790","DOI":"10.1101\/gr.137323.112","volume":"22","author":"AP Boyle","year":"2012","unstructured":"Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012;22(9):1790\u20137.","journal-title":"Genome Res"},{"issue":"8","key":"3462_CR20","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1038\/nrg3002","volume":"12","author":"MF Seldin","year":"2011","unstructured":"Seldin MF, Pasaniuc B, Price AL. New approaches to disease mapping in admixed populations. Nat Rev Genet. 2011;12(8):523\u20138.","journal-title":"Nat Rev Genet"},{"issue":"5","key":"3462_CR21","doi-asserted-by":"publisher","first-page":"1001","DOI":"10.1086\/420856","volume":"74","author":"MW Smith","year":"2004","unstructured":"Smith MW, Patterson N, Lautenberger JA, Truelove AL, McDonald GJ, Waliszewska A, Kessing BD, Malasky MJ, Scafe C, Le E, et al. A high-density admixture map for disease gene discovery in african americans. Am J Hum Genet. 2004;74(5):1001\u201313.","journal-title":"Am J Hum Genet"},{"issue":"4","key":"3462_CR22","doi-asserted-by":"publisher","first-page":"640","DOI":"10.1086\/507954","volume":"79","author":"C Tian","year":"2006","unstructured":"Tian C, Hinds DA, Shigeta R, Kittles R, Ballinger DG, Seldin MF. A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping. Am J Hum Genet. 2006;79(4):640\u20139.","journal-title":"Am J Hum Genet"},{"issue":"1","key":"3462_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/2041-2223-2-1","volume":"2","author":"JR Kidd","year":"2011","unstructured":"Kidd JR, Friedlaender FR, Speed WC, Pakstis AJ, De La Vega FM, Kidd KK. Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples. Investig Genet. 2011;2(1):1.","journal-title":"Investig Genet"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3462-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-020-3462-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3462-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,2]],"date-time":"2021-04-02T23:07:42Z","timestamp":1617404862000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-3462-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,3]]},"references-count":23,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3462"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-3462-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,3]]},"assertion":[{"value":"22 April 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 March 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 April 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"131"}}