{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T17:10:02Z","timestamp":1772557802962,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2021,10,5]],"date-time":"2021-10-05T00:00:00Z","timestamp":1633392000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"US National Institutes of Health","doi-asserted-by":"crossref","award":["U54-HG007963"],"award-info":[{"award-number":["U54-HG007963"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objective<\/jats:title>\n                    <jats:p>Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data. However, previous linked data inference methods are constrained to certain linkage settings and exhibit low power. Here, we present ATLAS, an automated, flexible, and robust association testing algorithm for probabilistically linked data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>Missing variables are imputed at various thresholds using a weighted average method that propagates uncertainty from probabilistic linkage. Next, estimated effect sizes are obtained using a generalized linear model. ATLAS then conducts the threshold combination test by optimally combining P values obtained from data imputed at varying thresholds using Fisher\u2019s method and perturbation resampling.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In simulations, ATLAS controls for type I error and exhibits high power compared to previous methods. In a real-world genetic association study, meta-analysis of ATLAS-enabled analyses on a linked cohort with analyses using an existing cohort yielded additional significant associations between rheumatoid arthritis genetic risk score and laboratory biomarkers.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>Weighted average imputation weathers false matches and increases contribution of true matches to mitigate linkage error-induced bias. The threshold combination test avoids arbitrarily choosing a threshold to rule a match, thus automating linked data-enabled analyses and preserving power.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>ATLAS promises to enable novel and powerful research studies using linked data to capitalize on all available data sources.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocab187","type":"journal-article","created":{"date-parts":[[2021,9,8]],"date-time":"2021-09-08T07:36:27Z","timestamp":1631086587000},"page":"2582-2592","source":"Crossref","is-referenced-by-count":3,"title":["ATLAS: an automated association test using probabilistically linked health records with application to genetic studies"],"prefix":"10.1093","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6679-1464","authenticated-orcid":false,"given":"Harrison G","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"},{"name":"Division of Rheumatology, Immunology, and Allergy, Brigham and Women\u2019s Hospital, Boston, Massachusetts, USA"},{"name":"Department of Biological Sciences, Columbia University, New York City, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0646-452X","authenticated-orcid":false,"given":"Boris P","family":"Hejblum","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA"},{"name":"Bordeaux Population Health, Universit\u00e9 de Bordeaux, Inserm U1219, Inria SISTM, Bordeaux, France"}]},{"given":"Griffin M","family":"Weber","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Nathan P","family":"Palmer","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Susanne E","family":"Churchill","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Peter","family":"Szolovits","sequence":"additional","affiliation":[{"name":"Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts, USA"}]},{"given":"Shawn N","family":"Murphy","sequence":"additional","affiliation":[{"name":"Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA"},{"name":"Research IS and Computing, Mass General Brigham HealthCare, Charlestown, Massachusetts, USA"}]},{"given":"Katherine P","family":"Liao","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"},{"name":"Division of Rheumatology, Immunology, and Allergy, Brigham and Women\u2019s Hospital, Boston, Massachusetts, USA"}]},{"given":"Isaac S","family":"Kohane","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5379-2502","authenticated-orcid":false,"given":"Tianxi","family":"Cai","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"},{"name":"Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,10,5]]},"reference":[{"issue":"2","key":"2021120106310668300_ocab187-B1","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1136\/amiajnl-2011-000492","article-title":"A translational engine at the national scale: informatics for integrating biology and the bedside","volume":"19","author":"Kohane","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"6","key":"2021120106310668300_ocab187-B2","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1197\/jamia.M2824","article-title":"Translational bioinformatics: coming of age","volume":"15","author":"Butte","year":"2008","journal-title":"J Am Med Inform Assoc"},{"key":"2021120106310668300_ocab187-B3"},{"issue":"501","key":"2021120106310668300_ocab187-B4","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1080\/01621459.2012.726889","article-title":"A Bayesian procedure for file linking to analyze end-of-life medical costs","volume":"108","author":"Gutman","year":"2013","journal-title":"J Am Stat Assoc"},{"key":"2021120106310668300_ocab187-B5","first-page":"1005","article-title":"The effect of mismatching on the measurement of response errors","volume":"60","author":"Neter","year":"1965","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"2021120106310668300_ocab187-B6","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1186\/s12874-018-0632-5","article-title":"Impact of linkage quality on inferences drawn from analyses using data with high rates of linkage errors in rural Tanzania","volume":"18","author":"Rentsch","year":"2018","journal-title":"BMC Med Res Methodol"},{"issue":"7","key":"2021120106310668300_ocab187-B7","doi-asserted-by":"crossref","first-page":"e103690","DOI":"10.1371\/journal.pone.0103690","article-title":"A new method for assessing how sensitivity and specificity of linkage studies affects estimation","volume":"9","author":"Moore","year":"2014","journal-title":"PLoS One"},{"issue":"12","key":"2021120106310668300_ocab187-B8","doi-asserted-by":"crossref","first-page":"e85278","DOI":"10.1371\/journal.pone.0085278","article-title":"Linkage, evaluation and analysis of national electronic healthcare data: application to providing enhanced blood-stream infection surveillance in paediatric intensive care","volume":"8","author":"Harron","year":"2013","journal-title":"PLoS One"},{"key":"2021120106310668300_ocab187-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1472-6947-13-1","article-title":"Impact of unlinked deaths and coding changes on mortality trends in the Swiss national cohort","volume":"13","author":"Schmidlin","year":"2013","journal-title":"BMC Med Inform Decis Mak"},{"issue":"6","key":"2021120106310668300_ocab187-B10","first-page":"2050","article-title":"Reflections on modern methods: linkage error bias","volume":"48","author":"Doidge","year":"2019","journal-title":"Int J Epidemiol"},{"issue":"30","key":"2021120106310668300_ocab187-B11","doi-asserted-by":"crossref","first-page":"4231","DOI":"10.1002\/sim.5498","article-title":"Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables","volume":"31","author":"Hof","year":"2012","journal-title":"Stat Med"},{"issue":"3","key":"2021120106310668300_ocab187-B12","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1111\/stan.12172","article-title":"A weighting approach to making inference with probabilistically linked data","volume":"73","author":"Chipperfield","year":"2019","journal-title":"Stat Neerland"},{"issue":"4","key":"2021120106310668300_ocab187-B13","doi-asserted-by":"crossref","first-page":"728","DOI":"10.1080\/10618600.2018.1458624","article-title":"Regression modeling and file matching using possibly erroneous matching variables","volume":"27","author":"Dalzell","year":"2018","journal-title":"J Comput Graph Stat"},{"issue":"S1","key":"2021120106310668300_ocab187-B14","doi-asserted-by":"crossref","first-page":"S139","DOI":"10.1111\/insr.12295","article-title":"Statistical analysis with linked data","volume":"87","author":"Han","year":"2019","journal-title":"Int Stat Rev"},{"key":"2021120106310668300_ocab187-B15","doi-asserted-by":"crossref","first-page":"180298","DOI":"10.1038\/sdata.2018.298","article-title":"Probabilistic record linkage of de-identified research datasets with discrepancies using diagnosis codes","volume":"6","author":"Hejblum","year":"2019","journal-title":"Sci Data"},{"issue":"2","key":"2021120106310668300_ocab187-B16","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1093\/biomet\/88.2.381","article-title":"A simple resampling method by perturbing the minimand","volume":"88","author":"Jin","year":"2001","journal-title":"Biometrika"},{"issue":"496","key":"2021120106310668300_ocab187-B17","doi-asserted-by":"crossref","first-page":"1371","DOI":"10.1198\/jasa.2011.tm10382","article-title":"A perturbation method for inference on regularized regression estimates","volume":"106","author":"Minnier","year":"2011","journal-title":"J Am Stat Assoc"},{"key":"2021120106310668300_ocab187-B18","first-page":"1597"},{"key":"2021120106310668300_ocab187-B19","first-page":"274","author":"Adly","year":"2009"},{"key":"2021120106310668300_ocab187-B20","first-page":"1","article-title":"Spherical regression under mismatch corruption with application to automated knowledge translation","author":"Shi","year":"2020","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"2021120106310668300_ocab187-B21","doi-asserted-by":"crossref","first-page":"6","DOI":"10.3390\/jpm6010006","article-title":"The information technology infrastructure for the translational genomics core and the partners biobank at partners personalized medicine","volume":"6","author":"Boutin","year":"2016","journal-title":"J Pers Med"},{"issue":"1","key":"2021120106310668300_ocab187-B22","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.ajhg.2010.12.007","article-title":"Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records","volume":"88","author":"Kurreeman","year":"2011","journal-title":"Am J Hum Genet"},{"key":"2021120106310668300_ocab187-B23","first-page":"1044","article-title":"Calculating the benefits of a research patient data repository","volume":"2006","author":"Nalichowski","year":"2006","journal-title":"AMIA Annu Symp Proc"},{"issue":"8","key":"2021120106310668300_ocab187-B24","doi-asserted-by":"crossref","first-page":"1120","DOI":"10.1002\/acr.20184","article-title":"Electronic medical records for discovery research in rheumatoid arthritis","volume":"62","author":"Liao","year":"2010","journal-title":"Arthritis Care Res (Hoboken)"},{"issue":"12","key":"2021120106310668300_ocab187-B25","doi-asserted-by":"crossref","first-page":"3759","DOI":"10.1093\/rheumatology\/keaa198","article-title":"Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms","volume":"59","author":"Huang","year":"2020","journal-title":"Rheumatology (Oxford)"},{"key":"2021120106310668300_ocab187-B26","doi-asserted-by":"crossref","first-page":"2","DOI":"10.3390\/jpm6010002","article-title":"Building the partners healthcare biobank at partners personalized medicine: Informed consent, return of research results, recruitment lessons and operational considerations","volume":"6","author":"Karlson","year":"2016","journal-title":"J Pers Med"},{"key":"2021120106310668300_ocab187-B27","doi-asserted-by":"crossref","first-page":"11","DOI":"10.3390\/jpm6010011","article-title":"The biobank portal for partners personalized medicine: a query tool for working with consented biobank samples, genotypes, and phenotypes using i2b2","volume":"6","author":"Gainer","year":"2016","journal-title":"J Pers Med"},{"issue":"7488","key":"2021120106310668300_ocab187-B28","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature12873","article-title":"Genetics of rheumatoid arthritis contributes to biology and drug discovery","volume":"506","author":"Okada","year":"2014","journal-title":"Nature"},{"issue":"3","key":"2021120106310668300_ocab187-B29","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1038\/ng.1076","article-title":"Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis","volume":"44","author":"Raychaudhuri","year":"2012","journal-title":"Nat Genet"},{"issue":"9","key":"2021120106310668300_ocab187-B30","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene\u2013disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"issue":"7","key":"2021120106310668300_ocab187-B31","doi-asserted-by":"crossref","first-page":"e0175508","DOI":"10.1371\/journal.pone.0175508","article-title":"Evaluating phecodes, clinical classification software, and icd-9-cm codes for phenome-wide association studies in the electronic health record","volume":"12","author":"Wei","year":"2017","journal-title":"PLoS One"},{"key":"2021120106310668300_ocab187-B32","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J R Stat Soc B (Methodol)"},{"issue":"11","key":"2021120106310668300_ocab187-B33","doi-asserted-by":"crossref","first-page":"1472","DOI":"10.1002\/art.24827","article-title":"Anti-citrullinated peptide antibody (ACPA) assays and their role in the diagnosis of rheumatoid arthritis","volume":"61","author":"Aggarwal","year":"2009","journal-title":"Arthritis Rheum"},{"issue":"3","key":"2021120106310668300_ocab187-B34","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1002\/art.37801","article-title":"Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non\u2013rheumatoid arthritis controls","volume":"65","author":"Liao","year":"2013","journal-title":"Arthritis Rheum"},{"issue":"1","key":"2021120106310668300_ocab187-B35","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1111\/j.1541-0420.2011.01666.x","article-title":"Combining multiple imputation and inverse-probability weighting","volume":"68","author":"Seaman","year":"2012","journal-title":"Biometrics"},{"key":"2021120106310668300_ocab187-B36","article-title":"Evaluation of the association between C-reactive protein and anti-citrullinated protein antibody in rheumatoid arthritis: analysis of two clinical practice data sets [abstract]","volume":"68 (suppl 10): 1226","author":"Alemao","year":"2016","journal-title":"Arthritis Rheumatol"},{"issue":"1","key":"2021120106310668300_ocab187-B37","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1016\/j.semarthrit.2020.11.005","article-title":"C-reactive protein and implications in rheumatoid arthritis and associated comorbidities","volume":"51","author":"Pope","year":"2021","journal-title":"Semin Arthritis Rheum"},{"issue":"7","key":"2021120106310668300_ocab187-B38","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1002\/1529-0131(200007)43:7<1473::AID-ANR9>3.0.CO;2-N","article-title":"Relationship between time-integrated C-reactive protein levels and radiologic progression in patients with rheumatoid arthritis","volume":"43","author":"Plant","year":"2000","journal-title":"Arthritis Rheum"},{"issue":"6","key":"2021120106310668300_ocab187-B39","first-page":"1095","article-title":"High sensitivity C-reactive protein as a disease activity marker in rheumatoid arthritis","volume":"31","author":"Dessein","year":"2004","journal-title":"J Rheumatol"},{"issue":"8","key":"2021120106310668300_ocab187-B40","first-page":"1477","article-title":"Comparative usefulness of C-reactive protein and erythrocyte sedimentation rate in patients with rheumatoid arthritis","volume":"24","author":"Wolfe","year":"1997","journal-title":"J Rheumatol"},{"issue":"3","key":"2021120106310668300_ocab187-B41","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1309\/LMZYTSO5RHIHV93T","article-title":"Rheumatoid factor, anti-cyclic citrullinated peptide antibody, C-reactive protein, and erythrocyte sedimentation rate for the clinical diagnosis of rheumatoid arthritis","volume":"46","author":"Shen","year":"2015","journal-title":"Lab Med"},{"issue":"6055","key":"2021120106310668300_ocab187-B42","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1136\/bmj.1.6055.195","article-title":"Rheumatoid arthritis: relation of serum C-reactive protein and erythrocyte sedimentation rates to radiographic changes","volume":"1","author":"Amos","year":"1977","journal-title":"Br Med J"},{"key":"2021120106310668300_ocab187-B43","first-page":"1817","article-title":"The level of inflammation in rheumatoid arthritis is determined early and remains stable over the longterm course of the illness","volume":"28","author":"Wolfe","year":"2001","journal-title":"J Rheumatol"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/28\/12\/2582\/41325387\/ocab187.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/28\/12\/2582\/41325387\/ocab187.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,7]],"date-time":"2024-09-07T21:07:32Z","timestamp":1725743252000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/12\/2582\/6381508"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,5]]},"references-count":43,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,10,5]]},"published-print":{"date-parts":[[2021,11,25]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocab187","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.05.02.21256490","asserted-by":"object"}]},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,12,1]]},"published":{"date-parts":[[2021,10,5]]}}}