{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:39:47Z","timestamp":1772138387922,"version":"3.50.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2025,6,27]],"date-time":"2025-06-27T00:00:00Z","timestamp":1750982400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"name":"Sturm Family Foundation"},{"name":"National Institute of Health","award":["1UL1TR002373"],"award-info":[{"award-number":["1UL1TR002373"]}]},{"name":"National Institute of Health","award":["1R01GM130715"],"award-info":[{"award-number":["1R01GM130715"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objectives<\/jats:title>\n                    <jats:p>Family data are a valuable data source in bioinformatic research. This is because family members often share common genetic and environmental exposures. Collecting this family data is traditionally very labor intensive but advances in electronic health record (EHR) data mining has proven useful when identifying pedigrees linked to longitudinal health histories. These are called e-pedigrees. Unfortunately, e-pedigrees tend to miss the oldest patients who inherently have the longest and richest health histories. A good source of family data from older generations includes obituaries, as they have a formulaic nature making them a good candidate for natural language processing (NLP) that can extract relationships to the decedent. While there have been several studies on obtaining such data from obituaries, we demonstrate for the first time approaches that tie that information to an EHR.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>Natural language processing extraction resulted in 8\u00a0166\u00a0534 family members being abstracted from 567\u00a0279 obituaries published in the state of Wisconsin. After matching decedent and family members to patients in the EHR, we identified 200 033 unique patients that were put in 53 640 pedigrees.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The largest pedigree consisted of 21 individuals. Heritability of adult height was quantified (H2=0.51\u00b10.04, P&amp;lt;1.00e-07) demonstrating these data\u2019s use in genetic research. The heritability data, coupled with overlapping data in a biobank, suggested 80%-90% of familial relationships were accurately defined.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>The totality of these findings demonstrate obituaries with the oldest people in society can be highly informative for bioinformatic research.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Code is available on GitHub at https:\/\/github.com\/jgmayer672\/ObituaryNLP.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocaf102","type":"journal-article","created":{"date-parts":[[2025,6,6]],"date-time":"2025-06-06T07:53:24Z","timestamp":1749196404000},"page":"1407-1414","source":"Crossref","is-referenced-by-count":0,"title":["Identifying family structures from obituaries and matching them to patients in an electronic heath record"],"prefix":"10.1093","volume":"32","author":[{"given":"John","family":"Mayer","sequence":"first","affiliation":[{"name":"Office of Research Computing and Analytics, Marshfield Clinic Research Institute , Marshfield, WI 54449,","place":["United States"]}]},{"given":"Brooke","family":"Delgoffe","sequence":"additional","affiliation":[{"name":"Office of Research Computing and Analytics, Marshfield Clinic Research Institute , Marshfield, WI 54449,","place":["United States"]}]},{"given":"Scott","family":"Hebbring","sequence":"additional","affiliation":[{"name":"Center for Precision Medicine Research, Marshfield Clinic Research Institute , Marshfield, WI 54449,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,6,27]]},"reference":[{"key":"2025081903513713400_ocaf102-B1","doi-asserted-by":"crossref","first-page":"1692","DOI":"10.1016\/j.cell.2018.04.032","article-title":"Disease heritability inferred from familial relationships reported in medical records","volume":"173","author":"Polubriaginof","year":"2018","journal-title":"Cell"},{"key":"2025081903513713400_ocaf102-B2","first-page":"221","article-title":"Automated family histories significantly improve risk prediction in an EHR","author":"Huang","year":"2024","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2025081903513713400_ocaf102-B3","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1038\/ng.3285","article-title":"Meta-analysis of the heritability of human traits based on fifty years of twin studies","volume":"47","author":"Posthuma","year":"2015","journal-title":"Nat Genet"},{"key":"2025081903513713400_ocaf102-B4","doi-asserted-by":"crossref","first-page":"342","DOI":"10.3802\/jgo.2014.25.4.342","article-title":"Completeness of pedigree and family cancer history for ovarian cancer patients","volume":"25","author":"Son","year":"2014","journal-title":"J Gynecol Oncol"},{"key":"2025081903513713400_ocaf102-B5","doi-asserted-by":"publisher","first-page":"1242","DOI":"10.1161\/01","article-title":"Ethical and methodological issues in pedigree stroke research","volume":"32","author":"Worrall","year":"2001","journal-title":"Stroke"},{"key":"2025081903513713400_ocaf102-B6","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1093\/bioinformatics\/btx569","article-title":"Applying family analyses to electronic health records to facilitate genetic research","volume":"34","author":"Huang","year":"2018","journal-title":"Bioinformatics"},{"key":"2025081903513713400_ocaf102-B7","doi-asserted-by":"crossref","first-page":"3966","DOI":"10.1093\/bioinformatics\/btab419","article-title":"E-Pedigrees: a large-scale automatic family pedigree prediction application","volume":"37","author":"Huang","year":"2021","journal-title":"Bioinformatics"},{"key":"2025081903513713400_ocaf102-B8","doi-asserted-by":"publisher","first-page":"e25670","DOI":"10.2196\/25670","article-title":"Construction of genealogical knowledge graphs from obituaries: multitask neural network extraction system","volume":"23","author":"He","year":"2021","journal-title":"J Med Internet Res"},{"key":"2025081903513713400_ocaf102-B9","doi-asserted-by":"crossref","first-page":"1817","DOI":"10.1007\/s10115-022-01687-4","article-title":"CustRE: a rule based system for family relations extraction from English text","volume":"64","author":"Mumtaz","year":"2022","journal-title":"Knowl Inf Syst"},{"key":"2025081903513713400_ocaf102-B10","doi-asserted-by":"publisher","first-page":"e30153","DOI":"10.2196\/24020","article-title":"Extracting family history information from electronic health records: natural language processing analysis","volume":"9","author":"Rybinski","year":"2021","journal-title":"JMIR Med Inform"},{"key":"2025081903513713400_ocaf102-B11","author":"Ziedan","year":"2022"},{"key":"2025081903513713400_ocaf102-B12","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/1472-6947-13-114","article-title":"Evaluating the risk of patient re-identification","volume":"13","author":"Emam","year":"2013","journal-title":"BMC Med Inform Decis Mak"},{"key":"2025081903513713400_ocaf102-B13","author":"NewsBank","year":"2022"},{"key":"2025081903513713400_ocaf102-B14","first-page":"363","author":"Finkel","year":"2005"},{"key":"2025081903513713400_ocaf102-B15","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1517\/17410541.2.1.49","article-title":"Marshfield clinic personalized medicine research project (PMRP): design, methods and recruitment for a large population-based biobank","volume":"2","author":"McCarty","year":"2005","journal-title":"Per Med"},{"key":"2025081903513713400_ocaf102-B16","first-page":"100201","article-title":"Genetic and clinical determinants of telomere length","volume":"4","author":"Allaire","year":"2023","journal-title":"HGG Adv"},{"key":"2025081903513713400_ocaf102-B17","doi-asserted-by":"crossref","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","article-title":"The variant call format and VCFtools","volume":"27","author":"Petr","year":"2011","journal-title":"Bioinformatics"},{"key":"2025081903513713400_ocaf102-B18","first-page":"456","article-title":"A review of the \u2018Statistical Analysis for Genetic Epidemiology\u2019 (SAGE) software package","volume-title":"Hum Genomics","author":"Elston","year":"2024"},{"key":"2025081903513713400_ocaf102-B19","author":"Safer v. Estate of Pack."},{"key":"2025081903513713400_ocaf102-B20","first-page":"278"},{"key":"2025081903513713400_ocaf102-B21","first-page":"444"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/9\/1407\/63606522\/ocaf102.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/9\/1407\/63606522\/ocaf102.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,19]],"date-time":"2025-08-19T07:51:54Z","timestamp":1755589914000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/32\/9\/1407\/8176557"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,27]]},"references-count":21,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,6,27]]},"published-print":{"date-parts":[[2025,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaf102","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.11.26.625445","asserted-by":"object"}]},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,9]]},"published":{"date-parts":[[2025,6,27]]}}}