{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T07:11:34Z","timestamp":1767856294842,"version":"3.49.0"},"reference-count":19,"publisher":"MIT Press","issue":"1","content-domain":{"domain":["www.mitpressjournals.org"],"crossmark-restriction":true},"short-container-title":["Quantitative Science Studies"],"published-print":{"date-parts":[[2020,2]]},"abstract":"<jats:p>Only scarce information is available on doctorate recipients\u2019 career outcomes ( BuWiN, 2013 ). With the current information base, graduate students cannot make an informed decision on whether to start a doctorate or not ( Benderly, 2018 ; Blank et al., 2017 ). However, administrative labor market data, which could provide the necessary information, are incomplete in this respect. In this paper, we describe the record linkage of two data sets to close this information gap: data on doctorate recipients collected in the catalog of the German National Library (DNB), and the German labor market biographies (IEB) from the German Institute of Employment Research. We use a machine learning-based methodology, which (a) improves the record linkage of data sets without unique identifiers, and (b) evaluates the quality of the record linkage. The machine learning algorithms are trained on a synthetic training and evaluation data set. In an exemplary analysis, we compare the evolution of the employment status of female and male doctorate recipients in Germany.<\/jats:p>","DOI":"10.1162\/qss_a_00001","type":"journal-article","created":{"date-parts":[[2019,8,29]],"date-time":"2019-08-29T18:16:35Z","timestamp":1567102595000},"page":"94-116","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":10,"title":["A supervised machine learning approach to trace doctorate recipients\u2019 employment trajectories"],"prefix":"10.1162","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0797-9314","authenticated-orcid":true,"given":"Dominik P.","family":"Heinisch","sequence":"first","affiliation":[{"name":"University of Kassel, Institute of Economics and INCHER-Kassel (Germany)"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9610-1222","authenticated-orcid":true,"given":"Johannes","family":"Koenig","sequence":"additional","affiliation":[{"name":"University of Kassel, Institute of Economics and INCHER-Kassel (Germany)"},{"name":"Institute of Employment Research (IAB) Rhineland-Palatinate-Saarland (Germany)"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3563-3206","authenticated-orcid":true,"given":"Anne","family":"Otto","sequence":"additional","affiliation":[{"name":"Institute of Employment Research (IAB) Rhineland-Palatinate-Saarland (Germany)"}]}],"member":"281","reference":[{"issue":"1","key":"bib1","doi-asserted-by":"crossref","first-page":"141","DOI":"10.3790\/schm.132.1.141","volume":"132","author":"Antoni M.","year":"2012","journal-title":"Schmollers Jahrbuch"},{"key":"bib3","author":"Benderly B. L.","year":"2018","journal-title":"Science"},{"key":"bib4","volume-title":"Pattern recognition and machine learning","author":"Bishop C. M.","year":"2006"},{"issue":"6369","key":"bib5","doi-asserted-by":"crossref","first-page":"1388","DOI":"10.1126\/science.aar4638","volume":"358","author":"Blank R.","year":"2017","journal-title":"Science"},{"issue":"2","key":"bib6","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1515\/jbnst-2014-2-305","volume":"234","author":"Buenstorf G.","year":"2014","journal-title":"Jahrb\u00fccher f\u00fcr National\u00f6konomie und Statistik"},{"key":"bib7","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-31164-2","volume-title":"Data matching: Concepts and techniques for record linkage, entity resolution, and duplicate detection","author":"Christen P.","year":"2012"},{"issue":"9","key":"bib8","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1109\/TKDE.2011.127","volume":"24","author":"Christen P.","year":"2012","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"issue":"2","key":"bib9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v017.i02","volume":"17","author":"Culp M.","year":"2006","journal-title":"Journal of Statistical Software"},{"key":"bib10","unstructured":"Deutsche Nationalbibliothek (DNB). (2018, November 13). The German National Library in brief. Retrieved from http:\/\/www.dnb.de\/EN\/Wir\/ueberblick"},{"issue":"4","key":"bib12","doi-asserted-by":"crossref","first-page":"599","DOI":"10.3790\/schm.130.4.599","volume":"130","author":"Dorner M.","year":"2010","journal-title":"Schmollers Jahrbuch"},{"issue":"1","key":"bib14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","volume":"33","author":"Friedman J.","year":"2010","journal-title":"Journal of Statistical Software"},{"key":"bib15","volume-title":"An Introduction to Statistical Learning","author":"Gareth J.","year":"2013"},{"issue":"1","key":"bib16","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s11192-018-2840-5","volume":"117","author":"Heinisch D. P.","year":"2018","journal-title":"Scientometrics"},{"key":"bib18","doi-asserted-by":"crossref","DOI":"10.3278\/6004283w","volume-title":"Bundesbericht Wissenschaftlicher Nachwuchs 2013","author":"Konsortium Bundesbericht Wissenschaftlicher Nachwuchs (BuWiN)","year":"2013"},{"key":"bib19","doi-asserted-by":"crossref","DOI":"10.3278\/6004603w","volume-title":"Bundesbericht Wissenschaftlicher Nachwuchs 2017","author":"Konsortium Bundesbericht Wissenschaftlicher Nachwuchs (BuWiN)","year":"2017"},{"issue":"3","key":"bib20","first-page":"18","volume":"2","author":"Liaw A.","year":"2002","journal-title":"R News"},{"key":"bib22","volume-title":"Education at a Glance 2018: OECD Indicators","author":"Organisation for Economic Co-operation and Development (OECD)","year":"2018"},{"key":"bib23","volume-title":"R: A Language and Environment for Statistical Computing","author":"R Core Team","year":"2017"},{"issue":"1","key":"bib24","first-page":"125","volume":"33","author":"Schnell R.","year":"2004","journal-title":"Austrian Journal of Statistics"}],"container-title":["Quantitative Science Studies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/qss_a_00001","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T16:54:53Z","timestamp":1695142493000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/qss\/article\/1\/1\/94-116\/15558"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2]]},"references-count":19,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,2]]}},"alternative-id":["10.1162\/qss_a_00001"],"URL":"https:\/\/doi.org\/10.1162\/qss_a_00001","relation":{},"ISSN":["2641-3337"],"issn-type":[{"value":"2641-3337","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,2]]},"assertion":[{"value":"2019-04-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-08-03","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-02-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}