{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,17]],"date-time":"2025-05-17T04:04:33Z","timestamp":1747454673039,"version":"3.40.5"},"reference-count":0,"publisher":"IOS Press","isbn-type":[{"value":"9781643685960","type":"electronic"}],"license":[{"start":{"date-parts":[[2025,5,15]],"date-time":"2025-05-15T00:00:00Z","timestamp":1747267200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,5,15]]},"abstract":"<jats:p>This work introduces a web application for extracting, processing, and visualizing data from sleep studies reports. Using Optical Character Recognition (OCR) and Natural Language Processing (NLP), the pipeline extracts over 75 key data points from four types of sleep reports. The web application offers an intuitive interface to view individual reports\u2019 details and aggregate data from multiple reports. The pipeline demonstrated 100% accuracy in extracting targeted information from a test set of 40 reports, even in cases with missing data or formatting inconsistencies. The developed tool streamlines the analysis of OSA reports, reducing the need for technical expertise and enabling healthcare providers and researchers to utilize sleep study data efficiently. Future work aims to expand the dataset for more complex analyses and imputation techniques.<\/jats:p>","DOI":"10.3233\/shti250498","type":"book-chapter","created":{"date-parts":[[2025,5,16]],"date-time":"2025-05-16T08:58:38Z","timestamp":1747385918000},"source":"Crossref","is-referenced-by-count":0,"title":["Automating Data Extraction from PDF Sleep Reports Using Data Mining Techniques"],"prefix":"10.3233","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-0920-3950","authenticated-orcid":false,"given":"F\u00e1bio","family":"Teixeira","sequence":"first","affiliation":[{"name":"University of Porto, Portugal"},{"name":"INESC TEC, Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-2819-0331","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Costa","sequence":"additional","affiliation":[{"name":"University of Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0408-4881","authenticated-orcid":false,"given":"Pedro","family":"Amorim","sequence":"additional","affiliation":[{"name":"University of Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2854-2891","authenticated-orcid":false,"given":"Nuno","family":"Guimar\u00e3es","sequence":"additional","affiliation":[{"name":"University of Porto, Portugal"},{"name":"INESC TEC, Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0390-9944","authenticated-orcid":false,"given":"Daniela","family":"Ferreira-Santos","sequence":"additional","affiliation":[{"name":"University of Porto, Portugal"},{"name":"INESC TEC, Porto, Portugal"}]}],"member":"7437","container-title":["Studies in Health Technology and Informatics","Intelligent Health Systems \u2013 From Technology to Data and Knowledge"],"original-title":[],"link":[{"URL":"https:\/\/ebooks.iospress.nl\/pdf\/doi\/10.3233\/SHTI250498","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,16]],"date-time":"2025-05-16T08:58:38Z","timestamp":1747385918000},"score":1,"resource":{"primary":{"URL":"https:\/\/ebooks.iospress.nl\/doi\/10.3233\/SHTI250498"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,15]]},"ISBN":["9781643685960"],"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/shti250498","relation":{},"ISSN":["0926-9630","1879-8365"],"issn-type":[{"value":"0926-9630","type":"print"},{"value":"1879-8365","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,15]]}}}