{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T05:24:10Z","timestamp":1740201850634,"version":"3.37.3"},"reference-count":0,"publisher":"IOS Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015]]},"abstract":"<jats:p>Authors evaluated supervised automatic classification algorithms for determination of health related web-page compliance with individual HONcode criteria of conduct using varying length character n-gram vectors to represent healthcare web page documents. The training\/testing collection comprised web page fragments extracted by HONcode experts during the manual certification process. The authors compared automated classification performance of n-gram tokenization to the automated classification performance of document words and Porter-stemmed document words using a Naive Bayes classifier and DF (document frequency) dimensionality reduction metrics. The study attempted to determine whether the automated, language-independent approach might safely replace word-based classification. Using 5-grams as document features, authors also compared the baseline DF reduction function to Chi-square and Z-score dimensionality reductions. Overall study results indicate that n-gram tokenization provided a potentially viable alternative to document word stemming.<\/jats:p>","DOI":"10.3233\/978-1-61499-564-7-1064","type":"book-chapter","created":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T09:02:47Z","timestamp":1740128567000},"source":"Crossref","is-referenced-by-count":0,"title":["Automated Detection of Health Websites' HONcode Conformity: Can N-gram Tokenization Replace Stemming?"],"prefix":"10.3233","author":[{"family":"Boyer C&eacute;lia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Dolamic Ljiljana","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Grabar Natalia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"7437","container-title":["Studies in Health Technology and Informatics","MEDINFO 2015: eHealth-enabled Health"],"original-title":[],"deposited":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T09:28:18Z","timestamp":1740130098000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.medra.org\/servlet\/aliasResolver?alias=iospressISBN&isbn=978-1-61499-563-0&spage=1064&doi=10.3233\/978-1-61499-564-7-1064"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015]]},"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/978-1-61499-564-7-1064","relation":{},"ISSN":["0926-9630"],"issn-type":[{"value":"0926-9630","type":"print"}],"subject":[],"published":{"date-parts":[[2015]]}}}