{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T00:40:01Z","timestamp":1746146401784,"version":"3.40.4"},"reference-count":3,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Info. Know. Mgmt."],"published-print":{"date-parts":[[2014,3]]},"abstract":"<jats:p>The article describes a solution to process large volumes of unstructured health social media data in a scalable fashion using the MapReduce framework. Our work is in the context of health informatics applications involving complex text and language processing as well as large resources such as ontologies, due to which the text processing of a single unit of text takes time. Even with a throughput of an order processing time of one second per unit, it takes over a week to process a million units, which is unacceptable. We present a solution where we take the processing to a MapReduce framework and achieve significant improvement in processing performance by dividing the processing across a cluster of processors. This paper describes the technical details of our work in terms of the design, modeling, and implementation of such an approach. We also present experimental results demonstrating the effectiveness of our approach.<\/jats:p>","DOI":"10.1142\/s0219649214500099","type":"journal-article","created":{"date-parts":[[2014,3,6]],"date-time":"2014-03-06T05:56:12Z","timestamp":1394085372000},"page":"1450009","source":"Crossref","is-referenced-by-count":1,"title":["Large Scale, Complex Processing of Health Data with MapReduce"],"prefix":"10.1142","volume":"13","author":[{"given":"Khanh Luan P.","family":"Nguyen","sequence":"first","affiliation":[{"name":"Cognie Inc., 365 San Juan Place, Pasadena, CA 91107, USA"}]},{"given":"Naveen","family":"Ashish","sequence":"additional","affiliation":[{"name":"Cognie Inc., 365 San Juan Place, Pasadena, CA 91107, USA"}]}],"member":"219","published-online":{"date-parts":[[2014,3,5]]},"reference":[{"key":"rf3","first-page":"2533","volume":"11","author":"Bouckaert R. R.","journal-title":"Journal of Machine Learning Research"},{"key":"rf5","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","volume":"51","author":"Dean J.","journal-title":"Operating of the ACM"},{"key":"rf9","first-page":"998","volume":"22","author":"Qiu X.","journal-title":"IEEE Transactions on Parallel and Distributed Systems"}],"container-title":["Journal of Information &amp; Knowledge Management"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0219649214500099","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T00:19:12Z","timestamp":1746145152000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0219649214500099"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,3]]},"references-count":3,"journal-issue":{"issue":"01","published-online":{"date-parts":[[2014,3,5]]},"published-print":{"date-parts":[[2014,3]]}},"alternative-id":["10.1142\/S0219649214500099"],"URL":"https:\/\/doi.org\/10.1142\/s0219649214500099","relation":{},"ISSN":["0219-6492","1793-6926"],"issn-type":[{"type":"print","value":"0219-6492"},{"type":"electronic","value":"1793-6926"}],"subject":[],"published":{"date-parts":[[2014,3]]}}}