{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:28:26Z","timestamp":1777854506376,"version":"3.51.4"},"reference-count":30,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2015,7,3]],"date-time":"2015-07-03T00:00:00Z","timestamp":1435881600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2016,4]]},"abstract":"<jats:p>Emails are the most popular and effective way of communicating over the internet. A number of applications are available today for computers and mobile devices for email messaging. Email messaging is constantly getting more popular and, as a result, numbers of sent and received emails are also increasing. It is very difficult for a user to remember emails and relate newer incoming emails to previous communications made on similar topics. Email threads provide a mechanism using which a user can obtain sequences of emails for a particular set of communication in a time frame and provides a number of benefits to users. In this work two email thread identification algorithms based on a nested textual clustering approach are presented. The work is planned in two stages; in the first stage two popular text clustering approaches, latent Dirichlet allocation and non-negative matrix factorization, are applied over the email messages to form the email clusters. Then in the second stage, clustering is again performed over the created email clusters to identify the email threads using threading features. Performance parameters like accuracy, precision, recall and F-measure are evaluated for the presented thread identification algorithms.<\/jats:p>","DOI":"10.1177\/0165551515587854","type":"journal-article","created":{"date-parts":[[2015,7,3]],"date-time":"2015-07-03T22:01:13Z","timestamp":1435960873000},"page":"200-212","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":24,"title":["Email thread identification using latent Dirichlet allocation and non-negative matrix factorization based clustering techniques"],"prefix":"10.1177","volume":"42","author":[{"given":"Aakanksha","family":"Sharaff","sequence":"first","affiliation":[{"name":"Department of Computer Science & Engineering, National Institute of Technology, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Naresh Kumar","family":"Nagwani","sequence":"additional","affiliation":[{"name":"Department of Computer Science & Engineering, National Institute of Technology, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2015,7,3]]},"reference":[{"key":"bibr1-0165551515587854","unstructured":"Moore C. You are what you email @your inbox. Cranfield Healthcare Management Group, Research Briefing, School of Management, Cranfield University, June 2011, pp. 1\u20134."},{"key":"bibr2-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148180"},{"key":"bibr3-0165551515587854","doi-asserted-by":"publisher","DOI":"10.4304\/jcp.3.10.86-93"},{"key":"bibr4-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1561\/1500000006"},{"key":"bibr5-0165551515587854","first-page":"207","volume":"17","author":"Balali A","year":"2013","journal-title":"Computaciony Sistemas"},{"key":"bibr6-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1109\/ASONAM.2012.195"},{"key":"bibr7-0165551515587854","first-page":"1284","volume":"4","author":"Joshi S","year":"2011","journal-title":"Very Large Databases (VLDB)"},{"key":"bibr8-0165551515587854","volume-title":"Recent advances in natural language processing (RANLP)","author":"Ani N","year":"2003"},{"key":"bibr9-0165551515587854","doi-asserted-by":"publisher","DOI":"10.3115\/1613984.1614011"},{"key":"bibr10-0165551515587854","doi-asserted-by":"publisher","DOI":"10.3115\/1220355.1220434"},{"key":"bibr11-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1109\/INFVIS.2003.1249028"},{"key":"bibr12-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1177\/0165551513494638"},{"key":"bibr13-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1109\/ETTandGRS.2008.321"},{"key":"bibr14-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2007.09.007"},{"key":"bibr15-0165551515587854","first-page":"85","volume-title":"Proceedings of the 7th international natural language generation conference (INLG 2012)","author":"Duboue PA","year":"2012"},{"key":"bibr16-0165551515587854","first-page":"1","volume-title":"Proceedings of the third conference on email and anti-spam (CEAS 2006)","author":"Yeh Y","year":"2006"},{"key":"bibr17-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1145\/2441776.2441922"},{"key":"bibr18-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.212"},{"key":"bibr19-0165551515587854","first-page":"388","volume-title":"Proceedings of the 2010 conference on empirical methods in natural language processing","author":"Joty S","year":"2010"},{"key":"bibr20-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.04.048"},{"key":"bibr21-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1155\/2014\/479746"},{"key":"bibr22-0165551515587854","first-page":"1807","volume-title":"Proceedings of the twenty-second international joint conference on artificial intelligence","author":"Joty S","year":"2011"},{"key":"bibr23-0165551515587854","first-page":"993","volume":"3","author":"Blei DM","year":"2003","journal-title":"The Journal of Machine Learning Research"},{"key":"bibr24-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1038\/44565"},{"key":"bibr25-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.11.005"},{"key":"bibr26-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860485"},{"key":"bibr27-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1145\/1216295.1216331"},{"key":"bibr28-0165551515587854","unstructured":"McCallum K. Mallet: Machine learning for language toolkit, http:\/\/mallet.cs.umass.edu\/ (2002, accessed February 2014)."},{"key":"bibr29-0165551515587854","unstructured":"Witten H, Frank E, Trigg LE, Weka: Practical machine learning tools and techniques with Java implementations, 1999."},{"key":"bibr30-0165551515587854","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30115-8_22"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515587854","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551515587854","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515587854","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:12Z","timestamp":1777504152000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551515587854"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,7,3]]},"references-count":30,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,4]]}},"alternative-id":["10.1177\/0165551515587854"],"URL":"https:\/\/doi.org\/10.1177\/0165551515587854","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,7,3]]}}}