{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:21:22Z","timestamp":1750306882277,"version":"3.41.0"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2013,5,1]],"date-time":"2013-05-01T00:00:00Z","timestamp":1367366400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2013,5]]},"abstract":"<jats:p>\n            Named Entity Recognition (NER) is a fundamental task in information extraction from unstructured text. Most previous machine-learning-based NER systems are domain-specific, which implies that they may only perform well on some specific domains (e.g.,\n            <jats:italic>Newswire<\/jats:italic>\n            ) but tend to adapt poorly to other related but different domains (e.g.,\n            <jats:italic>Weblog<\/jats:italic>\n            ). Recently, transfer learning techniques have been proposed to NER. However, most transfer learning approaches to NER are developed for binary classification, while NER is a multiclass classification problem in nature. Therefore, one has to first reduce the NER task to multiple binary classification tasks and solve them independently. In this article, we propose a new transfer learning method, named\n            <jats:italic>Transfer Joint Embedding<\/jats:italic>\n            (TJE), for cross-domain multiclass classification, which can fully exploit the relationships between classes (labels), and reduce domain difference in data distributions for transfer learning. More specifically, we aim to embed both labels (outputs) and high-dimensional features (inputs) from different domains (e.g., a source domain and a target domain) into a unified low-dimensional latent space, where 1) each label is represented by a prototype and the intrinsic relationships between labels can be measured by Euclidean distance; 2) the\n            <jats:italic>distance<\/jats:italic>\n            in data distributions between the source and target domains can be reduced; 3) the source domain labeled data are closer to their corresponding label-prototypes than others. After the latent space is learned, classification on the target domain data can be done with the simple nearest neighbor rule in the latent space. Furthermore, in order to scale up TJE, we propose an efficient algorithm based on stochastic gradient descent (SGD). Finally, we apply the proposed TJE method for NER across different domains on the ACE 2005 dataset, which is a benchmark in Natural Language Processing (NLP). Experimental results demonstrate the effectiveness of TJE and show that TJE can outperform state-of-the-art transfer learning approaches to NER.\n          <\/jats:p>","DOI":"10.1145\/2457465.2457467","type":"journal-article","created":{"date-parts":[[2013,5,21]],"date-time":"2013-05-21T12:33:56Z","timestamp":1369139636000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Transfer joint embedding for cross-domain named entity recognition"],"prefix":"10.1145","volume":"31","author":[{"given":"Sinno Jialin","family":"Pan","sequence":"first","affiliation":[{"name":"Institute for Infocomm Research, Singapore"}]},{"given":"Zhiqiang","family":"Toh","sequence":"additional","affiliation":[{"name":"Institute for Infocomm Research, Singapore"}]},{"given":"Jian","family":"Su","sequence":"additional","affiliation":[{"name":"Institute for Infocomm Research, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2013,5,17]]},"reference":[{"volume-title":"Proceedings of the 7th Message Understanding Conference.","author":"Aone C.","key":"e_1_2_1_1_1","unstructured":"Aone , C. , Halverson , L. , Hampton , T. , and Ramos-Santacruz , M . 1998. SRA: Description of the IE2 system used for MUC-7 . In Proceedings of the 7th Message Understanding Conference. Aone, C., Halverson, L., Hampton, T., and Ramos-Santacruz, M. 1998. SRA: Description of the IE2 system used for MUC-7. In Proceedings of the 7th Message Understanding Conference."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976603321780317"},{"key":"e_1_2_1_3_1","volume-title":"Advances in Neural Information Processing Systems","volume":"19","author":"Ben-David S.","unstructured":"Ben-David , S. , Blitzer , J. , Crammer , K. , and Pereira , F . 2007. Analysis of representations for domain adaptation . In Advances in Neural Information Processing Systems , vol. 19 , MIT Press, Cambridge, MA, 137--144. Ben-David, S., Blitzer, J., Crammer, K., and Pereira, F. 2007. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems, vol. 19, MIT Press, Cambridge, MA, 137--144."},{"key":"e_1_2_1_4_1","first-page":"163","article-title":"Label embedding trees for large multi-class tasks","volume":"23","author":"Bengio S.","year":"2010","unstructured":"Bengio , S. , Weston , J. , and Grangier , D. 2010 . Label embedding trees for large multi-class tasks . In Advances in Neural Information Processing Systems , vol. 23 , 163 -- 171 . Bengio, S., Weston, J., and Grangier, D. 2010. Label embedding trees for large multi-class tasks. In Advances in Neural Information Processing Systems, vol. 23, 163--171.","journal-title":"Advances in Neural Information Processing Systems"},{"volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 432--439","author":"Blitzer J.","key":"e_1_2_1_5_1","unstructured":"Blitzer , J. , Dredze , M. , and Pereira , F . 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification . In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 432--439 . Blitzer, J., Dredze, M., and Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 432--439."},{"volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language. 120--128","author":"Blitzer J.","key":"e_1_2_1_6_1","unstructured":"Blitzer , J. , McDonald , R. , and Pereira , F . 2006. Domain adaptation with structural correspondence learning . In Proceedings of the Conference on Empirical Methods in Natural Language. 120--128 . Blitzer, J., McDonald, R., and Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language. 120--128."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622407.1622416"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557045"},{"volume-title":"Proceedings of the NIPS Workshop on Advances in Strucured Learning for Text and Speech Processing.","author":"Ciaramita M.","key":"e_1_2_1_9_1","unstructured":"Ciaramita , M. and Altun , Y . 2005. Named-entity recognition in novel domains with external lexical knowledge . In Proceedings of the NIPS Workshop on Advances in Strucured Learning for Text and Speech Processing. Ciaramita, M. and Altun, Y. 2005. Named-entity recognition in novel domains with external lexical knowledge. In Proceedings of the NIPS Workshop on Advances in Strucured Learning for Text and Speech Processing."},{"key":"e_1_2_1_10_1","unstructured":"Cox T. and Cox M. 1994. Multidimensional Scaling. Chapman & Hall London.  Cox T. and Cox M. 1994. Multidimensional Scaling. Chapman & Hall London."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/944790.944813"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 256--263","author":"Daum\u00e9 III, H","year":"2007","unstructured":"Daum\u00e9 III, H . 2007 . Frustratingly easy domain adaptation . In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 256--263 . Daum\u00e9 III, H. 2007. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 256--263."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/345508.345593"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442794"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219885"},{"volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 141--150","author":"Finkel J. R.","key":"e_1_2_1_16_1","unstructured":"Finkel , J. R. and Manning , C. D . 2009. Nested named entity recognition . In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 141--150 . Finkel, J. R. and Manning, C. D. 2009. Nested named entity recognition. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 141--150."},{"volume-title":"Proceedings of the 28th International Conference on Machine Learning. 513--520","author":"Glorot X.","key":"e_1_2_1_17_1","unstructured":"Glorot , X. , Bordes , A. , and Bengio , Y . 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach . In Proceedings of the 28th International Conference on Machine Learning. 513--520 . Glorot, X., Bordes, A., and Bengio, Y. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning. 513--520."},{"volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems. MIT Press","author":"Gretton A.","key":"e_1_2_1_18_1","unstructured":"Gretton , A. , Borgwardt , K. M. , Rasch , M. , Sch\u00f6lkopf , B. , and Smola , A . 2007. A kernel method for the two-sample problem . In Proceedings of the Annual Conference on Neural Information Processing Systems. MIT Press , Cambridge, MA, 513--520. Gretton, A., Borgwardt, K. M., Rasch, M., Sch\u00f6lkopf, B., and Smola, A. 2007. A kernel method for the two-sample problem. In Proceedings of the Annual Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, 513--520."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/11564089_7"},{"volume-title":"Proceedings of the 7th Message Understanding Conference.","author":"Humphreys K.","key":"e_1_2_1_20_1","unstructured":"Humphreys , K. , Gaizauskas , R. , Azzam , S. , Huyck , C. , Mitchell , B. , Cunningham , H. , and Wilks , Y . 1998. Description of the University of Sheffield LaSIE-II system as used for MUC-7 . In Proceedings of the 7th Message Understanding Conference. Humphreys, K., Gaizauskas, R., Azzam, S., Huyck, C., Mitchell, B., Cunningham, H., and Wilks, Y. 1998. Description of the University of Sheffield LaSIE-II system as used for MUC-7. In Proceedings of the 7th Message Understanding Conference."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.3115\/1072228.1072282"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220835.1220845"},{"volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 264--271","author":"Jiang J.","key":"e_1_2_1_23_1","unstructured":"Jiang , J. and Zhai , C . 2007. Instance weighting for domain adaptation in NLP . In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 264--271 . Jiang, J. and Zhai, C. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. ACL, 264--271."},{"volume-title":"Proceedings of 7th Message Understanding Conference.","author":"Krupka G. R.","key":"e_1_2_1_24_1","unstructured":"Krupka , G. R. and Hausman , K . 1998. Isoquest inc.: Description of the NetOwlTM extractor system as used for MUC-7 . In Proceedings of 7th Message Understanding Conference. Krupka, G. R. and Hausman, K. 1998. Isoquest inc.: Description of the NetOwlTM extractor system as used for MUC-7. In Proceedings of 7th Message Understanding Conference."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/502115.502117"},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Manning C. D. Raghavan P. and Schtze H. 2008. Introduction to Information Retrieval. Cambridge University Press New York NY.   Manning C. D. Raghavan P. and Schtze H. 2008. Introduction to Information Retrieval. Cambridge University Press New York NY.","DOI":"10.1017\/CBO9780511809071"},{"volume-title":"Proceedings of the 7th Message Understanding Conference.","author":"Mikheev A.","key":"e_1_2_1_27_1","unstructured":"Mikheev , A. , Grover , C. , and Moens , M . 1998. Description of the LTG system used for MUC-7 . In Proceedings of the 7th Message Understanding Conference. Mikheev, A., Grover, C., and Moens, M. 1998. Description of the LTG system used for MUC-7. In Proceedings of the 7th Message Understanding Conference."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.3115\/977035.977037"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1075\/li.30.1.03nad"},{"volume-title":"Proceedings of the 23rd AAAI Conference on Artificial Intelligence. 677--682","author":"Pan S. J.","key":"e_1_2_1_30_1","unstructured":"Pan , S. J. , Kwok , J. T. , and Yang , Q . 2008. Transfer learning via dimensionality reduction . In Proceedings of the 23rd AAAI Conference on Artificial Intelligence. 677--682 . Pan, S. J., Kwok, J. T., and Yang, Q. 2008. Transfer learning via dimensionality reduction. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence. 677--682."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772767"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2010.2091281"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Quionero-Candela J. Sugiyama M. Schwaighofer A. and Lawrence N. D. 2009. Dataset Shift in Machine Learning. MIT Press.   Quionero-Candela J. Sugiyama M. Schwaighofer A. and Lawrence N. D. 2009. Dataset Shift in Machine Learning. MIT Press.","DOI":"10.7551\/mitpress\/9780262170055.001.0001"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASSP.1986.1165342"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1462198.1462204"},{"volume-title":"Proceedings of the 3rd International Conference on Language Resources and Evaluation. 1818--1824","author":"Sekine S.","key":"e_1_2_1_37_1","unstructured":"Sekine , S. , Sudo , K. , and Nobata , C . 2002. Extended named entity hierarchy . In Proceedings of the 3rd International Conference on Language Resources and Evaluation. 1818--1824 . Sekine, S., Sudo, K., and Nobata, C. 2002. Extended named entity hierarchy. In Proceedings of the 3rd International Conference on Language Resources and Evaluation. 1818--1824."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-75225-7_5"},{"key":"e_1_2_1_40_1","first-page":"1737","article-title":"Large margin taxonomy embedding for document categorization","volume":"21","author":"Weinberger K. Q.","year":"2009","unstructured":"Weinberger , K. Q. and Chapelle , O. 2009 . Large margin taxonomy embedding for document categorization . In Advances in Neural Information Processing Systems , vol. 21 , 1737 -- 1744 . Weinberger, K. Q. and Chapelle, O. 2009. Large margin taxonomy embedding for document categorization. In Advances in Neural Information Processing Systems, vol. 21, 1737--1744.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458102"},{"volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 1523--1532","author":"Wu D.","key":"e_1_2_1_42_1","unstructured":"Wu , D. , Lee , W. S. , Ye , N. , and Chieu , H. L . 2009. Domain adaptive bootstrapping for named entity recognition . In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 1523--1532 . Wu, D., Lee, W. S., Ye, N., and Chieu, H. L. 2009. Domain adaptive bootstrapping for named entity recognition. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 1523--1532."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015332"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073163"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2457465.2457467","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2457465.2457467","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:18:36Z","timestamp":1750234716000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2457465.2457467"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,5]]},"references-count":43,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2013,5]]}},"alternative-id":["10.1145\/2457465.2457467"],"URL":"https:\/\/doi.org\/10.1145\/2457465.2457467","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2013,5]]},"assertion":[{"value":"2011-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-05-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}