{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:39:00Z","timestamp":1750307940900,"version":"3.41.0"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2006,12,1]],"date-time":"2006-12-01T00:00:00Z","timestamp":1164931200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Transactions on Asian Language Information Processing"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:p>Discriminative sequential learning models like Conditional Random Fields (CRFs) have achieved significant success in several areas such as natural language processing or information extraction. Their key advantage is the ability to capture various nonindependent and overlapping features of inputs. However, several unexpected pitfalls have a negative influence on the model's performance; these mainly come from a high imbalance among classes, irregular phenomena, and potential ambiguity in the training data. This article presents a data-driven approach that can deal with such difficult data instances by discovering and emphasizing important conjunctions or associations of statistics hidden in the training data. Discovered associations are then incorporated into these models to deal with difficult data instances. Experimental results of phrase-chunking and named entity recognition using CRFs show a significant improvement in accuracy. In addition to the technical perspective, our approach also highlights a potential connection between association mining and statistical learning by offering an alternative strategy to enhance learning performance with interesting and useful patterns discovered from large datasets.<\/jats:p>","DOI":"10.1145\/1236181.1236187","type":"journal-article","created":{"date-parts":[[2007,4,9]],"date-time":"2007-04-09T19:07:12Z","timestamp":1176145632000},"page":"413-438","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Improving discriminative sequential learning by discovering important association of statistics"],"prefix":"10.1145","volume":"5","author":[{"given":"Xuan-Hieu","family":"Phan","sequence":"first","affiliation":[{"name":"Japan Advanced Institute of Science and Technology, Nomi, Ishikawa"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Le-Minh","family":"Nguyen","sequence":"additional","affiliation":[{"name":"Japan Advanced Institute of Science and Technology, Nomi, Ishikawa"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yasushi","family":"Inoguchi","sequence":"additional","affiliation":[{"name":"Japan Advanced Institute of Science and Technology, Nomi, Ishikawa"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tu-Bao","family":"Ho","sequence":"additional","affiliation":[{"name":"Japan Advanced Institute of Science and Technology, Nomi, Ishikawa"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Susumu","family":"Horiguchi","sequence":"additional","affiliation":[{"name":"Tohoku University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2006,12]]},"reference":[{"volume-title":"Proceedings of the 20th International Conference Very Large Data Bases (VLDB). 487--499","author":"Agrawal R.","key":"e_1_2_1_1_1","unstructured":"Agrawal , R. and Srikant , R . 1994. Fast algorithms for mining association rules . In Proceedings of the 20th International Conference Very Large Data Bases (VLDB). 487--499 . Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference Very Large Data Bases (VLDB). 487--499."},{"volume-title":"Proceedings of Neural Information Processing Systems (NIPS).","author":"Altun Y.","key":"e_1_2_1_2_1","unstructured":"Altun , Y. , Hofmann , T. , and Johnson , M . 2002. Discriminative learning for label sequences via boosting . In Proceedings of Neural Information Processing Systems (NIPS). Altun, Y., Hofmann, T., and Johnson, M. 2002. Discriminative learning for label sequences via boosting. In Proceedings of Neural Information Processing Systems (NIPS)."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219841"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/234285.234289"},{"volume-title":"Proceedings of the Recent Advances in Natural Language Processing (RANLP). 205--216","author":"Carreras X.","key":"e_1_2_1_5_1","unstructured":"Carreras , X. and Marquez , L . 2003. Phrase recognition by filtering and ranking with perceptrons . In Proceedings of the Recent Advances in Natural Language Processing (RANLP). 205--216 . Carreras, X. and Marquez, L. 2003. Phrase recognition by filtering and ranking with perceptrons. In Proceedings of the Recent Advances in Natural Language Processing (RANLP). 205--216."},{"key":"e_1_2_1_6_1","volume-title":"Tech. Rep. CMU-CS-99-108. Carnegie Mellon University.","author":"Chen S. F.","year":"1999","unstructured":"Chen , S. F. and Rosenfeld , R . 1999 . A gaussian prior for smoothing maximum entropy models. Tech. Rep. CMU-CS-99-108. Carnegie Mellon University. Chen, S. F. and Rosenfeld, R. 1999. A gaussian prior for smoothing maximum entropy models. Tech. Rep. CMU-CS-99-108. Carnegie Mellon University."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119199"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118693.1118694"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102373"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015428"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119201"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1504"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119204"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1597148.1597216"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073336.1073361"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1150--1157","author":"Kumar S.","key":"e_1_2_1_16_1","unstructured":"Kumar , S. and Hebert , M . 2003. Discriminative random fields: a discriminative framework for contextual interaction in classification . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1150--1157 . Kumar, S. and Hebert, M. 2003. Discriminative random fields: a discriminative framework for contextual interaction in classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1150--1157."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335372"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:DAMI.0000005258.31418.83"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 695--702","author":"He X.","key":"e_1_2_1_19_1","unstructured":"He , X. , Zemel , R. S. , and Carreira-Perpinan , M. A . 2004. Multiscale conditional random fields for image labeling . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 695--702 . He, X., Zemel, R. S., and Carreira-Perpinan, M. A. 2004. Multiscale conditional random fields for image labeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 695--702."},{"volume-title":"Proceedings of the 18th International Conference on Machine Learning (ICML). 282--289","author":"Lafferty J.","key":"e_1_2_1_20_1","unstructured":"Lafferty , J. , McCallum , A. , and Pereira , F . 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data . In Proceedings of the 18th International Conference on Machine Learning (ICML). 282--289 . Lafferty, J., McCallum, A., and Pereira, F. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML). 282--289."},{"volume-title":"Proceedings of the IEEE International Conference on Data Mining (ICDM). 369--376","author":"Li W.","key":"e_1_2_1_21_1","unstructured":"Li , W. , Han , J. , and Pei , J . 2001. Accurate and efficient classifications based on multiple class-association rules . In Proceedings of the IEEE International Conference on Data Mining (ICDM). 369--376 . Li, W., Han, J., and Pei, J. 2001. Accurate and efficient classifications based on multiple class-association rules. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 369--376."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01589116"},{"volume-title":"Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 80--86","author":"Liu B.","key":"e_1_2_1_23_1","unstructured":"Liu , B. , Hsu , W. , and Ma , Y . 1998. Integrating classification and association rule mining . In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 80--86 . Liu, B., Hsu, W., and Ma, Y. 1998. Integrating classification and association rule mining. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 80--86."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.824588"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118853.1118871"},{"volume-title":"Proceedings of 17th International Conference on Machine Learning (ICML). 591--598","author":"McCallum A.","key":"e_1_2_1_26_1","unstructured":"McCallum , A. , Freitag , D. , and Pereira , F . 2000. Maximum entropy markov models for information extraction and segmentation . In Proceedings of 17th International Conference on Machine Learning (ICML). 591--598 . McCallum, A., Freitag, D., and Pereira, F. 2000. Maximum entropy markov models for information extraction and segmentation. In Proceedings of 17th International Conference on Machine Learning (ICML). 591--598."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI). 403--410","author":"McCallum A.","year":"2003","unstructured":"McCallum , A. 2003 . Efficiently inducing features of conditional random fields . In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI). 403--410 . McCallum, A. 2003. Efficiently inducing features of conditional random fields. In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI). 403--410."},{"volume-title":"Proceedings of International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 94--100","author":"Padmanabhan B.","key":"e_1_2_1_28_1","unstructured":"Padmanabhan , B. and Tuzhilin , A . 1998. A belief-driven method for discovering unexpected patterns . In Proceedings of International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 94--100 . Padmanabhan, B. and Tuzhilin, A. 1998. A belief-driven method for discovering unexpected patterns. In Proceedings of International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 94--100."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220355.1220436"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.588021"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860479"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081906"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).","author":"Ratnaparkhi A.","year":"1996","unstructured":"Ratnaparkhi , A. 1996 . A maximum entropy model for part-of-speech tagging . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Ratnaparkhi, A. 1996. A maximum entropy model for part-of-speech tagging. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073445.1073473"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.553165"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 259--262","author":"Suzuki E.","year":"1997","unstructured":"Suzuki , E. 1997 . Autonomous discovery of reliable exception rules . In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 259--262 . Suzuki, E. 1997. Autonomous discovery of reliable exception rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 259--262."},{"volume-title":"Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 295--298","author":"Suzuki E.","key":"e_1_2_1_38_1","unstructured":"Suzuki , E. and Shimura , M . 1996. Exceptional knowledge discovery in databases based on information theory . In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 295--298 . Suzuki, E. and Shimura, M. 1996. Exceptional knowledge discovery in databases based on information theory. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 295--298."},{"volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS).","author":"Torralba A.","key":"e_1_2_1_39_1","unstructured":"Torralba , A. , Murphy , K. P. , and Freeman , W. T . 2004. Contextual models for object detection using boosted random fields . In Proceedings of the Conference on Neural Information Processing Systems (NIPS). Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Contextual models for object detection using boosted random fields. In Proceedings of the Conference on Neural Information Processing Systems (NIPS)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/640075.640118"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/944790.944820"}],"container-title":["ACM Transactions on Asian Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1236181.1236187","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1236181.1236187","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T14:52:16Z","timestamp":1750258336000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1236181.1236187"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,12]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["10.1145\/1236181.1236187"],"URL":"https:\/\/doi.org\/10.1145\/1236181.1236187","relation":{},"ISSN":["1530-0226","1558-3430"],"issn-type":[{"type":"print","value":"1530-0226"},{"type":"electronic","value":"1558-3430"}],"subject":[],"published":{"date-parts":[[2006,12]]},"assertion":[{"value":"2006-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}