{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T13:23:50Z","timestamp":1771075430543,"version":"3.50.1"},"reference-count":25,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2019,7,10]],"date-time":"2019-07-10T00:00:00Z","timestamp":1562716800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772231"],"award-info":[{"award-number":["61772231"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007129","name":"Natural Science Foundation of Shandong Province","doi-asserted-by":"publisher","award":["ZR2017MF025"],"award-info":[{"award-number":["ZR2017MF025"]}],"id":[{"id":"10.13039\/501100007129","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Shandong Provincial Key R&amp;D Program of China","award":["2018CXGC0706"],"award-info":[{"award-number":["2018CXGC0706"]}]},{"name":"Science and Technology Program of University of Jinan","award":["XKY1734"],"award-info":[{"award-number":["XKY1734"]}]},{"name":"Science and Technology Program of University of Jinan","award":["XKY1828"],"award-info":[{"award-number":["XKY1828"]}]},{"name":"Project of Shandong Provincial Social Science Program","award":["18CHLJ39"],"award-info":[{"award-number":["18CHLJ39"]}]},{"name":"Project of Independent Cultivated Innovation Team of Jinan City","award":["2018GXRC002"],"award-info":[{"award-number":["2018GXRC002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Social network services for self-media, such as Weibo, Blog, and WeChat Public, constitute a powerful medium that allows users to publish posts every day. Due to insufficient information transparency, malicious marketing of the Internet from self-media posts imposes potential harm on society. Therefore, it is necessary to identify news with marketing intentions for life. We follow the idea of text classification to identify marketing intentions. Although there are some current methods to address intention detection, the challenge is how the feature extraction of text reflects semantic information and how to improve the time complexity and space complexity of the recognition model. To this end, this paper proposes a machine learning method to identify marketing intentions from large-scale We-Media data. First, the proposed Latent Semantic Analysis (LSI)-Word2vec model can reflect the semantic features. Second, the decision tree model is simplified by decision tree pruning to save computing resources and reduce the time complexity. Finally, this paper examines the effects of classifier associations and uses the optimal configuration to help people efficiently identify marketing intention. Finally, the detailed experimental evaluation on several metrics shows that our approaches are effective and efficient. The F1 value can be increased by about 5%, and the running time is increased by 20%, which prove that the newly-proposed method can effectively improve the accuracy of marketing news recognition.<\/jats:p>","DOI":"10.3390\/fi11070155","type":"journal-article","created":{"date-parts":[[2019,7,10]],"date-time":"2019-07-10T11:56:51Z","timestamp":1562759811000},"page":"155","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Stacking-Based Ensemble Learning of Self-Media Data for Marketing Intention Detection"],"prefix":"10.3390","volume":"11","author":[{"given":"Yufeng","family":"Wang","sequence":"first","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"given":"Shuangrong","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0907-5581","authenticated-orcid":false,"given":"Songqian","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"given":"Jidong","family":"Duan","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"given":"Zhihao","family":"Hou","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"given":"Jia","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0135-5423","authenticated-orcid":false,"given":"Kun","family":"Ma","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, University of Jinan, Jinan 250022, China"},{"name":"Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,7,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"7451","DOI":"10.1007\/s00500-018-3391-7","article-title":"Stream-based live public opinion monitoring approach with adaptive probabilistic topic model","volume":"23","author":"Ma","year":"2018","journal-title":"Soft Comput."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zhang, C., Wang, X., Yu, S., and Wang, Y. (2018, January 6\u20138). Research on Keyword Extraction of Word2vec Model in Chinese Corpus. Proceedings of the 2018 IEEE\/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore.","DOI":"10.1109\/ICIS.2018.8466534"},{"key":"ref_3","unstructured":"Kalra, S., Li, L., and Tizhoosh, H.R. (2019). Automatic Classification of Pathology Reports using TF-IDF Features. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"26996","DOI":"10.1109\/ACCESS.2019.2893980","article-title":"Hot Topic Detection Based on a Refined TF-IDF Algorithm","volume":"7","author":"Zhu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.ins.2018.10.006","article-title":"Multi-co-training for document classification using various document representations: TF\u2013IDF, LDA, and Doc2Vec","volume":"477","author":"Kim","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Altszyler, E., Sigman, M., and Slezak, D.F. (2017). Corpus specificity in LSA and Word2vec: The role of out-of-domain documents. arXiv.","DOI":"10.18653\/v1\/W18-3001"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Anandarajan, M., Hill, C., and Nolan, T. (2019). Semantic Space Representation and Latent Semantic Analysis. Practical Text Analytics, Springer.","DOI":"10.1007\/978-3-319-95663-3"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1111\/coin.12158","article-title":"A Naive Bayes approach for URL classification with supervised feature selection and rejection framework","volume":"34","author":"Rajalakshmi","year":"2018","journal-title":"Comput. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1109\/TIM.2018.2861107","article-title":"A novel measurement data classification algorithm based on SVM for tracking closely spaced targets","volume":"68","author":"Zhao","year":"2018","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Khaleel, M.I., Hmeidi, I.I., and Najadat, H.M. (2016, January 15\u201317). An automatic text classification system based on genetic algorithm. Proceedings of the the 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 2016, Union, NJ, USA.","DOI":"10.1145\/2955129.2955174"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"40323","DOI":"10.1109\/ACCESS.2019.2904858","article-title":"Learning Multi-Domain Adversarial Neural Networks for Text Classification","volume":"7","author":"Ding","year":"2019","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Narayanan, A., Shi, E., and Rubinstein, B.I. (August, January 31). Link prediction by de-anonymization: How we won the kaggle social network challenge. Proceedings of the 2011 International Joint Conference on Neural Networks, San Jose, CA, USA.","DOI":"10.1109\/IJCNN.2011.6033446"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Pavlyshenko, B. (2018, January 21\u201325). Using Stacking Approaches for Machine Learning Models. Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine.","DOI":"10.1109\/DSMP.2018.8478522"},{"key":"ref_14","unstructured":"Zou, H., Xu, K., Li, J., and Zhu, J. (2017). The Youtube-8M kaggle competition: Challenges and methods. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Liu, J., Shang, W., and Lin, W. (2018, January 6\u20138). Improved Stacking Model Fusion Based on Weak Classifier and Word2vec. Proceedings of the 2018 IEEE\/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore.","DOI":"10.1109\/ICIS.2018.8466463"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/S0377-2217(99)00116-2","article-title":"Deriving decision rules to locate export containers in container yards","volume":"124","author":"Kim","year":"2000","journal-title":"Eur. J. Oper. Res."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gao, X., Luo, H., Wang, Q., Zhao, F., Ye, L., and Zhang, Y. (2019). A Human Activity Recognition Algorithm Based on Stacking Denoising Autoencoder and LightGBM. Sensors, 19.","DOI":"10.3390\/s19040947"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, J., Lou, C., Yu, R., Gao, J., Xu, T., Yu, M., and Di, H. (2018). Research on Hot Micro-blog Forecast Based on XGBOOST and Random Forest. International Conference on Knowledge Science, Engineering and Management, Springer.","DOI":"10.1007\/978-3-319-99247-1_31"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Xi, Y., Zhuang, X., Wang, X., Nie, R., and Zhao, G. (2018). A Research and Application Based on Gradient Boosting Decision Tree. International Conference on Web Information Systems and Applications, Springer.","DOI":"10.1007\/978-3-030-02934-0_2"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1000","DOI":"10.1016\/j.asoc.2017.07.027","article-title":"A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification","volume":"70","author":"Li","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_22","unstructured":"Sun, J. (2012). \u2018Jieba\u2019 Chinese Word Segmentation Tool, Gitlab."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Xu, Y., and Wang, J. (2016, January 29\u201331). The Adaptive Spelling Error Checking Algorithm based on Trie Tree. Proceedings of the 2016 2nd International Conference on Advances in Energy, Environment and Chemical Engineering (AEECE 2016), Singapore.","DOI":"10.2991\/aeece-16.2016.62"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1007\/978-981-10-8890-2_19","article-title":"CHAR-HMM: An Improved Continuous Human Activity Recognition Algorithm Based on Hidden Markov Model","volume":"Volume 747","author":"Liu","year":"2018","journal-title":"Mobile Ad-hoc and Sensor Networks: 13th International Conference, MSN 2017, Beijing, China, 17\u201320 December 2017"},{"key":"ref_25","unstructured":"(2019, May 29). Zecheng Zhan SOHU\u2019s Second Content Recognition Algorithm Competition. Available online: https:\/\/github.com\/zhanzecheng\/SOHU_competition."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/11\/7\/155\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:04:20Z","timestamp":1760187860000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/11\/7\/155"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,10]]},"references-count":25,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2019,7]]}},"alternative-id":["fi11070155"],"URL":"https:\/\/doi.org\/10.3390\/fi11070155","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,7,10]]}}}