{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T10:28:53Z","timestamp":1770805733108,"version":"3.50.0"},"reference-count":49,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T00:00:00Z","timestamp":1769558400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Telegram, along with WhatsApp and Signal, has become very popular due to its hybrid capabilities, including both instant private and public messaging, making it an effective tool for quickly broadcasting content to a wide audience. This article presents TGEconomicDataset, a new dataset containing more than 2.9 million messages from the most popular Russian-language Telegram channels in the field of economics, as well as synthetically generated labeled mixtures of these channels. These mixtures are specifically designed to model authorship change scenarios for testing various methods for solving the problem of continuous authentication, which is of particular interest due to the need for organizations and companies to rely on data posted on social media. The presented dataset is enriched with quotes of important financial instruments such as gold futures, the USD\/RUB currency pair, BRENT oil, the dollar index (DXY), and bitcoin (BTC), synchronized with the message timestamps. A detailed joint analysis of the collected data is provided. In addition to the presented dataset, we publish the scripts used to collect the data, integrate the financial indicators, and generate the synthetic mixtures for the continuous authentication task, ensuring full reproducibility of the research.<\/jats:p>","DOI":"10.3390\/data11020025","type":"journal-article","created":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T15:56:12Z","timestamp":1769615772000},"page":"25","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["TGEconomicDataset: A Collection of Russian-Language Economic Telegram Channels and a Synthetic Data Generation Framework for Continuous Authentication"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6734-4359","authenticated-orcid":false,"given":"Elena","family":"Luneva","sequence":"first","affiliation":[{"name":"Department of Comprehensive Information Security of Electronic Computer Systems, Tomsk State University of Control Systems and Radioelectronics, 634050 Tomsk, Russia"}]},{"given":"Pavel","family":"Banokin","sequence":"additional","affiliation":[{"name":"Department of Comprehensive Information Security of Electronic Computer Systems, Tomsk State University of Control Systems and Radioelectronics, 634050 Tomsk, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2393-6701","authenticated-orcid":false,"given":"Alexander","family":"Shelupanov","sequence":"additional","affiliation":[{"name":"Department of Comprehensive Information Security of Electronic Computer Systems, Tomsk State University of Control Systems and Radioelectronics, 634050 Tomsk, Russia"}]}],"member":"1968","published-online":{"date-parts":[[2026,1,28]]},"reference":[{"key":"ref_1","unstructured":"(2025, December 02). Telegram Statistics 2025: Worldwide Data. Available online: https:\/\/www.demandsage.com\/telegram-statistics\/."},{"key":"ref_2","first-page":"122","article-title":"Health pandemic and social media: A content analysis of COVID-related posts on a Telegram channel with more than one million subscribers","volume":"279","author":"Mehdipour","year":"2021","journal-title":"Stud. Health Technol. Inform."},{"key":"ref_3","first-page":"934","article-title":"When motivations meet affordances: News consumption on Telegram","volume":"22","author":"Lou","year":"2021","journal-title":"J. Stud."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Griggio, C.F., Nouwens, M., and Klokmose, C.N. (May, January 30). Caught in the network: The impact of WhatsApp\u2019s 2021 privacy policy update on users\u2019 messaging app ecosystems. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI \u201822), New York, NY, USA.","DOI":"10.1145\/3491102.3502032"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1007\/s42001-021-00155-3","article-title":"News loopholing: Telegram news as portable alternative media","volume":"5","year":"2022","journal-title":"J. Comput. Soc. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Gupta, K., Oladimeji, D., Varol, C., Rasheed, A., and Shahshidhar, N. (2023). A comprehensive survey on artifact recovery from social media platforms: Approaches and future research directions. Information, 14.","DOI":"10.3390\/info14120629"},{"key":"ref_7","unstructured":"(2025, December 02). Telegram APIs. Available online: https:\/\/core.telegram.org\/api."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Herrero-Solana, V., and Castro-Castro, C. (2022). Telegram channels and bots: A ranking of media outlets based in Spain. Societies, 12.","DOI":"10.20944\/preprints202207.0179.v1"},{"key":"ref_9","unstructured":"(2025, December 02). Channels FAQ. Available online: https:\/\/telegram.org\/faq_channels?setln=en."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Xiao, C., Freeman, D.M., and Hwa, T. (2015). Detecting clusters of fake accounts in online social networks. Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, Denver, CO, USA, 16 October 2015, ACM Press.","DOI":"10.1145\/2808769.2808779"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"La Morgia, M., Mei, A., Sassi, F., and Stefa, J. (2020, January 3\u20136). Pump and dumps in the Bitcoin era: Real time detection of cryptocurrency market manipulations. Proceedings of the 2020 IEEE 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA.","DOI":"10.1109\/ICCCN49398.2020.9209660"},{"key":"ref_12","first-page":"1","article-title":"The Doge of Wall Street: Analysis and detection of pump and dump cryptocurrency manipulations","volume":"23","author":"Mei","year":"2023","journal-title":"ACM Trans. Internet Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Rajaei, M.J., and Mahmoud, Q.H. (2023). A survey on pump and dump detection in the cryptocurrency market using machine learning. Future Internet, 15.","DOI":"10.3390\/fi15080267"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Conti, E., Salvi, D., Borrelli, C., Hosler, B., Bestagini, P., Antonacci, F., Sarti, A., Stamm, M.C., and Tubaro, S. (2022, January 22\u201327). Deepfake speech detection through emotion recognition: A semantic approach. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9747186"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"115742","DOI":"10.1016\/j.eswa.2021.115742","article-title":"Review on social spam detection: Challenges, open issues, and future directions","volume":"186","author":"Rao","year":"2021","journal-title":"Expert. Syst. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"82996","DOI":"10.1109\/ACCESS.2024.3411783","article-title":"A comprehensive review on secure biometric-based continuous authentication and user profiling","volume":"12","author":"Ayeswarya","year":"2024","journal-title":"IEEE Access"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1145\/1121949.1121951","article-title":"From Fingerprint to Writeprint","volume":"49","author":"Li","year":"2006","journal-title":"Commun. ACM"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Fedotova, A., Romanov, A., Kurtukova, A., and Shelupanov, A. (2022). Authorship attribution of social media and literary Russian-language texts using machine learning methods and feature selection. Future Internet, 14.","DOI":"10.3390\/fi14010004"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, Y., Zhang, X., and Hu, H. (2023). Continuous user authentication on multiple smart devices. Information, 14.","DOI":"10.3390\/info14050274"},{"key":"ref_20","unstructured":"(2025, December 02). Telegram Channels and Groups Catalog. Available online: https:\/\/tgstat.ru\/en."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"840","DOI":"10.1609\/icwsm.v14i1.7348","article-title":"The Pushshift Telegram Dataset","volume":"14","author":"Baumgartner","year":"2020","journal-title":"Proc. Int. AAAI Conf. Web Soc. Media"},{"key":"ref_22","unstructured":"Hoseini, M., Melo, P., Benevenuto, F., Feldmann, A., and Zannettou, S. (2021). On the globalization of the QAnon conspiracy theory through Telegram. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1007\/s13278-019-0575-9","article-title":"Telegram group quality measurement by user behavior analysis","volume":"9","author":"Hashemi","year":"2019","journal-title":"Soc. Netw. Anal. Min."},{"key":"ref_24","unstructured":"Sosa, J., and Sharoff, S. (2022). Multimodal pipeline for collection of misinformation data from Telegram. Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, 20\u201325 June 2022, European Language Resources Association."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"La Morgia, M., Mei, A., and Mongardini, A.M. (2025, January 3\u20137). TGDataset: Collecting and exploring the largest Telegram channels dataset. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1 (KDD \u201825), New York, NY, USA.","DOI":"10.1145\/3690624.3709397"},{"key":"ref_26","unstructured":"Angermaier, M., Pinheiro Neto, J., H\u00f6ldrich, E., and Lasser, J. (2025). Dataset for the \u201cThe Schwurbelarchiv: A German Language Telegram dataset for the study of conspiracy theories\u201d paper [Data set]. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"2510","DOI":"10.1609\/icwsm.v19i1.35952","article-title":"A Telegram dataset of propaganda and its moderation","volume":"19","author":"Kireev","year":"2025","journal-title":"Proc. Int. AAAI Conf. Web Soc. Media"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2423","DOI":"10.1609\/icwsm.v19i1.35945","article-title":"TeleScope: A longitudinal dataset for investigating online discourse and information interaction on Telegram","volume":"19","author":"Gangopadhyay","year":"2025","journal-title":"Proc. Int. AAAI Conf. Web Soc. Media"},{"key":"ref_29","unstructured":"(2025, December 02). Groups and Channels. Available online: https:\/\/telegram.org\/faq#groups-and-channels."},{"key":"ref_30","unstructured":"(2025, December 02). MessageEntity. Available online: https:\/\/core.telegram.org\/type\/MessageEntity."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Bansal, P., and Ouda, A. (2024). Continuous authentication in the digital age: An analysis of reinforcement learning and behavioral biometrics. Computers, 13.","DOI":"10.3390\/computers13040103"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"470","DOI":"10.3390\/jcp1030024","article-title":"Biometric systems de-identification: Current advancements and future directions","volume":"1","author":"Shopon","year":"2021","journal-title":"J. Cybersecur. Priv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Estrela, P.M.A.B., Albuquerque, R.d.O., Amaral, D.M., Giozza, W.F., and J\u00fanior, R.T.d.S. (2021). A framework for continuous authentication based on touch dynamics biometrics for mobile banking applications. Sensors, 21.","DOI":"10.3390\/s21124212"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Choi, M., Lee, S., Jo, M., and Shin, J.S. (2021). Keystroke dynamics-based authentication using unique keypad. Sensors, 21.","DOI":"10.3390\/s21062242"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Ashibani, Y., Kauling, D., and Mahmoud, Q.H. (2019). Design and implementation of a contextual-based continuous authentication framework for smart homes. Appl. Syst. Innov., 2.","DOI":"10.3390\/asi2010004"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Bu\u0142at, R., and Ogiela, M.R. (2023). Personalized context-aware authentication protocols in IoT. Appl. Sci., 13.","DOI":"10.3390\/app13074216"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Baig, A.F., and Eskeland, S. (2021). Security, privacy, and usability in continuous authentication: A survey. Sensors, 21.","DOI":"10.3390\/s21175967"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"39783","DOI":"10.1109\/ACCESS.2024.3377231","article-title":"Semantic clustering and transfer learning in social media texts authorship attribution","volume":"12","author":"Fedotova","year":"2024","journal-title":"IEEE Access"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Cho, S.H., Kim, B., Kwon, H., and Kim, M. (2025). Exploring the potential of large language models for author profiling tasks in digital text. Notebook for PAN at CLEF 2023, Elsevier.","DOI":"10.1016\/j.fsidi.2024.301814"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1429","DOI":"10.1016\/j.jcss.2014.12.019","article-title":"Authorship verification of e-mail and tweet messages applied for continuous authentication","volume":"81","author":"Brocardo","year":"2015","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_41","unstructured":"Li, J., Zhang, Q., and Huang, M. (2023). Author verification of text fragments based on the BERT model. Notebook for PAN at CLEF 2023, Foshan University."},{"key":"ref_42","unstructured":"(2026, January 16). Telemetr\u2014Telegram Channel Search and Analytics Platform. Available online: https:\/\/telemetr.me\/."},{"key":"ref_43","unstructured":"(2025, December 02). BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure. Available online: https:\/\/arxiv.org\/abs\/2203.05794."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1016\/0047-259X(82)90077-X","article-title":"The Fr\u00e9chet distance between multivariate normal distributions","volume":"12","author":"Dowson","year":"1982","journal-title":"J. Multivar. Anal."},{"key":"ref_45","unstructured":"(2026, January 16). Cointegrated\/Rubert-Tiny2: Lightweight Russian BERT Model. Available online: https:\/\/huggingface.co\/cointegrated\/rubert-tiny2."},{"key":"ref_46","unstructured":"(2025, December 02). Language Detection Library in Python. Available online: https:\/\/github.com\/fedelopez77\/langdetect."},{"key":"ref_47","unstructured":"(2025, December 02). Memory Inefficient Algorithm and Getting Error While Saving the Model. Available online: https:\/\/github.com\/MaartenGr\/BERTopic\/issues\/173."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Terragni, S., and Fersini, E. (2021). Word embedding-based topic similarity measures. Proceedings of the International Conference on Applications of Natural Language to Information Systems (ANLIS 2021), Saarbr\u00fccken, Germany, 23\u201325 June 2021, Springer.","DOI":"10.1007\/978-3-030-80599-9_4"},{"key":"ref_49","unstructured":"(2025, December 02). MarketWatch. Download Data. Available online: https:\/\/www.marketwatch.com\/investing\/cryptocurrency\/btcusd\/download-data?."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/11\/2\/25\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T05:22:49Z","timestamp":1770787369000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/11\/2\/25"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,28]]},"references-count":49,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2]]}},"alternative-id":["data11020025"],"URL":"https:\/\/doi.org\/10.3390\/data11020025","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,28]]}}}