{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T21:59:41Z","timestamp":1777586381317,"version":"3.51.4"},"reference-count":63,"publisher":"Walter de Gruyter GmbH","issue":"3","license":[{"start":{"date-parts":[[2025,3,18]],"date-time":"2025-03-18T00:00:00Z","timestamp":1742256000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>This work aims to develop a deep model for automatically labeling music tracks in terms of induced emotions. The machine learning architecture consists of two components: one dedicated to lyric processing based on Natural Language Processing (NLP) and another devoted to music processing. These two components are combined at the decision-making level. To achieve this, a range of neural networks are explored for the task of emotion extraction from both lyrics and music. For lyric classification, three architectures are compared, i.e., a 4-layer neural network, FastText, and a transformer-based approach. For music classification, the architectures investigated include InceptionV3, a collection of models from the ResNet family, and a joint architecture combining Inception and ResNet. SVM serves as a baseline in both threads. The study explores three datasets of songs accompanied by lyrics, with MoodyLyrics4Q selected and preprocessed for model training. The bimodal approach, incorporating both lyrics and audio modules, achieves a classification accuracy of 60.7% in identifying emotions evoked by music pieces. The MoodyLyrics4Q dataset used in this study encompasses musical pieces spanning diverse genres, including rock, jazz, electronic, pop, blues, and country. The algorithms demonstrate reliable performance across the dataset, highlighting their robustness in handling a wide variety of musical styles.<\/jats:p>","DOI":"10.2478\/jaiscr-2025-0011","type":"journal-article","created":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T01:38:14Z","timestamp":1742348294000},"page":"215-235","source":"Crossref","is-referenced-by-count":2,"title":["A Bimodal Deep Model to Capture Emotions from Music Tracks"],"prefix":"10.2478","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-1934-6217","authenticated-orcid":false,"given":"Jan","family":"Tobolewski","sequence":"first","affiliation":[{"name":"Faculty of Electronics, Telecommunications and Informatics , Gda\u0144sk University of Technology , 11\/12 Narutowicza St., 80-233 Gda\u0144sk , Poland"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-2019-2074","authenticated-orcid":false,"given":"Micha\u0142","family":"Sakowicz","sequence":"additional","affiliation":[{"name":"Faculty of Electronics, Telecommunications and Informatics , Gda\u0144sk University of Technology , 11\/12 Narutowicza St., 80-233 Gda\u0144sk , Poland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7521-1115","authenticated-orcid":false,"given":"Jordi","family":"Turmo","sequence":"additional","affiliation":[{"name":"Department of Computer Science , Universitat Polit\u00e8cnica de Catalunya , Jordi Girona Salgado, 1-3, 08034 Barcelona , Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6288-2908","authenticated-orcid":false,"given":"Bo\u017cena","family":"Kostek","sequence":"additional","affiliation":[{"name":"Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications, and Informatics , Gda\u0144sk University of Technology , 11\/12 Narutowicza St., 80-233 Gda\u0144sk , Poland"}]}],"member":"374","published-online":{"date-parts":[[2025,3,18]]},"reference":[{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_001","doi-asserted-by":"crossref","unstructured":"L. Smietanka and T. Maka, \u201cInterpreting convolutional layers in DNN model based on time\u2013frequency representation of emotional speech,\u201d Journal of Artificial Intelligence and Soft Computing Research, vol. 14, no. 1, pp. 5\u201323, Jan. 2024, doi: 10.2478\/jaiscr-2024-0001.","DOI":"10.2478\/jaiscr-2024-0001"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_002","doi-asserted-by":"crossref","unstructured":"S. Sheykhivand, Z. Mousavi, T. Y. Rezaii, and A. Farzamnia, \u201cRecognizing Emotions Evoked by Music Using CNN-LSTM Networks on EEG Signals,\u201d IEEE Access, vol. 8, pp. 139332-139345, 2020, doi: 10.1109\/ACCESS.2020.3011882.","DOI":"10.1109\/ACCESS.2020.3011882"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_003","doi-asserted-by":"crossref","unstructured":"Y. Takahashi, T. Hochin, and H. Nomiya, \u201cRelationship between Mental States with Strong Emotion Aroused by Music Pieces and Their Feature Values,\u201d in Proc. 2014 IIAI 3rd International Conference on Advanced Applied Informatics, 2014, pp. 718-725, doi: 10.1109\/IIAIAAI.2014.147.","DOI":"10.1109\/IIAI-AAI.2014.147"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_004","doi-asserted-by":"crossref","unstructured":"P. A. Wood and S. K. Semwal, \u201cOn exploring the connection between music classification and evoking emotion,\u201d in Proc. 2015 International Conference on Collaboration Technologies and Systems(CTS), 2015, pp. 474-476, doi: 10.1109\/CTS.2015.7210471.","DOI":"10.1109\/CTS.2015.7210471"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_005","doi-asserted-by":"crossref","unstructured":"M. Agapaki, E. A. Pinkerton, and E. Papatzikis, \u201cMusic and neuroscience research for mental health, cognition, and development: Ways forward,\u201d Frontiers in Psychology, vol. 13, 2022, doi: https:\/\/doi.org\/10.3389\/fpsyg.2022.976883.","DOI":"10.3389\/fpsyg.2022.976883"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_006","doi-asserted-by":"crossref","unstructured":"Y. Song, S. Dixon, M. Pearce, and A. Halpern, \u201cPerceived and Induced Emotion Responses to Popular Music: Categorical and Dimensional Models,\u201d Music Perception: An Interdisciplinary Journal, vol. 33, pp. 472-492, Apr. 2016, doi: 10.1525\/mp.2016.33.4.472.","DOI":"10.1525\/mp.2016.33.4.472"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_007","doi-asserted-by":"crossref","unstructured":"Y. Yuan, \u201cEmotion of Music: Extraction and Composing,\u201d Journal of Education, Humanities and Social Sciences, vol. 13, pp. 422-428, May 2023, doi: 10.54097\/ehss.v13i.8207.","DOI":"10.54097\/ehss.v13i.8207"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_008","doi-asserted-by":"crossref","unstructured":"S. A. Sujeesha, J. B. Mala, and R. Rajeev, \u201cAutomatic music mood classification using multi-modal attention framework,\u201d *Engineering Applications of Artificial Intelligence*, vol. 128, p. 107355, 2024, doi: 10.1016\/j.engappai.2023.107355.","DOI":"10.1016\/j.engappai.2023.107355"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_009","doi-asserted-by":"crossref","unstructured":"M. Schedl, P. Knees, B. McFee, D. Bogdanov, and M. Kaminskas, \u201cMusic recommender systems,\u201d in Recommender systems handbook, Springer, 2015, pp. 453-492.","DOI":"10.1007\/978-1-4899-7637-6_13"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_010","unstructured":"MorphCast Technology. Available: https:\/\/www.morphcast.com. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_011","doi-asserted-by":"crossref","unstructured":"S. Zhao, G. Jia, J. Yang, G. Ding, and K. Keutzer, \u201cEmotion Recognition From Multiple Modalities: Fundamentals and methodologies,\u201d IEEE Signal Processing Magazine, vol. 38, no. 6, pp. 59-73, Nov. 2021, doi: 10.1109\/msp.2021.3106895.","DOI":"10.1109\/MSP.2021.3106895"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_012","doi-asserted-by":"crossref","unstructured":"T. Li, \u201cMusic emotion recognition using deep convolutional neural networks,\u201d Journal of Computational Methods in Science and Engineering, vol. 24, no. 4-5, pp. 3063-3078, 2024, doi: 10.3233\/JCM-247551.","DOI":"10.3233\/JCM-247551"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_013","doi-asserted-by":"crossref","unstructured":"P. L. Louro, H. Redinho, R. Malheiro, R. P. Paiva, and R. Panda, \u201cA comparison study of deep learning methodologies for music emotion recognition,\u201d Sensors, vol. 24, no. 7, p. 2201, 2024, doi: 10.3390\/s24072201.","DOI":"10.3390\/s24072201"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_014","doi-asserted-by":"crossref","unstructured":"M. Blaszke, G. Korvel, and B. Kostek, \u201cExploring neural networks for musical instrument identification in polyphonic audio,\u201d IEEE Intelligent Systems, pp. 1-11, 2024, doi: 10.1109\/mis.2024.3392586.","DOI":"10.1109\/MIS.2024.3392586"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_015","doi-asserted-by":"crossref","unstructured":"M. Barata and P. Coelho, \u201cMusic Streaming Services: Understanding the drivers of customer purchase and intention to recommend,\u201d Heliyon, vol. 7, p. e07783, Aug. 2021, doi: 10.1016\/j.heliyon.2021.e07783.","DOI":"10.1016\/j.heliyon.2021.e07783"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_016","doi-asserted-by":"crossref","unstructured":"J. Webster, \u201cThe promise of personalization: Exploring how music streaming platforms are shaping the performance of class identities and distinction,\u201d New Media & Society, p. 146144482110278, Jul. 2021, doi: 10.1177\/14614448211027863.","DOI":"10.1177\/14614448211027863"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_017","doi-asserted-by":"crossref","unstructured":"E. Schmidt, D. Turnbull, and Y. Kim, \u201cFeature selection for content-based, time-varying musical emotion regression,\u201d in Proc ACM SIGMM Int Conf Multimedia Info Retrieval, Mar. 2010, pp. 267-274, doi: 10.1145\/1743384.1743431.","DOI":"10.1145\/1743384.1743431"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_018","doi-asserted-by":"crossref","unstructured":"Y.-H. Yang, Y.-C. Lin, H.-T. Cheng, I.-B. Liao, Y.-C. Ho, and H. H. Chen, \u201cToward Multimodal Music Emotion Classification,\u201d in Advances in Multimedia Information Processing - PCM 2008, 2008, pp. 70-79.","DOI":"10.1007\/978-3-540-89796-5_8"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_019","doi-asserted-by":"crossref","unstructured":"T. Ciborowski, S. Reginis, D. Weber, A. Kurowski, and B. Kostek, \u201cClassifying Emotions in Film Music\u2014A Deep Learning Approach,\u201d Electronics, vol. 10, no. 23, p. 2955, Nov. 2021, doi: 10.3390\/electronics10232955.","DOI":"10.3390\/electronics10232955"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_020","doi-asserted-by":"crossref","unstructured":"X. Han, F. Chen, and J. Ban, \u201cMusic Emotion Recognition Based on a Neural Network with an Inception-GRU Residual Structure,\u201d Electronics, vol. 12, no. 4, p. 978, Feb. 2023, doi: 10.3390\/electronics12040978.","DOI":"10.3390\/electronics12040978"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_021","doi-asserted-by":"crossref","unstructured":"Y. J. Liao, W. C. Wang, S.-J. Ruan, Y. H. Lee, and S. C. Chen, \u201cA Music Playback Algorithm Based on Residual-Inception Blocks for Music Emotion Classification and Physiological Information,\u201d Sensors, vol. 22, no. 3, p. 777, Jan. 2022, doi: 10.3390\/s22030777.","DOI":"10.3390\/s22030777"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_022","doi-asserted-by":"crossref","unstructured":"R. Sarkar, S. Choudhury, S. Dutta, A. Roy, and S. K. Saha, \u201cRecognition of emotion in music based on deep convolutional neural network,\u201d Multimedia Tools and Applications, vol. 79, pp. 765-783, 2019, [Online]. Available: https:\/\/api.semanticscholar.org\/CorpusID:254866914.","DOI":"10.1007\/s11042-019-08192-x"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_023","unstructured":"S. Giammusso, M. Guerriero, P. Lisena, E. Palumbo, and R. Troncy, \u201cPredicting the emotion of playlists using track lyrics,\u201d International Society for Music Information Retrieval ISMIR, Late Breaking Session, 2017."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_024","doi-asserted-by":"crossref","unstructured":"Y. Agrawal, R. Shanker, and V. Alluri, \u201cTransformer-based approach towards music emotion recognition from lyrics,\u201d Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science, vol 12657. Springer, 2021, doi: 10.1007\/978-3-030-72240-1 12.","DOI":"10.1007\/978-3-030-72240-1_12"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_025","doi-asserted-by":"crossref","unstructured":"D. Han, Y. Kong, H. Jiayi, and G. Wang, \u201cA survey of music emotion recognition,\u201d Frontiers of Computer Science, vol. 16, Dec. 2022, doi: 10.1007\/s11704-021-0569-4.","DOI":"10.1007\/s11704-021-0569-4"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_026","doi-asserted-by":"crossref","unstructured":"T. Baltru\u0161aitis, C. Ahuja, and L. -P. Morency, \u201cMultimodal Machine Learning: A Survey and Taxonomy,\u201d in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 423-443, 1 Feb. 2019, doi: 10.1109\/TPAMI.2018.2798607.","DOI":"10.1109\/TPAMI.2018.2798607"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_027","unstructured":"R. Delbouys, R. Hennequin, F. Piccoli, J. Royo-Letelier, and M. Moussallam, \u201cMusic Mood Detection Based On Audio And Lyrics With Deep Neural Net,\u201d ISMIR 2018 https:\/\/doi.org\/10.48550\/arXiv.1809.07276"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_028","doi-asserted-by":"crossref","unstructured":"I. A. P. Santana et al., \u201cMusic4all: A new music database and its applications,\u201d in Proc. 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 2020, pp. 399-404, doi: 10.1109\/IWSSIP48289.2020.9145170.","DOI":"10.1109\/IWSSIP48289.2020.9145170"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_029","doi-asserted-by":"crossref","unstructured":"E. \u00c7ano and M. Morisio, \u201cMoodylyrics: A sentiment annotated lyrics dataset,\u201d in Proc. 2017 International conference on intelligent systems, meta-heuristics & swarm intelligence, 2017, pp. 118-124, doi: 10.1145\/3059336.3059340.","DOI":"10.1145\/3059336.3059340"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_030","doi-asserted-by":"crossref","unstructured":"E. \u00c7ano and M. Morisio, \u201cMusic mood dataset creation based on last.Fm tags,\u201d in Proc. 2017 International Conference on Artificial Intelligence and Applications, Vienna, Austria, 2017, pp. 15-26, DOI:10.5121\/csit.2017.70603.","DOI":"10.5121\/csit.2017.70603"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_031","doi-asserted-by":"crossref","unstructured":"R.E. Thayer: The Biopsychology of Mood and Arousal, Oxford University Press, 1989.","DOI":"10.1093\/oso\/9780195068276.001.0001"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_032","doi-asserted-by":"crossref","unstructured":"J. Russell, \u201cA Circumplex Model of Affect,\u201d Journal of Personality and Social Psychology, vol. 39, pp. 1161-1178, Dec. 1980, doi: 10.1037\/h0077714.","DOI":"10.1037\/h0077714"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_033","unstructured":"Social music service - Last.fm. Available: https:\/\/www.last.fm\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_034","unstructured":"Genius - Song Lyrics & Knowledge. Available: https:\/\/genius.com\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_035","unstructured":"YouTube. Available: https:\/\/www.youtube.com. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_036","unstructured":"M. Sakowicz and J. Tobolewski, \u201cDevelopment and study of an algorithm for the automatic labeling of musical pieces in the context of emotion evoked,\u201d M.Sc. thesis, Gdansk University of Technology and Universitat Polit\u00e8cnica de Catalunya (co-supervised by B. Kostek and J. Turmo), 2023."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_037","unstructured":"Genius and Spotify partnering. Available: https:\/\/genius.com\/a\/genius-and-spotify-together. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_038","unstructured":"Pafy library. Available: https:\/\/pypi.org\/project\/pafy\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_039","unstructured":"Moviepy library. Available: https:\/\/pypi.org\/project\/moviepy\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_040","unstructured":"M. Honnibal and I. Montani, \u201cspaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing,\u201d 2017. Available: https:\/\/github.com\/explosion\/spaCy. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_041","doi-asserted-by":"crossref","unstructured":"P. N. Johnson-Laird and K. Oatley, \u201cEmotions, Simulation, and Abstract Art,\u201d Art & Perception, vol. 9, no. 3, pp. 260-292, 2021, DOI: https:\/\/doi.org\/10.1163\/22134913-bja10029.","DOI":"10.1163\/22134913-bja10029"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_042","doi-asserted-by":"crossref","unstructured":"P. N. Johnson-Laird and K. Oatley, \u201cHow poetry evokes emotions,\u201d Acta Psycho-logica, vol. 224, p. 103506, 2022, doi: https:\/\/doi.org\/10.1016\/j.actpsy.2022.103506.","DOI":"10.1016\/j.actpsy.2022.103506"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_043","doi-asserted-by":"crossref","unstructured":"J. Pennington, R. Socher, and C. Manning, \u201cGloVe: Global Vectors for Word Representation,\u201d in Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Oct. 2014, pp. 1532-1543, doi: 10.3115\/v1\/D14-1162.","DOI":"10.3115\/v1\/D14-1162"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_044","unstructured":"SpaCy - pre-trained pipeline for English. Available: https:\/\/spacy.io\/models\/en\\#en_core_web_lg. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_045","unstructured":"S. Loria, \u201cTextblob Documentation,\u201d Release 0.15, vol. 2, 2018. Available: https:\/\/textblob.readthedocs.io\/en\/dev\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_046","unstructured":"F. Pedregosa et al., \u201cScikit-learn: Machine Learning in Python,\u201d Journal of Machine Learning Research, vol. 12, no. 85, pp. 2825-2830, 2011. Available: http:\/\/jmlr.org\/papers\/v12\/pedregosa11a.html. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_047","unstructured":"\u201dParadise City\u201d Guns N\u2019 Roses https:\/\/genius.com\/Guns-n-roses-paradise-citylyrics"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_048","unstructured":"FastText - text classification tutorial. Available: https:\/\/fasttext.cc\/docs\/en\/supervisedtutorial.html. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_049","doi-asserted-by":"crossref","unstructured":"T. Wolf et al., \u201cTransformers: State-of-the-Art Natural Language Processing,\u201d Jan. 2020, pp. 38-45, doi: 10.18653\/v1\/2020.emnlp-demos.6.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_050","unstructured":"XLNet (base-sized model). Available: https:\/\/huggingface.co\/xlnet-base-cased. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_051","unstructured":"Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, \u201cXlnet: Generalized autoregressive pretraining for language understanding,\u201d Advances in neural information processing systems, vol. 32, 2019. https:\/\/doi.org\/10.48550\/arXiv.1906.08237"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_052","doi-asserted-by":"crossref","unstructured":"C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, \u201cRethinking the Inception Architecture for Computer Vision,\u201d in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 2818-2826 doi: 10.1109\/CVPR.2016.308.","DOI":"10.1109\/CVPR.2016.308"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_053","doi-asserted-by":"crossref","unstructured":"K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep Residual Learning for Image Recognition,\u201d in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770-778, doi: 10.1109\/CVPR.2016.90.","DOI":"10.1109\/CVPR.2016.90"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_054","unstructured":"K. Simonyan and A. Zisserman, \u201cVery deep convolutional networks for large-scale image recognition,\u201d in Proc. 3rd International Conference on Learning Representations(ICLR 2015), 2015, pp. 1-14. https:\/\/doi.org\/10.48550\/arXiv.1409.1556"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_055","unstructured":"Librosa library. Available: https:\/\/librosa.org\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_056","unstructured":"Chollet, F. et al., 2015. Keras. Available: https:\/\/github.com\/fchollet\/keras. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_057","unstructured":"TensorFlow library. Available: https:\/\/www.tensorflow.org\/?hl=pl. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_058","doi-asserted-by":"crossref","unstructured":"S. C. Huang, A. Pareek, S. Seyyedi, I. Banerjee, and M. Lungren, \u201cFusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines,\u201d npj Digital Medicine, vol. 3, 12, 2020. https:\/\/doi.org\/10.1038\/s41746-020-00341-z","DOI":"10.1038\/s41746-020-00341-z"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_059","unstructured":"A. Paszke et al., 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, 32. Curran Associates, Inc., pp. 8024-8035."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_060","unstructured":"Combining two deep learning models. Available: https:\/\/control.com\/technical-articles\/combining-two-deep-learning-models\/. Accessed: November 2024."},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_061","doi-asserted-by":"crossref","unstructured":"Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, A.Hanjalic, and N. Oliver, \u201cTFMAP: Optimizing MAP for top-n context-aware recommendation,\u201d in Proc. 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 155-164, Portland Oregon USA, August 2012, doi: 10.1145\/2348283.2348308.","DOI":"10.1145\/2348283.2348308"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_062","doi-asserted-by":"crossref","unstructured":"K. Pyrovolakis, P.K. Tzouveli, and G. Stamou, Multi-Modal Song Mood Detection with Deep Learning. Sensors (Basel, Switzerland), 22, 2022, doi:10.3390\/s22031065","DOI":"10.3390\/s22031065"},{"key":"2026042814175839626_j_jaiscr-2025-0011_ref_063","doi-asserted-by":"crossref","unstructured":"E. N. Shaday, V. J. L. Engel, and H. Heryanto, \u201cApplication of the Bidirectional Long Short-Term Memory Method with Comparison of Word2Vec, GloVe, and FastText for Emotion Classification in Song Lyrics\u201d, Procedia Computer Science, vol. 245, pp. 137-146, 2024, https:\/\/doi.org\/10.1016\/j.procs.2024.10.237","DOI":"10.1016\/j.procs.2024.10.237"}],"container-title":["Journal of Artificial Intelligence and Soft Computing Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/reference-global.com\/pdf\/10.2478\/jaiscr-2025-0011","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T20:05:38Z","timestamp":1777406738000},"score":1,"resource":{"primary":{"URL":"https:\/\/reference-global.com\/article\/10.2478\/jaiscr-2025-0011"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,18]]},"references-count":63,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3,18]]},"published-print":{"date-parts":[[2025,7,1]]}},"alternative-id":["10.2478\/jaiscr-2025-0011"],"URL":"https:\/\/doi.org\/10.2478\/jaiscr-2025-0011","relation":{},"ISSN":["2449-6499"],"issn-type":[{"value":"2449-6499","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,18]]}}}