{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T10:26:14Z","timestamp":1773051974659,"version":"3.50.1"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,5,6]],"date-time":"2024-05-06T00:00:00Z","timestamp":1714953600000},"content-version":"vor","delay-in-days":40,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Key Research and Development Program of Zhejiang","award":["2022C01018"],"award-info":[{"award-number":["2022C01018"]}]},{"name":"Key Research and Development Program of Zhejiang","award":["2024C01025"],"award-info":[{"award-number":["2024C01025"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62103374"],"award-info":[{"award-number":["62103374"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U21B2001"],"award-info":[{"award-number":["U21B2001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U21B2001"],"award-info":[{"award-number":["U21B2001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61973273"],"award-info":[{"award-number":["61973273"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"publisher","award":["2022YFC2804104"],"award-info":[{"award-number":["2022YFC2804104"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Molecular property prediction faces the challenge of limited labeled data as it necessitates a series of specialized experiments to annotate target molecules. Data augmentation techniques can effectively address the issue of data scarcity. In recent years, Mixup has achieved significant success in traditional domains such as image processing. However, its application in molecular property prediction is relatively limited due to the irregular, non-Euclidean nature of graphs and the fact that minor variations in molecular structures can lead to alterations in their properties. To address these challenges, we propose a novel data augmentation method called Mix-Key tailored for molecular property prediction. Mix-Key aims to capture crucial features of molecular graphs, focusing separately on the molecular scaffolds and functional groups. By generating isomers that are relatively invariant to the scaffolds or functional groups, we effectively preserve the core information of molecules. Additionally, to capture interactive information between the scaffolds and functional groups while ensuring correlation between the original and augmented graphs, we introduce molecular fingerprint similarity and node similarity. Through these steps, Mix-Key determines the mixup ratio between the original graph and two isomers, thus generating more informative augmented molecular graphs. We extensively validate our approach on molecular datasets of different scales with several Graph Neural Network architectures. The results demonstrate that Mix-Key consistently outperforms other data augmentation methods in enhancing molecular property prediction on several datasets.<\/jats:p>","DOI":"10.1093\/bib\/bbae165","type":"journal-article","created":{"date-parts":[[2024,5,6]],"date-time":"2024-05-06T05:39:41Z","timestamp":1714973981000},"source":"Crossref","is-referenced-by-count":10,"title":["Mix-Key: graph mixup with key structures for molecular property prediction"],"prefix":"10.1093","volume":"25","author":[{"given":"Tianyi","family":"Jiang","sequence":"first","affiliation":[{"name":"Institute of Cyberspace Security , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Zhejiang University of Technology , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology , 310056, Hangzhou , China"}]},{"given":"Zeyu","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Cyberspace Security , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Zhejiang University of Technology , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology , 310056, Hangzhou , China"}]},{"given":"Wenchao","family":"Yu","sequence":"additional","affiliation":[{"name":"the College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Zhejiang University of Technology , 310014, Hangzhou , China"}]},{"given":"Jinhuan","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Cyberspace Security , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Zhejiang University of Technology , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology , 310056, Hangzhou , China"}]},{"given":"Shanqing","family":"Yu","sequence":"additional","affiliation":[{"name":"Institute of Cyberspace Security , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Zhejiang University of Technology , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology , 310056, Hangzhou , China"}]},{"given":"Xiaoze","family":"Bao","sequence":"additional","affiliation":[{"name":"the College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Zhejiang University of Technology , 310014, Hangzhou , China"}]},{"given":"Bin","family":"Wei","sequence":"additional","affiliation":[{"name":"the College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Zhejiang University of Technology , 310014, Hangzhou , China"}]},{"given":"Qi","family":"Xuan","sequence":"additional","affiliation":[{"name":"Institute of Cyberspace Security , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Zhejiang University of Technology , College of Information Engineering, , 310023, Hangzhou , China"},{"name":"Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology , 310056, Hangzhou , China"}]}],"member":"286","published-online":{"date-parts":[[2024,5,5]]},"reference":[{"issue":"1","key":"2024051207255990900_ref1","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1093\/bib\/bbk007","article-title":"Machine learning in bioinformatics","volume":"7","author":"Larranaga","year":"2006","journal-title":"Brief Bioinform"},{"issue":"7715","key":"2024051207255990900_ref2","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1038\/s41586-018-0337-2","article-title":"Machine learning for molecular and materials science","volume":"559","author":"Butler","year":"2018","journal-title":"Nature"},{"key":"2024051207255990900_ref3","doi-asserted-by":"crossref","first-page":"606668","DOI":"10.3389\/fphar.2020.606668","article-title":"Improvement of prediction performance with conjoint molecular fingerprint in deep learning","volume":"11","author":"Xie","year":"2020","journal-title":"Front Pharmacol"},{"key":"2024051207255990900_ref4","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1016\/j.ddtec.2020.05.001","article-title":"Molecular property prediction: recent trends in the era of artificial intelligence","volume":"32","author":"Shen","year":"2019","journal-title":"Drug Discov Today Technol"},{"key":"2024051207255990900_ref5","article-title":"Multi-modal representation learning for molecular property prediction: sequence, graph, geometry","author":"Wang","year":"2024"},{"key":"2024051207255990900_ref6","first-page":"1263","article-title":"Neural message passing for quantum chemistry","volume-title":"International Conference on Machine Learning","author":"Gilmer","year":"2017"},{"key":"2024051207255990900_ref7","article-title":"Directional message passing for molecular graphs","volume-title":"International Conference on Learning Representations","author":"Gasteiger","year":"2020"},{"key":"2024051207255990900_ref8","first-page":"2831","article-title":"Communicative representation learning on attributed molecular graphs","volume-title":"IJCAI","author":"Song","year":"2020"},{"issue":"3","key":"2024051207255990900_ref9","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1038\/s42256-022-00447-x","article-title":"Molecular contrastive learning of representations via graph neural networks","volume":"4","author":"Wang","year":"2022","journal-title":"Nat Mach Intell"},{"key":"2024051207255990900_ref10","first-page":"518","article-title":"Dropconn: dropout connection based random gnns for molecular property prediction","volume":"36","author":"Zhang","year":"2024","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2024051207255990900_ref11","doi-asserted-by":"crossref","first-page":"1821","DOI":"10.1109\/TNSE.2023.3332499","article-title":"Null model-based data augmentation for graph classification","volume":"11","author":"Wang","year":"2023","journal-title":"IEEE Trans Netw Sci Eng"},{"key":"2024051207255990900_ref12","article-title":"Data augmentation on graphs: a survey","author":"Zhou","year":"2022"},{"key":"2024051207255990900_ref13","article-title":"Graph data augmentation for graph machine learning: a survey","author":"Zhao","year":"2022"},{"issue":"1","key":"2024051207255990900_ref14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on image data augmentation for deep learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"2024051207255990900_ref15","article-title":"Data augmentation for graph data: recent advancements","author":"Marrium","year":"2022"},{"issue":"4","key":"2024051207255990900_ref16","doi-asserted-by":"crossref","first-page":"3478","DOI":"10.1109\/TNSE.2021.3115104","article-title":"Sampling subgraph network with application to graph classification","volume":"8","author":"Wang","year":"2021","journal-title":"IEEE Trans Netw Sci Eng"},{"issue":"6","key":"2024051207255990900_ref17","doi-asserted-by":"crossref","first-page":"2776","DOI":"10.1109\/TKDE.2019.2957755","article-title":"Subgraph networks with application to structural feature space expansion","volume":"33","author":"Xuan","year":"2019","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2024051207255990900_ref18","article-title":"Dropedge: towards deep graph convolutional networks on node classification","volume-title":"International Conference on Learning Representations","author":"Rong","year":"2019"},{"key":"2024051207255990900_ref19","first-page":"22092","article-title":"Graph random neural networks for semi-supervised learning on graphs","volume":"33","author":"Feng","year":"2020","journal-title":"Adv Neural Inf Process Syst"},{"issue":"1","key":"2024051207255990900_ref20","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1109\/TNSE.2020.3032950","article-title":"M-evolve: structural-mapping-based data augmentation for graph classification","volume":"8","author":"Zhou","year":"2020","journal-title":"IEEE Trans Netw Sci Eng"},{"key":"2024051207255990900_ref21","article-title":"Graphcrop: subgraph cropping for graph classification","author":"Wang","year":"2020"},{"key":"2024051207255990900_ref22","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbad296","article-title":"Self-supervised learning with chemistry-aware fragmentation for effective molecular property prediction","volume":"24","author":"Xie","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024051207255990900_ref23","first-page":"10824","article-title":"Contrastive self-supervised learning for graph classification","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Zeng","year":"2021"},{"key":"2024051207255990900_ref24","article-title":"Subgraph networks based contrastive learning","author":"Wang","year":"2023"},{"key":"2024051207255990900_ref25","article-title":"Mixup: beyond empirical risk minimization","volume-title":"International Conference on Learning Representations","author":"Zhang","year":"2018"},{"key":"2024051207255990900_ref26","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2021.findings-acl.84","article-title":"A survey of data augmentation approaches for nlp","author":"Feng","year":"2021"},{"key":"2024051207255990900_ref27","first-page":"3663","article-title":"Mixup for node and graph classification","volume-title":"Proceedings of the Web Conference","author":"Wang","year":"2021"},{"key":"2024051207255990900_ref28","first-page":"8230","article-title":"G-mixup: Graph data augmentation for graph classification","volume-title":"International Conference on Machine Learning","author":"Han","year":"2022"},{"key":"2024051207255990900_ref29","doi-asserted-by":"crossref","first-page":"1281","DOI":"10.1145\/3485447.3512175","article-title":"Model-agnostic augmentation for accurate graph classification","volume-title":"Proceedings of the ACM Web Conference 2022","author":"Yoo","year":"2022"},{"key":"2024051207255990900_ref30","first-page":"7966","article-title":"Graph transplant: node saliency-guided graph mixup with local structure preservation","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Park","year":"2022"},{"issue":"15","key":"2024051207255990900_ref31","doi-asserted-by":"crossref","first-page":"2524","DOI":"10.1002\/asia.201900282","article-title":"Recent advances in the z\/e isomers of tetraphenylethene derivatives: stereoselective synthesis, aie mechanism, photophysical properties, and application as chemical probes","volume":"14","author":"Xie","year":"2019","journal-title":"Chem. Asian J."},{"issue":"18","key":"2024051207255990900_ref32","doi-asserted-by":"crossref","first-page":"5955","DOI":"10.1021\/jacs.8b01651","article-title":"Controllable self-assembly of macrocycles in water for isolating aromatic hydrocarbon isomers","volume":"140","author":"Guangcheng","year":"2018","journal-title":"J Am Chem Soc"},{"issue":"27","key":"2024051207255990900_ref33","doi-asserted-by":"crossref","first-page":"7586","DOI":"10.1002\/anie.201508818","article-title":"Scaffold diversity synthesis and its application in probe and drug discovery","volume":"55","author":"Garcia-Castro","year":"2016","journal-title":"Angew Chem Int Ed"},{"issue":"36","key":"2024051207255990900_ref34","doi-asserted-by":"crossref","first-page":"9755","DOI":"10.1002\/ange.201302045","article-title":"Discovery of neuritogenic compound classes inspired by natural products","volume":"125","author":"Dakas","year":"2013","journal-title":"Angewandte Chemie"},{"key":"2024051207255990900_ref35","article-title":"AugMix: a simple data processing method to improve robustness and uncertainty","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR)","author":"Hendrycks","year":"2020"},{"issue":"2","key":"2024051207255990900_ref36","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1039\/C7SC02664A","article-title":"Moleculenet: a benchmark for molecular machine learning","volume":"9","author":"Zhenqin","year":"2018","journal-title":"Chem Sci"},{"key":"2024051207255990900_ref37","first-page":"5812","article-title":"Graph contrastive learning with augmentations","volume":"33","author":"You","year":"2020","journal-title":"Adv Neural Inf Process Syst"},{"key":"2024051207255990900_ref38","article-title":"Strategies for pre-training graph neural networks","volume-title":"International Conference on Learning Representations","author":"Hu","year":"2020"},{"key":"2024051207255990900_ref39","first-page":"8892","article-title":"Autogcl: automated graph contrastive learning via learnable view generators","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Yin","year":"2022"},{"key":"2024051207255990900_ref40","article-title":"Pre-training molecular graph representation with 3d geometry","volume-title":"International Conference on Learning Representations","author":"Liu","year":"2022"},{"key":"2024051207255990900_ref41","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1038\/s42256-023-00654-0","article-title":"Knowledge graph-enhanced molecular contrastive learning with functional prompt","volume":"5","author":"Fang","year":"2023","journal-title":"Nat Mach Intell"},{"issue":"3","key":"2024051207255990900_ref42","doi-asserted-by":"crossref","first-page":"1000","DOI":"10.1021\/ci034243x","article-title":"Esol: estimating aqueous solubility directly from molecular structure","volume":"44","author":"Delaney","year":"2004","journal-title":"J Chem Inf Comput Sci"},{"key":"2024051207255990900_ref43","first-page":"1322","article-title":"Alkyl halides & aryl halides","volume":"130","author":"Grignard","year":"1900","journal-title":"Synthesis"},{"issue":"11","key":"2024051207255990900_ref44","doi-asserted-by":"crossref","first-page":"2042","DOI":"10.2118\/9288-PA","article-title":"Applications of water-soluble polymers in the oil field","volume":"33","author":"Chatterji","year":"1981","journal-title":"J Petrol Tech"},{"issue":"11","key":"2024051207255990900_ref45","article-title":"Visualizing data using t-sne","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J Mach Learn Res"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae165\/57527226\/bbae165.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae165\/57527226\/bbae165.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,12]],"date-time":"2024-05-12T07:26:31Z","timestamp":1715498791000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae165\/7665120"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,27]]},"references-count":45,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae165","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5,1]]},"published":{"date-parts":[[2024,3,27]]},"article-number":"bbae165"}}