{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T20:57:52Z","timestamp":1772225872624,"version":"3.50.1"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BioData Mining"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Identification of non-trivial and meaningful patterns in omics data is one of the most important biological tasks. The patterns help to better understand biological systems and interpret experimental outcomes. A well-established method serving to explain such biological data is Gene Set Enrichment Analysis. However, this type of analysis is restricted to a specific type of evaluation. Abstracting from details, the analyst provides a sorted list of genes and ontological annotations of the individual genes; the method outputs a subset of ontological terms enriched in the gene list. Here, in contrary to enrichment analysis, we introduce a new tool\/framework that allows for the induction of more complex patterns of 2-dimensional binary omics data. This extension allows to discover and describe semantically coherent biclusters.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a new rapid method called sem1R that reveals interpretable hidden rules in omics data. These rules capture semantic differences between two classes: a target class as a collection of positive examples and a non-target class containing negative examples. The method is inspired by the CN2 rule learner and introduces a new refinement operator that exploits prior knowledge in the form of ontologies. In our work this knowledge serves to create accurate and interpretable rules. The novel refinement operator uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Efficiency and effectivity of the novel refinement operator were tested on three real different gene expression datasets. Concretely, the Dresden Ovary Dataset, DISC, and m2816 were employed. The experiments show that the ontology-based refinement operator speeds-up the pattern induction drastically. The algorithm is written in C++ and is published as an R package available at<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/github.com\/fmalinka\/sem1r\">http:\/\/github.com\/fmalinka\/sem1r<\/jats:ext-link>.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s13040-020-00219-6","type":"journal-article","created":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T12:03:43Z","timestamp":1598961823000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator"],"prefix":"10.1186","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9681-4716","authenticated-orcid":false,"given":"Franti\u0161ek","family":"Malinka","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Filip","family":"\u017eelezn\u00fd","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ji\u0159\u00ed","family":"Kl\u00e9ma","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,9,1]]},"reference":[{"issue":"4","key":"219_CR1","doi-asserted-by":"publisher","first-page":"398","DOI":"10.1093\/bib\/1.4.398","volume":"1","author":"R Stevens","year":"2000","unstructured":"Stevens R, Goble CA, Bechhofer S. Ontology-based knowledge representation for bioinformatics. Brief Bioinform. 2000; 1(4):398\u2013414.","journal-title":"Brief Bioinform"},{"key":"219_CR2","first-page":"1","volume":"6","author":"T \u00d6sterlund","year":"2017","unstructured":"\u00d6sterlund T, Cvijovic M, Kristiansson E. Integrative analysis of omics data. Syst Biol. 2017; 6:1.","journal-title":"Syst Biol"},{"key":"219_CR3","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.pbi.2015.12.010","volume":"30","author":"D Rajasundaram","year":"2016","unstructured":"Rajasundaram D, Selbig J. More effort\u2014more results: recent advances in integrative \u2019omics\u2019 data analysis. Curr Opin Plant Biol. 2016; 30:57\u201361.","journal-title":"Curr Opin Plant Biol"},{"issue":"11","key":"219_CR4","doi-asserted-by":"publisher","first-page":"1251","DOI":"10.1038\/nbt1346","volume":"25","author":"B Smith","year":"2007","unstructured":"Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, et al.The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007; 25(11):1251.","journal-title":"Nat Biotechnol"},{"issue":"43","key":"219_CR5","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","volume":"102","author":"A Subramanian","year":"2005","unstructured":"Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al.Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Nat Acad Sci. 2005; 102(43):15545\u201350.","journal-title":"Proc Nat Acad Sci"},{"issue":"1","key":"219_CR6","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25(1):25.","journal-title":"Nat Genet"},{"issue":"D1","key":"219_CR7","first-page":"331","volume":"45","author":"GO Consortium","year":"2016","unstructured":"Consortium GO. Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res. 2016; 45(D1):331\u20138.","journal-title":"Nucleic Acids Res"},{"key":"219_CR8","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-75197-7","volume-title":"Foundations of Rule Learning","author":"J. Fuerkranz","year":"2012","unstructured":"Fuerkranz J., Gamberger D., Lavrac N.Foundations of Rule Learning. Heidelberg: Springer; 2012. isbn = 978-3-540-75197-7."},{"key":"219_CR9","first-page":"3","volume":"160","author":"SB Kotsiantis","year":"2007","unstructured":"Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: A review of classification techniques. Emerg Artif Intell Appl Comput Eng. 2007; 160:3\u201324.","journal-title":"Emerg Artif Intell Appl Comput Eng"},{"issue":"9","key":"219_CR10","doi-asserted-by":"publisher","first-page":"1116","DOI":"10.1093\/bioinformatics\/btg047","volume":"19","author":"TR Hvidsten","year":"2003","unstructured":"Hvidsten TR, L\u00e6greid A, Komorowski J. Learning rule-based models of biological process from gene expression time profiles using gene ontology. Bioinformatics. 2003; 19(9):1116\u201323.","journal-title":"Bioinformatics"},{"key":"219_CR11","volume-title":"Transactions on Computational Systems Biology VI","author":"L Calzone","year":"2006","unstructured":"Calzone L, Chabrier-Rivier N, Fages F, Soliman S. Machine learning biochemical networks from temporal logic properties. In: Transactions on Computational Systems Biology VI. Berlin: Springer: 2006. p. 68\u201394."},{"issue":"2","key":"219_CR12","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1016\/j.ijmedinf.2006.11.006","volume":"77","author":"R Bellazzi","year":"2008","unstructured":"Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008; 77(2):81\u201397.","journal-title":"Int J Med Inform"},{"issue":"D1","key":"219_CR13","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1093\/nar\/gkw1092","volume":"45","author":"M Kanehisa","year":"2016","unstructured":"Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. Kegg: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2016; 45(D1):353\u201361.","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"219_CR14","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1093\/nar\/gkv1070","volume":"44","author":"M Kanehisa","year":"2015","unstructured":"Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. Kegg as a reference resource for gene and protein annotation. Nucleic Acids Res. 2015; 44(D1):457\u201362.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"219_CR15","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1093\/nar\/28.1.27","volume":"28","author":"M Kanehisa","year":"2000","unstructured":"Kanehisa M, Goto S. Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000; 28(1):27\u201330.","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"219_CR16","doi-asserted-by":"publisher","first-page":"940","DOI":"10.1093\/nar\/gkr972","volume":"40","author":"LM Schriml","year":"2011","unstructured":"Schriml LM, Arze C, Nadendla S, Chang Y-WW, Mazaitis M, Felix V, Feng G, Kibbe WA. Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res. 2011; 40(D1):940\u20136.","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"219_CR17","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1093\/nar\/gku1011","volume":"43","author":"WA Kibbe","year":"2014","unstructured":"Kibbe WA, Arze C, Felix V, Mitraka E, Bolton E, Fu G, Mungall CJ, Binder JX, Malone J, Vasant D, et al.Disease ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res. 2014; 43(D1):1071\u20138.","journal-title":"Nucleic Acids Res"},{"issue":"11","key":"219_CR18","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1145\/219717.219748","volume":"38","author":"GA Miller","year":"1995","unstructured":"Miller GA. Wordnet: a lexical database for english. Commun ACM. 1995; 38(11):39\u201341.","journal-title":"Commun ACM"},{"key":"219_CR19","volume-title":"Proceedings of the 16th International Conference on World Wide Web","author":"FM Suchanek","year":"2007","unstructured":"Suchanek FM, Kasneci G, Weikum G. Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web. New York: ACM: 2007. p. 697\u2013706."},{"issue":"4","key":"219_CR20","first-page":"261","volume":"3","author":"P Clark","year":"1989","unstructured":"Clark P, Niblett T. The cn2 induction algorithm. Mach Learn. 1989; 3(4):261\u201383.","journal-title":"Mach Learn"},{"key":"219_CR21","volume-title":"Machine Learning Proceedings 1995","author":"WW Cohen","year":"1995","unstructured":"Cohen WW. Fast effective rule induction. In: Machine Learning Proceedings 1995. San Francisco: Morgan Kaufmann: 1995. p. 115\u201323."},{"issue":"7","key":"219_CR22","first-page":"41","volume":"18","author":"J Kl\u00e9ma","year":"2017","unstructured":"Kl\u00e9ma J, Malinka F, \u017eelezn\u00fd F. Semantic biclustering for finding local, interpretable and predictive expression patterns. BMC Genomics. 2017; 18(7):41. BioMed Central.","journal-title":"BMC Genomics"},{"key":"219_CR23","doi-asserted-by":"publisher","unstructured":"Clark P, Boswell R. Rule induction with cn2: Some recent improvements. In: European Working Session on Learning. Springer: 1991. p. 151\u201363. https:\/\/doi.org\/10.1007\/bfb0017011.","DOI":"10.1007\/bfb0017011"},{"issue":"2","key":"219_CR24","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1023\/A:1008894516817","volume":"9","author":"JH Friedman","year":"1999","unstructured":"Friedman JH, Fisher NI. Bump hunting in high-dimensional data. Stat Comput. 1999; 9(2):123\u201343.","journal-title":"Stat Comput"},{"key":"219_CR25","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-01574-8","volume-title":"Statistical Relational Artificial Intelligence: Logic, Probability, and Computation","author":"L De Raedt","year":"2016","unstructured":"De Raedt L. Statistical Relational Artificial Intelligence: Logic, Probability, and Computation. San Rafael: Morgan & Claypool Publishers; 2016."},{"key":"219_CR26","doi-asserted-by":"publisher","unstructured":"\u017e\u00e1kov\u00e1 M, \u017eelezn\u00fd F. Exploiting term, predicate, and feature taxonomies in propositionalization and propositional rule learning. In: Machine Learning: ECML 2007. Springer: 2007. p. 798\u2013805. https:\/\/doi.org\/10.1007\/978-3-540-74958-5_82.","DOI":"10.1007\/978-3-540-74958-5_82"},{"key":"219_CR27","volume-title":"International Conference on Inductive Logic Programming","author":"M Svato\u0161","year":"2017","unstructured":"Svato\u0161 M, \u0160ourek G, \u017eelezny\u0300 F, Schockaert S, Ku\u017eelka O. Pruning hypothesis spaces using learned domain theories. In: International Conference on Inductive Logic Programming. Cham: Springer: 2017. p. 152\u2013168."},{"key":"219_CR28","volume-title":"Artificial Intelligence: A Modern Approach (2nd Edition)","author":"SJ Russell","year":"2002","unstructured":"Russell SJ, Norvig P. Artificial Intelligence: A Modern Approach (2nd Edition). Upper Saddle River: Prentice Hall; 2002."},{"key":"219_CR29","volume-title":"Proceedings of the 5th International Symposium on Information Processing (FCIP-69)","author":"RS Michalski","year":"1969","unstructured":"Michalski RS. On the quasi-minimal solution of the general covering problem. In: Proceedings of the 5th International Symposium on Information Processing (FCIP-69). Bled: Vol. A3 (Switching Circuits): 1969. p. 125\u201328."},{"key":"219_CR30","volume-title":"Asian Conference on Computer Vision","author":"J Borovec","year":"2016","unstructured":"Borovec J, Kybic J. Binary pattern dictionary learning for gene expression representation in drosophila imaginal discs. In: Asian Conference on Computer Vision. Cham: Springer: 2016. p. 555\u201369."},{"issue":"1","key":"219_CR31","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1186\/2041-1480-4-32","volume":"4","author":"M Costa","year":"2013","unstructured":"Costa M, Reeve S, Grumbling G, Osumi-Sutherland D. The drosophila anatomy ontology. J Biomed Semant. 2013; 4(1):32.","journal-title":"J Biomed Semant"},{"key":"219_CR32","doi-asserted-by":"publisher","unstructured":"Jambor H, Surendranath V, Kalinka AT, Mejstrik P, Saalfeld S, Tomancak P. Systematic imaging reveals features and changing localization of mrnas in drosophila development. Elife. 2015; 4. https:\/\/doi.org\/10.7554\/elife.05003.","DOI":"10.7554\/elife.05003"},{"key":"219_CR33","unstructured":"Dresden Ovary Table. http:\/\/tomancak-srv1.mpi-cbg.de\/DOT\/main. Accessed 15 Feb 2016."},{"issue":"D1","key":"219_CR34","doi-asserted-by":"publisher","first-page":"746","DOI":"10.1093\/nar\/gkv1045","volume":"44","author":"R Petryszak","year":"2015","unstructured":"Petryszak R, Keays M, Tang YA, Fonseca NA, Barrera E, Burdett T, F\u00fcllgrabe A, Fuentes AM-P, Jupp S, Koskinen S, et al.Expression atlas update\u2014an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res. 2015; 44(D1):746\u201352.","journal-title":"Nucleic Acids Res"},{"issue":"6114","key":"219_CR35","doi-asserted-by":"publisher","first-page":"1593","DOI":"10.1126\/science.1228186","volume":"338","author":"J Merkin","year":"2012","unstructured":"Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science. 2012; 338(6114):1593\u20139.","journal-title":"Science"},{"issue":"8","key":"219_CR36","doi-asserted-by":"publisher","first-page":"1112","DOI":"10.1093\/bioinformatics\/btq099","volume":"26","author":"J Malone","year":"2010","unstructured":"Malone J, Holloway E, Adamusiak T, Kapushesky M, Zheng J, Kolesnikov N, Zhukova A, Brazma A, Parkinson H. Modeling sample variables with an experimental factor ontology. Bioinformatics. 2010; 26(8):1112\u20138.","journal-title":"Bioinformatics"}],"container-title":["BioData Mining"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-020-00219-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13040-020-00219-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-020-00219-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,15]],"date-time":"2022-11-15T18:03:33Z","timestamp":1668535413000},"score":1,"resource":{"primary":{"URL":"https:\/\/biodatamining.biomedcentral.com\/articles\/10.1186\/s13040-020-00219-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,1]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["219"],"URL":"https:\/\/doi.org\/10.1186\/s13040-020-00219-6","relation":{},"ISSN":["1756-0381"],"issn-type":[{"value":"1756-0381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,1]]},"assertion":[{"value":"6 April 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 July 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"13"}}