{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,5]],"date-time":"2024-08-05T00:58:04Z","timestamp":1722819484658},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The lack of reliable, comprehensive gold standards complicates the development of many bioinformatics tools, particularly for the analysis of expression data and biological networks. Simulation approaches can provide provisional gold standards, such as regulatory networks, for the assessment of network inference methods. However, this just defers the problem, as it is difficult to assess how closely simulators emulate the properties of real data.<\/jats:p>\n               <jats:p>Results: In analogy to Turing\u2019s test discriminating humans and computers based on responses to questions, we systematically compare real and artificial systems based on their gene expression output. Different expression data analysis techniques such as clustering are applied to both types of datasets. We define and extract distributions of properties from the results, for instance, distributions of cluster quality measures or transcription factor activity patterns. Distributions of properties are represented as histograms to enable the comparison of artificial and real datasets. We examine three frequently used simulators that generate expression data from parameterized regulatory networks. We identify features distinguishing real from artificial datasets that suggest how simulators could be adapted to better emulate real datasets and, thus, become more suitable for the evaluation of data analysis tools.<\/jats:p>\n               <jats:p>Availability: See http:\/\/www2.bio.ifi.lmu.de\/\u223ckueffner\/attfad\/ and the supplement for precomputed analyses; other compendia can be analyzed via the CRAN package attfad. The full datasets can be obtained from http:\/\/www2.bio.ifi.lmu.de\/\u223ckueffner\/attfad\/data.tar.gz.<\/jats:p>\n               <jats:p>Contact: \u00a0robert.kueffner@bio.ifi.lmu.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt438","type":"journal-article","created":{"date-parts":[[2013,8,17]],"date-time":"2013-08-17T00:29:16Z","timestamp":1376699356000},"page":"2603-2609","source":"Crossref","is-referenced-by-count":9,"title":["A Turing test for artificial expression data"],"prefix":"10.1093","volume":"29","author":[{"given":"Robert","family":"Maier","sequence":"first","affiliation":[{"name":"Department of Informatics, Ludwig-Maximilians Universit\u00e4t, 80333 M\u00fcnchen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ralf","family":"Zimmer","sequence":"additional","affiliation":[{"name":"Department of Informatics, Ludwig-Maximilians Universit\u00e4t, 80333 M\u00fcnchen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"K\u00fcffner","sequence":"additional","affiliation":[{"name":"Department of Informatics, Ludwig-Maximilians Universit\u00e4t, 80333 M\u00fcnchen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2013,8,16]]},"reference":[{"key":"2023012810465213000_btt438-B1","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1186\/1471-2105-10-47","article-title":"A general modular framework for gene set enrichment analysis","volume":"10","author":"Ackermann","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012810465213000_btt438-B2","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1186\/1471-2105-7-205","article-title":"SIMAGE: simulation of DNA-microarray gene expression data","volume":"7","author":"Albers","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012810465213000_btt438-B3","doi-asserted-by":"crossref","first-page":"1165","DOI":"10.1214\/aos\/1013699998","article-title":"The control of the false discovery rate in multiple testing under dependency","volume":"29","author":"Benjamini","year":"2001","journal-title":"Ann. Stat."},{"key":"2023012810465213000_btt438-B4","first-page":"711","article-title":"Unsupervised knowledge discovery in medical databases using relevance networks","author":"Butte","year":"1999","journal-title":"Proc. AMIA Symp."},{"key":"2023012810465213000_btt438-B5","first-page":"98","article-title":"Global functional profiling of gene expression","volume":"81","author":"Draghici","year":"2003","journal-title":"Genomics"},{"key":"2023012810465213000_btt438-B6","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1214\/07-AOAS101","article-title":"On testing the significance of sets of genes","volume":"1","author":"Efron","year":"2007","journal-title":"Ann. Appl. Stat."},{"key":"2023012810465213000_btt438-B7","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012810465213000_btt438-B8","doi-asserted-by":"crossref","first-page":"D866","DOI":"10.1093\/nar\/gkm815","article-title":"Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata","volume":"36","author":"Faith","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012810465213000_btt438-B9","doi-asserted-by":"crossref","first-page":"291","DOI":"10.4137\/BBI.S441","article-title":"Normalization and gene p-value estimation: issues in microarray data processing","volume":"2","author":"Fundel","year":"2008","journal-title":"Bioinform. Biol. Insights"},{"key":"2023012810465213000_btt438-B10","doi-asserted-by":"crossref","first-page":"D98","DOI":"10.1093\/nar\/gkq1110","article-title":"RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units)","volume":"39","author":"Gama-Castro","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012810465213000_btt438-B11","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1093\/bioinformatics\/btp068","article-title":"Benchmarking regulatory network reconstruction with GRENDEL","volume":"25","author":"Haynes","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B12","doi-asserted-by":"crossref","first-page":"035013","DOI":"10.1088\/1478-3975\/8\/3\/035013","article-title":"Analysis and simulation of gene expression profiles in pure and mixed cell populations","volume":"8","author":"Hebenstreit","year":"2011","journal-title":"Phys. Biol."},{"key":"2023012810465213000_btt438-B13","doi-asserted-by":"crossref","first-page":"e12807","DOI":"10.1371\/journal.pone.0012807","article-title":"Petri Nets with Fuzzy Logic (PNFL): reverse engineering and parametrization","volume":"5","author":"K\u00fcffner","year":"2010","journal-title":"PLoS One"},{"key":"2023012810465213000_btt438-B14","doi-asserted-by":"crossref","first-page":"1376","DOI":"10.1093\/bioinformatics\/bts143","article-title":"Inferring gene regulatory networks by ANOVA","volume":"28","author":"K\u00fcffner","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B15","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-7-113","article-title":"An improved map of conserved regulatory sites for Saccharomyces cerevisiae","volume":"7","author":"MacIsaac","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012810465213000_btt438-B16","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1214\/aoms\/1177730491","article-title":"On a test of whether one of two random variables is stochastically larger than the other","volume":"18","author":"Mann","year":"1947","journal-title":"Ann. Math. Stat."},{"key":"2023012810465213000_btt438-B17","doi-asserted-by":"crossref","first-page":"796","DOI":"10.1038\/nmeth.2016","article-title":"Wisdom of crowds for robust gene network inference","volume":"9","author":"Marbach","year":"2012","journal-title":"Nat. Methods"},{"key":"2023012810465213000_btt438-B18","doi-asserted-by":"crossref","first-page":"1480","DOI":"10.1093\/bioinformatics\/bts164","article-title":"Rigorous assessment of gene set enrichment tests","volume":"28","author":"Naeem","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B19","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1093\/bib\/bbn001","article-title":"Gene-set approach for expression pattern analysis","volume":"9","author":"Nam","year":"2008","journal-title":"Brief Bioinform."},{"key":"2023012810465213000_btt438-B20","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/j.ygeno.2010.10.003","article-title":"A comprehensive assessment of methods for de-novo reverse-engineering of genome-scale regulatory networks","volume":"97","author":"Narendra","year":"2011","journal-title":"Genomics"},{"key":"2023012810465213000_btt438-B21","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1038\/nbt0106-51","article-title":"Inference in Bayesian networks","volume":"24","author":"Needham","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023012810465213000_btt438-B22","doi-asserted-by":"crossref","first-page":"1650","DOI":"10.1016\/j.csda.2008.03.023","article-title":"Distribution modeling and simulation of gene expression data","volume":"53","author":"Parrish","year":"2009","journal-title":"Comput. Stat. Data Anal."},{"key":"2023012810465213000_btt438-B23","doi-asserted-by":"crossref","first-page":"2459","DOI":"10.1093\/bioinformatics\/btr407","article-title":"Simulating systems genetics data with SysGenSIM","volume":"27","author":"Pinna","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B24","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1093\/bioinformatics\/btp038","article-title":"Papers on normalization, variable selection, classification or clustering of microarray data","volume":"25","author":"Rocke","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B25","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"2023012810465213000_btt438-B26","doi-asserted-by":"crossref","first-page":"2263","DOI":"10.1093\/bioinformatics\/btr373","article-title":"GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods","volume":"27","author":"Schaffter","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012810465213000_btt438-B27","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1186\/1471-2105-7-43","article-title":"SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms","volume":"7","author":"Van den Bulcke","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012810465213000_btt438-B28","doi-asserted-by":"crossref","first-page":"150","DOI":"10.1093\/bib\/bbr029","article-title":"Learning transcriptional regulation on a genome scale: a theoretical analysis based on gene expression data","volume":"13","author":"Wu","year":"2012","journal-title":"Brief Bioinform."},{"key":"2023012810465213000_btt438-B29","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1109\/TITB.2004.824724","article-title":"Cluster analysis of gene expression data based on self-splitting and merging competitive learning","volume":"8","author":"Wu","year":"2004","journal-title":"IEEE Trans. Inf. Technol. Biomed."},{"key":"2023012810465213000_btt438-B30","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1093\/bioinformatics\/17.4.309","article-title":"Validating clustering for gene expression data","volume":"17","author":"Yeung","year":"2001","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/20\/2603\/48890760\/bioinformatics_29_20_2603.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/20\/2603\/48890760\/bioinformatics_29_20_2603.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T12:37:56Z","timestamp":1674909476000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/29\/20\/2603\/277311"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8,16]]},"references-count":30,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2013,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt438","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2013,10,15]]},"published":{"date-parts":[[2013,8,16]]}}}