{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T03:12:03Z","timestamp":1773976323981,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2017,6,6]],"date-time":"2017-06-06T00:00:00Z","timestamp":1496707200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100007349","name":"Israeli National Nanotechnology Initiative","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007349","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2017,12]]},"DOI":"10.1186\/s13321-017-0224-0","type":"journal-article","created":{"date-parts":[[2017,6,6]],"date-time":"2017-06-06T08:43:39Z","timestamp":1496738619000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells"],"prefix":"10.1186","volume":"9","author":[{"given":"Omer","family":"Kaspi","sequence":"first","affiliation":[]},{"given":"Abraham","family":"Yosipof","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3176-8982","authenticated-orcid":false,"given":"Hanoch","family":"Senderowitz","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2017,6,6]]},"reference":[{"key":"224_CR1","doi-asserted-by":"publisher","first-page":"011002","DOI":"10.1063\/1.4812323","volume":"1","author":"A Jain","year":"2013","unstructured":"Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: The materials project: a materials genome approach to accelerating materials innovation. APL Mater 1:011002","journal-title":"APL Mater"},{"key":"224_CR2","doi-asserted-by":"publisher","first-page":"10497","DOI":"10.1039\/C6DT01501H","volume":"45","author":"K Takahashi","year":"2016","unstructured":"Takahashi K, Tanaka Y (2016) Materials informatics: a journey towards material design and synthesis. Dalton Trans 45:10497\u201310499","journal-title":"Dalton Trans"},{"key":"224_CR3","doi-asserted-by":"publisher","first-page":"205901","DOI":"10.1103\/PhysRevLett.115.205901","volume":"115","author":"A Seko","year":"2015","unstructured":"Seko A, Togo A, Hayashi H, Tsuda K, Chaput L, Tanaka I (2015) Prediction of low-thermal-conductivity compounds with first-principles anharmonic lattice-dynamics calculations and bayesian optimization. Phys Rev Lett 115:205901","journal-title":"Phys Rev Lett"},{"key":"224_CR4","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1016\/S1369-7021(05)71123-8","volume":"8","author":"K Rajan","year":"2005","unstructured":"Rajan K (2005) Materials informatics. Mater Today 8:38\u201345","journal-title":"Mater Today"},{"key":"224_CR5","doi-asserted-by":"publisher","first-page":"735","DOI":"10.1021\/cm503507h","volume":"27","author":"O Isayev","year":"2015","unstructured":"Isayev O, Fourches D, Muratov EN, Oses C, Rasch K, Tropsha A, Curtarolo S (2015) Materials cartography: representing and mining materials space using structural and electronic fingerprints. Chem Mater 27:735\u2013743","journal-title":"Chem Mater"},{"key":"224_CR6","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1016\/j.commatsci.2012.02.002","volume":"58","author":"S Curtarolo","year":"2012","unstructured":"Curtarolo S, Setyawan W, Wang S, Xue J, Yang K, Taylor RH, Nelson LJ, Hart GLW, Sanvito S, Buongiorno-Nardelli M, Mingo N, Levy O (2012) AFLOWLIB.ORG: a distributed materials properties repository from high-throughput ab initio calculations. Comput Mater Sci 58:227\u2013235","journal-title":"Comput Mater Sci"},{"key":"224_CR7","doi-asserted-by":"publisher","first-page":"3117","DOI":"10.1111\/j.1151-2916.1998.tb02746.x","volume":"81","author":"T Kosugi","year":"1998","unstructured":"Kosugi T, Kaneko S (1998) Novel spray-pyrolysis deposition of cuprous oxide thin films. J Am Ceram Soc 81:3117\u20133124","journal-title":"J Am Ceram Soc"},{"key":"224_CR8","volume-title":"Pearson\u2019s crystal data\u00ae: crystal structure database for inorganic compounds","author":"P Villars","year":"2007","unstructured":"Villars P (2007) Pearson\u2019s crystal data\u00ae: crystal structure database for inorganic compounds. ASM International, Materials Park"},{"key":"224_CR9","unstructured":"https:\/\/www.matbase.com\/\n                    \n                  . Accessed 19 April 2017"},{"key":"224_CR10","unstructured":"https:\/\/www.matdat.com\/\n                    \n                  . Accessed 19 April 2017"},{"key":"224_CR11","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1021\/ci100176x","volume":"50","author":"D Fourches","year":"2010","unstructured":"Fourches D, Muratov E, Tropsha A (2010) Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model 50:1189\u20131204","journal-title":"J Chem Inf Model"},{"key":"224_CR12","first-page":"760","volume-title":"Chemical biology","author":"M Olah","year":"2008","unstructured":"Olah M, Rad R, Ostopovici L, Bora A, Hadaruga N, Hadaruga D, Moldovan R, Fulias A, Mractc M, Oprea TI (2008) WOMBAT and WOMBAT-PK: bioactivity databases for lead and drug discovery. In: Schreiber SL, Kapoor TM, Wess G (eds) Chemical biology. Wiley-VCH Verlag GmbH, New York, pp 760\u2013786"},{"key":"224_CR13","first-page":"223","volume-title":"Chemoinformatics in drug discovery","author":"M Olah","year":"2004","unstructured":"Olah M, Mracec M, Ostopovici L, Rad R, Bora A, Hadaruga N, Olah I, Banda M, Simon Z, Mracec M, Oprea TI (2004) WOMBAT: world of molecular bioactivity. In: Oprea TI (ed) Chemoinformatics in drug discovery. Wiley-VCH, New York, pp 223\u2013239"},{"key":"224_CR14","doi-asserted-by":"publisher","first-page":"1337","DOI":"10.1002\/qsar.200810084","volume":"27","author":"D Young","year":"2008","unstructured":"Young D, Martin T, Venkatapathy R, Harten P (2008) Are the chemical structures in your QSAR correct? QSAR Comb Sci 27:1337\u20131345","journal-title":"QSAR Comb Sci"},{"key":"224_CR15","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1557\/mrs.2016.93","volume":"41","author":"J Hill","year":"2016","unstructured":"Hill J, Mulholland G, Persson K, Seshadri R, Wolverton C, Meredig B (2016) Materials science with large-scale data and informatics: unlocking new opportunities. MRS Bull 41:399\u2013409","journal-title":"MRS Bull"},{"key":"224_CR16","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1186\/s13321-015-0108-0","volume":"7","author":"Y Gilad","year":"2015","unstructured":"Gilad Y, Nadassy K, Senderowitz H (2015) A reliable computational workflow for the selection of optimal screening libraries. J Cheminform 7:61","journal-title":"J Cheminform"},{"key":"224_CR17","volume-title":"Applied multivariate statistical analysis","author":"RA Johnson","year":"1992","unstructured":"Johnson RA (1992) Applied multivariate statistical analysis. Prentice Hall International, Incorporated, Upper Saddle River"},{"key":"224_CR18","doi-asserted-by":"publisher","first-page":"054110","DOI":"10.1103\/PhysRevB.95.054110","volume":"95","author":"K Takahashi","year":"2017","unstructured":"Takahashi K, Tanaka Y (2017) Unveiling descriptors for predicting the bulk modulus of amorphous carbon. Phys Rev B 95:054110","journal-title":"Phys Rev B"},{"key":"224_CR19","doi-asserted-by":"publisher","first-page":"014101","DOI":"10.1103\/PhysRevB.95.014101","volume":"95","author":"K Takahashi","year":"2017","unstructured":"Takahashi K, Tanaka Y (2017) Role of descriptors in predicting the dissolution energy of embedded oxides and the bulk modulus of oxide-embedded iron. Phys Rev B 95:014101","journal-title":"Phys Rev B"},{"key":"224_CR20","doi-asserted-by":"publisher","first-page":"476","DOI":"10.1002\/minf.201000061","volume":"29","author":"A Tropsha","year":"2010","unstructured":"Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29:476\u2013488","journal-title":"Mol Inform"},{"key":"224_CR21","doi-asserted-by":"publisher","first-page":"4977","DOI":"10.1021\/jm4004285","volume":"57","author":"A Cherkasov","year":"2014","unstructured":"Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz\u2019min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977\u20135010","journal-title":"J Med Chem"},{"key":"224_CR22","doi-asserted-by":"publisher","first-page":"5703","DOI":"10.1021\/nn1013484","volume":"4","author":"D Fourches","year":"2010","unstructured":"Fourches D, Pu D, Tassa C, Weissleder R, Shaw SY, Mumper RJ, Tropsha A (2010) Quantitative nanostructure\u2013activity relationship modeling. ACS Nano 4:5703\u20135712","journal-title":"ACS Nano"},{"key":"224_CR23","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/j.chemosphere.2005.07.002","volume":"63","author":"E Furusj\u00f6","year":"2006","unstructured":"Furusj\u00f6 E, Svenson A, Rahmberg M, Andersson M (2006) The importance of outlier detection and training set selection for reliable environmental QSAR predictions. Chemosphere 63:99\u2013108","journal-title":"Chemosphere"},{"key":"224_CR24","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1002\/jcc.23803","volume":"36","author":"A Yosipof","year":"2015","unstructured":"Yosipof A, Senderowitz H (2015) k-Nearest neighbors optimization-based outlier removal. J Comput Chem 36:493\u2013506","journal-title":"J Comput Chem"},{"key":"224_CR25","doi-asserted-by":"publisher","first-page":"2507","DOI":"10.1021\/acs.jcim.5b00515","volume":"55","author":"OE Nahum","year":"2015","unstructured":"Nahum OE, Yosipof A, Senderowitz H (2015) A multi-objective genetic algorithm for outlier removal. J Chem Inf Model 55:2507\u20132518","journal-title":"J Chem Inf Model"},{"key":"224_CR26","doi-asserted-by":"crossref","unstructured":"Hautamaki V, Karkkainen I, Franti P (2004) Outlier detection using k-nearest neighbour graph. In: Proceedings of the pattern recognition, 17th international conference (ICPR\u201904) IEEE Computer Society Washington, DC","DOI":"10.1109\/ICPR.2004.1334558"},{"key":"224_CR27","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1145\/335191.335437","volume":"29","author":"S Ramaswamy","year":"2000","unstructured":"Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. SIGMOD Rec. 29:427\u2013438","journal-title":"SIGMOD Rec."},{"key":"224_CR28","unstructured":"Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24th international conference on very large data bases, VLDB. Morgan Kaufmann Publishers Inc., New York"},{"key":"224_CR29","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1007\/s10910-009-9585-6","volume":"47","author":"L Tarko","year":"2010","unstructured":"Tarko L (2010) Monte Carlo method for identification of outlier molecules in QSAR studies. J Math Chem 47:174\u2013190","journal-title":"J Math Chem"},{"key":"224_CR30","doi-asserted-by":"publisher","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","volume":"23","author":"Y Saeys","year":"2007","unstructured":"Saeys Y, Inza I, Larra\u00f1aga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23:2507\u20132517","journal-title":"Bioinformatics"},{"key":"224_CR31","doi-asserted-by":"publisher","first-page":"4791","DOI":"10.3390\/molecules17054791","volume":"17","author":"F Sahigara","year":"2012","unstructured":"Sahigara F, Mansouri K, Ballabio D, Mauri A, Consonni V, Todeschini R (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17:4791","journal-title":"Molecules"},{"key":"224_CR32","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1002\/qsar.200390007","volume":"22","author":"A Tropsha","year":"2003","unstructured":"Tropsha A, Gramatica P, Gombar VK (2003) The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb Sci 22:69\u201377","journal-title":"QSAR Comb Sci"},{"key":"224_CR33","doi-asserted-by":"publisher","first-page":"1361","DOI":"10.1289\/ehp.5758","volume":"111","author":"L Eriksson","year":"2003","unstructured":"Eriksson L, Jaworska J, Worth AP, Cronin MTD, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ Health Perspect 111:1361\u20131375","journal-title":"Environ Health Perspect"},{"key":"224_CR34","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1145\/358669.358692","volume":"24","author":"MA Fischler","year":"1981","unstructured":"Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381\u2013395","journal-title":"Commun ACM"},{"key":"224_CR35","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1007\/3-540-45053-X_52","volume-title":"Computer vision\u2014ECCV 2000: 6th European conference on computer vision, Dublin, Ireland, June 26\u2013July 1, 2000 proceedings, Part II","author":"PHS Torr","year":"2000","unstructured":"Torr PHS, Davidson C (2000) IMPSAC: synthesis of importance sampling and random sample consensus. In: Vernon D (ed) Computer vision\u2014ECCV 2000: 6th European conference on computer vision, Dublin, Ireland, June 26\u2013July 1, 2000 proceedings, Part II. Springer, Berlin, pp 819\u2013833"},{"key":"224_CR36","doi-asserted-by":"publisher","first-page":"622","DOI":"10.1002\/minf.201600050","volume":"35","author":"A Yosipof","year":"2016","unstructured":"Yosipof A, Kaspi O, Majhi K, Senderowitz H (2016) Visualization based data mining for comparison between two solar cell libraries. Mol Inform 35:622\u2013628","journal-title":"Mol Inform"},{"key":"224_CR37","doi-asserted-by":"publisher","first-page":"3755","DOI":"10.1021\/jz3017039","volume":"3","author":"S R\u00fchle","year":"2012","unstructured":"R\u00fchle S, Anderson AY, Barad H-N, Kupfer B, Bouhadana Y, Rosh-Hodesh E, Zaban A (2012) All-oxide photovoltaics. J Phys Chem Lett 3:3755\u20133764","journal-title":"J Phys Chem Lett"},{"key":"224_CR38","doi-asserted-by":"publisher","first-page":"568","DOI":"10.1002\/minf.201600047","volume":"35","author":"A Yosipof","year":"2016","unstructured":"Yosipof A, Shimanovich K, Senderowitz H (2016) Materials informatics: statistical modeling in material science. Mol Inform 35:568\u2013579","journal-title":"Mol Inform"},{"key":"224_CR39","doi-asserted-by":"publisher","first-page":"4849","DOI":"10.1039\/c1ee02056k","volume":"4","author":"R Olivares-Amaya","year":"2011","unstructured":"Olivares-Amaya R, Amador-Bedolla C, Hachmann J, Atahan-Evrenk S, Sanchez-Carrera RS, Vogt L, Aspuru-Guzik A (2011) Accelerated computational discovery of high-performance materials for organic photovoltaics by means of cheminformatics. Energy Environ Sci 4:4849\u20134861","journal-title":"Energy Environ Sci"},{"key":"224_CR40","doi-asserted-by":"publisher","first-page":"23865","DOI":"10.1039\/C5RA01906K","volume":"5","author":"S Tortorella","year":"2015","unstructured":"Tortorella S, Marotta G, Cruciani G, De Angelis F (2015) Quantitative structure-property relationship modeling of ruthenium sensitizers for solar cells applications: novel tools for designing promising candidates. RSC Adv 5:23865\u201323873","journal-title":"RSC Adv"},{"key":"224_CR41","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1002\/minf.201400174","volume":"34","author":"A Yosipof","year":"2015","unstructured":"Yosipof A, Nahum OE, Anderson AY, Barad H-N, Zaban A, Senderowitz H (2015) Data mining and machine learning tools for combinatorial material science of all-oxide photovoltaic cells. Mol Inform 34:367\u2013379","journal-title":"Mol Inform"},{"key":"224_CR42","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1021\/co3001583","volume":"16","author":"AY Anderson","year":"2014","unstructured":"Anderson AY, Bouhadana Y, Barad H-N, Kupfer B, Rosh-Hodesh E, Aviv H, Tischler YR, R\u00fchle S, Zaban A (2014) Quantum Efficiency and bandgap analysis for combinatorial photovoltaics: sorting activity of Cu\u2013O compounds in all-oxide device libraries. ACS Comb Sci 16:53\u201365","journal-title":"ACS Comb Sci"},{"key":"224_CR43","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1016\/j.solmat.2014.10.005","volume":"132","author":"M Pavan","year":"2015","unstructured":"Pavan M, R\u00fchle S, Ginsburg A, Keller DA, Barad H-N, Sberna PM, Nunes D, Martins R, Anderson AY, Zaban A, Fortunato E (2015) TiO2\/Cu2O all-oxide heterojunction solar cells produced by spray pyrolysis. Sol Energy Mater Sol Cells 132:549\u2013556","journal-title":"Sol Energy Mater Sol Cells"},{"key":"224_CR44","doi-asserted-by":"publisher","first-page":"1567","DOI":"10.1021\/ci400715n","volume":"54","author":"A Yosipof","year":"2014","unstructured":"Yosipof A, Senderowitz H (2014) Optimization of molecular representativeness. J Chem Inf Model 54:1567\u20131577","journal-title":"J Chem Inf Model"},{"key":"224_CR45","doi-asserted-by":"publisher","unstructured":"Majhi K, Bertoluzzi L, Rietwyk KJ, Ginsburg A, Keller DA, Lopez-Varo P, Anderson AY, Bisquert J, Zaban A (2016) Thin-film photovoltaics: combinatorial investigation and modelling of MoO3 hole-selective contact in TiO2|Co3O4|MoO3 all-oxide solar cells. Adv Mater Interfaces 3. doi:\n                    10.1002\/admi.201670005","DOI":"10.1002\/admi.201670005"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-017-0224-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s13321-017-0224-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-017-0224-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,6,24]],"date-time":"2019-06-24T09:44:34Z","timestamp":1561369474000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-017-0224-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,6,6]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,12]]}},"alternative-id":["224"],"URL":"https:\/\/doi.org\/10.1186\/s13321-017-0224-0","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,6,6]]},"assertion":[{"value":"11 February 2017","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 May 2017","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2017","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"34"}}