{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T13:52:07Z","timestamp":1766065927572,"version":"3.41.0"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2009,12,1]],"date-time":"2009-12-01T00:00:00Z","timestamp":1259625600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:p>Practitioners and researchers regularly refer to error rates or accuracy percentages of databases. The former is the number of cells in error divided by the total number of cells; the latter is the number of correct cells divided by the total number of cells. However, databases may have similar error rates (or accuracy percentages) but differ drastically in the complexity of their accuracy problems. A simple percent does not provide information as to whether the errors are systematic or randomly distributed throughout the database. We expand the accuracy metric to include a randomness measure and include a probability distribution value. The proposed randomness check is based on the Lempel-Ziv (LZ) complexity measure. Through two simulation studies we show that the LZ complexity measure can clearly differentiate as to whether the errors are random or systematic. This determination is a significant first step and is a major departure from the percentage-alone technique. Once it is determined that the errors are random, a probability distribution, Poisson, is used to help address various managerial questions.<\/jats:p>","DOI":"10.1145\/1659225.1659229","type":"journal-article","created":{"date-parts":[[2010,1,12]],"date-time":"2010-01-12T20:23:07Z","timestamp":1263327787000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["An Accuracy Metric"],"prefix":"10.1145","volume":"1","author":[{"given":"Craig W.","family":"Fisher","sequence":"first","affiliation":[{"name":"Marist College"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eitel J. M.","family":"Lauria","sequence":"additional","affiliation":[{"name":"Marist College"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carolyn C.","family":"Matheus","sequence":"additional","affiliation":[{"name":"State University of New York at Albany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2009,12]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2006.883696"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/291469.291471"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.2003.11045769"},{"key":"e_1_2_1_4_1","first-page":"47","article-title":"Randomness and mathematical proof. Sci","volume":"232","author":"Chaitin G. J.","year":"1975","journal-title":"Amer."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Clarke R. D. 1946. An application of the poisson distribution. J. Inst. Actuaries 72. Clarke R. D. 1946. An application of the poisson distribution. J. Inst. Actuaries 72 .","DOI":"10.1017\/S0020268100035435"},{"key":"e_1_2_1_6_1","first-page":"43","article-title":"Recordkeeping integrity: Assessing records\u2019 content after Enron","volume":"37","author":"Dietel E. J.","year":"2003","journal-title":"Inform. Manag. J."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.104.2.301"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Gelman A. Carlin J. Stern H. and Rubin D. 1995. Bayesian Data Analysis. Chapman and Hall\/CRC. Gelman A. Carlin J. Stern H. and Rubin D. 1995. Bayesian Data Analysis . Chapman and Hall\/CRC.","DOI":"10.1201\/9780429258411"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/15.12.994"},{"key":"e_1_2_1_10_1","unstructured":"Huang K.-T. Lee Y. W. and Wang R. Y. 1999. Quality Information and Knowledge. Prentice Hall Englewood Cliffs NJ. Huang K.-T. Lee Y. W. and Wang R. Y. 1999. Quality Information and Knowledge . Prentice Hall Englewood Cliffs NJ."},{"key":"e_1_2_1_11_1","first-page":"405","article-title":"What is random","volume":"71","author":"Kac M.","year":"1983","journal-title":"Amer. Sci."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevA.36.842"},{"key":"e_1_2_1_13_1","first-page":"1","article-title":"Three approaches to the quantitative definition of information","volume":"1","author":"Kolmogorov A. N.","year":"1965","journal-title":"Probl. Inform. Transmis."},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Lee Y. W. Pipino L. L. Funk J. D. and Wang R. Y. 2006. Journey to Data Quality. MIT Press Cambridge MA. Lee Y. W. Pipino L. L. Funk J. D. and Wang R. Y. 2006. Journey to Data Quality . MIT Press Cambridge MA.","DOI":"10.7551\/mitpress\/4037.001.0001"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1976.1055501"},{"key":"e_1_2_1_16_1","unstructured":"Levin R. I. Rubin D. S. and Stinson J. P. 1986. Quantitative Approaches to Management. McGraw-Hill New York. Levin R. I. Rubin D. S. and Stinson J. P. 1986. Quantitative Approaches to Management . McGraw-Hill New York."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Li M. and Vitanyi P. 1997. A Introduction to Kolmogorov Complexity and its Applications. Springer Verlag Berlin. Li M. and Vitanyi P. 1997. A Introduction to Kolmogorov Complexity and its Applications . Springer Verlag Berlin.","DOI":"10.1007\/978-1-4757-2606-0"},{"volume-title":"Edited by H. B. Woolf","author":"The Merriam-Webster Dictionary","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.2005.11045823"},{"volume-title":"Data Quality: The Accuracy Dimension. Morgan Kaufman","year":"2003","author":"Olson J. E.","key":"e_1_2_1_20_1"},{"key":"e_1_2_1_21_1","unstructured":"Pfaffenberger R. C. and Patterson J. H. 1987. Statistical Methods. Irwin Homewood IL. Pfaffenberger R. C. and Patterson J. H. 1987. Statistical Methods . Irwin Homewood IL."},{"key":"e_1_2_1_22_1","first-page":"1","article-title":"Modeling database error rates","volume":"3","author":"Pierce E. M.","year":"1997","journal-title":"Data Qual."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1093\/intqhc\/12.1.47"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.1999.11518259"},{"volume-title":"Measuring data accuracy","author":"Redman T.","key":"e_1_2_1_25_1"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/269012.269025"},{"key":"e_1_2_1_27_1","unstructured":"Rukhin A. L. Soto J. Nechvatal J. Smid M. Levenson M. Banks D. Vangel M. Leigh S. and Vo S. 2000. A statistical test suite for the validation of cryptographical random number generators. National Institute of Standards and Technology Gaithersburg MD. Rukhin A. L. Soto J. Nechvatal J. Smid M. Levenson M. Banks D. Vangel M. Leigh S. and Vo S. 2000. A statistical test suite for the validation of cryptographical random number generators . National Institute of Standards and Technology Gaithersburg MD."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0019-9958(64)90223-2"},{"key":"e_1_2_1_30_1","unstructured":"Summers D. C. S. 2006. Quality. Pearson\/Prentice Hall Upper Saddle River NJ. Summers D. C. S. 2006. Quality . Pearson\/Prentice Hall Upper Saddle River NJ."},{"edition":"8","volume-title":"Elementary Statistics","author":"Triola M. F.","key":"e_1_2_1_31_1"},{"key":"e_1_2_1_32_1","unstructured":"Utts J. M. and Heckard R. F. 2006. Statistical Ideas and Methods. Thompson Belmont CA. Utts J. M. and Heckard R. F. 2006. Statistical Ideas and Methods . Thompson Belmont CA."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/240455.240479"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.1996.11518099"},{"volume-title":"e and the Poisson Distribution","author":"Winkel B. J.","key":"e_1_2_1_35_1"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1977.1055714"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1978.1055934"}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1659225.1659229","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1659225.1659229","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:41:32Z","timestamp":1750250492000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1659225.1659229"}},"subtitle":["Percentages, Randomness, and Probabilities"],"short-title":[],"issued":{"date-parts":[[2009,12]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["10.1145\/1659225.1659229"],"URL":"https:\/\/doi.org\/10.1145\/1659225.1659229","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"type":"print","value":"1936-1955"},{"type":"electronic","value":"1936-1963"}],"subject":[],"published":{"date-parts":[[2009,12]]},"assertion":[{"value":"2007-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}