{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T07:54:05Z","timestamp":1780991645911,"version":"3.54.1"},"reference-count":13,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2015,12,1]],"date-time":"2015-12-01T00:00:00Z","timestamp":1448928000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Big Data &amp; Society"],"published-print":{"date-parts":[[2015,12,1]]},"abstract":"<jats:p>Social scientists and data analysts are increasingly making use of Big Data in their analyses. These data sets are often \u201cfound data\u201d arising from purely observational sources rather than data derived under strict rules of a statistically designed experiment. However, since these large data sets easily meet the sample size requirements of most statistical procedures, they give analysts a false sense of security as they proceed to focus on employing traditional statistical methods. We explain how most analyses performed on Big Data today lead to \u201cprecisely inaccurate\u201d results that hide biases in the data but are easily overlooked due to the enhanced significance of the results created by the data size. Before any analyses are performed on large data sets, we recommend employing a simple data segmentation technique to control for some major components of observational data biases. These segments will help to improve the accuracy of the results.<\/jats:p>","DOI":"10.1177\/2053951715602495","type":"journal-article","created":{"date-parts":[[2015,12,1]],"date-time":"2015-12-01T21:44:25Z","timestamp":1449006265000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":51,"title":["Big Data and the danger of being precisely inaccurate"],"prefix":"10.1177","volume":"2","author":[{"given":"Daniel A","family":"McFarland","sequence":"first","affiliation":[{"name":"Stanford University, Stanford, CA, USA"},{"name":"Hearst Corporation, New York, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"H Richard","family":"McFarland","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, CA, USA"},{"name":"Hearst Corporation, New York, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"179","published-online":{"date-parts":[[2015,12,1]]},"reference":[{"key":"bibr15-2053951715602495","doi-asserted-by":"crossref","unstructured":"Birant D (2011)\n                      Data Mining Using RFM Analysis\n                      . INTECH Open Access Publisher.","DOI":"10.5772\/13683"},{"key":"bibr1-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-72579-6_12"},{"key":"bibr2-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1287\/mksc.14.4.378"},{"key":"bibr3-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1038\/494155a"},{"key":"bibr4-2053951715602495","unstructured":"Derya B (2011) Data mining using RFM analysis. In: Fanastu K (ed.)\n                      Knowledge-Oriented Applications in Data Mining\n                      . InTech. ISBN: 9780953-307-154-1."},{"key":"bibr5-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1086\/227352"},{"key":"bibr6-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1038\/nature07634"},{"key":"bibr7-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1126\/science.1248506"},{"key":"bibr8-2053951715602495","doi-asserted-by":"crossref","unstructured":"Leskovec J, Lang KJ and Mahoney M (2010) Empirical comparison of algorithms for network community detection. In:\n                      Proceedings of the 19th international conference on World wide web\n                      . ACM, 26\u201330 April 2010, Raleigh, North Carolina, USA: ACM, pp. 631\u2013640.","DOI":"10.1145\/1772690.1772755"},{"key":"bibr9-2053951715602495","unstructured":"Lewis RA, Rao JM and Reiley DH (2011) Here, there, and everywhere: correlated online behaviors can lead to overestimates of the effects of advertising. In: WWW 2011, Hyderabad, India, 28 March\u20131 April 2011. ACM 978-1-4503-0632-4\/11\/03."},{"key":"bibr10-2053951715602495","doi-asserted-by":"publisher","DOI":"10.1057\/palgrave.dbm.3240216"},{"issue":"19","key":"bibr11-2053951715602495","first-page":"4199","volume":"4","author":"Wei J-T","year":"2010","journal-title":"African Journal of Business Management"},{"key":"bibr12-2053951715602495","doi-asserted-by":"crossref","unstructured":"Yang J, McAuley J and Leskovec J (2013, December) Community detection in networks with node attributes. In:\n                      IEEE international conference on data mining (ICDM)\n                      , Dallas, Texas. 7\u201310 December 2013, pp. 1151\u20131156. DOI: 10.1109\/ICDM.2013.167.","DOI":"10.1109\/ICDM.2013.167"}],"container-title":["Big Data &amp; Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/2053951715602495","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/2053951715602495","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/2053951715602495","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T13:01:52Z","timestamp":1777381312000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/2053951715602495"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,12,1]]},"references-count":13,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,12,1]]}},"alternative-id":["10.1177\/2053951715602495"],"URL":"https:\/\/doi.org\/10.1177\/2053951715602495","relation":{},"ISSN":["2053-9517","2053-9517"],"issn-type":[{"value":"2053-9517","type":"print"},{"value":"2053-9517","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,12,1]]},"article-number":"2053951715602495"}}