{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T13:21:00Z","timestamp":1762953660226},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>Scientific discoveries are increasingly driven by analyzing large volumes of image data. Many new libraries and specialized database management systems (DBMSs) have emerged to support such tasks. It is unclear how well these systems support real-world image analysis use cases, and how performant the image analytics tasks implemented on top of such systems are. In this paper, we present the first comprehensive evaluation of large-scale image analysis systems using two real-world scientific image data processing use cases. We evaluate five representative systems (SciDB, Myria, Spark, Dask, and TensorFlow) and find that each of them has shortcomings that complicate implementation or hurt performance. Such shortcomings lead to new research opportunities in making large-scale image analysis both efficient and easy to use.<\/jats:p>","DOI":"10.14778\/3137628.3137634","type":"journal-article","created":{"date-parts":[[2017,9,7]],"date-time":"2017-09-07T13:35:53Z","timestamp":1504791353000},"page":"1226-1237","source":"Crossref","is-referenced-by-count":34,"title":["Comparative evaluation of big-data systems on scientific image analytics workloads"],"prefix":"10.14778","volume":"10","author":[{"given":"Parmita","family":"Mehta","sequence":"first","affiliation":[{"name":"University of Washington"}]},{"given":"Sven","family":"Dorkenwald","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Dongfang","family":"Zhao","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Tomer","family":"Kaftan","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Alvin","family":"Cheung","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Magdalena","family":"Balazinska","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Ariel","family":"Rokem","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Andrew","family":"Connolly","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Jacob","family":"Vanderplas","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Yusra","family":"AlSayyad","sequence":"additional","affiliation":[{"name":"University of Washington"}]}],"member":"320","published-online":{"date-parts":[[2017,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"http:\/\/dask.pydata.org.  http:\/\/dask.pydata.org."},{"key":"e_1_2_1_2_1","volume-title":"OSDI","author":"Abadi M.","year":"2016","unstructured":"M. Abadi : Large-scale machine learning on heterogeneous systems . In OSDI , 2016 . M. Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems. In OSDI, 2016."},{"key":"e_1_2_1_3_1","unstructured":"http:\/\/abcdstudy.org\/.  http:\/\/abcdstudy.org\/."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213864"},{"key":"e_1_2_1_5_1","unstructured":"https:\/\/aws.amazon.com\/.  https:\/\/aws.amazon.com\/."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmrb.1994.1037"},{"key":"e_1_2_1_7_1","unstructured":"https:\/\/bazel.build\/.  https:\/\/bazel.build\/."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2015.7363756"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2612185"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807271"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/647060.714153"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2004.30"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2007.906087"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00962238"},{"key":"e_1_2_1_15_1","unstructured":"http:\/\/fits.gsfc.nasa.gov.  http:\/\/fits.gsfc.nasa.gov."},{"key":"e_1_2_1_16_1","unstructured":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/fourth-paradigm-data-intensive-scientific-discovery.  https:\/\/www.microsoft.com\/en-us\/research\/publication\/fourth-paradigm-data-intensive-scientific-discovery."},{"key":"e_1_2_1_17_1","volume-title":"Dipy, a library for the analysis of diffusion MRI data. Front. Neuroinform., 8","author":"Garyfallidis E.","year":"2014","unstructured":"E. Garyfallidis Dipy, a library for the analysis of diffusion MRI data. Front. Neuroinform., 8 , 21 Feb. 2014 . E. Garyfallidis et al. Dipy, a library for the analysis of diffusion MRI data. Front. Neuroinform., 8, 21 Feb. 2014."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2594530"},{"key":"e_1_2_1_19_1","volume-title":"CIDR","author":"Herodotou H.","year":"2011","unstructured":"H. Herodotou : A self-tuning system for big data analytics . In CIDR , 2011 . H. Herodotou et al. Starfish: A self-tuning system for big data analytics. In CIDR, 2011."},{"key":"e_1_2_1_20_1","unstructured":"http:\/\/astroinf.cmm.uchile.cl\/category\/projects\/.  http:\/\/astroinf.cmm.uchile.cl\/category\/projects\/."},{"key":"e_1_2_1_21_1","unstructured":"https:\/\/medium.com\/data-collective\/rapid-growth-in-available-data-c5e2705a2423.  https:\/\/medium.com\/data-collective\/rapid-growth-in-available-data-c5e2705a2423."},{"key":"e_1_2_1_22_1","unstructured":"http:\/\/blog.d8a.com\/post\/9662265140\/the-growth-of-image-data-mobile-and-web.  http:\/\/blog.d8a.com\/post\/9662265140\/the-growth-of-image-data-mobile-and-web."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2015.04.057"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1038\/nn.4358"},{"key":"e_1_2_1_25_1","unstructured":"https:\/\/www.lsst.org\/.  https:\/\/www.lsst.org\/."},{"key":"e_1_2_1_26_1","unstructured":"http:\/\/dm.lsst.org\/.  http:\/\/dm.lsst.org\/."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2016.22"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1038\/nn.4393"},{"key":"e_1_2_1_29_1","unstructured":"https:\/\/nifti.nimh.nih.gov\/nifti-1.  https:\/\/nifti.nimh.nih.gov\/nifti-1."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2007.58"},{"key":"e_1_2_1_31_1","volume-title":"Automatica","author":"Otsu N.","year":"1975","unstructured":"N. Otsu . A threshold selection method from gray-level histograms . Automatica , 1975 . N. Otsu. A threshold selection method from gray-level histograms. Automatica, 1975."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559865"},{"key":"e_1_2_1_33_1","unstructured":"https:\/\/www.postgresql.org\/.  https:\/\/www.postgresql.org\/."},{"key":"e_1_2_1_34_1","unstructured":"http:\/\/www.rasdaman.org\/.  http:\/\/www.rasdaman.org\/."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.25080\/Majora-7b98e3ed-013"},{"key":"e_1_2_1_36_1","unstructured":"http:\/\/www.scidb.org\/.  http:\/\/www.scidb.org\/."},{"key":"e_1_2_1_37_1","unstructured":"http:\/\/forum.paradigm4.com\/t\/persistenting-data-to-remote-nodes\/1408\/8.  http:\/\/forum.paradigm4.com\/t\/persistenting-data-to-remote-nodes\/1408\/8."},{"key":"e_1_2_1_38_1","unstructured":"http:\/\/skyserver.sdss.org\/dr7.  http:\/\/skyserver.sdss.org\/dr7."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831365"},{"key":"e_1_2_1_40_1","unstructured":"http:\/\/spark.apache.org\/.  http:\/\/spark.apache.org\/."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2595633"},{"key":"e_1_2_1_42_1","volume-title":"Oct.","year":"2005","unstructured":"The Dark Energy Survey Collaboration. The Dark Energy Survey. ArXiv Astrophysics e-prints , Oct. 2005 . The Dark Energy Survey Collaboration. The Dark Energy Survey. ArXiv Astrophysics e-prints, Oct. 2005."},{"key":"e_1_2_1_43_1","unstructured":"http:\/\/www.tpc.org.  http:\/\/www.tpc.org."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2013.05.041"},{"key":"e_1_2_1_45_1","author":"Wandell B. A.","year":"2016","unstructured":"B. A. Wandell . Clarifying human white matter. Annu. Rev. Neurosci. , 1 Apr. 2016 . B. A. Wandell. Clarifying human white matter. Annu. Rev. Neurosci., 1 Apr. 2016.","journal-title":"Annu. Rev. Neurosci."},{"key":"e_1_2_1_46_1","volume-title":"CIDR","author":"Wang J.","year":"2017","unstructured":"J. Wang The myria big data management and analytics system and cloud services . In CIDR , 2017 . J. Wang et al. The myria big data management and analytics system and cloud services. In CIDR, 2017."},{"key":"e_1_2_1_47_1","volume-title":"NSDI","author":"Zaharia M.","year":"2012","unstructured":"M. Zaharia Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing . In NSDI , 2012 . M. Zaharia et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3137628.3137634","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:57:43Z","timestamp":1672221463000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3137628.3137634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":47,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2017,8]]}},"alternative-id":["10.14778\/3137628.3137634"],"URL":"https:\/\/doi.org\/10.14778\/3137628.3137634","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2017,8]]}}}