{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T09:11:11Z","timestamp":1768295471747,"version":"3.49.0"},"reference-count":8,"publisher":"SciPy","license":[{"start":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T00:00:00Z","timestamp":1720569600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>The increasing volume of research data in fields such as astronomy, biology,\nand engineering necessitates efficient distributed data management.\nTraditional commercial solutions are often unsuitable for the decentralized\ninfrastructure typical of academic projects. This paper presents the\nLibrarian, a custom framework designed for data transfer in large academic\ncollaborations, designed for the Simons Observatory (SO) as a ground up\nre-architechture of a previous astronomical data management tool called the\n\u2018HERA Librarian\u2019 from which it takes its name. SO is a new-generation\nobservatory designed for observing the Cosmic Microwave Background, and is\nlocated in the Atacama desert in Chile at over 5000 meters of elevation.<\/jats:p><jats:p>Existing tools like Globus Flows, iRODS, Rucio, and Datalad were evaluated\nbut were found to be lacking in automation or simplicity. Librarian\naddresses these gaps by integrating with Globus for efficient data transfer\nand providing a RESTful API for easy interaction. It also supports transfers\nthrough the movement of physical media for environments with intermittent\nconnectivity.<\/jats:p><jats:p>Using technologies like Python, FastAPI, and SQLAlchemy, the Librarian\nensures robust, scalable, and user-friendly data management tailored to the\nneeds of large-scale scientific projects. This solution demonstrates an\neffective method for managing the substantial data flows in modern \u2018big\nscience\u2019 endeavors.<\/jats:p>","DOI":"10.25080\/hwga5253","type":"proceedings-article","created":{"date-parts":[[2024,10,4]],"date-time":"2024-10-04T20:42:26Z","timestamp":1728074546000},"page":"236-246","source":"Crossref","is-referenced-by-count":1,"title":["Making Research Data Flow With Python"],"prefix":"10.25080","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1327-1921","authenticated-orcid":false,"given":"Josh","family":"Borrow","sequence":"first","affiliation":[{"name":"Department of Physics and Astronomy, University of Pennsylvania, 209 South 33rd Street, Philadelphia, PA, USA 19104"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4693-0102","authenticated-orcid":false,"given":"Paul La","family":"Plante","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Nevada, Las Vegas, NV 89154"},{"name":"Nevada Center for Astrophysics, University of Nevada, Las Vegas, NV 89154"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4810-666X","authenticated-orcid":false,"given":"James","family":"Aguirre","sequence":"additional","affiliation":[{"name":"Department of Physics and Astronomy, University of Pennsylvania, 209 South 33rd Street, Philadelphia, PA, USA 19104"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3734-3587","authenticated-orcid":false,"given":"Peter K. G.","family":"Williams","sequence":"additional","affiliation":[{"name":"Center for Astrophysics | Harvard & Smithsonian, 60 Garden St., Cambridge, MA 02138"}]}],"member":"32550","published-online":{"date-parts":[[2024,7,10]]},"reference":[{"key":"Ade2019","doi-asserted-by":"publisher","DOI":"10.1088\/1475-7516\/2019\/02\/056"},{"key":"Foster2011","doi-asserted-by":"publisher","DOI":"10.1109\/MIC.2011.64"},{"key":"Allen2012","doi-asserted-by":"publisher","DOI":"10.1145\/2076450.2076468"},{"key":"Rucio2019","doi-asserted-by":"publisher","DOI":"10.1007\/s41781-019-0026-3"},{"key":"Halchenko2021","doi-asserted-by":"publisher","DOI":"10.21105\/joss.03262"},{"key":"DeBoer2017","doi-asserted-by":"publisher","DOI":"10.1088\/1538-3873\/129\/974\/045001"},{"key":"LaPlante2020","doi-asserted-by":"publisher","DOI":"10.46620\/20-0041"},{"key":"LaPlante2021","doi-asserted-by":"publisher","DOI":"10.1016\/j.ascom.2021.100489"}],"event":{"name":"Python in Science Conference","location":"Tacoma, Washington","acronym":"SciPy","number":"23rd"},"container-title":["Proceedings of the Python in Science Conference","Proceedings of the 23rd Python in Science Conference"],"original-title":[],"deposited":{"date-parts":[[2024,11,5]],"date-time":"2024-11-05T21:10:13Z","timestamp":1730841013000},"score":1,"resource":{"primary":{"URL":"https:\/\/doi.curvenote.com\/10.25080\/HWGA5253"}},"subtitle":[],"proceedings-subject":"Scientific Computing with Python","short-title":[],"issued":{"date-parts":[[2024,7,10]]},"references-count":8,"URL":"https:\/\/doi.org\/10.25080\/hwga5253","relation":{},"ISSN":["2575-9752"],"issn-type":[{"value":"2575-9752","type":"print"}],"subject":[],"published":{"date-parts":[[2024,7,10]]}}}