{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:49:12Z","timestamp":1760237352705,"version":"build-2065373602"},"reference-count":54,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2020,4,9]],"date-time":"2020-04-09T00:00:00Z","timestamp":1586390400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>A predictable aggregation of dwarf minke whales (Balaenoptera acutorostrata subspecies) occurs annually in the Australian waters of the northern Great Barrier Reef in June\u2013July, which has been the subject of a long-term photo-identification study. Researchers from the Minke Whale Project (MWP) at James Cook University collect large volumes of underwater digital imagery each season (e.g., 1.8TB in 2018), much of which is contributed by citizen scientists. Manual processing and analysis of this quantity of data had become infeasible, and Convolutional Neural Networks (CNNs) offered a potential solution. Our study sought to design and train a CNN that could detect whales from video footage in complex near-surface underwater surroundings and differentiate the whales from people, boats and recreational gear. We modified known classification CNNs to localise whales in video frames and digital still images. The required high classification accuracy was achieved by discovering an effective negative-labelling training technique. This resulted in a less than 1% false-positive classification rate and below 0.1% false-negative rate. The final operation-version CNN-pipeline processed all videos (with the interval of 10 frames) in approximately four days (running on two GPUs) delivering 1.95 million sorted images.<\/jats:p>","DOI":"10.3390\/info11040200","type":"journal-article","created":{"date-parts":[[2020,4,9]],"date-time":"2020-04-09T14:42:03Z","timestamp":1586443323000},"page":"200","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Automatic Sorting of Dwarf Minke Whale Underwater Images"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3320-8938","authenticated-orcid":false,"given":"Dmitry A.","family":"Konovalov","sequence":"first","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"},{"name":"Marine Data Technology Hub, James Cook University, Townsville, QLD 4811, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2054-4836","authenticated-orcid":false,"given":"Natalie","family":"Swinhoe","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3809-9937","authenticated-orcid":false,"given":"Dina B.","family":"Efremova","sequence":"additional","affiliation":[{"name":"Funbox Inc., 119017 Moscow, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4307-693X","authenticated-orcid":false,"given":"R. Alastair","family":"Birtles","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martha","family":"Kusetic","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Suzanne","family":"Hillcoat","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Matthew I.","family":"Curnock","sequence":"additional","affiliation":[{"name":"CSIRO Land and Water, James Cook University, Townsville, QLD 4811, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Genevieve","family":"Williams","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0662-3439","authenticated-orcid":false,"given":"Marcus","family":"Sheaves","sequence":"additional","affiliation":[{"name":"College of Science and Engineering, James Cook University, Townsville, QLD 4181, Australia"},{"name":"Marine Data Technology Hub, James Cook University, Townsville, QLD 4811, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,4,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"247","DOI":"10.3389\/fmars.2019.00247","article-title":"Common and Antarctic minke whales: Conservation status and future research directions","volume":"6","author":"Risch","year":"2019","journal-title":"Front. Mar. Sci."},{"key":"ref_2","first-page":"1","article-title":"External characters of southern minke whales and the existence of a diminutive form","volume":"36","author":"Best","year":"1985","journal-title":"Sci. Rep. Whales Res. Inst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1080\/14724041003690468","article-title":"Attraction of dwarf minke whales Balaenoptera acutorostrata to vessels and swimmers in the Great Barrier Reef world heritage area\u2014The management challenges of an inquisitive whale","volume":"10","author":"Mangott","year":"2011","journal-title":"J. Ecotourism"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1071\/AM02023","article-title":"Commercial swim programs with dwarf minke whales on the northern Great Barrier Reef, Australia: Some characteristics of the encounters with management implications","volume":"24","author":"Birtles","year":"2002","journal-title":"Aust. Mammal."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.3727\/154427313X13659574649867","article-title":"Increased use levels, effort, and spatial distribution of tourists swimming with dwarf minke whales at the Great Barrier Reef","volume":"9","author":"Curnock","year":"2013","journal-title":"Tour. Mar. Environ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3038","DOI":"10.1121\/1.1371763","article-title":"Localization and visual verification of a complex minke whale vocalization","volume":"109","author":"Gedamke","year":"2001","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_7","first-page":"277","article-title":"Colour patterns of the dwarf minke whale Balaenoptera acutorostrata sensu lato: Description, cladistic analysis and taxonomic implications","volume":"51","author":"Arnold","year":"2005","journal-title":"Mem. Qld. Mus."},{"key":"ref_8","unstructured":"Sobtzick, S. (2010). Dwarf Minke Whales in the Northern Great Barrier Reef And Implications for the Sustainable Management of the Swim-With Whales Industry. [Ph.D. Thesis, James Cook University]. Available online: https:\/\/bit.ly\/2DORPRM."},{"key":"ref_9","first-page":"1","article-title":"The occurrence of two forms of minke whales in east Australian waters with description of external characters and skeleton of the diminutive form","volume":"38","author":"Arnold","year":"1987","journal-title":"Sci. Rep. Whales Res. Inst."},{"key":"ref_10","first-page":"25","article-title":"Individual minke whale recognition using deep learning convolutional neural networks","volume":"6","author":"Konovalov","year":"2018","journal-title":"J. Geosci. Environ. Prot."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zuiderveld, K. (1994). Contrast Limited Adaptive Histogram Equalization, Academic Press Professional. Graphic Gems IV.","DOI":"10.1016\/B978-0-12-336156-1.50061-6"},{"key":"ref_12","unstructured":"Chaudhuri, B.B., Kankanhalli, M.S., and Raman, B. (2018). Wild animal detection using deep convolutional neural network. Proceedings of the 2nd International Conference on Computer Vision & Image Processing, Roorkee, India, 9\u201312 September 2017, Springer."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.fishres.2010.10.011","article-title":"Comparison of visual census and high definition video transects for monitoring coral reef fish assemblages","volume":"107","author":"Pelletier","year":"2011","journal-title":"Fish. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recogn. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the Inception architecture for computer vision. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_19","unstructured":"Wilson, R.C., Hancock, E.R., and Smith, W.A.P. (2016, January 19\u201322). Wide residual networks. Proceedings of the British Machine Vision Conference (BMVC), York, UK."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Konovalov, D.A., Jahangard, S., and Schwarzkopf, L. (2018, January 10\u201313). In situ cane toad recognition. Proceedings of the 2018 Digital Image Computing: Techniques and Applications (DICTA), Canberra, Australia.","DOI":"10.1109\/DICTA.2018.8615780"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2058","DOI":"10.1038\/s41598-018-38343-3","article-title":"DeepWeeds: A multiclass weed species image dataset for deep learning","volume":"9","author":"Olsen","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhang, Q., Yang, Y., Ma, H., and Wu, Y.N. (2019, January 16\u201320). Interpreting CNNs via decision trees. Proceedings of the CVPR IEEE, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00642"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Mahendran, A., and Vedaldi, A. (2015, January 7\u201312). Understanding deep image representations by inverting them. Proceedings of the CVPR IEEE, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299155"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Holzinger, A., Kieseberg, P., Tjoa, A.M., and Weippl, E. (2018). Explainable AI: The new 42?. Machine Learning and Knowledge Extraction, Springer International Publishing.","DOI":"10.1007\/978-3-319-99740-7"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards real-time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal."},{"key":"ref_26","unstructured":"Ohara, K., and Bai, Q. (2019). Marine vertebrate predator detection and recognition in underwater videos by region convolutional neural network. Knowledge Management and Acquisition for Intelligent Systems, Springer International Publishing."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the CVPR IEEE, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201312). Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., and Guadarrama, S. (2017, January 21\u201326). Speed\/accuracy trade-offs for modern convolutional object detectors. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.351"},{"key":"ref_30","unstructured":"Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., and Duerig, T. (2018). The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. arXiv."},{"key":"ref_31","unstructured":"Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Kuznetsova, A., Rom, H., Uijlings, J., Popov, S., and Kamali, S. (2020, April 04). OpenImages: A Public Dataset for Large-Scale Multi-Label and Multi-Class Image Classification. Available online: https:\/\/bit.ly\/34lGYLn."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully convolutional networks for semantic segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Konovalov, D.A., Saleh, A., Bradley, M., Sankupellay, M., Marini, S., and Sheaves, M. (2019). Underwater fish detection with weak multi-domain supervision. IEEE IJCNN, 1\u20138.","DOI":"10.1109\/IJCNN.2019.8851907"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","article-title":"The Pascal visual object classes challenge: A retrospective","volume":"111","author":"Everingham","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1016\/j.neunet.2018.07.011","article-title":"A systematic study of the class imbalance problem in convolutional neural networks","volume":"106","author":"Buda","year":"2018","journal-title":"Neural Netw."},{"key":"ref_37","unstructured":"Birtles, A., Arnold, P., Curnock, M., Salmon, S., Mangott, A., Sobtzick, S., Valentine, P., Caillaud, A., and Rumney, J. (2020, April 04). Code of Practice for Dwarf Minke Whale Interactions in the Great Barrier Reef World Heritage Area. Available online: https:\/\/bit.ly\/36mObKD."},{"key":"ref_38","unstructured":"Simonyan, K., and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Oquab, M., Bottou, L., Laptev, I., and Sivic, J. (2014, January 23\u201328). Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of the CVPR IEEE, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.222"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 16\u201320). ImageNet: A large-scale hierarchical image database. Proceedings of the CVPR IEEE, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_41","unstructured":"Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., and Han, J. (2019). On the variance of the adaptive learning rate and beyond. arXiv."},{"key":"ref_42","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_43","first-page":"1139","article-title":"On the importance of initialization and momentum in deep learning","volume":"Volume 28","author":"Dasgupta","year":"2013","journal-title":"Proceedings of the 30th International Conference on Machine Learning"},{"key":"ref_44","first-page":"343","article-title":"No more pesky learning rates","volume":"Volume 28","author":"Dasgupta","year":"2013","journal-title":"Proceedings of the 30th International Conference on Machine Learning"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Leibe, B., Matas, J., Sebe, N., and Welling, M. (2016). Identity mappings in deep residual networks. Computer Vision\u2014ECCV 2016, Springer International Publishing.","DOI":"10.1007\/978-3-319-46454-1"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Cubuk, E.D., Zoph, B., Man\u00e9, D., Vasudevan, V., and Le, Q.V. (2018). AutoAugment: Learning augmentation policies from data. arXiv.","DOI":"10.1109\/CVPR.2019.00020"},{"key":"ref_47","unstructured":"Loshchilov, I., and Hutter, F. (2016). SGDR: Stochastic gradient descent with restarts. arXiv."},{"key":"ref_48","unstructured":"Smith, L.N. (2015). No more pesky learning rate guessing games. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Konovalov, D.A., Saleh, A., Efremova, D.B., Domingos, J.A., and Jerry, D.R. (2019, January 2\u20134). Automatic weight estimation of harvested fish from images. Proceedings of the 2019 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Perth, Australia.","DOI":"10.1109\/DICTA47822.2019.8945971"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Howard, J., and Gugger, S. (2020). Fastai: A layered API for deep learning. Information, 11.","DOI":"10.3390\/info11020108"},{"key":"ref_51","unstructured":"Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer, A. (2017, January 4\u20139). Automatic differentiation in PyTorch. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_52","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv."},{"key":"ref_53","unstructured":"Krizhevsky, A. (2020, April 04). Learning Multiple Layers of Features From Tiny Images. Available online: https:\/\/bit.ly\/2HfABiJ."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M., and Kalinin, A.A. (2020). Albumentations: Fast and flexible image augmentations. Information, 11.","DOI":"10.3390\/info11020125"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/4\/200\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:17:01Z","timestamp":1760174221000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/4\/200"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,9]]},"references-count":54,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["info11040200"],"URL":"https:\/\/doi.org\/10.3390\/info11040200","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2020,4,9]]}}}