{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T07:28:53Z","timestamp":1693639733610},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>Mining frequent subtrees in a database of rooted and labeled trees is an important problem in many domains, ranging from phylogenetic analysis to biochemistry and from linguistic parsing to XML data analysis. In this work we revisit this problem and develop an architecture conscious solution targeting emerging multicore systems. Specifically we identify a sequence of memory related optimizations that significantly improve the spatial and temporal locality of a state-of-the-art sequential algorithm -- alleviating the effects of memory latency. Additionally, these optimizations are shown to reduce the pressure on the front-side bus, an important consideration in the context of large-scale multicore architectures. We then demonstrate that these optimizations while necessary are not sufficient for efficient parallelization on multicores, primarily due to parametric and data-driven factors which make load balancing a significant challenge. To address this challenge, we present a methodology that adaptively and automatically modulates the type and granularity of the work being shared among different cores. The resulting algorithm achieves near perfect parallel efficiency on up to 16 processors on challenging real world applications. The optimizations we present have general purpose utility and a key out-come is the development of a general purpose scheduling service for moldable task scheduling on emerging multicore systems.<\/jats:p>","DOI":"10.14778\/1687627.1687706","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"694-705","source":"Crossref","is-referenced-by-count":24,"title":["Mining tree-structured data on multicore systems"],"prefix":"10.14778","volume":"2","author":[{"given":"Shirish","family":"Tatikonda","sequence":"first","affiliation":[{"name":"The Ohio State University, Columbus, OH"}]},{"given":"Srinivasan","family":"Parthasarathy","sequence":"additional","affiliation":[{"name":"The Ohio State University, Columbus, OH"}]}],"member":"320","published-online":{"date-parts":[[2009,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/69558.75700"},{"key":"e_1_2_1_2_1","series-title":"Genome Informatics Series","first-page":"134","volume-title":"Efficient Tree-Matching Methods for Accurate Carbohydrate Database Queries","author":"Aoki K.","year":"2003"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972726.10"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/850947.853341"},{"key":"e_1_2_1_5_1","first-page":"229","volume-title":"Finding patterns in time series: a dynamic programming approach. Advances in knowledge discovery and data mining","author":"Berndt D.","year":"1996"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.15"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150416"},{"key":"e_1_2_1_8_1","first-page":"1031","volume-title":"AAAI","author":"Charniak E.","year":"1996"},{"issue":"1","key":"e_1_2_1_9_1","first-page":"161","volume":"66","author":"Chi Y.","year":"2005","journal-title":"Frequent Subtree Mining-An Overview. Fundamenta Informaticae"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24775-3_9"},{"key":"e_1_2_1_11_1","volume-title":"Exploring the repertoire of RNA secondary motifs using graph theory","author":"Gan H. H.","year":"2003"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335372"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btn293"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/956417.956569"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/5.3.205"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.598280"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-2836(70)90057-4"},{"key":"e_1_2_1_18_1","first-page":"55","volume-title":"MGTS","author":"Nijssen S.","year":"2003"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453924"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8191(95)00017-I"},{"key":"e_1_2_1_21_1","first-page":"288","volume-title":"ICDE","author":"Rao P.","year":"2004"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/967900.968018"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1272996.1273006"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/6.4.309"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/977401.978138"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/846183.846188"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(93)90181-8"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/11731139_52"},{"key":"e_1_2_1_30_1","first-page":"63","volume-title":"VLDB","author":"Tatikonda S.","year":"2007"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183680"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2005.55"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/1032649.1033526"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/321796.321811"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24775-3_54"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/SSDM.2003.1214978"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362674"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/113445.113449"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/951949.952165"},{"key":"e_1_2_1_40_1","first-page":"721","volume-title":"ICDM","author":"Yan X.","year":"2002"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-004-0134-4"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.125"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/4434.806975"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956787"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/846218.847248"},{"key":"e_1_2_1_46_1","first-page":"126","volume-title":"IEEE Joint Symposia on Intelligence and Systems","author":"Zhang K.","year":"1998"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-39429-7_10"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1687627.1687706","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:31:05Z","timestamp":1672227065000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1687627.1687706"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.14778\/1687627.1687706"],"URL":"https:\/\/doi.org\/10.14778\/1687627.1687706","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2009,8]]}}}