{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T04:17:37Z","timestamp":1778300257768,"version":"3.51.4"},"reference-count":6,"publisher":"MIT Press - Journals","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Neural Computation"],"published-print":{"date-parts":[[1999,7,1]]},"abstract":"<jats:p> We investigate a class of hierarchical mixtures-of-experts (HME) models where generalized linear models with nonlinear mean functions of the form \u03c8(\u03b1 + x<jats:sup>T<\/jats:sup>\u03b2) are mixed. Here \u03c8(\u00b7) is the inverse link function. It is shown that mixtures of such mean functions can approximate a class of smooth functions of the form \u03c8(h(x)), where h(\u00b7) \u03b5 W<jats:sup>\u221e<\/jats:sup><jats:sub>2;k<\/jats:sub> (a Sobolev class over [0, 1]<jats:sup>s<\/jats:sup>, as the number of experts m in the network increases. An upper bound of the approximation rate is given as O(m<jats:sup>\u22122\/s<\/jats:sup>) in L<jats:sub>p<\/jats:sub> norm. This rate can be achieved within the family of HME structures with no more than s-layers, where s is the dimension of the predictor x. <\/jats:p>","DOI":"10.1162\/089976699300016403","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T11:55:01Z","timestamp":1027770901000},"page":"1183-1198","source":"Crossref","is-referenced-by-count":35,"title":["On the Approximation Rate of Hierarchical Mixtures-of-Experts for Generalized Linear Models"],"prefix":"10.1162","volume":"11","author":[{"given":"Wenxin","family":"Jiang","sequence":"first","affiliation":[{"name":"Department of Statistics, Northwestern University, Evanston, IL 60208, U.S.A."}]},{"given":"Martin A.","family":"Tanner","sequence":"additional","affiliation":[{"name":"Department of Statistics, Northwestern University, Evanston, IL 60208, U.S.A."}]}],"member":"281","reference":[{"key":"p_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1991.3.1.79"},{"key":"p_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1994.6.2.181"},{"key":"p_3","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(95)00014-3"},{"key":"p_5","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1996.8.1.164"},{"key":"p_6","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1996.10476965"},{"key":"p_7","doi-asserted-by":"publisher","DOI":"10.1109\/18.669150"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/089976699300016403","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:23:27Z","timestamp":1615584207000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/11\/5\/1183-1198\/6273"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1999,7,1]]},"references-count":6,"journal-issue":{"issue":"5","published-print":{"date-parts":[[1999,7,1]]}},"alternative-id":["10.1162\/089976699300016403"],"URL":"https:\/\/doi.org\/10.1162\/089976699300016403","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published":{"date-parts":[[1999,7,1]]}}}