{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T19:47:26Z","timestamp":1773085646425,"version":"3.50.1"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2005,6,1]],"date-time":"2005-06-01T00:00:00Z","timestamp":1117584000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2005,6]]},"DOI":"10.1007\/s10994-005-0469-0","type":"journal-article","created":{"date-parts":[[2005,6,10]],"date-time":"2005-06-10T11:48:18Z","timestamp":1118404098000},"page":"297-322","source":"Crossref","is-referenced-by-count":62,"title":["Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers"],"prefix":"10.1007","volume":"59","author":[{"given":"Russell","family":"Greiner","sequence":"first","affiliation":[]},{"given":"Xiaoyuan","family":"Su","sequence":"additional","affiliation":[]},{"given":"Bin","family":"Shen","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Zhou","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"469_CR1","unstructured":"Abe, N., Takeuchi, J., & Warmuth, M. (1991). Polynomial learnability of probablistic concepts with respect to the Kullback-Leibler divergence. In Conference on Learning Theory."},{"key":"469_CR2","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1023\/A:1007421730016","volume":"29","author":"J. Binder","year":"1997","unstructured":"Binder, J., Koller, D., Russell, S., & Kanazawa, K. (1997). Adaptive probabilistic networks with hidden variables. Machine Learning, 29, 213\u2013244.","journal-title":"Machine Learning"},{"key":"469_CR3","unstructured":"Bishop, C. (1998). Neural networks for pattern recognition. Oxford."},{"key":"469_CR4","unstructured":"Blake, C., & Merz, C. (2000). UCI repository of machine learning databases, http:\/\/www.ics.uci.edu\/\u223cmlearn\/MLRepository.html."},{"key":"469_CR5","doi-asserted-by":"crossref","unstructured":"Boyd, S., & Vandenberghe. L. (2004). Convex optimization. Cambridge.","DOI":"10.1017\/CBO9780511804441"},{"key":"469_CR6","doi-asserted-by":"crossref","unstructured":"Buntine, W. (1996). A guide to the literature on learning probabilistic networks from data. IEEE Transactions on Knowledge and Data Engineering.","DOI":"10.1109\/69.494161"},{"key":"469_CR7","unstructured":"Cerquides J., & de M\u00e1ntaras, R.L. (2003). Tractable Bayesian learning of tree augmented na\u00efve Bayes models. In International Conference on Machine Learning (pp. 75\u201382)."},{"key":"469_CR8","unstructured":"Cheng J., & Greiner, R. (1999). Comparing Bayesian network classifiers. In Uncertainty in Artificial Intelligence."},{"key":"469_CR9","doi-asserted-by":"crossref","unstructured":"Cheng, J., Greiner, R., Kelly, J., Bell, D., & Liu, W. (2002). Learning Bayesian networks from data: An information-theory based approach. Artificial Intelligence, 137.","DOI":"10.1016\/S0004-3702(02)00191-1"},{"key":"469_CR10","unstructured":"Chickering, D. M., Geiger, D., & Heckerman, D. (1994). Learning Bayesian networks is NP-hard. Technical Report MSR-TR-94-17, Microsoft Research."},{"key":"469_CR11","doi-asserted-by":"crossref","unstructured":"Chou, W., Juang, B., & Lee, C. (1992). Segmental GPD training of HMM based speech recognizer. In International Conference on Acoustics, Speech and Signal Processing, vol. 1 (pp. 473\u2013476).","DOI":"10.1109\/ICASSP.1992.225869"},{"key":"469_CR12","doi-asserted-by":"crossref","unstructured":"Chow, C., & Liu, C. (1968). Approximating discrete probability distributions with dependence trees. IEEE Tans. on Information Theory (pp. 462\u2013467).","DOI":"10.1109\/TIT.1968.1054142"},{"key":"469_CR13","doi-asserted-by":"crossref","unstructured":"Cooper, G. (1990). The computational complexity of probabilistic inference using Bayesian belief networks. Artificial intelligence, 42.","DOI":"10.1016\/0004-3702(90)90060-D"},{"key":"469_CR14","first-page":"309","volume":"9","author":"G. Cooper","year":"1992","unstructured":"Cooper, G., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309\u2013347.","journal-title":"Machine Learning"},{"key":"469_CR15","volume-title":"Analysis of binary data","author":"D. R. Cox","year":"1989","unstructured":"Cox, D. R., & Snell, E. J. (1989). Analysis of binary data. Chapman & Hall, London,"},{"key":"469_CR16","unstructured":"Darwiche, A. (2000). A differential approach to inference in Bayesian networks. In Uncertainty in Artificial Intelligence."},{"key":"469_CR17","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1023\/A:1007417612269","volume":"29","author":"S. Dasgupta","year":"1997","unstructured":"Dasgupta, S. (1997). The sample complexity of Learning fixed-structure Bayesian networks. Machine Learning, 29, 165\u2013180,","journal-title":"Machine Learning"},{"key":"469_CR18","unstructured":"Dash D., & Cooper, G. (2002). Exact model averaging with na\u00efve Bayesian classifiers. In International Conference on Machine Learning (pp. 91\u201398)."},{"key":"469_CR19","doi-asserted-by":"crossref","first-page":"647","DOI":"10.2307\/2529753","volume":"32","author":"A. P. Dawid","year":"1976","unstructured":"Dawid, A. P. (1976). Properties of diagnostic data distributions. Biometrics, 32, 647\u2013658.","journal-title":"Biometrics"},{"key":"469_CR20","doi-asserted-by":"crossref","unstructured":"Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. (with discussion). J. Royal Statistics Society, Series B, 39.","DOI":"10.1111\/j.2517-6161.1977.tb01600.x"},{"key":"469_CR21","unstructured":"Duda, R., & Hart, P. (1973). Pattern classification and scene analysis. Wiley."},{"key":"469_CR22","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1093\/biomet\/88.4.961","volume":"88","author":"D. Edwards","year":"2001","unstructured":"Edwards, D., & Lauritzen, S. (2001). The TM algorithm for maximising a conditional likelihood function. Biometrika, 88, 961\u2013972.","journal-title":"Biometrika"},{"key":"469_CR23","unstructured":"Fayyad, U., & Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In International Joint Conferences on Artificial Intelligence."},{"key":"469_CR24","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1023\/A:1007465528199","volume":"29","author":"N. Friedman","year":"1997","unstructured":"Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning Journal, 29, 131\u2013163.","journal-title":"Machine Learning Journal"},{"key":"469_CR25","unstructured":"Greiner, R., Grove, A., & Schuurmans, D.(1997). Learning Bayesian nets that perform well. In Uncertainty in Artificial Intelligence."},{"key":"469_CR26","unstructured":"Greiner, R., & Zhou, W. (2002). Structural extension to logistic regression: Discriminant parameter learning of belief net classifiers. In American Association of Artificial Intelligence."},{"key":"469_CR27","doi-asserted-by":"crossref","unstructured":"Grossman, D., & Domingos, P. (2004). Learning Bayesian network classifiers by maximizing conditional likelihood. In International Conference on Machine Learning.","DOI":"10.1145\/1015330.1015339"},{"key":"469_CR28","volume-title":"Neural network design","author":"M. Hagan","year":"1996","unstructured":"Hagan, M., Demuth, H., & Beale, M. (1996). Neural network design. Boston, MA, PWS Publishing."},{"key":"469_CR29","doi-asserted-by":"crossref","unstructured":"Heckerman, D. E. (1998). A tutorial on learning with Bayesian networks. In M. I. Jordan (Ed.), Learning in graphical models.","DOI":"10.1007\/978-94-011-5014-9_11"},{"key":"469_CR30","unstructured":"Jaakkola, T., Meila, M., & Jebara, T. (2000). Maximum entropy discrimination. In Neural Information Processing Systems."},{"key":"469_CR31","unstructured":"Jordan, M. (1995). Why the logistic function? A tutorial discussion on probabilities and neural networks."},{"key":"469_CR32","unstructured":"Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In International Joint Conference on Artificial Intelligence."},{"key":"469_CR33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0004-3702(97)00043-X","volume":"97","author":"R. Kohavi","year":"1997","unstructured":"Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97:1\u20132.","journal-title":"Artificial intelligence"},{"key":"469_CR34","unstructured":"Kontkanen, P., Myllym\u00e4ki, P., Silander, T., & Tirri, H. (1999). On supervised selection of Bayesian networks. In Uncertainty in Artificial Intelligence (pp. 334\u2013342)."},{"key":"469_CR35","unstructured":"Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning."},{"key":"469_CR36","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/0167-9473(93)E0056-A","volume":"19","author":"S. Lauritzen","year":"1995","unstructured":"Lauritzen, S. (1995). The EM algorithm for graphical association models with missing data. Computational Statistics and Data Analysis, 19, 191\u2013201.","journal-title":"Computational Statistics and Data Analysis"},{"key":"469_CR37","volume-title":"Statistical analysis with missing data","author":"J. A. Little","year":"1987","unstructured":"Little, J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York, Wiley."},{"key":"469_CR38","doi-asserted-by":"crossref","unstructured":"McCullagh, P., & Nelder, J. (1989). Generalized linear models. Chapman and Hall.","DOI":"10.1007\/978-1-4899-3242-6"},{"key":"469_CR39","unstructured":"Minka. T. (2001). Algorithms for maximum-likelihood logistic regression. Technical report, CMU CALD, http:\/\/www.stat.cmu.edu\/\u223cminka\/papers\/logreg\/minka-logreg.pdf."},{"key":"469_CR40","unstructured":"Mitchell, T. M. (1997). Machine learning. McGraw-Hill."},{"key":"469_CR41","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1023\/A:1024068626366","volume":"52","author":"C. Nadeau","year":"2003","unstructured":"Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239\u2013281,","journal-title":"Machine Learning"},{"key":"469_CR42","doi-asserted-by":"crossref","unstructured":"Nesterov, Y., & Nemirovskii, A. (1994). Interior-point polynomial methods in convex programming, Society for Industrial and Applied Mathematics.","DOI":"10.1137\/1.9781611970791"},{"key":"469_CR43","unstructured":"Ng, A., & Jordan, M. (2001). On discriminative versus generative classifiers: A comparison of logistic regression and naive Bayes. In Neural Information Processing Systems."},{"key":"469_CR44","volume-title":"Probabilistic reasoning in intelligent systems: Networks of plausible inference","author":"J. Pearl","year":"1988","unstructured":"Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo: Morgan Kaufmann."},{"key":"469_CR45","unstructured":"Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. (2002). Numerical recipes in C. Cambridge, http:\/\/www.nr.com\/."},{"key":"469_CR46","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511812651","volume-title":"Pattern recognition and neural networks","author":"B. Ripley","year":"1996","unstructured":"Ripley, B. (1996). Pattern recognition and neural networks, Cambridge, UK: Cambridge University Press."},{"issue":"3","key":"469_CR47","first-page":"269","volume":"59","author":"T. Roos","year":"2005","unstructured":"Roos, T., Wettig, H., Gr\u00fcnwald, P., & Myllym\u00e4ki, P., Tirri, H. (2005). On discriminative Bayesian network classifiers and logistic regression. Machine Learning, 59:3, 269\u2013298.","journal-title":"Machine Learning"},{"key":"469_CR48","doi-asserted-by":"crossref","unstructured":"Schl\u00fcter, R., Macherey, W., Kanthak, S., Ney, H., & Welling, L. (1997). Comparison of optimization methods for discriminative training criteria. In Proc. of European Conference on Speech Communication and Technology.","DOI":"10.21437\/Eurospeech.1997-10"},{"key":"469_CR49","unstructured":"Shen, B., Su, X., Greiner, R., Musilek, P., & Cheng, C. (2003). Discriminative parameter learning of general Bayesian network classifiers. In International Conference on Tools with Artificial Intelligence."},{"key":"469_CR50","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1093\/biomet\/89.2.478","volume":"89","author":"R. Sundberg","year":"2002","unstructured":"Sundberg, R. (2002). The convergence rate of the TM algorithm of Edwards and Lauritzen. Biometrika, 89, 478\u2013483.","journal-title":"Biometrika"},{"key":"469_CR51","unstructured":"Van Allen, T., & Greiner, R. (2000). Model selection criteria for learning belief nets: An empirical comparison. In International Conference on Machine Learning (pp. 1047\u20131054)."},{"key":"469_CR52","unstructured":"Zhou, W. (2002). Discriminative learning of Bayesian net parameters. Master\u2019s thesis, Dept of Computing Science, University of Alberta."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-005-0469-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10994-005-0469-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-005-0469-0","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,26]],"date-time":"2024-01-26T13:40:19Z","timestamp":1706276419000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10994-005-0469-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,6]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2005,6]]}},"alternative-id":["469"],"URL":"https:\/\/doi.org\/10.1007\/s10994-005-0469-0","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,6]]}}}