Legends: | Link to a Model | Reference cited by multiple papers |
References and models cited by this paper | References and models that cite this paper | |||
Akaho S, Kappen HJ (2000) Nonmonotonic generalization bias of Gaussian mixture models. Neural Comput 12:1411-27 [PubMed] Akaike H (1974) A new look at the statistical model identification IEEE Trans Appl Comp 19:716-723 Amari S (1967) Theory of adaptive pattern classifiers IEEE Trans 16:299-307 Amari S (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27:77-87 [PubMed] Amari S (1987) Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections and divergence Mathematical Systems Theory 20:53-82 Amari S (1998) Natural gradient works efficiently in learning Neural Comput 10:251-276 Amari S (2003) New consideration on criteria of model selection Neural networks and soft computing (Proceedings of the Sixth International Conference on Neural Networks and Soft Computing), Rutkowski L:Kacprzyk J, ed. pp.25 Amari S, Burnashev MV (2003) On some singularities in parameter estimation problems Problems Of Information Transmission 39:352-372 Amari S, Murata N (1993) Statistical theory of learning curves under entropic loss criterion Neural Comput 5:140-154 Amari S, Nagaoka H (2000) Methods of information geometry Amari S, Nakahara H (2005) Difficulty of singularity in population coding. Neural Comput 17:839-58 [Journal] [PubMed] Amari S, Ozeki T (2001) Differential and algebraic geometry of multilayer perceptrons IEICE Trans 84:31-38 Amari S, Ozeki T, Park H (2003) Learning and inference in hierarchical models with singularities Systems And Computers In Japan 34:34-42 Amari S, Park H, Fukumizu K (2000) Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput 12:1399-409 [PubMed] Amari S, Park H, Ozeki T (2001) Statistical inference in nonidentifiable and singular statistical models J Of The Korean Statistical Society 30:179-192 Amari S, Park H, Ozeki T (2002) Geometrical singularities in the neuromani-fold of multilayer perceptrons Advances in neural information processing systems, Dietterich TG:Becker S:Ghahramani Z, ed. pp.343 Brockett RW (1976) Some geometric questions in the theory of linear systems IEEE Trans On Automatic Control 21:449-455 Chen AM, Lu H, Hecht-nielsen R (1993) On the geometry of feedforward neural network error surfaces Neural Comput 5:910-927 Dacunha-castelle D, Gassiat E (1997) Testing in locally conic models, and application to mixture models Probability And Statistics 1:285-317 Fukumizu K (1999) Generalization error of linear neural networks in unidentifiable cases Algorithmic learning theory: Proceedings of the 10th International Conference on Algorithmic Learning Theory (ALT99), Watanabe O:Yokomori T, ed. pp.51 Fukumizu K (2003) Likelihood ratio of unidentifiable models and multilayer neural networks Annals Of Statistics 31:833-851 Fukumizu K, Amari S (2000) Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Netw 13:317-27 [PubMed] Hagiwara K (2002) On the problem in model selection of neural network regression in overrealizable scenario. Neural Comput 14:1979-2002 [Journal] [PubMed] Hagiwara K (2002) Regularization learning, early stopping and biased estimator Neurocomputing 48:937-955 Hagiwara K, Hayasaka T, Toda N, Usui S, Kuno K (2001) Upper bound of the expected training error of neural network regression for a Gaussian noise sequence. Neural Netw 14:1419-29 [PubMed] Hagiwara K, Toda N, Usui S (1993) On the problem of applying AIC to determine the structure of a layered feed-forward neural network Proceedings Of IJCNN 3:2263-2266 Hartigan JA (1985) A failure of likelihood asymptotics for normal mixtures Proc Barkeley Conf in Honor of J Neyman and J Kiefer 2:807-810 Hotelling H (1939) Tubes and spheres in n-spaces, and a class of statistical problems Amer J Math 61:440-460 Inoue M, Park H, Okada M (2003) On-line learning theory of soft committee machines with correlated hidden units-Steepest gradient descent and natural gradient descent J Phys Soc Jpn 72:805-810 Kang K, Oh JH, Kwon S, Park Y (1993) Generalization in a two-layer neural networks Phys Rev E 48:4805-4809 Kitahara M, Hayasaka T, Toda N, Usui S (2000) On the statistical properties of least squares estimators of layered neural networks (in Japanese) IEICE Transactions 86:563-570 Kurkova V, Kainen PC (1994) Functionally equivalent feedforward neural networks Neural Comput 6:543-558 Liu X, Shao Y (2003) Asymptotics for likelihood ratio tests under loss of identifiability Annals Of Statistics 31:807-832 Minsky M (1969) Perceptrons Murata N, Yoshizawa S, Amari S (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5:865-72 [Journal] [PubMed] Park H, Amari SI, Fukumizu K (2000) Adaptive natural gradient learning algorithms for various stochastic models. Neural Netw 13:755-64 [PubMed] Park H, Inoue M, Okada M (2003) On-line learning dynamics of multilayer perceptrons with unidentifiable parameters J Phys A Math Gen 36:11753-11764 Rattray M, Saad D (1999) Analysis of natural gradient descent for multilayer neural networks Phys Rev E 59:4523-4532 Rattray M, Saad D, Amari S (1998) Natural gradient descent for on-line learning Phys Rev Lett 81:5461-5464 Riegler P, Biehl M (1995) On-line backpropagation in two-layered neural networks J Phys A Math Gen 28:L507-L513 Risssanen J (1986) Stochastic complexity and modeling Ann Statist 14:1080-1100 Rosenblatt F (1962) Principles Of Neurodynamics Ruger SM, Ossen A (1997) The metric of weight space Neural Processing Letters 5:63-72 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation Parallel Distributed Processing, Rumelhart DE:McClelland JL, ed. pp.318 Saad D, Solla SA (1995) On-line learning in soft committee machines. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 52:4225-4243 [PubMed] Schwarz G (1978) Estimating the dimension of a model Ann Stat 6:461-464 Sussmann HJ (1992) Uniqueness of the weights for minimal feedforward nets with a given input-output map Neural Netw 5:589-593 Watanabe S (2001) Algebraic geometrical methods for hierarchical learning machines. Neural Netw 14:1049-60 [PubMed] Watanabe S (2001) Algebraic information geometry for learning machines with singularities Advances in neural information processing systems, Leen TK:Dietterich TG:Tresp V, ed. pp.329 Watanabe S (2001) Algebraic analysis for nonidentifiable learning machines. Neural Comput 13:899-933 [PubMed] Watanabe S, Amari S (2003) Learning coefficients of layered models when the true distribution mismatches the singularities Neural Comput 15:1013-1033 Weyl H (1939) On the volume of tubes Amer J Math 61:461-472 Wu S, Amari S, Nakahara H (2002) Population coding and decoding in a neural field: a computational study. Neural Comput 14:999-1026 [Journal] [PubMed] Wu S, Nakahara H, Amari S (2001) Population coding with correlation and an unfaithful model. Neural Comput 13:775-97 [PubMed] Yamazaki K, Watanabe S (2002) A probabilistic algorithm to calculate the learning curves of hierarchical learning machines with singularities Trans on IEICE 85:363-372 Yamazaki K, Watanabe S (2003) Singularities in mixture models and upper bounds of stochastic complexity. Neural Netw 16:1029-38 [Journal] [PubMed] | Hiratani N, Fukai T (2018) Redundancy in synaptic connections enables neurons to learn optimally. Proc Natl Acad Sci U S A 115:E6871-E6879 [Journal] [PubMed]
Nakajima S, Watanabe S (2007) Variational Bayes solution of linear neural networks and its generalization performance. Neural Comput 19:1112-53 [Journal] [PubMed] |