Citation Relationships



Amari S, Park H, Ozeki T (2006) Singularities affect dynamics of learning in neuromanifolds. Neural Comput 18:1007-65 [PubMed]

References and models cited by this paper

References and models that cite this paper

Akaho S, Kappen HJ (2000) Nonmonotonic generalization bias of Gaussian mixture models. Neural Comput 12:1411-27 [PubMed]

Akaike H (1974) A new look at the statistical model identification IEEE Trans Appl Comp 19:716-723

Amari S (1967) Theory of adaptive pattern classifiers IEEE Trans 16:299-307

Amari S (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27:77-87 [PubMed]

Amari S (1987) Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections and divergence Mathematical Systems Theory 20:53-82

Amari S (1998) Natural gradient works efficiently in learning Neural Comput 10:251-276

Amari S (2003) New consideration on criteria of model selection Neural networks and soft computing (Proceedings of the Sixth International Conference on Neural Networks and Soft Computing), Rutkowski L:Kacprzyk J, ed. pp.25

Amari S, Burnashev MV (2003) On some singularities in parameter estimation problems Problems Of Information Transmission 39:352-372

Amari S, Murata N (1993) Statistical theory of learning curves under entropic loss criterion Neural Comput 5:140-154

Amari S, Nagaoka H (2000) Methods of information geometry

Amari S, Nakahara H (2005) Difficulty of singularity in population coding. Neural Comput 17:839-58 [Journal] [PubMed]

Amari S, Ozeki T (2001) Differential and algebraic geometry of multilayer perceptrons IEICE Trans 84:31-38

Amari S, Ozeki T, Park H (2003) Learning and inference in hierarchical models with singularities Systems And Computers In Japan 34:34-42

Amari S, Park H, Fukumizu K (2000) Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput 12:1399-409 [PubMed]

Amari S, Park H, Ozeki T (2001) Statistical inference in nonidentifiable and singular statistical models J Of The Korean Statistical Society 30:179-192

Amari S, Park H, Ozeki T (2002) Geometrical singularities in the neuromani-fold of multilayer perceptrons Advances in neural information processing systems, Dietterich TG:Becker S:Ghahramani Z, ed. pp.343

Brockett RW (1976) Some geometric questions in the theory of linear systems IEEE Trans On Automatic Control 21:449-455

Chen AM, Lu H, Hecht-nielsen R (1993) On the geometry of feedforward neural network error surfaces Neural Comput 5:910-927

Dacunha-castelle D, Gassiat E (1997) Testing in locally conic models, and application to mixture models Probability And Statistics 1:285-317

Fukumizu K (1999) Generalization error of linear neural networks in unidentifiable cases Algorithmic learning theory: Proceedings of the 10th International Conference on Algorithmic Learning Theory (ALT99), Watanabe O:Yokomori T, ed. pp.51

Fukumizu K (2003) Likelihood ratio of unidentifiable models and multilayer neural networks Annals Of Statistics 31:833-851

Fukumizu K, Amari S (2000) Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Netw 13:317-27 [PubMed]

Hagiwara K (2002) On the problem in model selection of neural network regression in overrealizable scenario. Neural Comput 14:1979-2002 [Journal] [PubMed]

Hagiwara K (2002) Regularization learning, early stopping and biased estimator Neurocomputing 48:937-955

Hagiwara K, Hayasaka T, Toda N, Usui S, Kuno K (2001) Upper bound of the expected training error of neural network regression for a Gaussian noise sequence. Neural Netw 14:1419-29 [PubMed]

Hagiwara K, Toda N, Usui S (1993) On the problem of applying AIC to determine the structure of a layered feed-forward neural network Proceedings Of IJCNN 3:2263-2266

Hartigan JA (1985) A failure of likelihood asymptotics for normal mixtures Proc Barkeley Conf in Honor of J Neyman and J Kiefer 2:807-810

Hotelling H (1939) Tubes and spheres in n-spaces, and a class of statistical problems Amer J Math 61:440-460

Inoue M, Park H, Okada M (2003) On-line learning theory of soft committee machines with correlated hidden units-Steepest gradient descent and natural gradient descent J Phys Soc Jpn 72:805-810

Kang K, Oh JH, Kwon S, Park Y (1993) Generalization in a two-layer neural networks Phys Rev E 48:4805-4809

Kitahara M, Hayasaka T, Toda N, Usui S (2000) On the statistical properties of least squares estimators of layered neural networks (in Japanese) IEICE Transactions 86:563-570

Kurkova V, Kainen PC (1994) Functionally equivalent feedforward neural networks Neural Comput 6:543-558

Liu X, Shao Y (2003) Asymptotics for likelihood ratio tests under loss of identifiability Annals Of Statistics 31:807-832

Minsky M (1969) Perceptrons

Murata N, Yoshizawa S, Amari S (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5:865-72 [Journal] [PubMed]

Park H, Amari SI, Fukumizu K (2000) Adaptive natural gradient learning algorithms for various stochastic models. Neural Netw 13:755-64 [PubMed]

Park H, Inoue M, Okada M (2003) On-line learning dynamics of multilayer perceptrons with unidentifiable parameters J Phys A Math Gen 36:11753-11764

Rattray M, Saad D (1999) Analysis of natural gradient descent for multilayer neural networks Phys Rev E 59:4523-4532

Rattray M, Saad D, Amari S (1998) Natural gradient descent for on-line learning Phys Rev Lett 81:5461-5464

Riegler P, Biehl M (1995) On-line backpropagation in two-layered neural networks J Phys A Math Gen 28:L507-L513

Risssanen J (1986) Stochastic complexity and modeling Ann Statist 14:1080-1100

Rosenblatt F (1962) Principles Of Neurodynamics

Ruger SM, Ossen A (1997) The metric of weight space Neural Processing Letters 5:63-72

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation Parallel Distributed Processing, Rumelhart DE:McClelland JL, ed. pp.318

Saad D, Solla SA (1995) On-line learning in soft committee machines. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 52:4225-4243 [PubMed]

Schwarz G (1978) Estimating the dimension of a model Ann Stat 6:461-464

Sussmann HJ (1992) Uniqueness of the weights for minimal feedforward nets with a given input-output map Neural Netw 5:589-593

Watanabe S (2001) Algebraic geometrical methods for hierarchical learning machines. Neural Netw 14:1049-60 [PubMed]

Watanabe S (2001) Algebraic information geometry for learning machines with singularities Advances in neural information processing systems, Leen TK:Dietterich TG:Tresp V, ed. pp.329

Watanabe S (2001) Algebraic analysis for nonidentifiable learning machines. Neural Comput 13:899-933 [PubMed]

Watanabe S, Amari S (2003) Learning coefficients of layered models when the true distribution mismatches the singularities Neural Comput 15:1013-1033

Weyl H (1939) On the volume of tubes Amer J Math 61:461-472

Wu S, Amari S, Nakahara H (2002) Population coding and decoding in a neural field: a computational study. Neural Comput 14:999-1026 [Journal] [PubMed]

Wu S, Nakahara H, Amari S (2001) Population coding with correlation and an unfaithful model. Neural Comput 13:775-97 [PubMed]

Yamazaki K, Watanabe S (2002) A probabilistic algorithm to calculate the learning curves of hierarchical learning machines with singularities Trans on IEICE 85:363-372

Yamazaki K, Watanabe S (2003) Singularities in mixture models and upper bounds of stochastic complexity. Neural Netw 16:1029-38 [Journal] [PubMed]

Hiratani N, Fukai T (2018) Redundancy in synaptic connections enables neurons to learn optimally. Proc Natl Acad Sci U S A [Journal] [PubMed]

   A model of optimal learning with redundant synaptic connections (Hiratani & Fukai 2018) [Model]

Nakajima S, Watanabe S (2007) Variational Bayes solution of linear neural networks and its generalization performance. Neural Comput 19:1112-53 [Journal] [PubMed]

(58 refs)