Citation Relationships

Sugiyama M, Kawanabe M, Müller KR (2004) Trading variance reduction with unbiasedness: the regularized subspace information criterion for robust model selection in kernel regression. Neural Comput 16:1077-104 [PubMed]

References and models cited by this paper

References and models that cite this paper

Akaike H (1974) A new look at the statistical model identification IEEE Trans Appl Comp 19:716-723

Akaike H (1980) Likelihood and the Bayes procedure Bayesian statistics, Bernardo NJ:DeGroot MH:Lindley DV:Smith AFM, ed. pp.1411

Aronszajn N (1950) Theory of reproducing kernels Transactions Of The American Mathematical Society 68:337-404

Bergman S (1970) The kernel function and conformal mapping

Bishop C (1995) Neural Networks For Pattern Recognition

Bousquet O, Elisseeff A (2002) Stability and generalization J Mach Learn Res 2:499-526

Cherkassky V, Shao X, Mulier FM, Vapnik VN (1999) Model complexity control for regression using VC generalization bounds. IEEE Trans Neural Netw 10:1075-89 [Journal] [PubMed]

Craven P, Wahba G (1979) Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation Numerische Mathematik 31:377-403

Cristianini N, Shawe-taylor J (2000) An introduction to support vector machines

Cucker F, Smale S (2001) On the mathematical foundations of learning Bull Amer Math Soc 39:1-49

Daubechies I (1992) Ten lectures on wavelets

Devroye L, Gyorfi L, Lugosi G (1996) A probabilistic theory of pattern recognition

Donoho DL (1995) De-noising by soft thresholding IEEE Trans Inform Theory 41:613-627

Donoho DL, Johnstone IM (1994) Ideal spatial adaptation via wavelet shrinkage Biometrika 81:425-455

Felsenstein J (1985) CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP. Evolution 39:783-791 [Journal] [PubMed]

Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias-variance dilemma Neural Comput 4:1-58

Girosi F (1998) An Equivalence Between Sparse Approximation and Support Vector Machines. Neural Comput 10:1455-80 [PubMed]

Gu C, Heckman N, Wahba G (1992) A note on generalized cross-validation with replicates Statistics And Probability Letters 14:283-287

Henkel RE (1979) Tests of significance

Heskes T (1998) Bias/Variance Decompositions for Likelihood-Based Estimators. Neural Comput 10:1425-33 [PubMed]

Hoerl AE, Kennard RW (1970) Ridge regression: Biased estimation for nonorthogonal problems Technometrics 12:55-67

Joachims T (1999) Making large-scale SVM learning practical Advances in kernel methods-Support vector learning, Scholkpf B:Burges C:Smola A, ed.

Konishi S, Kitagawa G (1996) Generalized information criteria in model selection Biometrika 83:875-890

Lehmann E, Casella G (1998) Theory Of Point Estimation

Li K (1986) Asymptotic optimality of CL and generalized cross-validation in ridge regression with application to spline smoothing Ann Stat 14:1101-1112

Linhart H (1988) A test whether two AIC's differ significantly South Africa Statistical Journal 22:153-161

Luntz A, Brailovsky V (1969) On estimation of characters obtained in statistical procedure of recognition Techicheskaya Kibernetica

Mallows CL (1964) Choosing variables in a linear regression Paper presented at the Central Regional Meeting of the Institute of Mathematical Statistics

Mallows CL (1973) Some comments on CP Technometrics 15:661-675

Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12:181-201 [Journal] [PubMed]

Muller KR, Smola AJ, Ratsch G, Scholkopf B, Kohlmorgen J, Vapnik V (1998) Using support vector machines for time series prediction Advances in kernel methods-Support vector learning, Scholkopf B:Burges CJC:Smola AJ, ed. pp.243

Murata N (1998) Bias of estimators and regularization terms Proceedings of 1998 Workshop on Information-Based Induction Sciences (IBIS98) :87-94

Murata N, Yoshizawa S, Amari S (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Trans Neural Netw 5:865-72 [Journal] [PubMed]

Orr MJL (1996) Introduction to radial basis function networks Tech Rep (Available on-line:

Rasmussen CE, Neal RM, Hinton GE, van_Camp D, Revow M, Ghahramani Z, Kustra R, Tibshirani R (1996) The DELVE manual Available on-line:

Saitoh S (1988) Theory of reproducing kernels and its applications

Saitoh S (1997) Integral transforms, reproducing kernels and their applications

Scholkopf B, Smola AJ (2001) Learning with kernels: Support vector machines, regularization, optimization, and beyond

Scholkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms Neural Comput 12:1207-45 [PubMed]

Shimodaira H (1997) Assessing the error probability of the model selection test Anna Inst Stat Math 49:395-410

Shimodaira H (1998) An application of multiple comparison techniques to model selection Ann Inst Stat Math 50:1-13

Smola AJ, Schölkopf B, Müller KR (1998) The connection between regularization operators and support vector kernels. Neural Netw 11:637-649 [PubMed]

Stein C (1956) Inadmissibility of the usual estimator for the mean of a multivariate normal distribution Proceedings Of The 3rd Berkeley Symposium On Mathematical Statistics And Probability 1:197-206

Sugiura N (1978) Further analysis of the data by Akaike's information criterion and the finite corrections Communications In Statistics-theory And Methods 7:13-26

Sugiyama M, Kawanabe M, Muller KR (2003) Trading variance reduction with unbiasedness - The regularized subspace information criterion for robust model selection in kernel regression Tech. Rep. No. TR03-0003 (Available on-line:

Sugiyama M, Muller KR (2002) The subspace information criterion for infinite dimensional hypothesis spaces J Mach Learn Res 3:323-359

Sugiyama M, Ogawa H (2001) Subspace information criterion for model selection. Neural Comput 13:1863-89 [Journal] [PubMed]

Sugiyama M, Ogawa H (2002) Optimal design of regularization term and regularization parameter by subspace information criterion. Neural Netw 15:349-61 [PubMed]

Takeuchi K (1976) Distribution of information statistics and validity criteria of models Mathematical Science 153:12-18

Tibshirani R (1996) Regression shrinkage and selection via the LASSO J Roy Stat Soc B 58:267-288

Tsuda K, Sugiyama M, Muller KR (2002) Subspace information criterion for non-quadratic regularizers-Model selection for sparse regressors IEEE Trans Neural Networks 13:70-80

Vapnik V (1995) The Nature of Statistical Learning Theory

Vapnik V (1998) Statistical Learning Theory

Vapnik VN (1982) Estimation of dependencies based on empirical data

Wahba G (1985) A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem Ann Stat 13:1378-1402

Wahba G (1990) Splines models for observational data

Williams CKI (1998) Prediction with gaussian processes: From linear regression to linear prediction and beyond Learning in graphical models, Jordan MI, ed. pp.599

Williams CKI, Rasmussen CE (1996) Gaussian processes for regression Advances in neural processing systems, Touretzky DS:Mozar MC:Hasselmo ME, ed. pp.598

(58 refs)