This web page contains
a list of topics to be covered in class, notes,
relevant references, and some useful links.
The book by Devroye, Gyorfi, and Lugosi is abbreviated as DGL when
referenced below.
Topics and References
-
Week 1
Classification problem in a probabilistic framework, The Bayes decision function, Empirical risk minimization (DGL Ch.2).
[March 24] Introduction
[March 26] Empirical risk minimization -
Week 2
Risk bounds, Concentration inequalities (Hoeffding's inequality) (DGL Ch.8).
[March 31] Probabilistic error bounds
[April 2] The Glivenko-Cantelli theorem
-
Week 3
The Glivenko-Cantelli theorem, Vapnik-Chervonenkis inequality, Vapnik-Chervonenkis theory, V-C dimension (DGL Ch.12 and 13).
[April 7] Vapnik-Chervonenkis inequality
[April 9] Vapnik-Chervonenkis dimension -
Week 4
Combinatorial aspects of Vapnik-Chervonenkis theory (DGL Ch. 13), Support Vector Machines.
[April 14] V-C dimension and shatter coefficient
[April 16] Linear Support Vector Machines -
Week 5
Support vector machines, Constrained optimization, Regularization.
[April 21] Support Vector Machines
[April 23] Constrained optimization
C. J. C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Knowledge Discovery and Data Mining, 2(2), 1998.
Javier M. Moguerza and Alberto Munoz, Support Vector Machines with Applications, Statistical Science, Volume 21, Number 3 (2006), 322-336. -
Week 6
Kernel methods, Reproducing Kernel Hilbert Spaces (RKHS), Methods of regularization in RKHS.
[April 28-30] Kernel methods and RKHS | Illustration of SVMs and R code
See Section 1.1 of Grace Wahba, Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV. In `Advances in Kernel Methods - Support Vector Learning', Schölkopf, Burges and Smola (eds.), MIT Press 1999, 69-88. -
Week 7
Methods of regularization in RKHS, Consistency (DGL Ch.6)
[May 5-7] Consistency -
Week 8
Consistency, Nearest neighbor rules, Stone's theorem for universal consistency (DGL Ch.5 and 6).
[May 12] Nearest neighbor rules
[May 14] Stone's theorem
Cover, T. and Hart, P. (1967), Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13:21-27.
Stone, C. (1977), Consistent nonparametric regression. Annals of Statistics, 5:595-645. -
Week 9
Universal consistency of k-nearest neighbor rules and kernel classification rules (DGL Ch.6).
[May 19] Kernel classification rules
Devroye, L. and Wagner, T. (1980), Distribution-free consistency results in nonparametric discrimination and regression function estimation. Annals of Statistics, 8:231-239. -
Week 10
Boosting.
Robert E. Schapire, A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999.
[May 26] No class
[May 28] boosting
[Acknowledgement] The instructor thanks Prasenjit Kapat for his laborious work of scribing lecture notes in latex initially.
