In this course, using fundamental concepts of probability theory, we present an introduction to the design of adaptive information processing systems. This course extends coursework on adaptive signal processing and can also be taken as an introduction to machine learning and data science. Typical application areas include pattern recognition, medical signal analysis, speech and language processing, image processing, bio-informatics and robotics.
In the 2017/18 academic year, this class is taught in semester B (3rd quarter) and starts on 5-Feb-2018.
→ Watch this section for announcements
8-Mar-2018: Added answers to question 15 (on temporal models) in the exercises for part-1.
8-Mar-2018: Class materials for part-2 have been updated.
8-Mar-2018: The lecture booklet in PDF-format for part-1 has been updated to incorporate some minor updates that were added over the past few weeks. No need to print out a new version if you have a previous version.
5-Mar-2018: There was a request to supply an answer to the three-coins problem (in lesson 10 - The EM Algorithm). You can google
three coins EM algorithmto find many resources on the internet. The problem and solution was originally described in Collins (1997), sec.3.1. I personally like this concise summary.
21-Feb-2018: Per request of some students, I added the corresponding lesson number to the exercises for part 1 (and reshuffled the sequence of exercises to match the sequence order of the lessons).
- In principle, you can download all needed materials from this site. We strongly recommend that you acquire the following text book: Pattern Recognition and Machine Learning (Springer, 2006) by Christopher M. Bishop. Try to get the book before classes start.
Part 1: Linear Gaussian Models and the EM Algorithm
Instructor: Prof.dr.ir. Bert de Vries
We present a unified probabilistic modeling approach to a large set of algorithms based on Linear Gaussian Models, including models for regression and classification problems, Gaussian mixture models, Kalman filters, hidden Markov models and various latent component analysis models. Furthermore, we derive the Expectation Maximization (EM) algorithm for maximum likelihood estimation problems and present factor graphs as a unifying framework for efficient realization of probabilistic inference algorithms. In part 1, the emphasis will be on parameter estimation for a given model specification. You can view the lecture notes through the links below:
- 0 - Introduction
- 1 - Machine Learning Overview
- 2 - Probability Theory Review
- 3 - Bayesian Machine Learning
- 4 - Working with Gaussians
- 5 - Density Estimation
- 6 - Linear Regression
- 7 - Generative Classification
- 8 - Discriminative Classification
- 9 - Clustering with Gaussian Mixture Models
- 10- The EM Algorithm
- 11- Continuous Latent Variable Models - PCA and FA
- 12- Factor Graphs and Message Passing Algorithms
- 13- Dynamic Latent Variable Models
14- EM as a Message Passing Algorithm (this lesson not at exam!)
The source files for these lecture notes are accessible at github. If you catch an error or if you have a specific update request, please file a github issue.
Here is a PDF bundle of all classes for part-1. The lecture notes may change a bit during the course, e.g., to process comments by students. A final PDF version will be posted after the last lecture.
- Code examples in the lecture notes are in the Julia language, which is syntactically similar to MATLAB. In order to run the code examples straight in the browser, you will need to run the lecture notes files in a Jupyter notebook. We recommend that you run the cloud-based JuliaBox service to run Jupyter notebooks. Please see these instructions (scroll to down to the README) if you want to run the lecture notes in JuliaBox.
Part 2: Model Complexity Control and the MDL Principle
Instructor: Dr.ir. Tjalling J. Tjalkens
In part 2, the discussion on probabilistic modeling extends to model specification itself. Specifically, the notion of Stochastic Complexity will be developed and the Minimum Description Length (MDL) principle will be used to select appropriate models. The lessons are structured as follows:
- Part 2A: The Bayesian Information Criterion
- Part 2B: Bayesian model estimation and Context-tree model selection
Part 2C: Descriptive complexity
- Click here to view or download the lecture notes for part-2.
- An extended version of the part-2 handouts is in preparation but only half-finished. You can download this UNFINISHED work as well.
- Background on information theory.
- Markov structures and summary of essential content.
- Each year there will be two written exam opportunities. Check the official TUE course site for exam dates.
- In preparation for the exam, we recommend that you work through the following exercises and old exams:
- Please feel free to consult the following matrix and Gaussian cheat sheets (by Sam Roweis) when making exercises.
- Note however that you cannot bring notes or books to the exam. All needed formulas are supplied at the exam sheet.
The 2007 class meetings were recorded and can be viewed if you have a valid TU/e account. Note however that the current class will change a bit relative to the 2007 class. Talk to us before you plan to follow the class only from video.
Prerequisites: Mathematical maturity equivalent to undergraduate engineering program. Some MATLAB programming skills are helpful.
This course is a replacement for the 3-ECTS course 5MB20-Adaptive Information Processing, which was taught between 2005-2014. The new course 5SSB0 is a 5-ECTS course and while the contents are similar to 5MB20, some lessons have been extended with new materials. The slide materials for 5MB20 for the academic year 2014/15 are still available here.
You’re advised to bring the lecture notes (either in soft- or hardcopy) with you to class in order to add your personal comments.
Some related resources on the net with lots of relevant content
- CS281: Advanced Machine Learning by prof. Ryan Adams, at Harvard University.