Advanced Computational Statistics

Course Homepage for Advanced Computational Statistics (PhD Course 2025; 7.5 HEC)

Summary

Statistics depends heavily on computational methods. Optimisation methods are used in statistics for example for maximum likelihood estimates, optimal experimental designs, risk minimization in decision theoretic models. In these cases, solutions of optimisation problems usually do not have a closed form but need to be computed numerically with an algorithm. Another big demand on computational methods is when statistical distributions are simulated and integrated and statistics of these distributions have to be determined in an efficient way.

This course focuses on computational methods for optimisation, simulation and integration needed in statistics. The optimisation part discusses gradient based, stochastic gradient based, and gradient free methods. Further, constrained optimisation will be a course topic. We will discuss techniques to simulate efficiently for solving statistical problems.

We will use implementation with the programming language R. Examples from machine learning and optimal design will illustrate the methods.

Most welcome to the course!
Frank Miller, Department of Computer and Information Science, Linköping University
frank.miller at liu.se

Topic 1: Gradient based optimisation

Lectures: March 11; Time 13:30-17:30. Linköping University, Campus Valla. Room: John von Neumann.

Reading:

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Chapter 2 until 2.2.3 (and Chaper 1.1-1.4 if needed).
Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. MIT Press, http://www.deeplearningbook.org. Chapter 4.3 (and parts of Chapter 2 and Chapter 4.2 if needed).
Wright SJ, Recht B (2022). Optimization for data analysis. Cambridge. Chapter 2 to 4.
About analytical optimisation. (Frank Miller, March 2023/2025)

Example code: steepestascentL1.r

Assignment for lecture 1

Topic 2: Stochastic gradient based optimisation

Lectures: March 12; Time 9:00-12:00. Linköping University, Campus Valla. Room: John von Neumann.

Reading:

Wright SJ, Recht B (2022). Optimization for data analysis. Cambridge. Chapter 5 to 6.
Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. MIT Press, http://www.deeplearningbook.org. Chapter 5.9 and 8.1 to 8.6 (and other parts of Chapter 5 based on interest).
Wright SJ (2015). Coordinate descent algorithms. Mathematical programming 151, no. 1: 3-34.

Further reading:

Pedregosa F (2018). The stochastic gradient method.
Kingma DP, Ba J (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations.
Dwork C, McSherry F, Nissim K, Smith A (2006). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pp. 265-284. Springer Berlin Heidelberg.
Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016). Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 308-318.

Dataset logist.txt

Assignment for lecture 2

Topic 3: Gradient free optimisation

Lectures: April 1; Time 10-12 and 13-15. Online via Zoom (link will be sent to registered participants by email).

Reading:

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Chapter 3.1 to 3.3 (and 2.2.4).
AlphaOpt (2017). Introduction To Optimization: Gradient Free Algorithms (1/2) - Genetic - Particle Swarm, Youtube video (elementary introduction of concepts).
AlphaOpt (2017). Introduction To Optimization: Gradient Free Algorithms (2/2) Simulated Annealing, Nelder-Mead, Youtube video (elementary introduction of concepts).
Clerc M (2012). Standard Particle Swarm Optimisation. 15 pages. https://hal.archives-ouvertes.fr/hal-00764996 (some background to details in implementation including choice of parameter values).
Wang D, Tan D, Liu L (2018). Particle swarm optimization algorithm: an overview. Soft Comput 22:387-408. (Broad overview over research about PSO since invention).
Bonyadi MR, Michalewicz Z (2016). Stability analysis of the particle swarm optimization without stagnation assumption. IEEE transactions on evolutionary computation, 20(5):814-819.
Cleghorn CW, Engelbrecht AP (2018). Particle swarm stability: a theoretical extension using the non-stagnate distribution assumption. Swarm Intelligence, 12(1):1-22. An author version is available here.
Clerc M (2016). Chapter 8: Particle Swarms. In: Metaheuristics. (Siarry P ed.).
Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. MIT Press, http://www.deeplearningbook.org. Chapter 5.2 and 7.1 (not about gradient free but about regularisation as background for Problem 3.1).

Example code: PSO_order1_stability.r

Code for Problem 3.3: crit_HA3.r

Dataset for Problem 3.1: cressdata.txt.

Assignment for Lecture 3

Topic 4: Optimisation with constraints

Lecture: April 15, Time 10-12 and 13-15. Online via Zoom (same link as for Lecture 3).

Reading:

Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. MIT Press, http://www.deeplearningbook.org. Chapter 4.4-4.5 and 7.2.
Lange K (2010). Numerical Analysis for Statisticians. Springer, New York. Chapter 11.3-11.4 (Optimisation with equality and inequality constraints) and Chapter 16.1-16.3 and 16.5 (Barrier method, model selection and the lasso).
Wright SJ, Recht B (2022). Optimization for data analysis. Cambridge. Chapter 7.

Program for D-optimal design for quadratic regression in Lecture 4 using constrOptim: L4_Doptdesign_constrOptim.r.
R-code for Euclidian projection onto an L₁-norm ball: eucprojl1.r.

Assignment for Lecture 4

Topic 5: EM algorithm and bootstrap

Lecture: April 29, Time 10-13. Online via Zoom (same link as for Lecture 3).

Reading:

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Chapter 4.1-4.2.2 and 9.1-9.3, 9.7.
Lindholm A, Wahlström N, Lindsten F, Schön TB (2022). Machine Learning - A First Course for Engineers and Scientists. Cambridge University Press, http://smlbook.org/book/sml-book-draft-latest.pdf. Chapter 10.1-10.2, 7.1.
Bishop CM (2006). Pattern Recognition and Machine Learning. Springer, New York. Chapter 9. https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Wicklin R (2017). The bias-corrected and accelerated (BCa) bootstrap interval. SAS blogs. https://blogs.sas.com/content/iml/2017/07/12/bootstrap-bca-interval.html

Code for Problem 5.1 (R, Python, Julia, Matlab): emalg.txt

Dataset for Problem 5.1: bivardat.csv.

Assignment for Lecture 5

Topic 6: Simulation of random variables

Lecture: May 15, Time 13:30-17:30. Online via Zoom (same link as for Lecture 3).

Reading:

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Chapter 6.1-6.3.1, 7.1, and 7.3.

Code for example "Type I error under wrong distribution": typeIerrorSimulation.r

Assignment for Lecture 6

Topic 7: Numerical and Monte Carlo integration; importance sampling

Lecture: May 16, Time 9-12. Online via Zoom (same link as for Lecture 3).

Reading:

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Chapter 5.1, 5.3, and 6.4.1-6.4.2.
Owen AB (2013). Monte Carlo theory, methods and examples. https://artowen.su.domains/mc/Ch-var-is.pdf. Chapter 9.1-9.5.

Assignment for Lecture 7

Prerequisites

Accepted to a doctoral program in Sweden in Statistics or a related field (e.g. Mathematical Statistics, Engineering Science, Quantitative Finance, Computer Science). Knowledge about Statistical Inference (e.g. from the Master's level) and familiarity with a programming language (e.g. with R) is required.

Content

The course contains fundamental principles of computational statistics. Focus is on:

Principles of gradient based and gradient free optimisation including stochastic optimisation and constrained optimisation
Introduction to convergence analysis for stochastic optimisation algorithms
Statistical problem-solving using optimisation, including maximum likelihood, regularized least squares, and optimal experimental designs
Principles of numerical integration
Principles of statistical simulation
The bootstrap method
Statistical problem-solving using simulation techniques including generation of Monte Carlo estimates, their confidence intervals, and posterior distributions

Intended Learning Outcomes

On completion of the course, the student is expected to be able to:

Demonstrate knowledge of principles of computational statistics
Explain theoretical and empirical methods to compare different algorithms
Design and organize algorithms for optimisation, integration, and simulation of distributions
Solve statistical computing problems using advanced algorithms
Adapt a given optimisation, integration, or simulation method to a specific problem
Assess, compare and contrast properties of alternative optimisation, integration, and simulation methods
Critically judge different methods for optimisation, integration, and simulation
Ability to choose an adequate method for a given statistical problem

Examination and Grading

The intended learning outcomes will be graded by several individual home assignments. The grades given: Pass or Fail. The schedule for the assignments:

Assignment for lecture 1: March 12 until March 31*
Assignment for lecture 2: March 12 until March 31 (peer assessment until April 14)
Assignment for lecture 3: April 1 until April 14*
Assignment for lecture 4: April 15 until April 28 (peer assessment until May 14)
Assignment for lecture 5: April 29 until May 14*
Assignment for lecture 6: May 16 until June 7*
Assignment for lecture 7: May 16 until June 7 (peer assessment until June 30)

*teacher assessment
Second chance for Topic 1-7: until September 30 (no extension!, peer assessment for assignments 2, 4, and 7 until October 21)

Course Literature

Givens GH, Hoeting JA (2013). Computational Statistics, 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey.
Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. MIT Press, http://www.deeplearningbook.org. Focus on Chapter 4, 5, and 8.
Wright SJ, Recht B (2022). Optimization for data analysis. Cambridge.
Further literature including research articles and other learning material will be provided in the course.

Course Structure and Schedule

Lectures and some problem sessions. The teaching is conducted in English. Course participants will spend most of their study time by solving the problem sets for each topic on their own computers without supervision. The course will be held in March, April, and May 2025.

Lecture 1: Gradient based optimisation
March 11, 13:30-17:30 (in Linköping)
Lecture 2: Stochastic gradient based optimisation
March 12, 9:00-12:00 (in Linköping)
Lecture 3: Gradient free optimisation
April 1, 10:00-12:00, 13:00-15:00 (online, Zoom)
Lecture 4: Optimisation with constraints
April 15, 10:00-12:00, 13:00-15:00 (online, Zoom)
Lecture 5: EM algorithm and bootstrap
April 29, 10:00-13:00 (online, Zoom)
Lecture 6: Simulation of random variables
May 15, 13:30-17:30 (online, Zoom)
Lecture 7: Numerical and Monte Carlo integration; importance sampling
May 16, 9:00-12:00 (online, Zoom)

Lectures 1 and 2 will all be in the room John von Neumann, see a map via this link.

Teachers

Lecture 2: Sebastian Mair Lecture 1, 3-7: Frank Miller

Registration

Registration deadline has passed. If you are interested to participate, please send an email to Frank Miller (frank.miller at liu.se) to ask if you still can join the course. Write your name, department, and name of your supervisor. You are also welcome for any questions related to the course.