Subject description - B4M36SAN

Summary of Study | Summary of Branches | All Subject Groups | All Subjects | List of Roles | Explanatory Notes               Instructions
B4M36SAN Statistical Data Analysis
Roles:PO, P Extent of teaching:2P+2C
Department:13136 Language of teaching:CS
Guarantors:Kléma J. Completion:Z,ZK
Lecturers:Kléma J. Credits:6
Tutors:Barvínek J., Kléma J., Le A., Míkovec Z., Pevný T. Semester:Z


This course builds on the skills developed in introductory statistics courses. It is practically oriented and gives an introduction to applied statistics. It mainly aims at multivariate statistical analysis and modelling, i.e., the methods that help to understand, interpret, visualize and model potentially high-dimensional data. It can be seen as a purely statistical counterpart to machine learning and data mining courses.

Course outlines:

1. Introduction, motivation, a course map, review of the basic statistical terms and methods.
2. Dimension reduction (PCA and kernel PCA).
3. Dimension reduction (other non-linear methods).
4. Clustering (basic methods, spectral clustering).
5. Clustering (biclustering, semi-supervised clustering)
6. Multivariate confirmation analysis (ANOVA and MANOVA).
7. Discriminant analysis (categorical dependent variable, LDA, logistic regression).
8. Multivariate regression (continuous dependent variable, linear regression, p-values, overfitting)
9. Multivariate regression (non-linear models, polynomial and local regression).
10. Anomaly detection.
11. Robust statistics.
12. Empirical studies, their design and evaluation.
13. Power analysis.
14. The final review, spare lecture.

Exercises outline:

1. Programming in R, introduction.
2. R libraries, statistical packages, learning package Swirl.
3. Data visualization in R.
4. Dimension reduction - assignment.
5. Clustering - assignment.
6. Multivariate confirmation analysis - assignment.
7. Discriminant analysis - assignment.
8. Mid-term test.
9. Multivariate linear regression - assignment.
10. Multivariate non-linear regression - assignment.
11. Anomaly detection - assignment.
12. Empirical study design - assignment.
13. Power analysis - assignment.
14. Spare lab, credits.


1. Hair, J. F., et al.: Multivariate Data Analysis: A Global Perspective. 7th ed., Prentice Hall, 2009.
2. James, G. et al.: An Introduction to Statistical Learning with Applications in R., Springer, 2013.


The general statistical concepts covered in the course B0B01PST. The knowledge of linear classification, clustering and dimensionality reduction, see B4B33RPZ for details.



multivariate data analysis multivariate regression clustering dimensionality reduction anomaly detection power analysis

Subject is included into these academic programs:

Program Branch Role Recommended semester
MPOI1_2016 Human-Computer Interaction PO 3
MPOI2_2016 Cyber Security PO 1
MPOI8_2018 Bioinformatics PO 1
MPOI9_2016 Data Science PO 1
MPOI2_2018 Cyber Security PO 1
MPOI9_2018 Data Science PO 1
MPOI8_2016 Bioinformatics PO 1
MPBIO1_2018 Bioinformatics P 1
MPBIO4_2018 Signal processing P 1
MPBIO3_2018 Image processing P 1
MPBIO2_2018 Medical Instrumentation P 1
MPOI1_2018 Human-Computer Interaction PO 3

Page updated 14.6.2021 19:52:31, semester: L/2021-2, L/2020-1, Z,L/2022-3, Z/2021-2, Send comments about the content to the Administrators of the Academic Programs Proposal and Realization: I. Halaška (K336), J. Novák (K336)