Overview: Probability and statistics have become the basis for data science with its
numerous applied techniques. Our aim within this course is to introduce the
students to the main concepts and methods of probability and statistics and to
help them develop probabilistic and statistical thinking. We will discuss the basic
notions of probability (sample space and events, axioms of probability,
independence and conditioning; discrete and continuous random variables and
their distributions; expectations, variance and other characteristics) and statistics
(samples, descriptive statistics, parameter estimation, hypothesis testing and
regression) that are necessary for understanding the main techniques of data
science.
Course topics:
Part I. Basic notions
Topic 1. Introduction
● Notion of probability
● Classical and geometric probability
● Combinatorial analysis
Topic 2. Axioms of probability
● Sample spaces, events, probability
● Axioms of probability
● Inclusion-exclusion principle
Topic 3. Conditioning
● Conditional probability; Independence
● Total probability rule
● Bayes’ formula
Part II. Random variables
Topic 4. Discrete random variables
● Definition and examples
● Probability mass function and cumulative distribution function
● Independence and joint distribution
Topic 5. Continuous random variables
● Density and cumulative distribution functions
● Standard continuous distributions
● Independence, joint distributions, transformed distributions
Topic 6. Basic characteristics of random variables
● Expectation and variance; moments
● Covariance and correlation
● Conditional distributions
Part III. Limit theorems and random processes
Topic 7. Limit theorems: LLN and CLT
● Chebyshev’s inequality and Weak Law of Large Numbers
● Types of convergence; SLLN; Monte Carlo method
● Central Limit Theorem and approximation by normal distribution
Topic 8. Markov chains
● Examples and basic notions
● Stationary distribution
● Absorption probability and time
Topic 9. Some random processes
● Bernoulli process
● Poisson process
● Random walk
Part IV. Parameter estimation
Topic 10. Statistical models
● Statistical models
● Samples and their characteristics; graphical tools
● Parameter distribution families, statistics, and estimators
Topic 11. Parameter estimation
● Unbiased and consistent estimators
● Moment method
● Maximum likelihood estimator
Topic 12. Confidence intervals
● Point vs interval estimation
● Confidence intervals for the mean and variance
● Exit polls and confidence intervals
Part V. Hypothesis testing
Topic 13. Hypothesis testing
● Neyman-Pearson framework
● Errors of types I and II
● Test size, power function, p-value
Topic 14. Tests for normal distribution
● The z-test
● The t-test
● Tests for the variance
Topic 15. Regression
● The linear model
● Parameter estimation
● Hypothesis testing in linear regression
Викладачі:
Ключові факти:
Навчальний семестр: 3
Кількість кредитів: 6 ECTS
Освітня програма: Комп’ютерні науки,
ІТ та Бізнес-аналітика