## Statistics and Econometrics

### Brief summary of the course

The course consists of four main parts. First, we deal with tools and methods of descriptive statistics and probability theory. We focus particularly on characterizing data samples using location, volatility, and correlation measures. Furthermore, we discuss random variables, distribution, and density functions and learn the most important statistical distributions. In the second, we deal with statistical inference and learn how to make inferences from a sample to a population. This includes such concepts as point estimation, Maximum-Likelihood estimation, confidence intervals, and statistical tests. Third, we discuss the linear regression model as the key tool of causal data analysis. Here we deal both with theoretical aspects that include properties, extensions to generalized linear models and with data-driven issues, such as interpretation, model selection, missing data, etc. Finally, in the fourth part, we consider extensions of the linear regression, such as nonparametric regression, regression trees, and logistic regression as a method classification.

### Course topics

Part 1. Descriptive Statistics

• Introduction
• Statistical concepts
• Characteristics of univariate data sets
• Characteristics of bivariate data sets

Part 2. Probability Theory

• Probability of events
• Random variables and distribution functions
• Random vectors

Part 3. Inferential Statistics

• Parameter estimation
• Confidence intervals
• Statistical tests

Part 4. Regression (part I)

• Linear Regression
• OLS
• Testing the coefficients
• Goodness-of-fit
• Missing data
• Multicollinearity
• Heteroscedasticity
• Autocorrelation
• Outliers

Part 5. Regression (part II)

• Model selection
• Omission of relevant regressors
• Inclusion of irrelevant regressors
• Stepwise model selection
• Generalized least squares
• Nonlinear regression

Part 6. Nonparametric regression

• Kernel density estimator
• Univariate nonparametric regression
• Lasso regression

Part 7. Regression trees

• CART
• CHAID

Part 8. Modeling binary, nominal and count data

• Modeling binary data
• Binary data with CART/CHAID
• Modeling nominal data
• Modeling count data

Part 9. Time series decomposition

• Trend component
• Seasonal component
• Irregular component
• Time series models
• Parameter estimation for ARMA processes

Part 10. Forecasting

• Goodness of forecasts
• Naive forecasts
• Exponential smoothing
• Forecast combinations