## Introduction to Statistical Programming with R

### Course topics

Part 1-2. Introduction to R programming

• Data structures in R
• Concept of vectorization
• Loops in R
• Functions in R
• Apply-family functions

Part 3. Useful R packages

• Important libraries: dplyr, purrr, tidyr, ggplot2
• Data reading, writing, and manipulations
• Data visualizations in R

Part 4. Descriptive Statistics

• Basic descriptive functions
• Characteristics of univariate data sets
• Characteristics of bivariate data sets
• Descriptive visualizations

Part 5. Probability Theory in R

• Probability of events
• Random variables and distribution functions
• Visualizing data distributions

Part 6. Inferential Statistics

• Parameter estimation
• Confidence intervals
• Statistical tests
• Hypothesis Tests

Part 7. Regression (part I)

• Linear Regression
• OLS
• Testing the coefficients
• Goodness-of-fit
• Missing data
• Multicollinearity
• Heteroscedasticity
• Autocorrelation
• Outliers

Part 8. Regression (part II)

• Model selection
• Omission of relevant regressors
• Inclusion of irrelevant regressors
• Stepwise model selection
• Generalized least squares
• Nonlinear regression

Part 9. Nonparametric regression

• Kernel density estimator
• Univariate nonparametric regression
• Lasso regression
• Ridge regression

Part 10. Modeling binary, nominal, and count data

• Regression trees (CART, CHAID)
• Modeling binary data
• Binary data with CART/CHAID
• Modeling nominal data
• Modeling count data

Part 11. Time-series: decomposition

• Time series components: trend, seasonality, irregular component
• Time series models

Part 12. Time-series: forecasting

• forecast package
• Parameter estimation for ARMA processes
• Goodness of forecasts
• Naive forecasts
• Exponential smoothing
• Forecast combinations

### Prerequisites

• Introduction to Data Science
• Statistics and Econometrics

### Homeworks

1. (50 points) Hypothesis Testing Exercises (using given dataset)
2. (50 points) Regression models exercise