## Introduction to Statistical Programming with R

### Course topics

Part 1-2. Introduction to R programming

• Data structures in R
•  Concept of vectorization
•  Loops in R
•  Functions in R
•  Apply-family functions
•  Important libraries: dplyr, ggplot2
•  Data reading, writing, and manipulations
• Data visualizations in R

Part 3. Descriptive Statistics and Probability Theory in R]

• Basic descriptive functions
• Characteristics of univariate data sets
• Characteristics of bivariate data sets
• Probability of events
•  Random variables and distribution functions

Part 4. Inferential Statistics

•  Parameter estimation
• Confidence intervals
•  Statistical tests
•  Hypothesis Tests

Part 5. Regression (part I)

• Linear Regression
• OLS
• Testing the coefficients
• Goodness-of-fit
• Missing data
• Multicollinearity
• Heteroscedasticity
• Autocorrelation
• Outliers

Part 6. Regression (part II)

• Model selection
• The omission of relevant regressors
• Inclusion of irrelevant regressors
• Stepwise model selection
• Generalized least squares
• Nonlinear regression

Part 7. Nonparametric regression, modeling binary, nominal, and count data

• Kernel density estimator
• Univariate nonparametric regression
• Lasso regression
•  Regression trees (CART, CHAID)
•  Modeling binary data
• Binary data with CART/CHAID
• Modeling nominal data
• Modeling count data

Part 8. Time-series: decomposition and forecasting

• Time series components: trend, seasonality, irregular component
• Time series models
• Forecast package
• Parameter estimation for ARMA processes
• Goodness of forecasts
• Naive forecasts
•  Exponential smoothing
• Forecast combinations

### Prerequisites

• Introduction to Data Science
• Statistics and Econometrics

### Homeworks

1. (50 points) Hypothesis Testing Exercises (using given dataset)
2. (50 points) Regression models exercise