Brief summary of the course
In this course, we consider the features of the programming language R with an emphasis on functions designed for statistical analysis and modeling. This course will make it easier for you to study econometrics, as R allows you to do statistical data analysis with a minimum of writing your own code.
Learning outcomes
By the end of the course students will have specialized conceptual knowledge that includes modern scientific achievements in the field of computer science and is the basis for original thinking and research, critical thinking of problems in the field of computer science and on the verge of knowledge fields; develop algorithmic and software for data analysis (including big data); design architectural solutions for information and computer systems for various purposes; test the software; identify and eliminate issues during software exploitations, and formulate tasks for its modification or reengineering.
Course plan
Part 1-2. Introduction to R programming
● Data structures in R
● Concept of vectorization
● Loops in R
● Functions in R
● Apply-family functions
Part 3. Useful R packages
● Important libraries: dplyr, purrr, tidyr, ggplot2
● Data reading, writing, and manipulations
● Data visualizations in R
Part 4. Descriptive Statistics
● Basic descriptive functions
● Characteristics of univariate data sets
● Characteristics of bivariate data sets
● Descriptive visualizations
Part 5. Probability Theory in R
● Probability of events
● Random variables and distribution functions
● Visualizing data distributions
Part 6. Inferential Statistics
● Parameter estimation
● Confidence intervals
● Statistical tests
● Hypothesis Tests
Part 7. Regression (part I)
● Linear Regression
● OLS
● Testing the coefficients
● Goodness-of-fit
● Missing data
● Multicollinearity
● Heteroscedasticity
● Autocorrelation
● Outliers
Part 8. Regression (part II)
● Model selection
● Omission of relevant regressors
● Inclusion of irrelevant regressors
● Stepwise model selection
● Generalized least squares
● Nonlinear regression
Part 9. Nonparametric regression
● Kernel density estimator
● Univariate nonparametric regression
● Lasso regression
● Ridge regression
Part 10. Modeling binary, nominal, and count data
● Regression trees (CART, CHAID)
● Modeling binary data
● Binary data with CART/CHAID
● Modeling nominal data
● Modeling count data
Part 11. Time-series: decomposition
● Time series components: trend, seasonality, irregular component
● Time series models
Part 12. Time-series: forecasting
● forecast package
● Parameter estimation for ARMA processes
● Goodness of forecasts
● Naive forecasts
● Exponential smoothing
● Forecast combinations