Empirical research in economics, finance, research and management using R
Essentials, real examples, and troubleshooting
This page contains the slides and numerical simulation codes for the lectures delivered in 2023 during my employment as a post-doctoral researcher at the University of Luxembourg.
Day 1: Introduction into programming (2023-09-18)
Day 2: Getting started with R (2023-09-20)
Day 3: Special object types in R (2023-09-25)
Day 4: Functions, a.k.a. the pith of R (2023-09-27)
Day 5: Graphics and summaries in R (2023-10-02)
Day 6: Numerical optmisation in R (2023-10-03)
Goal and objectives
The aim of this doctoral course is to help researchers learn the basics of the R programming language, to teach them how to apply it to answer typical and atypical research questions, how to produce useful diagnostics and visualise the results, and how to proceed in case of troubles or errors. The material covered contains advice that is difficult to find in the literature or online resources. The practical part of this course is based on the questions asked by Ph.D. students in DSEFM over the course of several years. The applied topics covered in the course (see ‘Course details’ below, items 8–10) will depend on the interest of the participants. The theoretical knowledge of numerical methods and algorithms received in this course is applicable to other statistical packages used for research in the field.
Upon successful completion of this course, students will be able to:
- Understand the logic of the R language and use its strongest features to manage and transform data;
- Use numerical optimisation based on deterministic and stochastic algorithms to answer questions ubiquitous in applied research;
- Create their own functions, routines, and simulations for cutting-edge research to increase productivity, even when there is no existing implementation or solution;
- Produce neat publication-ready plots in 2D and 3D and create vibrant animations to better illustrate the research problem;
- Troubleshoot errors, debug functions, look for the possible source of error or performance bottlenecks;
- Understand how computers store and process data and where accuracy is typically lost.
Course details
- How computers compute, how data are stored. Data formats. Programming concepts. Comparison of statistical packages. Showcasing R as a language.
- Best practices for development. How to get help. R packages. RStudio features. Data types in R, object types, variables, vectors, matrices, functions.
- Special data types. Classes and methods. Logical operations, conditions, loops. Subsetting, data manipulation. Lists, vectorised operations. Data transformation best practices. Text manipulation.
- Functions in R. How to efficiently create a user function. Environments. Debugging. Parallel computing. Speeding up, benchmarking, profiling.
- Graphics in R: scatter plots, line plots, heat maps, histograms, bar plots, box-and-whisker plots. Plot clean-up. Formulae, summarisation, aggregation, describing data. Producing animations.
- Numerical optimisation (convex and non-convex). Derivative-based, derivative-free and stochastic optimisers. Speeding up slow optimisation problems.
- Applied economic analysis in R: OLS, 2SLS, GMM, panel models, non-linear and non-parametric methods, time-series methods, robust estimation. Hypothesis testing and inference.
- Practical session #1: fetching data from the web; merging ‘dirty’ data sets; diagnosing panel models; variable selection via LASSO/Elastic Net.
- Practical session #2: detecting non-linearities in relationships, parametric and semi-parametric specification testing, improving estimation efficiency, principal component analysis and dimensionality reduction.
- Practical session #3: custom conditional-density models; avoiding numerical instabilities; reproducing non-standard plots; demographic calculations; recreational statistics.
Miscellaneous
This course might change the manner in which one conceptualises applied statistical analysis on a computer and, in fact, may create certain cognitive biases.: