# Lessons

**This Lesson Covers**

- What’s an Open Lab?
- Why R?
- Learning objectives for the semester
- Setup: R, R Studio
- A quick example

**Required Data Files**

listings.csv

**Optional Reading**

R for Data Science Chapters 4, 6, 8

**This Lesson Covers**

- Reproducibility
- Projects in RStudio
- Importing data
- Objects and classes
- Tables for categorical data
- Exploring continuous data
- Missing data
- Saving output
- ggplot (time allowing)

**Required Data Files**

Five Thousand Wine Reviews

**Optional Reading**

R for Data Science Chapter 5

**This Lesson Covers**

- Review: Starting a New Project in R, loading the tidyverse and importing data
- Filtering
- Relational and Assignment Operators
- Reordering Data (arrange)
- Selecting Data (select)
- Renaming Columns
- Adding New Variables
- Summarizing Data
- Piping

**Required Data Files**

Boston AirBnB Data

**Optional Reading**

R for Data Science Chapter 7

**This Lesson Covers**

- What is Exploratory Data Analysis?
- What do we have? – dim, str, and summary
- Frequency – Univariate EDA
- Covariation – Two or more variables
- Categorical vs Categorical Variables
- Categorical vs Continuous Variables

**Required Data Files**

New York Business Inspections

**Optional Reading**

R for Data Science Chapters 14 and 15

**This Lesson Covers**

- Getting Started With Strings
- Combining and Subsetting Strings
- Regular Expressions
- Creating Factors
- Altering Factors

**Required Data Files**

Brazilian E-Commerce

**Optional Reading**

R for Data Science Chapters 12 & 13

**This Lesson Covers**

- Merging / Joining Dataframes
- Reshaping with tidyr

**Required Data Files**

US Cheese Consumption

**Optional Reading**

R for Data Science Chapter 27

**This Lesson Covers**

- R Markdown
- Markdown Syntax
- Creating Reproducible Reports

**Required Data Files**

Weather in Austin, TX

**Optional Reading**

R for Data Science Chapter 19

**This Lesson Covers**

- When you should write a function
- Steps to writing a function
- Naming conventions
- Arguments
- Returns
- Conditionals
- Environment

**This Lesson Covers**

- Getting started with loops
- Output
- While loops
- Loops with conditionals and functions
- Error handling

**This Lesson Covers**

- Terminology
- Simple Linear Models with Plots
- Multiple Regression – Formula notation in R
- Modeling
- Simulations
- Reproducible simulations

# Extras

**This Lesson Covers**

Have you ever wanted to change your ggplots with the click of a button? Wouldn’t it be nice to use a drop-down menu to filter your data? R Shiny allows you and others to interact with your code through a graphic web interface.

**This Lesson Covers**

This extra shows you an easy way to split up your loops over multiple cores on your computer to run in “parallel” and speed up large or long-running loops.

**This Lesson Covers**

This extra demonstrates two useful tools for handling missing data in statistical models.

**This Lesson Covers**

The `caret`

package provides a consistent framework for fitting hundreds of different types of predictive models, then comparing them to select the most effective models using out of sample accuracy.

**This Lesson Covers**

The stargazer package makes it easy to create publication quality regression tables in html or LateX.

**This Lesson Covers**

The ggally package provides a function for creating scatterplot matrices. A scatterplot matrix arranges multiple scatterplots on a grid so that they are easy to compare to one another.