1
Introduction
1.1
Goals
1.2
Data
1.3
Git and GitHub
2
Tidying data: tidyr
3
Wrangling data: dplyr
3.1
Two typial workflows
3.2
Manipulating rows
3.2.1
Extract rows
3.2.2
Arranging rows
3.3
Manipulating columns
3.3.1
Extract and rename columns
3.3.2
Create new columns
3.4
Scoped functions
3.5
Aggregate
3.6
Window functions
3.7
Combining tables
3.8
Database backend
3.8.1
Motivation
3.8.2
Set up
3.8.3
Querying the database
4
Dates and times: lubridate
4.1
What is Lubridate?
4.2
Basics
4.3
Import data, clean date-time
4.4
Check and modify data
4.5
Before exploration
4.5.1
Intervals
4.5.2
Groupings
4.6
Exploration - Analysis
4.6.1
Temperatur
4.6.2
Rainfall
4.6.3
Humidity
4.7
Wrap up
5
Categorical data: forcats
5.1
Introduction
5.2
General functions
5.2.1
Create
5.2.2
Count values per level
5.2.3
Inspect and set levels
5.2.4
Inspect unique values
5.3
Combine factors
5.3.1
Combine factors with different levels
5.3.2
Standardise levels of various factors
5.4
Order of levels
5.4.1
Manual reordering of levels
5.4.2
Reordering by frequency
5.4.3
Reordering by appearance
5.4.4
Reverse level order
5.4.5
Shift levels
5.4.6
Randomly shuffle levels
5.4.7
Reordering levels by other variables
5.5
Change the value of levels
5.5.1
Renaming the levels
5.5.2
Anonymize levels
5.5.3
Collapse multiple levels into one
5.5.4
Aggregate levels into a lump
5.5.5
Manually lump levels
5.6
Add or drop levels
5.6.1
Add levels
5.6.2
Drop levels
5.6.3
Assign a level to
NAs
6
Character data: stringr
6.1
Introduction
6.1.1
Use of stringR in the R environment
6.1.2
Types of data, which could be used with stringR
6.2
Data Set for examples
6.3
Regular Expressions
6.4
Functions of stringR explained
6.4.1
Manage lengths
6.4.2
Subset Strings
6.4.3
Detect matches
6.4.4
Mutate Strings
6.4.5
Join and Split
6.5
Literature
7
High performance computing: data.table
7.1
Motivation
7.2
Data exploration
7.2.1
Subsetting rows
7.2.2
Selecting Columns
7.2.3
Grouping results
7.2.4
Editing Data
7.2.5
Side effects
7.3
Runtime comparision
7.4
Further resources
R Data Science Book
2
Tidying data: tidyr