This three-day training will turn attendees into savvy data programmers and visualizers with a solid foundation to tackle data cleaning and visualization. Students will become comfortable with R, an open-source tool that is widely used by professional statisticians and analysts. R is designed to analyzing data powerfully and effectively, create predictive models, and build beautiful visualizations. The workshop optimizes learning by integrating practice time and discussion time in class to improve retention and provide individualized support throughout the session.

Data science fundamentals:

  • What is data science?
  • A data scientist's approach
  • Commercial applications of data science

Introduction to R programming:

  • Installing R and RStudio
  • Introduction to RStudio
  • Performing basic calculations in R
  • Loading data into R
  • Understanding data types, how and when do use them
  • How to read and write data

Fundamentals of data cleaning

  • Evaluate and address missing values in data
  • Manipulate data types and structures 
  • Transforming and cleaning data using tidyverse’s dplyr package
  • Selecting and subsetting data
  • Summarizing and aggregating data

Basic visualizations:

  • Basic plotting in R
  • Basic plotting with ggplot2
  • Customizing graphs and adjusting formats
  • Telling a story through data and visualizations

Interactive graphs and maps in R:

  • Introduction to interactive visualization
  • Introduction to charts and graphs with ggvis
  • Interactive maps with leaflet
  • Interactive visualizations with HighCharts
  • Publishing your interactive visualizations to the web


Introduction to R, Data Cleaning and Visualization Training Objectives: 

  • Understand how data science can be used effectively in industry
  • Program proficiently in R
  • Build powerful static and interactive data visualizations
  • Visualize findings effectively