Class Details

Price: $1,500

Course Includes:

  • Class exercises in addition to training instruction
  • Courseware books, notepads, pens, highlighters and other materials
  • Full breakfast with variety of bagels, fruits, yogurt, doughnuts and juice
  • Tea, coffee, and soda available all day
  • Freshly baked cookies every afternoon - * only at participating locations

 




Course Outline

Module 1 Introduction to Clustering

  • Lesson 1: Commercial applications of data mining
  • Lesson 2 Introduction to clustering and the k-means algorithm

Module 2: Implementation of Clustering

  • Lesson 1: k-means clustering on multi-dimensional data
  • Lesson 2: Evaluating the quality of clustering
  • Lesson 3: Determining the right number of clusters to use

Module 3: Clustering Multivariate Data

  • Lesson 1: Working with binary data -- cosine distance
  • Lesson 2: Clustering binary data -- sperical k-means
  • Lesson 3: Assessing quality of spherical k-means clustering
  • Lesson 4: Interpreting clusters of binary data and making recommendations
  • Lesson 5: Pitfalls of clustering

Module 4: Introduction to Network Analysis

  • Lesson 1: Commercial applications of network analysis
  • Lesson 2: Geographic networks
  • Lesson 3: Interest networks
  • Lesson 4: Human communities
  • Lesson 5: Graph components: nodes and edges

Module 5: Measuring and Visualizing Networks

  • Lesson 1: Centrality measures: degree, betweeness, eigenvector and PageRank
  • Lesson 2: Geolocation and visualizing geographic networks
  • Lesson 3: Visualizing non-geographic networks
  • Lesson 4: Identifying network weaknesses

Module 6: Network Propagation and Message Diffusion

  • Lesson 1: The SIRS model: message diffusion, virus infection and cascading failures
  • Lesson 2: Mining Twitter using API
  • Lesson 3: Interactive network diffusion simulations animation package

Module 7: Measuring Trust and Community Detection

  • Lesson 1: Political data and The Sunlight Foundation data
  • Lesson 2: Measuring node similarity -- Jaccard distance
  • Lesson 3: Hierarchical clustering for community detection
  • Lesson 4: Modularity-based methods for community detection
  • Lesson 5: Additional tips and resources

 

Objectives

At the conclusion of this course, participants will be able to do the following:

  • Use clustering to mine data for patterns and trends
  • Measure trust among people
  • Predict connections 
  • Make recommendatons
  • Build simulations of how messages and information spreads