Class Details

Price: $2,295

3-Day Course Includes:

  • Class exercises in addition to training instruction
  • Courseware books, notepads, pens, highlighters and other materials
  • Free subscription to Cloudera's practice exam questions
  • Full breakfast with variety of bagels, fruits, yogurt, doughnuts and juice
  • Tea, coffee, and soda available all day
  • Freshly baked cookies every afternoon - * only at participating locations

For group training options, please call us at (240) 667-7757 or email promo@phoenixts.com.

Course Outline

Data Science

  • Intro to Data Science
  • Data Science Growing Need
  • Data Scientist's Role in Business

Evaluating Use Cases

  • Finance, Retail, and Advertising
  • Telecommunications and Utilities
  • Healthcare and Pharmaceuticals
  • Defense and Intelligence

Understanding the Project Lifecycle

  • Project Lifecycle Steps
  • Lab Scenarios

Data Acquisition

  • Sourcing Data
  • Acquisition Methods

Reviewing Input Data

  • Data Quality and Quantity
  • Data Formats

Data Formation

  • File Format Conversion
  • Anonymization
  • Datasets

Analysis and Statistical Techniques

  • Statistics and Probability
  • Descriptive and Inferential Statistics

Machine Learning Fundamentals

  • Three C's of Machine Learning
  • Naive Bayes Classifiers
  • Algorithms and Data

Recommender

  • Recommender Systems
  • Collaborative Filtering
  • Recommender System Limitations
  • Important Core Concepts

Apache Spark and MLlib

  • Apache Spark
  • MapReduce Comparison
  • Apache Spark Fundamentals
  • The Spark MLlib Package

MLlib for Recommender Implementation

  • Latent Factor Recommenders and the ALS Method
  • ALS Recommenders and Hyperparameters
  • Developing Recommenders in MLlib
  • Adjusting and Tuning Hyperparameters
  • Weighting

Conducting and Evaluating Experiments

  • How to Measure Recommender Success
  • Design for Successful and Effective Experiments
  • UIs for Recommenders

Production Deployment

  • Production Deployment Overview
  • Developing Conclusions and Creating Visual Results
  • Performance Optimization Considerations

Objectives

Attendees should come to understand the develop skills and a grasp of the Hadoop ecosystem. Skills addressed in this training include:

  • Ability to identify business use cases where data science provides insightful and important results
  • Ability to gather, organize, and combine disparate data sources for developing pictures for analysis
  • Ability to employ statistical methods for data exploration
  • An understanding of when to utilize Apache Spark and Hadoop streaming for data science pipelines
  • An understanding of when to employ machine learning techniques for varying data science projects
  • Ability to implement and manage recommenders with Spark's MLlib for data experiments
  • An understanding of analysis and identification of pitfalls for deploying analytics projects at scale

Class Exam

Cloudera Certified Professional: Data Scientist (CCP:DS) Certification Exam

To earn this accredidation, individuals must pass the Data Science Essentials (DS-200) exam and complete the Data Science Challenge.

Details:

  • Exam Code - DS-200
  • Questions: 60 Questions with 6-10 extra beta questions
  • Types of Questions - Multiple choice, reading passages and matching
  • Time Limit - 90 minutes
  • Passing Score - 500 on scale of 0-700
  • Language - English

Objectives:

  • Data Acquisition
  • Data Evaluation
  • Data Transformation
  • Machine Learning Basics
  • Clustering
  • Classification
  • Collaborative Filtering
  • Model/Feature Selection
  • Probability
  • Visualization
  • Optimization

Phoenix TS is an authorized testing center for Pearson VUE and Prometric exams. To register for exams contact us or visit the Pearson VUE or Prometric websites.