Introduction to Machine Learning and Deep Learning with Spark and TensorFlow

Explore Spark Essentials, Algorithms, Machine Learning, and Data Mining Concepts, and How TensorFlow Implements Them



5 Days

Course Overview

Apache Spark, a significant component in the Hadoop Ecosystem, is a cluster computing engine used in Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, offers order-of-magnitude faster processing for many in-memory computing tasks compared to Map/Reduce. It can be programmed in Java, Scala, Python, and R - the favorite languages of Data Scientists - along with SQL-based front ends.

Machine Learning Foundation: Working with Spark and TensorFlow is a comprehensive, hands-on machine learning course intended for data scientists and software engineers (with Python experience), new to these technologies and Machine Learning.  This course explores popular machine learning algorithms from the ground up.  Students will explore Apache Spark essentials, core machine learning concepts, regressions, classifications, clustering and more.

The abundance of data and affordable cloud scale has led to an explosion of interest in Deep Learning. Google has released an excellent library called TensorFlow to open-source, allowing state-of-the-art machine learning done at scale, complete with GPU-based acceleration.  Students will explore these skills in an active hands-on manner. The second part of the course introduces students to Deep Learning concepts and how TensorFlow implements them.

Course Objectives

This “skills-centric” course is about 50% hands-on lab and 50% lecture, with extensive practical exercises designed to reinforce fundamental skills, concepts and best practices taught throughout the course.  Throughout the program, working in a hands-on learning environment guided by our expert instructor, students will

  • Learn popular machine learning algorithms, their applicability, and limitations
  • Practice the application of these methods in the Spark machine learning environment
  • Learn practical use cases and limitations of algorithms
  • Will explore not just the related APIs, but will also learn the theory behind them
  • Work with real world datasets from Uber, Netflix, Walmart, Prosper, etc.

Need different skills or topics?  If your team requires different topics or tools, additional skills or custom approach, this course may be further adjusted to accommodate.  We offer additional big data, analytics. AI, machine learning, programming, Python/R and other related topics that may be blended with this course for a track that best suits your needs. Our team will collaborate with you to understand your needs and will target the course to focus on your specific learning objectives and goals.

Course Prerequisites

This is an intermediate level course, geared for Data Scientists, Data Analysts and Developers new to Machine Learning, Spark and TensorFlow.

Pre-Requisites:  Students should have attended or have incoming skills equivalent to those in this course:

  • Strong basic Python Skills.  Attendees without Python background may view labs as follow along exercises or team with others to complete them.
  • Good foundational mathematics in Linear Algebra and Probability
  • Basic Linux skills, including familiarity with command-line options such as ls, cd, cp, and su

Please see the Related Courses tab for specific Pre-Requisite courses, Related Courses that offer similar skills or topics, and next-step Learning Path recommendations.

Course Agenda

Please note that this list of topics is based on our standard course offering, evolved from typical industry uses and trends. We will work with you to tune this course and level of coverage to target the skills you need most.

Part 1: Introduction to Machine Learning

  1. Machine Learning (ML) Overview
  • Machine Learning landscape
  • Machine Learning applications
  • Understanding ML algorithms & models
  1. ML in Python and Spark
  • Spark ML Overview
  • Introduction to Jupyter notebooks
  • Lab: Working with Jupyter + Python + Spark
  • Lab: Spark ML utilities
  1. Machine Learning Concepts
  • Statistics Primer
  • Covariance, Correlation, Covariance Matrix
  • Errors, Residuals
  • Overfitting / Underfitting
  • Cross-validation, bootstrapping
  • Confusion Matrix
  • ROC curve, Area Under Curve (AUC)
  • Lab: Basic stats
  1. Feature Engineering (FE)
  • Preparing data for ML
  • Extracting features, enhancing data
  • Data cleanup
  • Visualizing Data
  • Lab: data cleanup
  • Lab: visualizing data
  1. Linear regression
  • Simple Linear Regression
  • Multiple Linear Regression
  • Running LR
  • Evaluating LR model performance
  • Lab
  • Use case: House price estimates
  1. Logistic Regression
  • Understanding Logistic Regression
  • Calculating Logistic Regression
  • Evaluating model performance
  • Lab: Use case: credit card application, college admissions
  1. Classification: SVM (Supervised Vector Machines)
  • SVM concepts and theory
  • SVM with kernel
  • Lab: Use case: Customer churn data
  1. Classification: Decision Trees & Random Forests
  • Theory behind trees
  • Classification and Regression Trees (CART)
  • Random Forest concepts
  • Labs: Use case: predicting loan defaults, estimating election contributions
  1. Classification: Naive Bayes
  • Theory
  • Lab
  • Use case: spam filtering
  1. Clustering (K-Means)
  • Theory behind K-Means
  • Running K-Means algorithm
  • Estimating the performance
  • Lab: Use case: grouping cars data, grouping shopping data
  1. Principal Component Analysis (PCA)
  • Understanding PCA concepts
  • PCA applications
  • Running a PCA algorithm
  • Evaluating results
  • Lab: Use case: analyzing retail shopping data
  1. Recommendations (Collaborative filtering)
  • Recommender systems overview
  • Collaborative Filtering concepts
  • Lab: Use case: movie recommendations, music recommendations
  1. Performance 
  • Best practices for scaling and optimizing Apache Spark
  • Memory caching
  • Testing and validation

Part Two: Introduction to Deep Learning with TensorFlow

  1. Machine Learning Quick Review
  • Understanding Machine Learning
  • Supervised versus Unsupervised Learning
  • Regression
  • Classification
  • Clustering
  1. Introducing Tensorflow
  • Tensorflow intro
  • Tensorflow Features
  • Tensorflow Versions
  • GPU and TPU scalability
  • Lab: Setting up and Running Tensorflow
  1. The Tensor: The Basic Unit of Tensorflow
  • Introducing Tensors
  • Tensorflow Execution Model
  • Lab: Learning about Tensors
  1. Single Layer Linear Perceptron Classifier With TensorFlow
  • Introducing Perceptrons
  • Linear Separability and Xor Problem
  • Activation Functions
  • Softmax output
  • Backpropagation, loss functions, and Gradient Descent
  • Lab: Single-Layer Perceptron in Tensorflow
  1. Hidden Layers: Intro to Deep Learning
  • Hidden Layers as a solution to XOR problem
  • Distributed Training with Tensorflow
  • Vanishing Gradient Problem and ReLU
  • Loss Functions
  • Lab: Feedforward Neural Network Classifier in Tensorflow
  1. High level Tensorflow: tf.learn
  • Using high level tensorflow
  • Developing a model with tf.learn
  • Lab: Developing a tf.learn model
  1. Convolutional Neural Networks in Tensorflow
  • Introducing CNNs
  • CNNs in Tensorflow
  • Lab : CNN apps
  1. Introducing Keras
  • What is Keras?
  • Using Keras with a Tensorflow Backend
  • Lab: Example with a Keras
  1. Recurrent Neural Networks in Tensorflow
  • Introducing RNNs
  • RNNs in Tensorflow
  • Lab: RNN
  1. Long Short-Term Memory (LSTM) in Tensorflow
  • Introducing RNNs
  • RNNs in Tensorflow
  • Lab: RNN
  1. Conclusion
  • Summarize features and advantages of Tensorflow
  • Summarize Deep Learning and How Tensorflow can help
  • Next steps

Course Materials

Student Materials: Each participant will receive a Student Guide with course notes, code samples, software tutorials, step-by-step written lab instructions, diagrams and related reference materials and resource links. Students will also receive the project files (or code, if applicable) and solutions required for the hands-on work.


Hands-On Setup Made Simple! Our dedicated tech team will work with you to ensure our ‘easy-access’ cloud-based course environment is accessible, fully-tested and verified as ready to go well in advance of the course start date, ensuring a smooth start to class and effective learning experience for all participants. Please inquire for details and options.


Raise the bar for advancing technology skills

Attend a Class!

Live scheduled classes are listed below or browse our full course catalog anytime

Special Offers

We regulary offer discounts for individuals, groups and corporate teams. Contact us

Custom Team Training

Check out custom training solutions planned around your unique needs and skills.

EveryCourse Extras

Exclusive materials, ongoing support and a free live course refresh with every class.

Hot Summer Savings!
Buy One Get One Free!

Enroll by September 30 in any TWO public classes in 2022 for the price of ONE! 

Click for Details & Additional Offers

Learn. Explore. Advance!

Extend your training investment! Recorded sessions, free re-sits and after course support included with Every Course
Trivera MiniCamps
Gain the skills you need with less time in the classroom with our short course, live-online hands-on events
Trivera QuickSkills: Free Courses and Webinars
Training on us! Keep your skills current with free live events, courses & webinars
Trivera AfterCourse: Coaching and Support
Expert level after-training support to help organizations put new training skills into practice on the job

The voices of our customers speak volumes

Special Offers
Limited Offer for most courses.

SAVE 50%

Learn More