Apache Spark Primer | Quick Start to Spark Basics, Components, RDDs & More

Get Started with Spark Basics, Leverage Features, Explore & Use RDDs, Spark SQL & Dataframes and More

TTSK7502

Introductory

2 Days

Course Overview

Apache Spark Essentials is a two-day course that provides students with a quick introduction to the Spark environment, benefits, features and common uses and tools. Working in a hands-on learning environment, students will learn where Spark fits into the Big Data ecosystem, and how to use core Spark features for critical data analysis.  The course also explores (at a higher-level) key Spark technologies such as Spark shell for interactive data analysis, Spark internals, RDDs, Dataframes and Spark SQL.

Course Objectives

This “skills-centric” course is about 50% hands-on lab and 50% lecture, designed to train attendees in core big data/ Spark development and use skills, coupling the most current, effective techniques with the soundest industry practices. Throughout the course students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review.

This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development focused on Spark and related tools.  Working in a hands-on learning environment, students will explore:

  • Spark Basics
  • Getting Started with Spark
  • RDDs In Depth
  • Spark SQL & Dataframes

Need different skills or topics?  If your team requires different topics or tools, additional skills or custom approach, this course may be further adjusted to accommodate.  We offer additional Spark, big data, data science, Hadoop, AI / machine learning / deep learning, programming, analytics,  and other related topics that may be blended with this course for a track that best suits your needs. Our team will collaborate with you to understand your needs and will target the course to focus on your specific learning objectives and goals.

Course Prerequisites

This is an Introductory-level course is geared for experienced Developers and Architects seeking to be proficient in Spark tools & technologies. Attendees should be experienced developers who are comfortable with basic Python programming.  Students should also be able to navigate Linux command line and have basic knowledge of Linux editors (such as VI / nano) for editing code.

Take Before: Students should have attended the course(s) below, or should have basic skills in these areas:

  • TTPS4802     Python Basic Primer / Quick Start to Python

Please see the Related Courses tab for specific Pre-Requisite courses, Related Courses that offer similar skills or topics, and next-step Learning Path recommendations.

Course Agenda

Please note that this list of topics is based on our standard course offering, evolved from typical industry uses and trends. We’ll work with you to tune this course and level of coverage to target the skills you need most.  

Spark Basics

  • Background and history
  • Spark and Hadoop
  • Spark concepts and architecture
  • Spark eco system (core, spark sql, mlib, streaming)

Getting Started with Spark

  • Spark in local mode
  • Spark web UI
  • Spark shell
  • Analyzing dataset - part 1
  • Inspecting RDDs

RDDs In Depth

  • Partitions
  • RDD Operations / transformations
  • RDD types
  • MapReduce on RDD
  • Caching and persistence
  • Sharing cached RDDs

Spark SQL & Dataframes

  • Dataframes
  • Dataframes DDL
  • Spark SQL
  • Defining tables and importing datasets
  • Queries

Course Materials

Student Materials: Each participant will receive a Student Guide with course notes, code samples, software tutorials, step-by-step written lab instructions, diagrams and related reference materials and resource links. Students will also receive the project files (or code, if applicable) and solutions required for the hands-on work.

Hands-On Setup Made Simple! Our dedicated tech team will work with you to ensure our ‘easy-access’ cloud-based course environment is accessible, fully-tested and verified as ready to go well in advance of the course start date, ensuring a smooth start to class and effective learning experience for all participants. Please inquire for details and options.

Raise the bar for advancing technology skills

Attend a Class!

Live scheduled classes are listed below or browse our full course catalog anytime

Special Offers

We regulary offer discounts for individuals, groups and corporate teams. Contact us

Custom Team Training

Check out custom training solutions planned around your unique needs and skills.

EveryCourse Extras

Exclusive materials, ongoing support and a free live course refresh with every class.

Mix, Match & Master!
2FOR1: Two Courses, One Price!

Enroll in *any* two public courses (for 2023 *OR* 2024 dates!) by December 31, for one price!  Learn something new, or share the promo!

Click for Details & Additional Offers

Learn. Explore. Advance!

Extend your training investment! Recorded sessions, free re-sits and after course support included with Every Course
Trivera MiniCamps
Gain the skills you need with less time in the classroom with our short course, live-online hands-on events
Trivera QuickSkills: Free Courses and Webinars
Training on us! Keep your skills current with free live events, courses & webinars
Trivera AfterCourse: Coaching and Support
Expert level after-training support to help organizations put new training skills into practice on the job

The voices of our customers speak volumes

Special Offers
Limited Offer for most courses.

SAVE 50%

Learn More