Introduction to Cassandra

Explore Core Skills Required to Create Good Models with Cassandra (C*), Explore CQL, Querying & More

TTDS6776

Introductory and Beyond

3 Days

Course Overview

The Cassandra (C*) database is a massively scalable NoSQL database that provides high availability and fault tolerance, as well as linear scalability when adding new nodes to a cluster. It has many powerful capabilities, such as tunable and eventual consistency, that allow it to meet the needs of modern applications, but also introduce a new paradigm for data modeling that many organizations do not have the expertise to use in the best way.

 

Introduction to Cassandra is a hands-on course designed to teach attendees the basics of how to create good data models with Cassandra.  This technical course has a focus on the practical aspects of working with C*, and introduces essential concepts needed to understand Cassandra, including enough coverage of internal architecture to make good decisions. It is hands-on, with labs that provide experience in core functionality. Students will also explore CQL (Cassandra Query Language), as well as some of the “anti-patterns” that lead to non-optimal C* data models and be ready to work on production systems involving Cassandra.

Objectives

This “skills-centric” course is about 50% hands-on lab and 50% lecture, coupling the most current techniques with the soundest industry practices. Throughout the course students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review.

The goal of this course is to enable technical students new to Cassandra to begin working with Cassandra in an optimal manner.  Throughout the course students will learn to:

  • Understand the Big Data needs that C* addresses
  • Be familiar with the operation and structure of C*
  • Be able to install and set up a C* database
  • Use the C* tools, including cqlsh, nodetool, and ccm (Cassandra Cluster Manager)
  • Be familiar with the C* architecture, and how a C* cluster is structured
  • Understand how data is distributed and replicated in a C* cluster
  • Understand core C* data modeling concepts, and use them to create well-structured data models
  • Be familiar with the C* eventual consistency model and use it intelligently
  • Be familiar with consistency mechanisms such as read repair and hinted handoff
  • Understand and use CQL to create tables and query for data
  • Know and use the CQL data types (numerical, textual, uuid, etc.)
  • Be familiar with the various kinds of primary keys available (simple, compound, and composite primary keys)
  • Be familiar with the C* write and read paths
  • Understand C* deletion and compaction

Need different skills or topics?  If your team requires different topics or tools, additional skills or custom approach, this course may be further adjusted to accommodate.  We offer additional big data / data science, Cassandra, Apache tooling, DevOps, database, programming and other related topics that may be blended with this course for a track that best suits your needs. Our team will collaborate with you to understand your needs and will target the course to focus on your specific learning objectives and goals.

Course Objectives

This “skills-centric” course is about 50% hands-on lab and 50% lecture, coupling the most current techniques with the soundest industry practices. Throughout the course students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review.

The goal of this course is to enable technical students new to Cassandra to begin working with Cassandra in an optimal manner.  Throughout the course students will learn to:

  • Understand the Big Data needs that C* addresses
  • Be familiar with the operation and structure of C*
  • Be able to install and set up a C* database
  • Use the C* tools, including cqlsh, nodetool, and ccm (Cassandra Cluster Manager)
  • Be familiar with the C* architecture, and how a C* cluster is structured
  • Understand how data is distributed and replicated in a C* cluster
  • Understand core C* data modeling concepts, and use them to create well-structured data models
  • Be familiar with the C* eventual consistency model and use it intelligently
  • Be familiar with consistency mechanisms such as read repair and hinted handoff
  • Understand and use CQL to create tables and query for data
  • Know and use the CQL data types (numerical, textual, uuid, etc.)
  • Be familiar with the various kinds of primary keys available (simple, compound, and composite primary keys)
  • Be familiar with the C* write and read paths
  • Understand C* deletion and compaction

Need different skills or topics?  If your team requires different topics or tools, additional skills or custom approach, this course may be further adjusted to accommodate.  We offer additional big data / data science, Cassandra, Apache tooling, DevOps, database, programming and other related topics that may be blended with this course for a track that best suits your needs. Our team will collaborate with you to understand your needs and will target the course to focus on your specific learning objectives and goals.

Working in a hands-on learning environment, students will learn:

  • Understand the needs that C* addresses
  • Be familiar with the operation and structure of C*
  • Be able to install and set up a C* database
  • Use the C* tools, including cqlsh, nodetool, and ccm (Cassandra Cluster Manager)
  • Be familiar with the C* architecture, and how a C* cluster is structured
  • Understand how data is distributed and replicated in a C* cluster
  • Understand core C* data modeling concepts, and use them to create well-structured data models
  • Use data replication and eventual consistency intelligently
  • Understand and use CQL to create tables and query for data
  • Know and use the CQL data types (numerical, textual, uuid, etc.)
  • Understand the various kinds of primary keys available (simple, compound, and composite primary keys)
  • Use more advanced capabilities like collections, counters, secondary indexes, CAS (Compare and Set), static columns, and batches
  • Be familiar with the Java client API
  • Use the Java client API to write client programs that work with C*
  • Build and use dynamic queries with QueryBuilder
  • Understand and use asynchronous queries with the Java API

Course Prerequisites

Attendees should have incoming experience with and knowledge of SQL. Some familiarity with distributed systems is also helpful.

Course Agenda

Session 1: Cassandra Overview

  • Why We Need Cassandra - Big Data Challenges vs RDBMS
  • High level Cassandra Overview
  • Cassandra Features
  • Optional: Basic Cassandra Installation and Configuration

Session 2: Cassandra Architecture and CQL Overview

  • Cassandra Architecture Overview
  • Cassandra Clusters and Rings
  • Nodes and Virtual Nodes
  • Data Replication in Cassandra
  • Introduction to CQL
  • Defining Tables with a Single Primary Key
  • Using cqlsh for Interactive Querying
  • Selecting and Inserting/Upserting Data with CQL
  • Data Replication and Distribution
  • Basic Data Types (including uuid, timeuuid)

Session 3: Data Modeling and CQL Core Concepts

  • Defining a Compound Primary Key
  • CQL for Compound Primary Keys
  • Partition Keys and Data Distribution
  • Clustering Columns
  • Overview of Internal Data Organization
  • Overview of Other Querying Capabilities
  • ORDER BY, CLUSTERING ORDER BY, UPDATE , DELETE,  ALLOW FILTERING
  • Batch Queries
  • Data Modeling Guidelines
  • Denormalization
  • Data Modeling Workflow
  • Data Modeling Principles
  • Primary Key Considerations
  • Composite Partition Keys
  • Defining with CQL
  • Data Distribution with Composite Partition Key
  • Overview of Internal Data Organization

Session 4: Additional CQL Capabilities

  • Indexing
  • Primary/Partition Keys and Pagination with token()
  • Secondary Indexes and Usage Guidelines
  • Cassandra collections
  • Collection Structure and Uses
  • Defining and Querying Collections (set, list, and map)
  • Materialized View
  • Overview
  • Usage Guidelines

Session 5: Data Consistency In Cassandra

  • Overview of Consistency in Cassandra
  • CAP Theorem
  • Eventual (Tunable) Consistency in C* - ONE, QUORUM, ALL
  • Choosing CL ONE
  • Choosing CL QUORUM
  • Achieving Immediate Consistency
  • Overview of Other Consistency Levels
  • Supportive Consistency Mechanisms
    • Writing / Hinted Handoff
    • Read Repair
    • Nodetool repair

Session 6: Internal Mechanisms

  • Ring Details
    • Partitioners
    • Gossip Protocol
    • Snitches
  • Write Path
    • Overview / Commit Log
    • Memtables and SSTables
    • Write Failure
      • Unavailable Nodes and Node Failure
      • Requirements for Write Operations
  • Read Path Overview
    • Read Mechanism
    • Replication and Caching
  • Deletion/Compaction Overview
    • Delete Mechanism
    • Tombstones and Compaction

Session 7: Working with IntelliJ

  • Configuring JDBC Data Source for Cassandra
  • Reading Schema Information
  • Querying and Editing Tables.

Course Materials

Student Materials: Each participant will receive a digital Student Guide and/or Course Notes, code samples, software tutorials, step-by-step written lab instructions (as applicable), diagrams and related reference materials and resource links. Students will also receive the project files (or code, if applicable) and solutions required for the hands-on work.

Hands-On Setup Made Simple! Our dedicated tech team will work with you to ensure our ‘easy-access’ cloud-based course environment, or local installation, is accessible, fully-tested and verified as ready to go well in advance of the course start date, ensuring a smooth start to class and effective learning experience for all participants. In some cases we can also help you install this course locally if preferred. Please inquire for details and options.

Every-Course Extras = High-Value & Long-Term Learning Support! All Public Schedule courses include our unique EveryCourse Extras package (Course Recordings, Live Instructor Follow-on Support, Free *Live* Course Refresh Re-Takes, early access to Special Offers, Free Courses & more). Please inquire for details.

Raise the bar for advancing technology skills

Attend a Class!

Live scheduled classes are listed below or browse our full course catalog anytime

Special Offers

We regulary offer discounts for individuals, groups and corporate teams. Contact us

Custom Team Training

Check out custom training solutions planned around your unique needs and skills.

EveryCourse Extras

Exclusive materials, ongoing support and a free live course refresh with every class.

Attend a Course

Please see the current upcoming available open enrollment course dates posted below. Please feel free to Register Online below, or call 844-475-4559 toll free to connect with our Registrar for assistance. If you need additional date options, please contact us for scheduling.

Course Title Days Date Time Price
Introduction to Cassandra 3 Days Nov 16 to Nov 18 10:00 AM to 06:00 PM EST $1,995.00 Enroll

Hot Summer Savings!
Buy One Get One Free!

Enroll by September 30 in any TWO public classes in 2022 for the price of ONE! 

Click for Details & Additional Offers

Learn. Explore. Advance!

Extend your training investment! Recorded sessions, free re-sits and after course support included with Every Course
Trivera MiniCamps
Gain the skills you need with less time in the classroom with our short course, live-online hands-on events
Trivera QuickSkills: Free Courses and Webinars
Training on us! Keep your skills current with free live events, courses & webinars
Trivera AfterCourse: Coaching and Support
Expert level after-training support to help organizations put new training skills into practice on the job

The voices of our customers speak volumes

Special Offers
Limited Offer for most courses.

SAVE 50%

Learn More