infyni

Topics
Instructor (1)
Reviews

1 Spark Core

Understanding Disk Computing vs In Memory Computing
How Spark works
Understanding the Spark architecture and why it is better than Map Reduce
RDD: Unit of Data in Spark
Architecture of Spark
Understanding Spark Context, Worker Nodes, Executioner and Tasks
How Spark supports multiple languages
How beneficial is Unified Spark API
Transformation and Actions in RDD
How DAG's are formed
How does Spark Lazy Loading works
Broadcast, Accumulator in Spark Core
Persisting and Caching Data in Spark
DAG scheduler
Physical plan, logical plan
Common problems
Shuffles, Spills, small files
Optimizations
Spark Tuning
Memory Tuning
Garbage Collection Tuning
Data Structure Tuning

2 Spark SQL

Why Spark SQL?
Understanding Spark SQL behind the hoods working
How Spark RDD still forms the base of Spark SQL
Data Frame: Basic Unit of Data in Spark SQL
Dataset: Type Safety unit of data in Spark SQL for Object Oriented Languages like Java and scalability
Loading Data from following format into Datasets or DataFrames a. JSON b. CSV c. Parquet d. ORC
Using Joins in Spark
Connecting Spark with Hive
Connecting Hive with HBase
Exporting Data from Spark

3 Introduction To Cassandra

Understanding What Cassandra Is
Learning What Cassandra Is Being Used For

4 Getting Started with The Architecture

Understanding That Cassandra Is a Distributed Database
Learning What Snitch Is For
Learning What Gossip Is For
Learning How Data Gets Distributed
Learning About Replication
Learning About Virtual Nodes

5 Installing Cassandra

Downloading Cassandra
Ensuring Oracle Java Is Installed
Installing Cassandra
Viewing The Main Configuration File
Providing Cassandra with Permission to Directories
Starting Cassandra
Checking Status
Accessing The Cassandra system.log File

6 Communicating With Cassandra

Understanding Ways To Communicate With Cassandra
Using Cqlsh

7 Creating A Databas

Understanding A Cassandra Database
Defining A Keyspace
Deleting A Keyspace
Creating A Table
Defining Columns and Data Types
Defining A Primary Key
Recognizing A Partition Key
Specifying A Descending Clustering Order

8 Inserting Data

Understanding Ways To Write Data
Using The INSERT INTO Command
Using The COPY Command
How Data Is Stored In Cassandra
How Data Is Stored On Disk

9 Modeling Data

Understanding Data Modeling In Cassandra
Using A WHERE Clause
Understanding Secondary Indexes
Creating A Secondary Index
Defining A Composite Partition Key

10 Updating And Deleting Data

Updating Data
Understanding How Updating Works
Deleting Data
Understanding Tombstones
Using TTLs
Updating A TTL

11 Creating An Application [Optional]

Understanding Cassandra Drivers
Exploring The DataStax Java Driver
Setting Up A Development Environment
Creating An Application Page
Acquiring The DataStax Java Driver Files
Getting The DataStax Java Driver Files Through Maven
Providing The DataStax Java Driver Files Manually
Connecting To A Cassandra Cluster
Executing A Query
Displaying Query Results

12 Kafka Fundamentals

Brokers and Topics
Topic Replication
Producers and Message Keys
Consumers & Consumer Groups
Consumer Offsets & Delivery Semantics

13 Kafka Broker Discovery

Zookeeper
Kafka Guarantees

14 Starting Kafka

Windows - Download Kafka and PATH Setup
Windows - Start Zookeeper & Kafka
Windows - Summary

15 CLI Introduction

Kafka Topics CLI
Kafka Console Producer CLI
Kafka Console Consumer CLI
Kafka Consumers in Group
Kafka Consumer Groups CLI
Resetting Offsets
CLI Options that are good to know
What about UIs? Conduktor
Conduktor - Demo
KafkaCat as a replacement for Kafka CLI

16 Introduction to Kafka Programming

Installing Java & IntelliJ Community Edition
Creating Kafka Project
Java Producer
Java Producer Callbacks
Java Producer with Keys
Java Consumer
Java Consumer inside Consumer Group
Java Consumer Seek and Assign
Client Bi-Directional Compatibility
Configuring Producers and Consumers
Real World Project Overview

17 Producer and Advanced Configurations Overview

Twitter Setup
Producer Part - Writing Twitter Client
Producer Part - Writing the Kafka Producer
Producer Configurations Introduction
acks & min.insync.replicas
retries, delivery.timeout.ms & max.in.flight.requests.per.connection
Idempotent Producer
Producer Part - Safe Producer
Producer Compression
Producer Batching
Producer Part - High Throughput Producer
Producer Default Partitions and Key Hashing
[Advanced] max.block.ms and buffer.memory
Refactoring the Project

18 Consumer and Advanced Configuration Overview

Setting up ElasticSearch in the Cloud
ElasticSearch
Consumer Part - Setup Project
Consumer Part - Write the Consumer & Send to ElasticSearch
Delivery Semantics for Consumers
Consumer Part - Idempotence
Consumer Poll Behavior
Consumer Offset Commit Strategies
Consumer Part - Manual Commit of Offsets
Consumer Part - Performance Improvement using Batching
Consumer Offsets Reset Behavior
Consumer Part - Replaying Data
Consumer Internal Threads

19 Kafka in the Real World

Kafka Connect Introduction
Kafka Connect Twitter Hands-On
Kafka Streams Introduction
Kafka Streams Hands-On
Kafka Schema Registry Introduction

Zartab Nakhwa

About Instructor

I have 9 years experience in Project consulting and corporate training. I specialize in Full Stack with React and Angular as well as in the Big Data Domain.

5.0

Ratings

10

Reviews

60

Students

1

Course

(10)

"It was a decent course. The explanations were satisfactory, but I struggled a bit due to my beginner level. Nonetheless, it was a good learning experience.

Muralidhar Kommuri

Dec. 19, 2023, 3:43 p.m.

"I learned a lot from this course. The pace was good for beginners. Would be nice if there were more real-life examples."

Prashanth Guddeti

Dec. 19, 2023, 3:42 p.m.

"This course broadened my understanding of managing data. The explanations were clear, making it easier for beginners like me."

Narasimharao Kota

Dec. 19, 2023, 3:42 p.m.

"Good course! I had little knowledge about these tools, but after completing this, I feel more confident. Would appreciate more exercises though."

Ashok Segu

Dec. 19, 2023, 3:41 p.m.

"I liked the way they taught about Kafka and Cassandra. It was a bit challenging at times, but the instructors were supportive."

Pallavi Chinnala

Dec. 19, 2023, 3:41 p.m.

"The content was comprehensive and informative. It helped me get a good grasp on data handling with these tools. Worth the time.

Dhana Muvvala

Dec. 19, 2023, 3:40 p.m.

"Great course! Despite being new to this field, I could follow along easily. The examples and exercises were really beneficial."

Pratap Kumar Challa

Dec. 19, 2023, 3:39 p.m.

"I found this course incredibly helpful. The way they explained using Spark and Cassandra was superb. Highly recommended."

Samaj Seemakurthy