In Apache Cassandra Lunch #104, CEO of Anant, Rahul Singh, will discuss methods and strategies to manage big data in Apache Cassandra after you’ve got it already stored. We’ll discuss how to delete or apply TTLs after the fact, how to operationalize processes with Apache Airflow and Apache Spark, and how to manage Data hygiene as a strategy so that you’re not stuck with bad data later.
Accompanying Blog: Coming Soon!
Accompanying SlideShare: Coming Soon!
Sign Up For Our Newsletter: [ Ссылка ]
Join Cassandra Lunch Weekly at 12 PM EST Every Wednesday: [ Ссылка ]
Cassandra.Link:
[ Ссылка ]
Follow Us and Reach Us At:
Anant:
[ Ссылка ]
Awesome Cassandra:
[ Ссылка ]
Cassandra.Lunch:
[ Ссылка ]
Email:
solutions@anant.us
LinkedIn:
[ Ссылка ]
Twitter:
[ Ссылка ]
Eventbrite:
[ Ссылка ]
Facebook:
[ Ссылка ]
Join The Anant Team:
[ Ссылка ]
TIMESTAMPS
00:00 Apache Cassandra Lunch Introduction
08:32 Cleaning Big Data in Apache Cassandra Introduction
12:16 Big Data Options
14:19 Data Cleanup
17:39 Cleaning Data in SQL
18:20 Cleaning Data in Spark SQL
24:50 Scheduling the Work
25:22 Scylla Spark Migrator
27:04 Airflow DAG to Migrate Cassandra Data
28:35 Airflow DAG to Clean Cassandra Data
31:17 Considerations for Spark/Airflow Solution
36:10 Key Takeaways
38:03 Questions and Discussion
#dataops #cassandra #data
Ещё видео!