Course topics
Hadoop:
- Hadoop architecture
- Data ingestion into Big Data systems and ETL
- Distributed processing MapReduce Framework
- Apache Hive
- HBase
- Hadoop Application Testing
Spark:
- Spark Core processing RDD
- Spark SQL
- SparkMLlib modeling Big Data with Spark
- Stream Processing Frameworks and Spark Streaming
- Improving Spark Performance
- Data Processing with PySpark
Prerequisites
Linux knowledge, RDBMS knowledge, great if have some minimal experience with AWS Cloud