Big Data with Hadoop & Spark

Course topics

Hadoop:

  • Hadoop architecture
  • Data ingestion into Big Data systems and ETL
  • Distributed processing MapReduce Framework
  • Apache Hive
  • HBase
  • Hadoop Application Testing

Spark:

  • Spark Core processing RDD
  • Spark SQL
  • SparkMLlib modeling Big Data with Spark
  • Stream Processing Frameworks and Spark Streaming
  • Improving Spark Performance
  • Data Processing with PySpark

Prerequisites

Linux knowledge, RDBMS knowledge, great if have some minimal experience with AWS Cloud