Course topics
Part 1. Intro to Big Data & Parallel processing- Historical retrospective on Big data development
- Current Big-Data ecosystem overview (Landscape).
- Review all of the different technologies such as Nifi, Flink, Spark, etc.
- Parallel programming. Paradigm, concept shift
- Streaming programming. Purpose, challenges, architecture. Intro to Kafka
- Scala programming language
- Programming paradigms
- Scala syntax. Data Structures. OOP in Scala
- Functional Programming in Scala
- Function as a first-class citizen. Higher-order functions
- Referential transparency, pure functions, side-effects.
- Functional Programming patterns. Monads
- Asynchronous and parallel programming in Scala
- Parallel & Concurrent programming on JVM
- Asynchronous programming with Futures. Parallel collections
- Introduction to Akka Actors
- Actor model. Akka. Handling shared state.
- Implementing event-driven systems. Reactive programming
- Introduction to Akka Streams
- Handling streams of data. Source, Sink, Flow, runnable graphs
- Handling backpressure
- Cloud computing. Core cloud services.
- IaaS, PaaS, SaaS. Horizontal and vertical scalability
- Containerization & virtualization. Virtual compute instances
- Core services
- AWS & GCP resources.
- Core AWS and GCP services
- Utilizing serverless architecture
- Hands on Cloud Services
- Intro to Kafka Streams
- Basics. Kstreams, Ktables, concept, usage, limitations
- Hands on KStreams
- Kstreams – Advanced practices
- Deploying simple Kstream app using Confluent platform
Prerequisites
Strong knowledge and practical experience with Java or Python. Would be a plus:- Basic knowledge of JVM
- Basic understanding of data structures
- Experience working with multi-threading
- Functional Programming basics
- Experience developing enterprise-level software
- Git
- JDK 1.8
- Scala 2.12. sbt 1.2.3
- IntelliJ IDEA – Community edition or any elevated.
- Scala plugin for IntelliJ IDEA
- Latest version of Docker