Distributed Databases and Distributed Systems

|
|
|
Distributed Databases and Distributed Systems

Course topics

Module 1.  Why Distributed DB and Distributed Systems? Remote Procedure Call
  • Network socket
  • RPC
  • Sync/Async call
  • Messaging
  • gRPC
RDBMS
  • App architecture
  • Business transaction vs System transaction. Distributed transactions
  • ACID – properties of database transactions
  • Transaction isolation levels
  • Pessimistic vs optimistic locking. Lost update problem
Distributed transactions
  • 2PC protocol
  • 3PC protocol
Module 2.  NoSQL
  • RDBMS problems. ORM (Object-relational mapping)
  • SQL vs NoSQL
  • NoSQL properties (schemaless, aggregate orientation, transactions, …)
  • Types of NoSQL databases: KV, Document, Column-family, Graph
Distribution Models
  • Consistency problem
  • Sharding
  • Replication
  • Consistency models: eventual consistency, monotonic reads, read your writes, strong consistency. Consistency guarantee
  • MongoDB and Cassandra parameters for consistency guarantee
CAP theorem
  • CAP theorem. BASE.
  • CAP theorem with SQL and NoSQL DBs
  • Polyglot Persistence
Module 3 MapReduce
  • Data locality. Phases
  • Standard algorithms: Word count, Inverted index, Top N
  • Map/Reduce/Combine functions requirements
  • MapReduce alternatives
  • RDBMS vs NoSQL vs MapReduce
Module 4 Distributed systems
  • Consensus problem. Split-brain problem. Byzantine Generals problem.
  • Distributed systems: Communication, Failure Modes, Leader, Consensus, Quorums, Time, Order
  • Vector/Lamport Clock
Consensus protocols
  • Replicated state machine
  • Raft protocol
  • Paxos protocol

Preliminary practical tasks

  • Map/Reduce implementation*
  • 2PC protocol
  • MongoDB basics
  • Neo4J basics
  • Cassandra data model basics
  • MongoDB replication
  • Cassandra replication
  • MongoDB Map/Reduce
  • Raft protocol*

Prerequisites

None

Про факультет

Важлива інформація

Контактна інформація