E. Lozes

Distributed Big Data For AI

This lecture aims at introducing some concepts around distributed data management: replicated data consistency, distributed file systems, map-reduce, etc. The lecture will provide also a concrete experience with labs oriented as a project.

S2 3 ECTS 24h OPT EN E. Lozes

Tentative syllabus

  • Topic 1: Introduction to parallelism
  • Topic 2 : Storing Big Data
  • Topic 3: Processing Big Data

  • Lab 1 : Introduction to Apache Spark
  • Lab 2 : Machine Learning with Spark
  • Lab 3 : Kafka and Spark streaming.