COURSE OUTLINE

Session 1

Working with distributed file systems (HDFS)

Session 2-3

Understanding and working with MapReduce

Session 4-5

SQL over BigData: Hive

Session 6-7

Spark: in-memory computational model

Session 8-9

Spark DataFrame / SQL / GraphFrame

Session 11-12

Spark ML: classification / regression / clusterisation

Session 13-14

NoSQL (HBase / Cassandra / …)

Session 10

Big Data applications examples
 and Spark optimisation

Session 15

Building Big Data Service