Welcome to the course Distributed Data Processing using MapReduce! Please, find a schedule of the lectures and assignments on Blackboard under “Course Information” (scroll down).
This will be a course that is on top of some very exciting developments in cluster computing and data centers, initiated by Google, and followed by many others such as Yahoo, Amazon, AOL, Baidu, Joost, Mylife, Facebook, etc., etc. The course is not only about processing terabytes of data on large clusters. In fact, not many courses in the master's Computer Science will be so “core computer science”: We will discuss new file systems (GFS and Hadoop FS), new programming paradigms (MapReduce), new programming languages and query languages (Sawzall, Pig), and of course many web search and data mining applications that made Google one of today's leading IT companies.
I hope to see you at our lectures on Friday's 3/4 hour.