Distributed Programming: Apache Spark - Parallel processing of large scale data, batch or streaming - In-memory computation of big data in seconds or less - Machine learning implementation - Graph data analysis - Real-time streaming - Processing in Python, Scala, and R