This class will introduce students to using Apache Spark on the GPC. The Python interface PySpark will used to access the Spark infrastructure. Students are encouraged to bring a laptop to the class, so as to follow along with the exercises and quizzes. Students will be introduced to the PySpark syntax and commands, techniques for loading and managing data, and various data analysis strategies.
Category: Data Science
Date: Tue, 1 Dec 2015 - 9:00 am