Spark: PySpark Examples

Example 1: Top 3 Occurrences: In this tutorial we will generate 400,000 lines of data that consists of Name,Country,JobTitle Then we have a scenario where we would like to find out the Top 3 Occurences from our

Setup Spark Cluster on Hadoop YARN

This is part 3 of our Big Data Cluster Setup. From our Previous Post I was going through the steps on getting your Hadoop Cluster up and running. In this tutorial, we will setup Apache Spark, on top of the

