Spark: PySpark Examples

Example 1: Top 3 Occurrences: In this tutorial we will generate 400,000 lines of data that consists of Name,Country,JobTitle Then we have a scenario where we would like to find out the Top 3 Occurences from our…

Setup PIG on Hadoop YARN Cluster

This is part 4 of our Big Data Cluster Setup. From our Previous Post I was going through the steps on setting up Spark on your Hadoop Cluster. In this tutorial, we will setup Apache Pig, on top of the…

Setup Spark Cluster on Hadoop YARN

This is part 3 of our Big Data Cluster Setup. From our Previous Post I was going through the steps on getting your Hadoop Cluster up and running. In this tutorial, we will setup Apache Spark, on top of the…