In this tutorial we will setup a Basic Kibana Dashboard for a Web Server that is running a Blog on Nginx. #carbonads { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", Helvetica, Arial, sans-serif; } #carbonads { display: block; overflow:
BigData
A collection of 11 posts
Amazon EMR Performance Comparison dealing with Hadoops SmallFiles Problem
Today I would like to have a dive into Job Performance with Hadoop, running on the Managed Hadoop Framework of Amazon Web Services, which is Elastic MapReduce (EMR). Hadoop does not deal well with lots of small files, and I
Removing the Hive Metastore Password from hive-site.xml on EMR
With Hive's Metastore config, we have an entry that hosts your password to authenticate against your metastore database. This password is saved in clear-text, which looks like this: <property> <name>javax.jdo.option.ConnectionPassword</name&
AWS: Create EMR Cluster with Java SDK Examples
Today, providing some basic examples on creating a EMR Cluster and adding steps to the cluster with the AWS Java SDK. This tutorial will show how to create an EMR Cluster in eu-west-1 with 1x m3.xlarge Master Node and
Generating Sensible Transaction Data with Python
The other day, I was facing a scenario where I had to setup bucketing with Hive and I needed some sample data, but in the same way I thought it would've been nice to have some random data, but that
Spark: PySpark Examples
Example 1: Top 3 Occurrences: In this tutorial we will generate 400,000 lines of data that consists of Name,Country,JobTitle Then we have a scenario where we would like to find out the Top 3 Occurences from our
Setup PIG on Hadoop YARN Cluster
This is part 4 of our Big Data Cluster Setup. From our Previous Post I was going through the steps on setting up Spark on your Hadoop Cluster. In this tutorial, we will setup Apache Pig, on top of the
Setup Spark Cluster on Hadoop YARN
This is part 3 of our Big Data Cluster Setup. From our Previous Post I was going through the steps on getting your Hadoop Cluster up and running. In this tutorial, we will setup Apache Spark, on top of the
Subscribe to Sysadmins
Subscribe today and get access to a private newsletter and new content every week!