I wanted a quick way to analyze nginx access logs from the command line, where I only wanted to see the following:

  • Top 10 Request IP's (from the current Access Log)
  • Top Request Methods (From the Current Access Log)
  • Top 10 Request Pages (From the Current Access Log)
  • Top 10 Request Pages (From Current and Gzipped Logs)
  • Top 10 HTTP 404 Page Resopnses (From Current and Gzipped Logs)

The Bash Script:

The script can be modified so it could be a lot more dynamic, but this was working for what I needed to see.

The script can be found on gist.github.com/ruanbekker

#!/bin/bash

# variables
LOGFILE="/var/log/nginx/access.log"
LOGFILE_GZ="/var/log/nginx/access.log.*"
RESPONSE_CODE="200"

# functions
filters(){
grep $RESPONSE_CODE \
| grep -v "\/rss\/" \
| grep -v robots.txt \
| grep -v "\.css" \
| grep -v "\.jss*" \
| grep -v "\.png" \
| grep -v "\.ico"
}

filters_404(){
grep "404"
}

request_ips(){
awk '{print $1}'
}

request_method(){
awk '{print $6}' \
| cut -d'"' -f2
}

request_pages(){
awk '{print $7}'
}

wordcount(){
sort \
| uniq -c
}

sort_desc(){
sort -rn
}

return_kv(){
awk '{print $1, $2}'
}

request_pages(){
awk '{print $7}'
}

return_top_ten(){
head -10
}

## actions
get_request_ips(){
echo ""
echo "Top 10 Request IP's:"
echo "===================="

cat $LOGFILE \
| filters \
| request_ips \
| wordcount \
| sort_desc \
| return_kv \
| return_top_ten
echo ""
}

get_request_methods(){
echo "Top Request Methods:"
echo "===================="
cat $LOGFILE \
| filters \
| request_method \
| wordcount \
| return_kv
echo ""
}

get_request_pages_404(){
echo "Top 10: 404 Page Responses:"
echo "==========================="
zgrep '-' $LOGFILE $LOGFILE_GZ\
| filters_404 \
| request_pages \
| wordcount \
| sort_desc \
| return_kv \
| return_top_ten
echo ""
}


get_request_pages(){
echo "Top 10 Request Pages:"
echo "====================="
cat $LOGFILE \
| filters \
| request_pages \
| wordcount \
| sort_desc \
| return_kv \
| return_top_ten
echo ""
}

get_request_pages_all(){
echo "Top 10 Request Pages from All Logs:"
echo "==================================="
zgrep '-' --no-filename $LOGFILE $LOGFILE_GZ \
| filters \
| request_pages \
| wordcount \
| sort_desc \
| return_kv \
| return_top_ten
echo ""
}

# executing
get_request_ips
get_request_methods
get_request_pages
get_request_pages_all
get_request_pages_404

Running the Script:

When running the script, you will have more or less the format of the following output:

Top 10 Request IP's:
====================
13 1.2.3.4
8 1.2.3.5
8 1.2.3.6
4 1.2.3.7
4 1.2.3.8
2 1.2.3.9
2 1.2.3.10
2 1.2.3.11
2 1.2.3.12
2 1.2.3.13

Top Request Methods:
====================
106 GET
1 HEAD

Top 10: 404 Page Responses:
===========================
146 /content/images/201x/0x/image-1.png
45 /
38 /wp-login.php/
36 /apple-touch-icon.png/
11 http://1.2.3.4:80/sql/websql/
11 http://1.2.3.4:80/sql/webdb/
11 http://1.2.3.4:80/sql/webadmin/
11 http://1.2.3.4:80/sql/sqlweb/
11 http://1.2.3.4:80/sql/sqladmin/
11 http://1.2.3.4:80/sql/sql-admin/

Top 10 Request Pages:
=====================
23 /traefik-a-modern-http-reverse-proxy-and-load-balancer-for-microservices-such-as-docker/
13 /
10 /traefik-a-modern-http-reverse-proxy-and-load-balancer-for-microservices-such-as-docker/amp/
6 /structured-search-queries-with-elasticsearch/
5 /interfacing-amazon-dynamodb-with-python-using-boto3/
4 /aws-java-sdk-detect-if-s3-object-exists-using-doesobjectexist/
2 /spark-pyspark-examples/
2 /how-to-setup-mongodb-server-on-ubuntu-and-enable-authentication/
2 /getting-started-with-aws-elasticsearch-service/
2 /aws-elasticsearch-register-s3-repository-for-snapshots-using-the-cli/

Top 10 Request Pages from All Logs:
===================================
654 /
505 /traefik-a-modern-http-reverse-proxy-and-load-balancer-for-microservices-such-as-docker/
361 /interfacing-amazon-dynamodb-with-python-using-boto3/
274 /running-java-web-applications-on-docker-with-payara-micro/
228 /aws-java-sdk-detect-if-s3-object-exists-using-doesobjectexist/
165 /setup-a-distributed-storage-volume-with-glusterfs/
128 /how-to-ingest-nginx-access-logs-to-elasticsearch-using-filebeat-and-logstash/
121 /aws-access-kibana-5-behind-elb-via-nginx-reverse-proxy-on-custom-dns/
112 /secure-access-to-kibana-on-aws-elasticsearch-service/
89 /docker-swarm-getting-started-with-a-3-node-docker-swarm-cluster-with-a-scalable-app/

Analyzing your Nginx Logs with Elasticsearch and Logstash:

For pushing all your nginx logs from multiple web servers using filebeat into logstash, then to elasticsearch, have a look at: Ingesting Nginx Access Logs into Elasticsearch using Filebeat and Logstash