Interfacing Amazon DynamoDB with Python using Boto3

Introduction:

In this Tutorial I will show you how to use the boto3 module in Python which is used to interface with Amazon Web Services (AWS).

For other blogposts that I wrote on DynamoDB can be found from blog.ruanbekker.com|dynamodb and sysadmins.co.za|dynamodb

What is Amazon's DynamoDB?

DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

The awesome thing about DynamoDB, is that the service takes care of the administration of operating and scaling a distributed database, so that you don't have to worry about hardware provisioning, setup / configuration, replication, software patching, or cluster scaling.

Let's get Started:

We will go through a couple of practical examples, such as:

  • Creating a Table
  • Writing to a Table
  • Deleting from a Table
  • Query
  • Scan, etc.

Setup / Configuration:

We will need awscli and boto3 to continue

We require pip, which is a Package Manager for Python, once you have that installed, install the required packages:

$ sudo pip install awscli
$ sudo pip install boto3

Next, configure your authentication credentials in order for DynamoDB to be able to verify the requests that we will be making to the DynamoDB endpoint. Please see information on IAM Users if you have not create your user yet.

Configure the credentials by providing your aws_secret_access_key_id, aws_secret_access_key_id and region details:

$ aws configure
AWS Access Key ID [****************XYZ]: 
AWS Secret Access Key [****************xyz]: 
Default region name [eu-west-1]: 
Default output format [json]: 

The boto module will then use your default credential provider to authenticate.

Creating a Table:

import boto3

dynamodb = boto3.resource('dynamodb')

table = dynamodb.create_table(
    TableName='staff',
    KeySchema=[
        {
            'AttributeName': 'username', 
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'last_name', 
            'KeyType': 'RANGE'
        }
    ], 
    AttributeDefinitions=[
        {
            'AttributeName': 'username', 
            'AttributeType': 'S'
        }, 
        {
            'AttributeName': 'last_name', 
            'AttributeType': 'S'
        }, 
    ], 
    ProvisionedThroughput={
        'ReadCapacityUnits': 1, 
        'WriteCapacityUnits': 1
    }
)

table.meta.client.get_waiter('table_exists').wait(TableName='staff')
print(table.item_count)

Writing to DynamoDB (PUT Item):

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

table.put_item(
   Item={
        'username': 'ruanb',
        'first_name': 'ruan',
        'last_name': 'bekker',
        'age': 30,
        'account_type': 'administrator',
    }
)

Response:

{
    'ResponseMetadata': {
        'HTTPHeaders': {
            'content-length': '2',
            'content-type': 'application/x-amz-json-1.0',
            'date': 'Sat, 11 Mar 2017 20:50:10 GMT',
            'x-amz-crc32': '2745614147',
            'x-amzn-requestid': '2TJPEXAMPLEREQUESTID'},
            'HTTPStatusCode': 200,
            'RequestId': '2TJPEXAMPLEREQUESTID',
            'RetryAttempts': 0
        }
    }
}

Reading from DynamoDB (GET Item):

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

response = table.get_item(
   Key={
        'username': 'ruanb',
        'last_name': 'bekker'
    }
)

item = response['Item']
name = item['first_name']

print(item)
print("Hello, {}" .format(name))

Response:

{
    'username': 'ruanb', 
    'account_type': 'administrator', 
    'last_name': 'bekker', 
    'first_name': 'ruan', 
    'age': Decimal('30')
}

Hello, ruan

Because we created our table with a Hash Key and Range Key, if we tried to do a getItem and only specifying a Hash or only a Range key, we will be faced with this exception:

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the GetItem operation: The provided key element does not match the schema

Updating Items:

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

table.update_item(
    Key={
        'username': 'ruanb',
        'last_name': 'bekker'
    },
    UpdateExpression='SET age = :val1',
    ExpressionAttributeValues={
        ':val1': 29
    }
)

Response:

'ResponseMetadata': {
    'RequestId': '8BQLEXAMPLEREQUEST', 
    'HTTPStatusCode': 200, 
    'HTTPHeaders': {
        'x-amzn-requestid': '8BQLEXAMPLEREQUESTID', 
        'x-amz-crc32': '2745614147', 
        'content-type': 'application/x-amz-json-1.0', 
        'content-length': '2', 
        'date': 'Sun, 12 Mar 2017 18:55:38 GMT'
        }, 
    'RetryAttempts': 0
    }
}

Deleting an Item:

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

table.delete_item(
    Key={
        'username': 'ruanb',
        'last_name': 'bekker'
    }
)

Response:

'ResponseMetadata': {
    'RequestId': '8FQLEXAMPLEREQUEST', 
    'HTTPStatusCode': 200, 
    'HTTPHeaders': {
        'x-amzn-requestid': '8FQLEXAMPLEREQUESTID', 
        'x-amz-crc32': '2745614147', 
        'content-type': 'application/x-amz-json-1.0', 
        'content-length': '2', 
        'date': 'Sun, 12 Mar 2017 18:58:38 GMT'
        }, 
    'RetryAttempts': 0
    }
}

Batch Write Item:

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

with table.batch_writer() as batch:
    batch.put_item(
        Item={
            'account_type': 'standard_user',
            'username': 'stefanb',
            'first_name': 'stefan',
            'last_name': 'bester',
            'age': 30,
            'address': {
                'road': '1 jamesville street',
                'city': 'kroonstad',
                'province': 'free state',
                'country': 'south africa'
            }
        }
    )
    batch.put_item(
        Item={
            'account_type': 'administrator',
            'username': 'ruanb',
            'first_name': 'ruan',
            'last_name': 'bekker',
            'age': 30,
            'address': {
                'road': '10 peterville street',
                'city': 'cape town',
                'province': 'western cape',
                'country': 'south africa'
            }
        }
    )
    batch.put_item(
        Item={
            'account_type': 'standard_user',
            'username': 'samanthas',
            'first_name': 'samantha',
            'last_name': 'smith',
            'age': 28,
            'address': {
                'road': '12 newton street',
                'city': 'port elizabeth'
                'province': 'eastern cape',
                'country': 'south africa'
            }
        }
    )

To Create 50 items from a for loop:

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

with table.batch_writer() as batch:
    for i in range(50):
        batch.put_item(
            Item={
                'account_type': 'anonymous',
                'username': 'user' + str(i),
                'first_name': 'unknown',
                'last_name': 'unknown'
            }
        )

Query:

With Query you can query on the Hash/Range key, but not on non key attributes.

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

response = table.query(
    KeyConditionExpression=Key('username').eq('samanthas') & Key('last_name').eq('smith')
)

items = response['Items']
print(items)

Output:

[
  {
      'username': 'samanthas', 
      'address': {
          'country': 'south africa', 
          'province': 'eastern cape', 
          'city': 'port elizabeth', 
          'road': '12 newton street'
          }, 
      'account_type': 'standard_user', 
      'last_name': 'smith', 
      'first_name': 'samantha', 
      'age': Decimal('28')
  }
]

Creating a Global Secondary Index:

Let's say we want to query for people with usernames that we know of, but want to query on age (older, younger than). As age is a non key attribute, we need to create a GSI on Hash: username and Range: age

ddb=boto3.Session(region_name='eu-west-1').client('dynamodb')
ddb.update_table(
    AttributeDefinitions=[{'AttributeName': 'username', 'AttributeType': 'S'}, {'AttributeName': 'lastname', 'AttributeType': 'S'}, {'AttributeName': 'age', 'AttributeType': 'N'}], 
    TableName='staff', 
    GlobalSecondaryIndexUpdates=[{'Create': {'IndexName': 'staffindex', 'KeySchema': [{'AttributeName': 'username', 'KeyType': 'HASH'}, {'AttributeName': 'age', 'KeyType': 'RANGE'}],
    'Projection': {'ProjectionType': 'ALL'}, 'ProvisionedThroughput': {'ReadCapacityUnits': 1, 'WriteCapacityUnits': 1}}}]
)

Now that we have our GSI created, we can query from our Index:

table.query(
    IndexName='staffindex',
    KeyConditionExpression=Key('username').eq('petera') & Key('age').gt(10)
)

Output:

{u'Count': 1, u'Items': [{u'username': u'petera', u'lastname': u'adams', u'age': Decimal('32'), u'account_type': u'administrator', u'firstname': u'peter'}], 
...

Using Query with a FilterExpression which will be a scan on the returned data from the query:

table.query(
    IndexName='staffindex',
    KeyConditionExpression=Key('username').eq('petera') & Key('age').gt(10),
    FilterExpression=Attr('account_type').eq('administrator')
)

Output:

{u'Count': 1, u'Items': [{u'username': u'petera', u'lastname': u'adams', u'age': Decimal('32'), u'account_type': u'administrator', u'firstname': u'peter'}]

Scan:

With scan you can scan the table based on attributes of the items, for example getting users older than 29. In other terms querying on attributes which is not part of our Primary Keys

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

response = table.scan(
    FilterExpression=Attr('age').gt(29)
)

items = response['Items']
print(items)

Output:

>>> items
[
  {
    'username': 'stefanb', 
    'address': {
      'country': 'south africa', 
      'province': 'free state', 
      'city': 'kroonstad', 
      'road': '1 jamesville street'
      }, 
    'account_type': 'standard_user', 
    'last_name': 'bester', 
    'first_name': 'stefan', 
    'age': Decimal('30')
  }, 
  {
    'username': 'ruanb', 
    'address': {
      'country': 'south africa', 
      'province': 'western cape', 
      'city': 'cape town', 
      'road': '10 peterville street'
      }, 
    'account_type': 'administrator', 
    'last_name': 'bekker', 
    'first_name': 'ruan', 
    'age': Decimal('30')
  }
]

From the above output we can see that we have 2 items, lets get a way to print out the 2 usernames that we retrieved:

>>> len(items)
2

>>> for x in range(len(items)): 
...    items[x]['username']
'stefanb'
'ruanb'

Lets say we would like to get all items who's Names starts with 'r' and account type is 'administrators':

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

response = table.scan(
    FilterExpression=Attr('first_name').begins_with('r') & Attr('account_type').eq('administrator')
)

items = response['Items']
print(items)

Output:

[
  {
    'username': 'ruanb', 
    'address': {
      'country': 'south africa', 
      'province': 'western cape', 
      'city': 'cape town', 
      'road': '10 peterville street'
      }, 
    'account_type': 'administrator', 
    'last_name': 'bekker', 
    'first_name': 'ruan', 
    'age': Decimal('30')
  }
]

Deleting the Table:

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')

table.delete()

I hope this post was useful :D