Interfacing Amazon DynamoDB with Python using Boto3
Introduction:
In this Tutorial I will show you how to use the boto3 module in Python which is used to interface with Amazon Web Services (AWS).
For other blogposts that I wrote on DynamoDB can be found from blog.ruanbekker.com|dynamodb and sysadmins.co.za|dynamodb
What is Amazon's DynamoDB?
DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
The awesome thing about DynamoDB, is that the service takes care of the administration of operating and scaling a distributed database, so that you don't have to worry about hardware provisioning, setup / configuration, replication, software patching, or cluster scaling.
Let's get Started:
We will go through a couple of practical examples, such as:
- Creating a Table
- Writing to a Table
- Deleting from a Table
- Query
- Scan, etc.
Setup / Configuration:
We will need awscli and boto3 to continue
We require pip, which is a Package Manager for Python, once you have that installed, install the required packages:
$ sudo pip install awscli
$ sudo pip install boto3
Next, configure your authentication credentials in order for DynamoDB to be able to verify the requests that we will be making to the DynamoDB endpoint. Please see information on IAM Users if you have not create your user yet.
Configure the credentials by providing your aws_secret_access_key_id
, aws_secret_access_key_id
and region
details:
$ aws configure
AWS Access Key ID [****************XYZ]:
AWS Secret Access Key [****************xyz]:
Default region name [eu-west-1]:
Default output format [json]:
The boto module will then use your default credential provider to authenticate.
Creating a Table:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.create_table(
TableName='staff',
KeySchema=[
{
'AttributeName': 'username',
'KeyType': 'HASH'
},
{
'AttributeName': 'last_name',
'KeyType': 'RANGE'
}
],
AttributeDefinitions=[
{
'AttributeName': 'username',
'AttributeType': 'S'
},
{
'AttributeName': 'last_name',
'AttributeType': 'S'
},
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
table.meta.client.get_waiter('table_exists').wait(TableName='staff')
print(table.item_count)
Writing to DynamoDB (PUT Item):
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
table.put_item(
Item={
'username': 'ruanb',
'first_name': 'ruan',
'last_name': 'bekker',
'age': 30,
'account_type': 'administrator',
}
)
Response:
{
'ResponseMetadata': {
'HTTPHeaders': {
'content-length': '2',
'content-type': 'application/x-amz-json-1.0',
'date': 'Sat, 11 Mar 2017 20:50:10 GMT',
'x-amz-crc32': '2745614147',
'x-amzn-requestid': '2TJPEXAMPLEREQUESTID'},
'HTTPStatusCode': 200,
'RequestId': '2TJPEXAMPLEREQUESTID',
'RetryAttempts': 0
}
}
}
Reading from DynamoDB (GET Item):
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
response = table.get_item(
Key={
'username': 'ruanb',
'last_name': 'bekker'
}
)
item = response['Item']
name = item['first_name']
print(item)
print("Hello, {}" .format(name))
Response:
{
'username': 'ruanb',
'account_type': 'administrator',
'last_name': 'bekker',
'first_name': 'ruan',
'age': Decimal('30')
}
Hello, ruan
Because we created our table with a Hash Key and Range Key, if we tried to do a getItem
and only specifying a Hash
or only a Range
key, we will be faced with this exception:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the GetItem operation: The provided key element does not match the schema
Updating Items:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
table.update_item(
Key={
'username': 'ruanb',
'last_name': 'bekker'
},
UpdateExpression='SET age = :val1',
ExpressionAttributeValues={
':val1': 29
}
)
Response:
'ResponseMetadata': {
'RequestId': '8BQLEXAMPLEREQUEST',
'HTTPStatusCode': 200,
'HTTPHeaders': {
'x-amzn-requestid': '8BQLEXAMPLEREQUESTID',
'x-amz-crc32': '2745614147',
'content-type': 'application/x-amz-json-1.0',
'content-length': '2',
'date': 'Sun, 12 Mar 2017 18:55:38 GMT'
},
'RetryAttempts': 0
}
}
Deleting an Item:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
table.delete_item(
Key={
'username': 'ruanb',
'last_name': 'bekker'
}
)
Response:
'ResponseMetadata': {
'RequestId': '8FQLEXAMPLEREQUEST',
'HTTPStatusCode': 200,
'HTTPHeaders': {
'x-amzn-requestid': '8FQLEXAMPLEREQUESTID',
'x-amz-crc32': '2745614147',
'content-type': 'application/x-amz-json-1.0',
'content-length': '2',
'date': 'Sun, 12 Mar 2017 18:58:38 GMT'
},
'RetryAttempts': 0
}
}
Batch Write Item:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
with table.batch_writer() as batch:
batch.put_item(
Item={
'account_type': 'standard_user',
'username': 'stefanb',
'first_name': 'stefan',
'last_name': 'bester',
'age': 30,
'address': {
'road': '1 jamesville street',
'city': 'kroonstad',
'province': 'free state',
'country': 'south africa'
}
}
)
batch.put_item(
Item={
'account_type': 'administrator',
'username': 'ruanb',
'first_name': 'ruan',
'last_name': 'bekker',
'age': 30,
'address': {
'road': '10 peterville street',
'city': 'cape town',
'province': 'western cape',
'country': 'south africa'
}
}
)
batch.put_item(
Item={
'account_type': 'standard_user',
'username': 'samanthas',
'first_name': 'samantha',
'last_name': 'smith',
'age': 28,
'address': {
'road': '12 newton street',
'city': 'port elizabeth'
'province': 'eastern cape',
'country': 'south africa'
}
}
)
To Create 50 items from a for loop:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
with table.batch_writer() as batch:
for i in range(50):
batch.put_item(
Item={
'account_type': 'anonymous',
'username': 'user' + str(i),
'first_name': 'unknown',
'last_name': 'unknown'
}
)
Query:
With Query you can query on the Hash/Range key, but not on non key attributes.
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
response = table.query(
KeyConditionExpression=Key('username').eq('samanthas') & Key('last_name').eq('smith')
)
items = response['Items']
print(items)
Output:
[
{
'username': 'samanthas',
'address': {
'country': 'south africa',
'province': 'eastern cape',
'city': 'port elizabeth',
'road': '12 newton street'
},
'account_type': 'standard_user',
'last_name': 'smith',
'first_name': 'samantha',
'age': Decimal('28')
}
]
Creating a Global Secondary Index:
Let's say we want to query for people with usernames that we know of, but want to query on age (older, younger than). As age is a non key attribute, we need to create a GSI on Hash: username and Range: age
ddb=boto3.Session(region_name='eu-west-1').client('dynamodb')
ddb.update_table(
AttributeDefinitions=[{'AttributeName': 'username', 'AttributeType': 'S'}, {'AttributeName': 'lastname', 'AttributeType': 'S'}, {'AttributeName': 'age', 'AttributeType': 'N'}],
TableName='staff',
GlobalSecondaryIndexUpdates=[{'Create': {'IndexName': 'staffindex', 'KeySchema': [{'AttributeName': 'username', 'KeyType': 'HASH'}, {'AttributeName': 'age', 'KeyType': 'RANGE'}],
'Projection': {'ProjectionType': 'ALL'}, 'ProvisionedThroughput': {'ReadCapacityUnits': 1, 'WriteCapacityUnits': 1}}}]
)
Now that we have our GSI created, we can query from our Index:
table.query(
IndexName='staffindex',
KeyConditionExpression=Key('username').eq('petera') & Key('age').gt(10)
)
Output:
{u'Count': 1, u'Items': [{u'username': u'petera', u'lastname': u'adams', u'age': Decimal('32'), u'account_type': u'administrator', u'firstname': u'peter'}],
...
Using Query with a FilterExpression which will be a scan on the returned data from the query:
table.query(
IndexName='staffindex',
KeyConditionExpression=Key('username').eq('petera') & Key('age').gt(10),
FilterExpression=Attr('account_type').eq('administrator')
)
Output:
{u'Count': 1, u'Items': [{u'username': u'petera', u'lastname': u'adams', u'age': Decimal('32'), u'account_type': u'administrator', u'firstname': u'peter'}]
Scan:
With scan you can scan the table based on attributes of the items, for example getting users older than 29. In other terms querying on attributes which is not part of our Primary Keys
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
response = table.scan(
FilterExpression=Attr('age').gt(29)
)
items = response['Items']
print(items)
Output:
>>> items
[
{
'username': 'stefanb',
'address': {
'country': 'south africa',
'province': 'free state',
'city': 'kroonstad',
'road': '1 jamesville street'
},
'account_type': 'standard_user',
'last_name': 'bester',
'first_name': 'stefan',
'age': Decimal('30')
},
{
'username': 'ruanb',
'address': {
'country': 'south africa',
'province': 'western cape',
'city': 'cape town',
'road': '10 peterville street'
},
'account_type': 'administrator',
'last_name': 'bekker',
'first_name': 'ruan',
'age': Decimal('30')
}
]
From the above output we can see that we have 2 items, lets get a way to print out the 2 usernames that we retrieved:
>>> len(items)
2
>>> for x in range(len(items)):
... items[x]['username']
'stefanb'
'ruanb'
Lets say we would like to get all items who's Names starts with 'r' and account type is 'administrators':
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
response = table.scan(
FilterExpression=Attr('first_name').begins_with('r') & Attr('account_type').eq('administrator')
)
items = response['Items']
print(items)
Output:
[
{
'username': 'ruanb',
'address': {
'country': 'south africa',
'province': 'western cape',
'city': 'cape town',
'road': '10 peterville street'
},
'account_type': 'administrator',
'last_name': 'bekker',
'first_name': 'ruan',
'age': Decimal('30')
}
]
Deleting the Table:
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('staff')
table.delete()
I hope this post was useful :D