Inserting query parameters into DynamoDB using Boto3 - amazon-web-services

I am trying to have my server less function working as i am trying my hands on it.
I am trying to perform API PUT method , which will be integrated with proxy lambda function
I have a lambda function as below:
def lambda_handler(event, context):
param = event['queryStringParameters']
dynamodb = boto3.resource('dynamodb', region_name="us-east-1")
table = dynamodb.Table('*****')
response = table.put_item(
Item = {
}
)
i want to insert the Param value which i am getting from query parameters into DynamoDB table.
I am able to achieve it by :
response = table.put_item(
Item = param
)
But the issue here is if the partition key is present it will just over ride the value in place of throwing an error of present partition key.
I know the PUT method is idempotent.
Is there any other way i can achieve this ?

Per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.put_item, you can:
"perform a conditional put operation (add a new item if one with the
specified primary key doesn't exist)"
Note
To prevent a new item from replacing an existing item, use a
conditional expression that contains the attribute_not_exists function
with the name of the attribute being used as the partition key for the
table. Since every record must contain that attribute, the
attribute_not_exists function will only succeed if no matching item
exists.
Also see DynamoDB: updateItem only if it already exists
If you really need to know whether the item exists or not so you can trigger your exception logic, then run a query first to see if the item already exists and don't even call put_item. You can also explore whether using a combination of ConditionExpression and one of the ReturnValues options (for put_item or update_item) may return enough data for you to know if an item existed.

Related

How to set Time to live in dynamodb item

I am trying to add items in dynamodb in batch. My table consists of composite primary key i.e. a combination of primary key and sort key. I have enabled time to live on my table but metrics for deletedItemsCount is showing no change.
Following is my code :-
def generate_item(data):
item = {
"pk": data['pk'],
"ttl": str(int(time.time())), # current time set for testing
"data": json.dumps({"data": data}),
"sk": data['sk']
}
return item
def put_input_data(input_data, table_name):
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(table_name)
data_list = input_data["data"]
try:
with table.batch_writer() as writer:
for index, data in enumerate(data_list):
writer.put_item(Item=generate_item(data))
except ClientError as exception_message:
raise
On querying the table I can see item is getting added into the table, but graph for deletedItemsCount shows no change.
Can someone point where am I going wrong ? Would appreciate any hint.
Thanks
looks like your ttl attribute is a String, but...
The TTL attribute’s value must be a Number data type. For example, if you specify for a table to use the attribute name expdate as the TTL attribute, but the attribute on an item is a String data type, the TTL processes ignore the item.
Source: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/time-to-live-ttl-before-you-start.html#time-to-live-ttl-before-you-start-formatting
Hope that resolves your issue.
The implementation of time-to-live (TTL) is different in different databases and you shouldn't assume a specific implementation in DynamoDB.
The usual requirement of TTL is that the object will not be visible when reading or writing after the TTL period, and not necessarily be evicted from the table by that time. When you access an item in the table, DynamoDB checks the TTL of the item and returns it or updates it only if it is valid (before its expiration TTL). If it is not valid anymore, DynamoDB will ignore the item, and from your perspective as a client, it will be similar to the experience that the item was already deleted.
UPDATE: Based on the comment below from #Nadav Har'El, it is your responsibility to check the validity of the items using the TTL value (documentation here).
The actual deletion or eviction is done by a sweeper that goes over the table periodically. Please also note that the deletion after TTL is a system-delete compared to a standard delete by a delete command from a client. If you are processing the DynamoDB stream you should be aware of that difference. You can read more about TTL and DynamoDB streams here.

How to retrieve all the item from DynamoDB using boto3?

I want to retrieve all the items from my table without specifying any particular parameter, I can do it using Key Pair, but want to get all items. How to do it?
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Email')
response = table.get_item(
Key={
"id": "2"
}
)
item = response['Item']
print(item)
This way I can do, but how to retrieve all items? is there any method?
If you want to retrieve all items you will need to use the Scan command.
You can do this by running
response = table.scan()
Be aware that running this will utilise a large number of read credits (RCU). If you're using eventual consistency 1 RCU will be equal to 2 items (under 4KB) and strongly consistent will be 1 item per each RCU (under 4KB).
Here is the consideration page for scans vs queries in AWS documentation.

How to change data type of column in DynamoDb?

Initially I have inserted integer values hence schema created with column type number, later string values were also inserted in same column. Now I am facing issue while fetching values. Need tho update column type number to string.
Well, there are no columns in DynamoDB and even if you consider attributes as columns which they are not, they don't enforce specific type, except for primary key. Therefore you can't change the type of a column.
If you are asking about how to change type of a specific attribute for all items in a table, then you need to run update command on all of the items. DynamoDB unfortunately doesn't support batch update operation, therefore you need to fetch keys of all the items that you need to updated, loop through that list and update each item separately.
I recently had to do this. Here is my script that I used:
Assume that 'timestamp' is name of column you need to change from string to number. So here is solution:
import boto3
from boto3.dynamodb.conditions import Key
db_client = boto3.resource('dynamodb', region_name="eu-west-3")
table_res = db_client.Table(TABLE_NAME)
not_finished = True
ret = table_res.scan()
while not_finished:
for item in ret['Items']:
if 'timestamp' in item and isinstance(item['timestamp'], str):
new_item = item
new_item['timestamp'] = int(float(item['timestamp']))
print("fixing {}, {} -> {}".format(item['SK'], item['timestamp'], new_item['timestamp']))
table_res.put_item(Item = new_item)
if "LastEvaluatedKey" in ret:
last_key = ret['LastEvaluatedKey']
ret = table_res.scan(ExclusiveStartKey = last_key)
else:
not_finished = False
I do understand you probably don't need this anymore, but I still hope this will help somebody.

How to get all the Sort keys for a given Partition key (HASH) efficiently?

I am new to DynamoDB and I am coming from an RDBMS background. Is there any way to get all the sortkey (RANGE) for a given Partition key (HASH). I am not interested in the data, just the sort keys. What is the efficient way to achieve this?
I don't know if it's possible to do exactly as you asked but you could add the sort key value as a separate column in the table.
Perhaps it would be simpler to have two separate columns in the table, one for your partition key and one for your range/sort key. Create a secondary index on the partition key to query and then return values from your new column representing your sort key.
I'm assuming that HashKey & RangeKey are specified while creating DynamoDB Table. You can use DynamoDB's Query API and specify range key's column name in AttributesToGet field of this API request. Please use the pagination support provided in Query API, else your system will suffer in case large number of values are returned.
You can improve the #Chris McLaughlin solution adding a ProjectionExpression attribute to the query. ProjectionExpression need to be a string that identifies one ("attribute_name") or more attributes ("attribute_name1,attribute_name2") to retrieve from the table.
response = table_object.query(
KeyConditionExpression = Key(partition_key_name).eq(partition_key_value),
ProjectionExpression = sort_key_name
)
This will give you all the sort_keys in your table. It is not necessary to create an additional column to do this since the sort_key is already a column in the table.
You can use KeyConditionExpression as part of the DynamoDB QueryAPI
Here is roughly how you could do it in python:
import boto3
from boto3.dynamodb.conditions import Key
from botocore.exceptions import ClientError
session = boto3.session.Session(region_name = 'us-east-1')
dynamodb = session.resource('dynamodb')
table_object = dynamodb.Table(table_name)
return_list = []
try:
response = table_object.query(
KeyConditionExpression = Key(partition_key_name).eq(partition_key_value),
ProjectionExpression = sort_key_name
)
except ClientError:
return False
if 'Items' in response:
for response_result in response['Items']:
if sort_key_name in response_result:
return_list.append(response_result.get(sort_key_name))
return return_list
else:
return False
Updated thanks to #Hernan for suggesting including ProjectionExpression

DynamoDB Primary Key strategy

I'm dabbling with DynamoDB (using boto3) for the first time, and I'm not sure how to define my Partition Key. I'm used to SQL, where you can use AUTO_INCREMENT to ensure that the Key will always increase.
I haven't seen such an option in DynamoDB - instead, when using put_item, the "primary key attributes are required" - I take this to mean that I have to define the value explicitly (and, indeed, if I leave it off, I get botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the PutItem operation: One or more parameter values were invalid: Missing the key id in the item)
If already have rows with id 1, 2, 3, ...N, I naturally want the next row that I insert to have Primary Key N+1. But I don't know how to generate that - the solutions given here are all imperfect.
Should I be generating the Primary Key values independently, perhaps by hashing the other values of the item? If I do so, isn't there a (small) chance of hash-collision? Then again, since DynamoDB seems to determine partition based on a hash of the Partition Key, is there any reason for me not to simply use a random sufficiently-long string?
DynamoDb does not support generated keys, you have to specify one yourself. You can't reliably generate sequential IDs.
One common way is instead to use UUIDs.
I had the same problem while working through the Build a basic Web Application tutorial.
In module 4 of the tutorial, after modifying the lambda function to write to the DynamoDB table, I had to change ID to Id in the line marked THIS LINE (see below) after which the test worked.
def lambda_handler(event, context):
# extract values from the event object we got from the Lambda service and store in a variable
name = event['firstName'] +' '+ event['lastName']
# write name and time to the DynamoDB table using the object we instantiated and save response in a variable
response = table.put_item(
Item={
'ID': name, <- THIS LINE
'LatestGreetingTime':now
})
# return a properly formatted JSON object
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda, ' + name)
}
I also had to edit my test input to include a random uuid as shown:
{
"Id": "560e2227-c738-41d9-ad5a-bcad6a3bc273",
"firstName": "Ada",
"lastName": "Lovelace"
}