Hi Stackoverflow I'm trying to conditionally put an item within a DynamoDB table. The DynamoDB table has the following attributes.
ticker - Partition Key
price_date - Sort Key
price - Attribute
Every minute I'm calling an API which gives me a minute by minute list of dictionaries for all stock prices within the day so far. However, the data I receive from the API sometimes can be behind by a minute or two. I don't particularly want to overwrite all the records within the DynamoDB table every time I get new data. To achieve this I've tried to create a conditional expression to only use put_item when there is a match on ticker but there is a new price_date
I've created a simplification of my code below to better illustrate my problem.
import boto3
from boto3.dynamodb.conditions import Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('stock-intraday')
data = [
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:30:00.000Z', 'price': 100},
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:31:00.000Z', 'price': 101}
]
for item in data:
dynamodb_response = table.put_item(Item=item,
ConditionExpression=Attr("ticker").exists() & Attr("price_date").not_exists())
However when I run this code I get this error...
What is wrong with my conditional expression?
Found an answer to my own problem. DynamoDB was throwing an error because my code WAS working but with some minor changes.
There needed to be a TRY EXCEPT block but also since the partition key is already evaluated only the price_date needed to be included within the condition expression
import boto3
from boto3.dynamodb.conditions import Attr
from botocore.exceptions import ClientError
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('stock-intraday')
data = [
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:30:00.000Z', 'price': 100},
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:31:00.000Z', 'price': 101}]
for item in data:
try:
dynamodb_response = table.put_item(Item=item,
ConditionExpression=Attr("price_date").not_exists())
except ClientError as e:
if e.response['Error']['Code'] == 'ConditionalCheckFailedException':
pass
Related
I have table already and I want to add a new attribute to that table.
I am trying to do that with the update_item functionality of dynamDB.
use case: Bid table holds details on the bid of the product. The user accepts the bid, once the bid is accepted, have to add a few attributes to that record like the user information. Not sure if it is the right way or should I have a new table for this.
pratition key is : Pickup,
sort key is : DropOff
A Demo example that I am trying currently
currently trying to alter the same table and facing the error.
import json
import boto3
def lambda_handler(event, context):
dynamo_client = boto3.resource('dynamodb')
users = dynamo_client.Table('LoadsandBids')
item = event['body']
print("Fuirst")
users.update_item(
Key={
'Pickup': event['body']['Pickup'],
'DropOff' : event['body']['DropOff']
},
UpdateExpression='SET #attr1 = :val1',
ExpressionAttributeNames={'#attr1': 'new_field'},
ExpressionAttributeValues={':val1': event['body']['new']},
ReturnValues='UPDATED_NEW'
)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
getting an error:
"An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema",
Could anyone help me out of this and also suggest it I my approach is good or not?
My Requirement
I want to create a CloudWatch-Metric from Athena query results.
Example
I want to create a metric like user_count of each day.
In Athena, I will write an SQL query like this
select date,count(distinct user) as count from users_table group by 1
In the Athena editor I can see the result, but I want to see these results as a metric in Cloudwatch.
CloudWatch-Metric-Name ==> user_count
Dimensions ==> Date,count
If I have this cloudwatch metric and dimensions, I can easily create a Monitoring Dashboard and send send alerts
Can anyone suggest a way to do this?
You can use CloudWatch custom widgets, see "Run Amazon Athena queries" in Samples.
It's somewhat involved, but you can use a Lambda for this. In a nutshell:
Setup your query in Athena and make sure it works using the Athena console.
Create a Lambda that:
Runs your Athena query
Pulls the query results from S3
Parses the query results
Sends the query results to CloudWatch as a metric
Use EventBridge to run your Lambda on a recurring basis
Here's an example Lambda function in Python that does step #2. Note that the Lamda function will need IAM permissions to run queries in Athena, read the results from S3, and then put a metric into Cloudwatch.
import time
import boto3
query = 'select count(*) from mytable'
DATABASE = 'default'
bucket='BUCKET_NAME'
path='yourpath'
def lambda_handler(event, context):
#Run query in Athena
client = boto3.client('athena')
output = "s3://{}/{}".format(bucket,path)
# Execution
response = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': DATABASE
},
ResultConfiguration={
'OutputLocation': output,
}
)
#S3 file name uses the QueryExecutionId so
#grab it here so we can pull the S3 file.
qeid = response["QueryExecutionId"]
#occasionally the Athena hasn't written the file
#before the lambda tries to pull it out of S3, so pause a few seconds
#Note: You are charged for time the lambda is running.
#A more elegant but more complicated solution would try to get the
#file first then sleep.
time.sleep(3)
###### Get query result from S3.
s3 = boto3.client('s3');
objectkey = path + "/" + qeid + ".csv"
#load object as file
file_content = s3.get_object(
Bucket=bucket,
Key=objectkey)["Body"].read()
#split file on carriage returns
lines = file_content.decode().splitlines()
#get the second line in file
count = lines[1]
#remove double quotes
count = count.replace("\"", "")
#convert string to int since cloudwatch wants numeric for value
count = int(count)
#post query results as a CloudWatch metric
cloudwatch = boto3.client('cloudwatch')
response = cloudwatch.put_metric_data(
MetricData = [
{
'MetricName': 'MyMetric',
'Dimensions': [
{
'Name': 'DIM1',
'Value': 'dim1'
},
],
'Unit': 'None',
'Value': count
},
],
Namespace = 'MyMetricNS'
)
return response
return
I would like to get the usage cost report of each instance in my aws account form a period of time.
I'm able to get linked_account_id and service in the output but I need instance_id as well. Please help
import argparse
import boto3
import datetime
cd = boto3.client('ce', 'ap-south-1')
results = []
token = None
while True:
if token:
kwargs = {'NextPageToken': token}
else:
kwargs = {}
data = cd.get_cost_and_usage(
TimePeriod={'Start': '2019-01-01', 'End': '2019-06-30'},
Granularity='MONTHLY',
Metrics=['BlendedCost','UnblendedCost'],
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'LINKED_ACCOUNT'},
{'Type': 'DIMENSION', 'Key': 'SERVICE'}
], **kwargs)
results += data['ResultsByTime']
token = data.get('NextPageToken')
if not token:
break
print('\t'.join(['Start_date', 'End_date', 'LinkedAccount', 'Service', 'blended_cost','unblended_cost', 'Unit', 'Estimated']))
for result_by_time in results:
for group in result_by_time['Groups']:
blended_cost = group['Metrics']['BlendedCost']['Amount']
unblended_cost = group['Metrics']['UnblendedCost']['Amount']
unit = group['Metrics']['UnblendedCost']['Unit']
print(result_by_time['TimePeriod']['Start'], '\t',
result_by_time['TimePeriod']['End'],'\t',
'\t'.join(group['Keys']), '\t',
blended_cost,'\t',
unblended_cost, '\t',
unit, '\t',
result_by_time['Estimated'])
As far as I know, Cost Explorer can't treat the usage per instance. There is a function Cost and Usage Reports which gives a detailed billing report by dump files. In this file, you can see the instance id.
It can also be connected to the AWS Athena. Once you did this, then directly query to the file on Athena.
Here is my presto example.
select
lineitem_resourceid,
sum(lineitem_unblendedcost) as unblended_cost,
sum(lineitem_blendedcost) as blended_cost
from
<table>
where
lineitem_productcode = 'AmazonEC2' and
product_operation like 'RunInstances%'
group by
lineitem_resourceid
The result is
lineitem_resourceid unblended_cost blended_cost
i-***************** 279.424 279.424
i-***************** 139.948 139.948
i-******** 68.198 68.198
i-***************** 3.848 3.848
i-***************** 0.013 0.013
where the resourceid containes the instance id. The amount of cost is summed for all usage in this month. For other type of product_operation, it will contains different resource ids.
You can add an individual tag to all instances (e.g. Id) and then group by that tag:
GroupBy=[
{
'Type': 'TAG',
'Key': 'Id'
},
],
I am attempting to load a simple transactions.txt table into a S3 bucket where a Lambda function reads the file and populates DynamoDB tables for Customers and Transactions. This all works fine. However, I also have a Lambda function that is supposed to read the Transactions table as they populate the table and sum up the transaction totals by customer and insert them into another DynamoDB table--TransactionTotal.
My TotalNotifier Lambda function throws a "KeyError" regarding a "New Image". I believe the code is fine, and I have tried changing the type of Streams from 'New and Old' to just 'New' for the Transactions table and still encounter same error.
from __future__ import print_function
import json, boto3
# Connect to SNS
sns = boto3.client('sns')
alertTopic = 'HighBalanceAlert'
snsTopicArn = [t['TopicArn'] for t in sns.list_topics()['Topics'] if t['TopicArn'].endswith(':' + alertTopic)][0]
# Connect to DynamoDB
dynamodb = boto3.resource('dynamodb')
transactionTotalTableName = 'TransactionTotal'
transactionsTotalTable = dynamodb.Table(transactionTotalTableName);
# This handler is executed every time the Lambda function is triggered
def lambda_handler(event, context):
# Show the incoming event in the debug log
print("Event received by Lambda function: " + json.dumps(event, indent=2))
# For each transaction added, calculate the new Transactions Total
for record in event['Records']:
customerId = record['dynamodb']['NewImage']['CustomerId']['S']
transactionAmount = int(record['dynamodb']['NewImage']['TransactionAmount']['N'])
# Update the customer's total in the TransactionTotal DynamoDB table
response = transactionsTotalTable.update_item(
Key={
'CustomerId': customerId
},
UpdateExpression="add accountBalance :val",
ExpressionAttributeValues={
':val': transactionAmount
},
ReturnValues="UPDATED_NEW"
)
Here is a sample error from the CloudWatch log:
'NewImage': KeyError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 30, in lambda_handler
customerId = record['dynamodb']['NewImage']['CustomerId']['S']
KeyError: 'NewImage'
To elaborate on Oluwafemi's comment, you're likely experiencing this error when receiving a REMOVE event. Regardless of whether your stream is new and old images, or just new, you won't receive a NEW_IMAGE on a REMOVE event, since there is no new image. Check out the example events on aws docs.
A check on the value of record['eventName'] should solve the issue.
Is there a way at all to query on the global secondary index of dynamodb using boto3. I dont find any online tutorials or resources.
You need to provide an IndexName parameter for the query function.
This is the name of the index, which is usually different from the name of the index attribute (the name of the index has an -index suffix by default, although you can change it during table creation). For example, if your index attribute is called video_id, your index name is probably video_id-index.
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('videos')
video_id = 25
response = table.query(
IndexName='video_id-index',
KeyConditionExpression=Key('video_id').eq(video_id)
)
To check the index name, go to the Indexes tab of the table on the web interface of AWS. You'll need a value from the Name column.
For anyone using the boto3 client, below example should work:
import boto3
# for production
client = boto3.client('dynamodb')
# for local development if running local dynamodb server
client = boto3.client(
'dynamodb',
region_name='localhost',
endpoint_url='http://localhost:8000'
)
resp = client.query(
TableName='UsersTabe',
IndexName='MySecondaryIndexName',
ExpressionAttributeValues={
':v1': {
'S': 'some#email.com',
},
},
KeyConditionExpression='emailField = :v1',
)
# will always return list
items = resp.get('Items')
first_item = items[0]
Adding the updated technique:
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource(
'dynamodb',
region_name='localhost',
endpoint_url='http://localhost:8000'
)
table = dynamodb.Table('userTable')
attributes = table.query(
IndexName='UserName',
KeyConditionExpression=Key('username').eq('jdoe')
)
if 'Items' in attributes and len(attributes['Items']) == 1:
attributes = attributes['Items'][0]
There are so many questions like this because calling dynamo through boto3 is not intuitive. I use dynamof library to make things like this a lot more common sense. Using dynamof the call looks like this.
from dynamof.operations import query
from dynamof.conditions import attr
query(
table_name='users',
conditions=attr('role').equals('admin'),
index_name='role_lookup_index')
https://github.com/rayepps/dynamof
disclaimer: I wrote dynamof