Get Dimensions for USAGE_TYPE AWS Boto3 CostExplorer Client - amazon-web-services

I'm trying to get Costs using CostExplorer Client in boto3. But I can't find the values to use as a Dimension filter. The documentation says that we can extract those values from GetDimensionValues but how do I use GetDimensionValues.
response = client.get_cost_and_usage(
TimePeriod={
'Start': str(start_time).split()[0],
'End': str(end_time).split()[0]
},
Granularity='DAILY',
Filter = {
'Dimensions': {
'Key':'USAGE_TYPE',
'Values': [
'DataTransfer-In-Bytes'
]
}
},
Metrics=[
'NetUnblendedCost',
],
GroupBy=[
{
'Type': 'DIMENSION',
'Key': 'SERVICE'
},
]
)

The boto3 reference for GetDimensionValues has a lot of details on how to use that call. Here's some sample code you might use to print out possible dimension values:
response = client.get_dimension_values(
TimePeriod={
'Start': '2022-01-01',
'End': '2022-06-01'
},
Dimension='USAGE_TYPE',
Context='COST_AND_USAGE',
)
for dimension_value in response["DimensionValues"]:
print(dimension_value["Value"])
Output:
APN1-Catalog-Request
APN1-DataTransfer-Out-Bytes
APN1-Requests-Tier1
APN2-Catalog-Request
APN2-DataTransfer-Out-Bytes
APN2-Requests-Tier1
APS1-Catalog-Request
APS1-DataTransfer-Out-Bytes
.....

Related

IndexError: list index out of range with moto

I am mocking an internal function which is returning dynamodb query. the query had begins_with which was throwing error IndexError: list index out of range.
I changed the query and removed begins_with yet still getting the same error. If I remove AND condition from KeyConditionExpression then the query works.
Below is the query:
val = 'test#val#testing'
input_query = {
'TableName': <table_name>,
'KeyConditionExpression': '#23b62 = :23b62 And #23b63 = :23b63)',
'FilterExpression': 'contains(#23b64, :23b64)',
'ProjectionExpression': '#23b60,#23b61',
'ExpressionAttributeNames': {'#23b60': 'level', '#23b61': 'test_id', '#23b62': 'PK', '#23b63': 'SK', '#23b64': 'used_in'},
'ExpressionAttributeValues': {':23b62': {'S': 'testing'}, ':23b63': {'S': val}, ':23b64': {'S': 'test'}}
}
New Query :
dynamodb_client.query(TableName="table",
KeyConditionExpression = "#PK = :PK And #SK = :SK",
ExpressionAttributeNames = {
"#PK": "PK",
"#SK": "SK"
},
FilterExpression = "contains(Used, :used)",
ExpressionAttributeValues ={
":PK": {"S": "tests"},
":SK": {"S": "test#en#testing"},
":used": {"S": "testing"}
}
)
Test case:
from botocore.exceptions import ClientError
from dynamodb_json import json_util as dynamodb_json
import logging
from contextlib import contextmanager
import pytest
from unittest.mock import patch
#contextmanager
def ddb_setup(dynamodb_resource):
table = dynamodb_resource.create_table(
TableName='table',
KeySchema=[
{
'AttributeName': 'PK',
'KeyType': 'HASH'
}, {
'AttributeName': 'SK',
'KeyType': 'SORT'
},
],
AttributeDefinitions=[
{
'AttributeName': 'PK',
'AttributeType': 'S'
}, {
'AttributeName': 'SK',
'AttributeType': 'S'
},
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1,
}
)
yield
class TestDynamoDB:
def test_create_table(self, dynamodb_resource, dynamodb_client):
with ddb_setup(dynamodb_resource):
try:
response = dynamodb_client.describe_table(
TableName='table')
resp = dynamodb_client.query(TableName="table",
KeyConditionExpression = "#PK = :PK And #SK = :SK",
ExpressionAttributeNames = {
"#PK": "PK",
"#SK": "SK"
},
FilterExpression = "contains(Used, :used)",
ExpressionAttributeValues ={
":PK": {"S": "tests"},
":SK": {"S": "test#en#testing"},
":used": {"S": "testing"}
}
)
except ClientError as err:
logger.error(f"error: {err.response['Error']['Code']}", )
assert err.response['Error']['Code'] == 'ResourceNotFoundException'
Could anyone suggest how can I run this query with moto with AND condition.
Here is an example of a working test configuration using pytest and moto. I've added code that shows how to use the AND condition using the resource and client API.
import boto3
import boto3.dynamodb.conditions as conditions
import moto
import pytest
TABLE_NAME = "data"
#pytest.fixture
def test_table():
with moto.mock_dynamodb():
client = boto3.client("dynamodb")
client.create_table(
AttributeDefinitions=[
{"AttributeName": "PK", "AttributeType": "S"},
{"AttributeName": "SK", "AttributeType": "S"}
],
TableName=TABLE_NAME,
KeySchema=[
{"AttributeName": "PK", "KeyType": "HASH"},
{"AttributeName": "SK", "KeyType": "RANGE"}
],
BillingMode="PAY_PER_REQUEST"
)
table = boto3.resource("dynamodb").Table(TABLE_NAME)
table.put_item(Item={
"PK": "pk_value",
"SK": "sk_value"
})
yield TABLE_NAME
def test_query_with_and_using_resource(test_table):
table = boto3.resource("dynamodb").Table(TABLE_NAME)
response = table.query(
KeyConditionExpression=conditions.Key("PK").eq("pk_value") & conditions.Key("SK").eq("sk_value")
)
assert len(response["Items"]) == 1
def test_query_with_and_using_client(test_table):
client = boto3.client("dynamodb")
response = client.query(
TableName=TABLE_NAME,
KeyConditionExpression="#PK = :PK AND #SK = :SK",
ExpressionAttributeNames={
"#PK": "PK",
"#SK": "SK"
},
ExpressionAttributeValues={
":PK": {"S": "pk_value"},
":SK": {"S": "sk_value"}
}
)
assert len(response["Items"]) == 1
First, we set up a table with a dummy item, and then there are two tests, the first for the resource and the second for the client API. Maybe this helps you figure out the mistake.
AWS uses the keyword RANGE to indicate that something is a sort-key. (No idea why..)
If you replace:
'KeyType': 'SORT'
with
'KeyType': 'RANGE'
the test passes.
I'm assuming that AWS throws a more obvious error when creating a table with an unknown KeyType. If you want, you can create a feature request on Moto's Github for Moto to replicate that behaviour and throw the same exception.

MissingConfigVariableError while creating DataContext in Great Expectations

Unable to create DataContext with the following configuration.I am try to use a Databricks spark df datasource and in house DB as storeBackendDefaults
I get the MissingConfigVariableError exceptions
Could some explain what I am missing
import great_expectations as ge
import great_expectations.exceptions as ge_exceptions
from great_expectations.data_context.types.base import DataContextConfig, DatasourceConfig, FilesystemStoreBackendDefaults, DatabaseStoreBackendDefaults
from great_expectations.data_context import BaseDataContext
my_spark_datasource_config = DatasourceConfig(
class_name="Datasource",
execution_engine={"class_name": "SparkDFExecutionEngine"},
data_connectors={"sample_sparkdf_runtime_data_connector": {
"module_name": "great_expectations.datasource.data_connector",
"class_name": "RuntimeDataConnector",
"batch_identifiers": [
"some_key_maybe_pipeline_stage",
"some_other_key_maybe_run_id"
]
}
}
)
data_context_config = DataContextConfig(config_version = 2
,plugins_directory = None
,config_variables_file_path = None
,datasources={"my_spark_datasource": my_spark_datasource_config}
,store_backend_defaults=DatabaseStoreBackendDefaults(default_credentials = {
"drivername": "PrestoSQL",
"host": "*****",
"port": "443",
"username": "*****",
"password": "*****",
"database": "****"
}
),
anonymous_usage_statistics={"enabled": False}
)
context = BaseDataContext(project_config=data_context_config)

list indices must be integers or slices when appending a dict to a serializer in django

I have an already created serializer object, I am trying to add a new object to the serializer but I keep getting the error
list indices must be integers or slices, not str
I am not able to trace where I am going wrong with the creation of the new object. Here is my code below and more explanations.
class ClusterFunctionView(generics.ListAPIView):
permission_classes = (IsAuthenticated,)
serializer_class = FunctionListSerializer
def get_queryset(self):
//returns serializer
def list(self, request, *args, **kwargs):
response = super().list(request, *args, **kwargs)
user = self.request.user
cluster = Cluster.objects.filter(user_id=user.id, id=self.kwargs["cluster_id"]).first()
schedule = Schedule.objects.filter(clusters__in=[cluster]).values().first() # I am getting the new object here
print('schedule', type(schedule)) # I checked the type, it is a dict
response.data['schedule'] = schedule # doesn't seem to be appending to the existing serializer.
return response
The following is an example of the output of the schedule object, I printed using print('schedule', schedule):
schedule {'id': 7, 'user_id': 3, 'creation_time': datetime.datetime(2020, 5, 25, 15, 44, 39, 875485), 'name': 'mandard_1', 'is_active': True, 'comment': 'extract mardard premier batch', 'cron_expression': '#once'}
A sample of the existing serializer on which I should add the above object is:
[
{
"id": 1,
"function": "connections",
"max_concurrency": 1,
"mandatory_params": {},
"public_params": {
"cluster": {
"account": true,
"max_pages": {
"max": 100,
"default": 100
},
"profiles_per_page": {
"max": 25,
"default": 25
}
}
},
"params": {
"max_pages": 100,
"account_function": "user_account",
"alchemy_directory": "connections",
"unique_result_obj_attribute": "connection_id"
}
}
]
I am expecting a serializer with a schedules object, a result like :
[
{
"id": 1,
"function": "connections",
"max_concurrency": 1,
"mandatory_params": {},
"public_params": {
"cluster": {
"account": true,
"max_pages": {
"max": 100,
"default": 100
},
"profiles_per_page": {
"max": 25,
"default": 25
}
}
},
"params": {
"max_pages": 100,
"account_function": "user_account",
"alchemy_directory": "connections",
"unique_result_obj_attribute": "connection_id"
}
},
"schedule": {} # this should be added as a result
]
What could the problem be, and what solution could I undertake? Thanks
I was able to solve the question, realizing it was a small mistake that I did.
In the def list function,
This line was the one causing the error :
response.data['schedule'] = schedule
I realised that the serializer produced was of a list result, and I was initially trying to append the schedule : {}, directly into a list, hence the error. It should be appended into the first object. Hence I needed to access the object in the list using index, so changing it to below solved the problem:
response.data[0]['schedule'] = schedule

AWS update kinesis firehose configuration pro-grammatically

Currently I am writing a test library to test the configuration settings. I would like to set only few parameters of firehose like SizeInMBs and IntervalInSeconds. All other parameters will remain same. Is there a simple way to do it?
I wrote the following method
def set_firehose_buffering_hints(self, size_mb, interval_sec):
response = self._firehose_client.describe_delivery_stream(DeliveryStreamName=self.firehose)
lambdaarn = (response['DeliveryStreamDescription']
['Destinations'][0]['ExtendedS3DestinationDescription']
['ProcessingConfiguration']['Processors'][0]['Parameters'][0]['ParameterValue'])
response = self._firehose_client.update_destination(DeliveryStreamName=self.firehose,
CurrentDeliveryStreamVersionId=response['DeliveryStreamDescription']['VersionId'],
DestinationId=response['DeliveryStreamDescription']['Destinations'][0]['DestinationId'],
ExtendedS3DestinationUpdate={
"BufferingHints": {
"IntervalInSeconds": interval_sec,
"SizeInMBs": size_mb
},
'ProcessingConfiguration': {
'Processors': [{
'Type': 'Lambda',
'Parameters': [
{
'ParameterName': 'LambdaArn',
'ParameterValue': lambdaarn
},
{
'ParameterName': 'BufferIntervalInSeconds',
'ParameterValue': str(interval_sec)
},
{
'ParameterName': 'BufferSizeInMBs',
'ParameterValue': str(size_mb)
}]
}]
}})

ElasticSearch: Getting old visitor data into an index

I'm learning ElasticSearch in the hopes of dumping my business data into ES and viewing it with Kibana. After a week of various issues I finally have ES and Kibana working (1.7.0 and 4 respectively) on 2 Ubuntu 14.04 desktop machines (clustered).
The issue I'm having now is how to get the data into ES best. The data flow is that I capture the PHP global variables $_REQUEST and $_SERVER for each visit to text file with a unique ID. From there, if they fill in a form I capture that data in a text file also named with that unique ID in a different directory. Then my customers tell me if that form fill was any good with a delay of up to 50 days.
So I'm starting with the visitor data - $_REQUEST and $_SERVER. A lot of it is redundant so I'm really just attempting to capture the timestamp of their arrival, their IP, the IP of the server they visited, the domain they visited, the unique ID, and their User Agent. So I created this mapping:
time_date_mapping = { 'type': 'date_time' }
str_not_analyzed = { 'type': 'string'} # Originally this included 'index': 'not analyzed' as well
visit_mapping = {
'properties': {
'uniqID': str_not_analyzed,
'pages': str_not_analyzed,
'domain': str_not_analyzed,
'Srvr IP': str_not_analyzed,
'Visitor IP': str_not_analyzed,
'Agent': { 'type': 'string' },
'Referrer': { 'type': 'string' },
'Entrance Time': time_date_mapping, # Stored as a Unix timestamp
'Request Time': time_date_mapping, # Stored as a Unix timestamp
'Raw': { 'type': 'string', 'index': 'not_analyzed' },
},
}
I then enter it into ES with:
es.index(
index=Visit_to_ElasticSearch.INDEX,
doc_type=Visit_to_ElasticSearch.DOC_TYPE,
id=self.uniqID,
timestamp=int(math.floor(self._visit['Entrance Time'])),
body=visit
)
When I look at the data in the index on ES only Entrance Time, _id, _type, domain, and uniqID are indexed for searching (according to Kibana). All of the data is present in the document but most of the fields show "Unindexed fields can not be searched."
Additionally, I was attempting to just get a Pie chart of the Agents. But I couldn't figure out to get visualized because no matter what boxes I click on the Agent field is never an option for aggregation. Just mentioned it because it seems the fields which are indexed do show up.
I've attempting to mimic the mapping examples in the elasticsearch.py example which pulls in github. Can someone correct me on how I'm using that map?
Thanks
------------ Mapping -------------
{
"visits": {
"mappings": {
"visit": {
"properties": {
"Agent": {
"type": "string"
},
"Entrance Time": {
"type": "date",
"format": "dateOptionalTime"
},
"Raw": {
"properties": {
"Entrance Time": {
"type": "double"
},
"domain": {
"type": "string"
},
"uniqID": {
"type": "string"
}
}
},
"Referrer": {
"type": "string"
},
"Request Time": {
"type": "string"
},
"Srvr IP": {
"type": "string"
},
"Visitor IP": {
"type": "string"
},
"domain": {
"type": "string"
},
"uniqID": {
"type": "string"
}
}
}
}
}
}
------------- Update and New Mapping -----------
So I deleted the index and recreated it. The original index had some data in it from before I knew anything about mapping the data to specific field types. This seemed to fix the issue with only a few fields being indexed.
However, parts of my mapping appear to be ignored. Specifically the Agent string mapping:
visit_mapping = {
'properties': {
'uniqID': str_not_analyzed,
'pages': str_not_analyzed,
'domain': str_not_analyzed,
'Srvr IP': str_not_analyzed,
'Visitor IP': str_not_analyzed,
'Agent': { 'type': 'string', 'index': 'not_analyzed' },
'Referrer': { 'type': 'string' },
'Entrance Time': time_date_mapping,
'Request Time': time_date_mapping,
'Raw': { 'type': 'string', 'index': 'not_analyzed' },
},
}
Here's the output of http://localhost:9200/visits_test2/_mapping
{
"visits_test2": {
"mappings": {
"visit": {
"properties": {
"Agent":{"type":"string"},
"Entrance Time": {"type":"date","format":"dateOptionalTime"},
"Raw": {
"properties": {
"Entrance Time":{"type":"double"},
"domain":{"type":"string"},
"uniqID":{"type":"string"}
}
},
"Referrer":{"type":"string"},
"Request Time": {"type":"date","format":"dateOptionalTime"},
"Srvr IP":{"type":"string"},
"Visitor IP":{"type":"string"},
"domain":{"type":"string"},
"uniqID":{"type":"string"}
}
}
}
}
}
Note that I've used an entirely new index. The reason being that I wanted to make to sure nothing was carrying over from one to the next.
Note that I'm using the Python library elasticsearch.py and following their examples for mapping syntax.
--------- Python Code for Entering Data into ES, per comment request -----------
Below is a file name mapping.py, I have not yet fully commented the code since this was just code to test whether this method of data entry into ES was viable. If it is not self-explanatory, let me know and I'll add additional comments.
Note, I programmed in PHP for years before picking up Python. In order to get up and running faster with Python I created a couple of files with basic string and file manipulation functions and made them into a package. They are written in Python and meant to mimic the behavior of a built-in PHP function. So when you see a call to php_basic_* it is one of those functions.
# Standard Library Imports
import json, copy, datetime, time, enum, os, sys, numpy, math
from datetime import datetime
from enum import Enum, unique
from elasticsearch import Elasticsearch
# My Library
import basicconfig, mybasics
from mybasics.cBaseClass import BaseClass, BaseClassErrors
from mybasics.cHelpers import HandleErrors, LogLvl
# This imports several constants, a couple of functions, and a helper class
from basicconfig.startup_config import *
# Connect to ElasticSearch
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
# Create mappings of a visit
time_date_mapping = { 'type': 'date_time' }
str_not_analyzed = { 'type': 'string'} # This originally included 'index': 'not_analyzed' as well
visit_mapping = {
'properties': {
'uniqID': str_not_analyzed,
'pages': str_not_analyzed,
'domain': str_not_analyzed,
'Srvr IP': str_not_analyzed,
'Visitor IP': str_not_analyzed,
'Agent': { 'type': 'string', 'index': 'not_analyzed' },
'Referrer': { 'type': 'string' },
'Entrance Time': time_date_mapping,
'Request Time': time_date_mapping,
'Raw': { 'type': 'string', 'index': 'not_analyzed' },
'Pages': { 'type': 'string', 'index': 'not_analyzed' },
},
}
class Visit_to_ElasticSearch(object):
"""
"""
INDEX = 'visits'
DOC_TYPE = 'visit'
def __init__(self, fname, index=True):
"""
"""
self._visit = json.loads(php_basic_files.file_get_contents(fname))
self._pages = self._visit.pop('pages')
self.uniqID = self._visit['uniqID']
self.domain = self._visit['domain']
self.entrance_time = self._convert_time(self._visit['Entrance Time'])
# Get a list of the page IDs
self.pages = self._pages.keys()
# Extra IPs and such from a single page
page = self._pages[self.pages[0]]
srvr = page['SERVER']
req = page['REQUEST']
self.visitor_ip = srvr['REMOTE_ADDR']
self.srvr_ip = srvr['SERVER_ADDR']
self.request_time = self._convert_time(srvr['REQUEST_TIME'])
self.agent = srvr['HTTP_USER_AGENT']
# Now go grab data that might not be there...
self._extract_optional()
if index is True:
self.index_with_elasticsearch()
def _convert_time(self, ts):
"""
"""
try:
dt = datetime.fromtimestamp(ts)
except TypeError:
dt = datetime.fromtimestamp(float(ts))
return dt.strftime('%Y-%m-%dT%H:%M:%S')
def _extract_optional(self):
"""
"""
self.referrer = ''
def index_with_elasticsearch(self):
"""
"""
visit = {
'uniqID': self.uniqID,
'pages': [],
'domain': self.domain,
'Srvr IP': self.srvr_ip,
'Visitor IP': self.visitor_ip,
'Agent': self.agent,
'Referrer': self.referrer,
'Entrance Time': self.entrance_time,
'Request Time': self.request_time,
'Raw': self._visit,
'Pages': php_basic_str.implode(', ', self.pages),
}
es.index(
index=Visit_to_ElasticSearch.INDEX,
doc_type=Visit_to_ElasticSearch.DOC_TYPE,
id=self.uniqID,
timestamp=int(math.floor(self._visit['Entrance Time'])),
body=visit
)
es.indices.create(
index=Visit_to_ElasticSearch.INDEX,
body={
'settings': {
'number_of_shards': 5,
'number_of_replicas': 1,
}
},
# ignore already existing index
ignore=400
)
In case it matters this is the simple loop I use to dump the data into ES:
for f in all_files:
try:
visit = mapping.Visit_to_ElasticSearch(f)
except IOError:
pass
where all_files is a list of all the visit files (full path) I have in my test data set.
Here is a sample visit file from a Google Bot visit:
{u'Entrance Time': 1407551587.7385,
u'domain': u'############',
u'pages': {u'6818555600ccd9880bf7acef228c5d47': {u'REQUEST': [],
u'SERVER': {u'DOCUMENT_ROOT': u'/var/www/####/',
u'Entrance Time': 1407551587.7385,
u'GATEWAY_INTERFACE': u'CGI/1.1',
u'HTTP_ACCEPT': u'*/*',
u'HTTP_ACCEPT_ENCODING': u'gzip,deflate',
u'HTTP_CONNECTION': u'Keep-alive',
u'HTTP_FROM': u'googlebot(at)googlebot.com',
u'HTTP_HOST': u'############',
u'HTTP_IF_MODIFIED_SINCE': u'Fri, 13 Jun 2014 20:26:33 GMT',
u'HTTP_USER_AGENT': u'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
u'PATH': u'/usr/local/bin:/usr/bin:/bin',
u'PHP_SELF': u'/index.php',
u'QUERY_STRING': u'',
u'REDIRECT_SCRIPT_URI': u'http://############/',
u'REDIRECT_SCRIPT_URL': u'############',
u'REDIRECT_STATUS': u'200',
u'REDIRECT_URL': u'############',
u'REMOTE_ADDR': u'############',
u'REMOTE_PORT': u'46271',
u'REQUEST_METHOD': u'GET',
u'REQUEST_TIME': u'1407551587',
u'REQUEST_URI': u'############',
u'SCRIPT_FILENAME': u'/var/www/PIAN/index.php',
u'SCRIPT_NAME': u'/index.php',
u'SCRIPT_URI': u'http://############/',
u'SCRIPT_URL': u'/############/',
u'SERVER_ADDR': u'############',
u'SERVER_ADMIN': u'admin#############',
u'SERVER_NAME': u'############',
u'SERVER_PORT': u'80',
u'SERVER_PROTOCOL': u'HTTP/1.1',
u'SERVER_SIGNATURE': u'<address>Apache/2.2.22 (Ubuntu) Server at ############ Port 80</address>\n',
u'SERVER_SOFTWARE': u'Apache/2.2.22 (Ubuntu)',
u'uniqID': u'bbc398716f4703cfabd761cc8d4101a1'},
u'SESSION': {u'Entrance Time': 1407551587.7385,
u'uniqID': u'bbc398716f4703cfabd761cc8d4101a1'}}},
u'uniqID': u'bbc398716f4703cfabd761cc8d4101a1'}
Now I understand better why the Raw field is an object instead of a simple string since it is assigned self._visit which in turn was initialized with json.loads(php_basic_files.file_get_contents(fname)).
Anyway, based on all the information you've given above, my take is that the mapping was never installed via put_mapping. From there on, there's no way anything else can work the way you like. I suggest you modify your code to install the mapping before you index your first visit document.