How can I get AWS monthly invoice PDF using AWS API? - amazon-web-services

How can I programmatically download the PDF monthly invoice the accounting department ask me every month?
I can get them from AWS Console (eg. https://console.aws.amazon.com/billing/home?region=eu-west-3#/bills?year=2019&month=3)
Where there is a link to the invoice.
The moment I click to download the invoice, I can see HTTP requests to the following URL:
https://console.aws.amazon.com/billing/rest/v1.0/bill/invoice/generate?generatenew=true&invoiceGroupId=_SOME_ID_&invoicenumber=_SOME_ID_
Then a final request to the URL that actually serves the PDF file:
https://console.aws.amazon.com/billing/rest/v1.0/bill/invoice/download?invoiceGroupId=_SOME_ID_&invoicenumber=_SOME_ID_
I cannot find documentation on the AWS API to fetch such invoice document (there is some for billing reports and other stuff, but none for the "official" document) so I start to ask myself if it is even available?
Before going scraping the AWS Console (via Scrapy, Selenium, Puppeteer) I ask the community.
NB: I know AWS can send the invoice PDF via e-mail but I rather fetch it directly from AWS instead of fetching from an IMAP/POP e-mail server.

You can use aws cli or aws sdk to get the data in json format. And then convert the json into pdf (not covered in this answer).
AWS cli
aws cli provides get-cost-and-usage command. By fiddling with parameters you can get the output that matches the one that is produced by billing invoice.
Example usage of this command:
aws ce get-cost-and-usage \
--time-period Start=2019-03-01,End=2019-04-01 \
--granularity MONTHLY \
--metrics "BlendedCost" "UnblendedCost" "UsageQuantity" \
--group-by Type=DIMENSION,Key=SERVICE
Which produces the following output
{
"GroupDefinitions": [
{
"Type": "DIMENSION",
"Key": "SERVICE"
}
],
"ResultsByTime": [
{
"TimePeriod": {
"Start": "2019-03-01",
"End": "2019-04-01"
},
"Total": {},
"Groups": [
{
"Keys": [
"AWS Budgets"
],
"Metrics": {
"BlendedCost": {
"Amount": "3.0392156805",
"Unit": "USD"
},
"UnblendedCost": {
"Amount": "3",
"Unit": "USD"
},
"UsageQuantity": {
"Amount": "155",
"Unit": "N/A"
}
}
},
{
"Keys": [
"AWS CloudTrail"
],
"Metrics": {
"BlendedCost": {
"Amount": "0",
"Unit": "USD"
},
"UnblendedCost": {
"Amount": "0",
"Unit": "USD"
},
"UsageQuantity": {
"Amount": "720042",
"Unit": "N/A"
}
}
},
...
AWS SDK
You can also get the same kind of data programmatically. The easiest way to do it is to use aws sdk. Refer to the documentation of the sdk you want to use. For example information on this functionality for python sdk can be found here.

Specific to invoices, it is unfortunate but still to this day there is no native way to download them other than manually downloading them or being a lucky one to get and have to deal with all of them via email https://aws.amazon.com/premiumsupport/knowledge-center/download-pdf-invoice/
There is https://github.com/iann0036/aws-bill-export (it does not use a native API but instead scrapes the webpage and is setup via lambda and nodejs) and also Puppeteer among other dependencies.
I just finished writing some Python + Selenium that is far more "monstrous" but gets the job done (for today's UI/Jan.2023 at least)...
I thought I'd share both of those since you mentioned them in the OP and no other solutions have come up.
import os
import sys
import time
import argparse
from os.path import expanduser
from datetime import datetime
from dateutil.relativedelta import relativedelta
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
home = expanduser("~")
# Variables grabbed from CLI arguments
parser = argparse.ArgumentParser(
description='AWS Console Login, programming the unprogrammatically-accessible (via CLI/API, using selenium instead).')
parser.add_argument(
'-i', '--interactive',
help="Use False for Headless mode",
default=False,
required=False
)
args = parser.parse_args()
# ChromeDriver options
options = webdriver.ChromeOptions()
if args.interactive == False:
options.add_argument('--headless')
download_directory = "./aws_invoice_downloads"
if not os.path.exists(download_directory):
os.makedirs(download_directory)
else:
download_directory = home + "/Downloads"
options.add_argument("--window-size=1920x1080")
options.add_argument("--remote-debugging-port=9222")
options.add_argument('--no-sandbox')
options.add_argument("--disable-gpu")
options.add_argument('--disable-dev-shm-usage')
options.add_experimental_option("prefs", {
"download.default_directory": download_directory,
"download.prompt_for_download": False
})
# Initiate ChromeDriver
driver = webdriver.Chrome(executable_path='chromedriver', options=options)
# create action chain object
action = ActionChains(driver)
# Set the default selenium timeout
delay = 30 # seconds
# Abort function
def abort_function():
print ("Aborting!")
driver.close()
sys.exit(1)
# Wait for download function
def download_wait(path_to_downloads):
seconds = 0
dl_wait = True
while dl_wait and seconds < 30:
time.sleep(1)
dl_wait = False
for fname in os.listdir(path_to_downloads):
if fname.endswith('.crdownload'):
dl_wait = True
seconds += 1
return seconds
def download_invoices(Id, Network):
print("Switching to the " + Network + "/" + Id + " org account...")
# remove_existing_conflicts(Network)
driver.get("https://signin.aws.amazon.com/switchrole?account=" + Id + "&roleName=YOUR_ROLE_NAME&displayName=" + Network + "%20Org%20Master")
time.sleep(1)
elem = WebDriverWait(driver, delay).until(
EC.presence_of_element_located((By.XPATH, '//*[#type="submit"]'))
)
elem.click()
time.sleep(3)
print("Downloading invoices...")
# Notes
# Can provide YYYY and MM in the URL to get a specific YYYY/MM billing period
# https://us-east-1.console.aws.amazon.com/billing/home?region=us-east-1#/bills?year=2023&month=1
# Get today's YYYY
today = datetime.now()
last_month = today - relativedelta(months=1)
year = last_month.strftime("%Y")
month = last_month.strftime("%m")
driver.get("https://us-east-1.console.aws.amazon.com/billing/home?region=us-east-1#/bills?year=" + year + "&month=" + month)
WebDriverWait(driver, 13).until(
EC.presence_of_element_located((By.XPATH, '//*[#data-testid="main-spinner"]'))
)
time.sleep(2)
elem = WebDriverWait(driver, 13).until(
EC.presence_of_all_elements_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1]'))
)
# Count the number of items in the list
elem_count = len(elem)
print("Found " + str(elem_count) + " items in the list...")
# Loop through the list and expand each item
for i in range(1, elem_count + 1):
print("Expanding item " + str(i) + " of " + str(elem_count) + "...")
# (//*[text()[contains(., " Charges")]])[position() < last() - 1][i]
elem = WebDriverWait(driver, 13).until(
EC.presence_of_element_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']'))
)
desired_y = (elem.size['height'] / 2) + elem.location['y']
current_y = (driver.execute_script('return window.innerHeight') / 2) + driver.execute_script('return window.pageYOffset')
scroll_y_by = desired_y - current_y
driver.execute_script("window.scrollBy(0, arguments[0]);", scroll_y_by)
time.sleep(2) # Fixes content shift and ElementClickInterceptedException by waiting, checking the elem, and scrolling again
elem = WebDriverWait(driver, delay).until(
EC.visibility_of_element_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']')))
driver.execute_script("arguments[0].scrollIntoView(true); window.scrollBy(0, -100);", elem)
action.move_to_element(elem).move_by_offset(0,0).click().perform()
# Count the number of invoices with that item
# (//*[text()[contains(., " Charges")]])[position() < last() - 1][2]/following-sibling::div//*[#title="Download Invoice"]
elem = WebDriverWait(driver, 13).until(
EC.presence_of_all_elements_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']/following-sibling::div//*[#title="Download Invoice"]'))
)
# Count the number of items in the list
invoice_count = len(elem)
# Loop through the list and download each invoice
for j in range(1, invoice_count + 1):
print("Downloading invoice " + str(j) + " of " + str(invoice_count) + "...")
# (//*[text()[contains(., " Charges")]])[position() < last() - 1][2]/following-sibling::div//*[#title="Download Invoice"][1]
elem = WebDriverWait(driver, 13).until(
EC.presence_of_element_located((By.XPATH, '((//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']/following-sibling::div//*[#title="Download Invoice"])[' + str(j) + ']'))
)
desired_y = (elem.size['height'] / 2) + elem.location['y']
current_y = (driver.execute_script('return window.innerHeight') / 2) + driver.execute_script('return window.pageYOffset')
scroll_y_by = desired_y - current_y
driver.execute_script("window.scrollBy(0, arguments[0]);", scroll_y_by)
time.sleep(2) # Fixes content shift and ElementClickInterceptedException by waiting, checking the elem, and scrolling again
elem = WebDriverWait(driver, delay).until(
EC.visibility_of_element_located((By.XPATH, '((//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']/following-sibling::div//*[#title="Download Invoice"])[' + str(j) + ']')))
driver.execute_script("arguments[0].scrollIntoView(true); window.scrollBy(0, -100);", elem)
action.move_to_element(elem).move_by_offset(0,0).click().perform()
download_wait(download_directory)
time.sleep(3)
# Find the parent again
elem = WebDriverWait(driver, 13).until(
EC.presence_of_element_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']'))
)
# Collapse the parent
desired_y = (elem.size['height'] / 2) + elem.location['y']
current_y = (driver.execute_script('return window.innerHeight') / 2) + driver.execute_script('return window.pageYOffset')
scroll_y_by = desired_y - current_y
driver.execute_script("window.scrollBy(0, arguments[0]);", scroll_y_by)
time.sleep(2) # Fixes content shift and ElementClickInterceptedException by waiting, checking the elem, and scrolling again
elem = WebDriverWait(driver, delay).until(
EC.visibility_of_element_located((By.XPATH, '(//*[text()[contains(., " Charges")]])[position() < last() - 1][' + str(i) + ']')))
action.move_to_element(elem).move_by_offset(0,0).click().perform()

Related

Bittrex REST API for Python, I want to create an order using API v3 https://api.bittrex.com/v3/orders

I need help to create orders using the bittrex version 3 REST API. I have the code below and I can't understand what is missing to work.
I can make other GET calls, but I cannot make this POST request.
I don't know how to deal with the passing of parameters.
Official documentation at https://bittrex.github.io/api/v3#tag-Orders.
def NewOrder(market, amount, price):
#print 'open sell v3', market
market = 'HEDG-BTC'#'BTC-'+market
uri = 'https://api.bittrex.com/v3/orders?'
params = {
'marketSymbol': 'BTC-HEDG',#'HEDG-BTC', #market
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
'useAwards': True
}
timestamp = str(int(time.time()*1000))
Content = ""
contentHash = hashlib.sha512(Content.encode()).hexdigest()
Method = 'POST'
uri2 = buildURI(uri, params)
#uri2 = 'https://api.bittrex.com/v3/orders?direction=BUY&limit=0.00021&marketSymbol=HEDG-BTC&quantity=1.1&timeInForce=POST_ONLY_GOOD_TIL_CANCELLED&type=LIMIT&useAwards=True'
#print uri2
PreSign = timestamp + uri2 + Method + contentHash# + subaccountId
#print PreSign
Signature = hmac.new(apisecret, PreSign.encode(), hashlib.sha512).hexdigest()
headers = {
'Api-Key' : apikey,
'Api-Timestamp' : timestamp,
'Api-Content-Hash': contentHash,
'Api-Signature' : Signature
}
r = requests.post(uri2, data={}, headers=headers, timeout=11)
return json.loads(r.content)
NewOrder('HEDG', 1.1, 0.00021)
And my error message:
{u'code': u'BAD_REQUEST', u'data': {u'invalidRequestParameter': u'direction'}, u'detail': u'Refer to the data field for specific field validation failures.'}
It seems from the documentation that this body is expected by the api as json data:
{
"marketSymbol": "string",
"direction": "string",
"type": "string",
"quantity": "number (double)",
"ceiling": "number (double)",
"limit": "number (double)",
"timeInForce": "string",
"clientOrderId": "string (uuid)",
"useAwards": "boolean"
}
and you are setting these values as url params that's the issue.
you need to do this:
uri = 'https://api.bittrex.com/v3/orders'
# NOTE >>>> please check that you provide all the required fields.
payload = {
'marketSymbol': 'BTC-HEDG',#'HEDG-BTC', #market
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
'useAwards': True
}
# do rest of the stuffs as you are doing
# post payload as json data with the url given in doc
r = requests.post(uri, json=payload, headers=headers, timeout=11)
print(r.json())
If you still have issues let us know. If it works then please mark answer as accepted.
Hope this helps.
I made the following modifications to the code but started to give error in 'Content-Hash'
I'm assuming that some parameters are optional so they are commented.
def NewOrder(market, amount, price):
market = 'BTC-'+market
uri = 'https://api.bittrex.com/v3/orders'
payload = {
'marketSymbol': market,
'direction': 'BUY',
'type': 'LIMIT',
'quantity': amount,
#"ceiling": "number (double)",
'limit': price,
'timeInForce': 'POST_ONLY_GOOD_TIL_CANCELLED',
#"clientOrderId": "string (uuid)",
'useAwards': True
}
#ceiling (optional, must be included for ceiling orders and excluded for non-ceiling orders)
#clientOrderId (optional) client-provided identifier for advanced order tracking
timestamp = str(int(time.time()*1000))
Content = ''+json.dumps(payload, separators=(',',':'))
print Content
contentHash = hashlib.sha512(Content.encode()).hexdigest()
Method = 'POST'
#uri2 = buildURI(uri, payload)#line not used
print uri
#PreSign = timestamp + uri2 + Method + contentHash# + subaccountId
PreSign = timestamp + uri + Method + contentHash# + subaccountId
print PreSign
Signature = hmac.new(apisecret, PreSign.encode(), hashlib.sha512).hexdigest()
headers = {
'Api-Key' : apikey,
'Api-Timestamp' : timestamp,
'Api-Content-Hash': contentHash,
'Api-Signature' : Signature
}
r = requests.post(uri, json=payload, headers=headers, timeout=11)
print(r.json())
return json.loads(r.content)
NewOrder('HEDG', 1.5, 0.00021)
{u'code': u'INVALID_CONTENT_HASH'}
Bittrex API via requests package PYTHON
import hmac
import hashlib
import time, requests
nonce = str(int(time.time() * 1000))
content_hash = hashlib.sha512(''.encode()).hexdigest()
signature = hmac.new(
'<SECRET_KEY>'.encode(),
''.join([nonce, url, 'GET', content_hash]).encode(),
hashlib.sha512
).hexdigest()
headers = {
'Api-Timestamp': nonce,
'Api-Key': '<API_KEY>',
'Content-Type': 'application/json',
'Api-Content-Hash': content_hash,
'Api-Signature': signature
}
result = requests.get(url=url, headers=headers)

AWS boto3 unable to put tags after creating an AMI

I'm trying to put tags after creating AMI from an instance using boto3 and getting an error:
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "TagSpecifications", must be one of:
BlockDeviceMappings, Description, DryRun, InstanceId, Name, NoReboot
Here is my code, can you please check what I'm doing wrong?
It works for snapshot but failing for image.
import xlrd
import boto3
import datetime
client = boto3.client('ec2')
# Give the location of the file
loc = ("/Users/user1/Documents/aws-python/aws-tag-test (1).xlsx")
# To open Workbook
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
# For row 0 and column 0
#print (sheet.cell_value(0, 0))
nowtime = datetime.datetime.now()
nowdate = (nowtime.strftime("%Y-%m-%d %H-%M"))
print (nowdate)
#print (nowtime)
server_ids = []
instancename =[]
for i in range (1,sheet.nrows):
server_ids.append(sheet.cell_value(i,1))
instancename.append(sheet.cell_value(i,0))
#print (sheet.cell_value(i,1))
# excel closed
for i in range (len(server_ids)):
print(server_ids[i], instancename[i])
response = client.create_image(
Description = 'ami ' + instancename[i] + ' ' + str(nowdate),
InstanceId = server_ids[i],
Name = 'ami ' + instancename[i] + ' ' + str(nowdate),
NoReboot = True,
DryRun=False,
TagSpecifications=[
{
'ResourceType': 'image',
'Tags': [
{
'Key': 'Name',
'Value': 'ami-' + instancename[i] + '-' + str(nowdate)
},
{
'Key': 'date',
'Value': datetime.datetime.now().strftime("%Y-%m-%d")
}
]
},
]
)
#)
print(response)
Really appreciate your help.
Yes, it is now available. Not sure when, but it was definitely added sometime after the original comments.

AWS Lambda is writing wrong output in the CloudWatch metrics

I'm new to Devops and coding. I'm working on building a monitoring tool (grafana) with CloudWatch and Lambda.
I have a code which is not working properly. It pings the server. If it is returning 200 it will push 0 in the metrics and when the site is down it should push 1 but when I'm mentioning in the write metrics to write 1, instead of writing 1 its writing 100 and if I try to do any other values its greater than 100 its posting but less than 100 its just post 100.
Here is the code:
import boto3
import urllib2
def write_metric(value, metric):
d = boto3.client('cloudwatch')
d.put_metric_data(Namespace='WebsiteStatus',
MetricData=[
{
'MetricName':metric,
'Dimensions':[
{
'Name': 'Status',
'Value': 'WebsiteStatusCode',
},
],
'Value': value,
},
]
)
def check_site(url, metric):
STAT = 1
print("Checking %s " % url)
request = urllib2.Request("https://" +url)
try:
response = urllib2.urlopen(request)
response.close()
except urllib2.URLError as e:
if hasattr(e, 'code'):
print ("[Error:] Connection to %s failed with code: " %url +str(e.code))
STAT = 100
write_metric(STAT, metric)
if hasattr(e, 'reason'):
print ("[Error:] Connection to %s failed with code: " % url +str(e.reason))
STAT = 100
write_metric(STAT, metric)
except urllib2.HTTPError as e:
if hasattr(e, 'code'):
print ("[Error:] Connection to %s failed with code: " % url + str(e.code))
STAT = 100
write_metric(STAT, metric)
if hasattr(e, 'reason'):
print ("[Error:] Connection to %s failed with code: " % url + str(e.reason))
STAT = 100
write_metric(STAT, metric)
print('HTTPError!!!')
if STAT != 100:
STAT = response.getcode()
return STAT
def lambda_handler(event, context):
websiteurls = [
"website.com"
]
metricname = 'SiteAvailability'
for site in websiteurls:
r = check_site(site,metricname)
if r == 200:
print("Site %s is up" %site)
write_metric(0, metricname)
else:
print("[Error:] Site %s down" %site)
write_metric(1, metricname)
These lines:
STAT = 100
write_metric(STAT, metric)
will always send 100 as your value.

Kinesis Firehose with Lambda decorator getting throttled

I am using Firehose with a lambda decorator to ingest vpc flow logs into Redshift.
(VPC Flow Logs -> Kinesis Data Stream -> Kinesis Firehose -> Lambda Decorator -> Redshift) The volume of traffic is high which causes the lambda to error out with task timed out when reingesting unprocessed records back into firehose. The lambda has the max timeout and 3GB of memory.
I believe the issue is related to lambda's 6mb payload size. Is there a way to batch or reduce the payload to ensure the function doesn't error out? Thanks in advance.
import base64
import json
import gzip
import StringIO
import boto3
import datetime
def transformLogEvent(log_event):
version = log_event['extractedFields']['version']
accountid = log_event['extractedFields']['account_id']
interfaceid = log_event['extractedFields']['interface_id']
srcaddr = log_event['extractedFields']['srcaddr']
dstaddr = log_event['extractedFields']['dstaddr']
srcport = log_event['extractedFields']['srcport']
dstport = log_event['extractedFields']['dstport']
protocol = log_event['extractedFields']['protocol']
packets = log_event['extractedFields']['packets']
bytes = log_event['extractedFields']['bytes']
starttime = datetime.datetime.fromtimestamp(int(log_event['extractedFields']['start'])).strftime('%Y-%m-%d %H:%M:%S')
endtime = datetime.datetime.fromtimestamp(int(log_event['extractedFields']['end'])).strftime('%Y-%m-%d %H:%M:%S')
action = log_event['extractedFields']['action']
logstatus = log_event['extractedFields']['log_status']
row = '"' + str(version) + '"' + "," + '"' + str(accountid) + '"' + "," + '"' + str(interfaceid) + '"' + "," + '"' + str(srcaddr) + '"' + "," + '"' + str(dstaddr) + '"' + "," + '"' + str(srcport) + '"' + "," + '"' + str(dstport) + '"' + "," + '"' + str(protocol) + '"' + "," + '"' + str(packets) + '"' + "," + '"' + str(bytes) + '"' + "," + '"' + str(starttime) + '"' + "," + '"' + str(endtime) + '"' + "," + '"' + str(action) + '"' + "," + '"' + str(logstatus) + '"' + "\n"
#print(row)
return row
def processRecords(records):
for r in records:
data = base64.b64decode(r['data'])
striodata = StringIO.StringIO(data)
try:
with gzip.GzipFile(fileobj=striodata, mode='r') as f:
data = json.loads(f.read())
except IOError:
# likely the data was re-ingested into firehose
pass
recId = r['recordId']
# re-ingested data into firehose
if type(data) == str:
yield {
'data': data,
'result': 'Ok',
'recordId': recId
}
elif data['messageType'] != 'DATA_MESSAGE':
yield {
'result': 'ProcessingFailed',
'recordId': recId
}
else:
data = ''.join([transformLogEvent(e) for e in data['logEvents']])
#print(data)
data = base64.b64encode(data)
yield {
'data': data,
'result': 'Ok',
'recordId': recId
}
def putRecords(streamName, records, client, attemptsMade, maxAttempts):
failedRecords = []
codes = []
errMsg = ''
try:
response = client.put_record_batch(DeliveryStreamName=streamName, Records=records)
except Exception as e:
failedRecords = records
errMsg = str(e)
# if there are no failedRecords (put_record_batch succeeded), iterate over the response to gather results
if not failedRecords and response['FailedPutCount'] > 0:
for idx, res in enumerate(response['RequestResponses']):
if not res['ErrorCode']:
continue
codes.append(res['ErrorCode'])
failedRecords.append(records[idx])
errMsg = 'Individual error codes: ' + ','.join(codes)
if len(failedRecords) > 0:
if attemptsMade + 1 < maxAttempts:
print('Some records failed while calling PutRecords, retrying. %s' % (errMsg))
putRecords(streamName, failedRecords, client, attemptsMade + 1, maxAttempts)
else:
raise RuntimeError('Could not put records after %s attempts. %s' % (str(maxAttempts), errMsg))
def handler(event, context):
streamARN = ''
region = ''
streamName = ''
records = list(processRecords(event['records']))
projectedSize = 0
recordsToReingest = []
for idx, rec in enumerate(records):
if rec['result'] == 'ProcessingFailed':
continue
projectedSize += len(rec['data']) + len(rec['recordId'])
# 4000000 instead of 6291456 to leave ample headroom for the stuff we didn't account for
if projectedSize > 4000000:
recordsToReingest.append({
'Data': rec['data']
})
records[idx]['result'] = 'Dropped'
del(records[idx]['data'])
if len(recordsToReingest) > 0:
client = boto3.client('firehose', region_name=region)
putRecords(streamName, recordsToReingest, client, attemptsMade=0, maxAttempts=20)
print('Reingested %d records out of %d' % (len(recordsToReingest), len(event['records'])))
else:
print('No records to be reingested')
return {"records": records}

Weird behavior with external class in django

I've got a weird problem here. It hurts my brain thinking about it.
I've got a Django project with multiple apps. Today I added another app.
(views.py)
from %app_name%.%class_file% import %class_name%
def api(request):
t = %class_name%()
data = {}
data['listOtherDevices'] = t.listOtherDevices
logger = logging.getLogger(__name__)
logger.error(len(t.listOtherDevices))
return JsonResponse(data)
The imported class fills the 'listOtherDevices'-array via
__init__ perfectly fine when I run it inside a console. When I do so, there are exactly 3 elements inside this array (as there are 3 devices in my LAN the class could find). So when I visit the url (development server -> manage.py runserver) linked to this method I can see a JSON with exactly 3 entries. So far so good but now comes the weird part. When I refresh the url in my browser or visit it one more time, there are more than 3 entries. The scheme is like this:
opened url 1 time: 3 entries
opened url 2 times: 9 entries (+ 6)
opened url 3 times: 18 entries (+ 9)
opened url 4 times: 30 entries (+ 12)
opened url 5 times: 45 entries (+ 15)
opened url 6 times: 63 entries (+ 18)
I can see a pattern there but I cannot understand why this happends.
sudo sysdig -c spy_users
tells me, that the class is gathering information exactly 3 times using
subprocess.check_output
The responded JSON is syntactically OK. Seems like the class would 'find' 9, or 18 devices.
Please help me, because as I said earlier: this makes my brain hurt :)
import json
import subprocess
class tradfri:
tradfriIPv4 = 'blablabla'
tradfriUser = 'blablabla'
tradfriPassword = 'blablabla'
pathToCoap = '/blablabla/coap-client'
listDevices = []
listGroups = []
listDevicesDetails = []
listGroupsDetails = []
listLightbulbs = []
listOtherDevices = []
def __init__(self):
self.getDevices()
self.getGroups()
self.getArrayLightbulbs()
self.getArrayOtherDevices()
def getDevices(self):
method = 'get'
stdoutdata = subprocess.check_output( self.pathToCoap
+ ' -m ' + method
+ ' -u "' + self.tradfriUser + '"'
+ ' -k "' + self.tradfriPassword + '"'
+ ' coaps://' + self.tradfriIPv4 + ':5684/15001'
+ " | awk 'NR==4'",
shell=True).decode("utf-8")
self.listDevices = json.loads(stdoutdata)
for ID in self.listDevices:
stdoutdata = subprocess.check_output( self.pathToCoap
+ ' -m ' + method
+ ' -u "' + self.tradfriUser + '"'
+ ' -k "' + self.tradfriPassword + '"'
+ ' coaps://' + self.tradfriIPv4 + ':5684/15001/' + str(ID)
+ " | awk 'NR==4'",
shell=True).decode("utf-8")
self.listDevicesDetails.append(json.loads(stdoutdata))
def getGroups(self):
method = 'get'
stdoutdata = subprocess.check_output( self.pathToCoap
+ ' -m ' + method
+ ' -u "' + self.tradfriUser + '"'
+ ' -k "' + self.tradfriPassword + '"'
+ ' coaps://' + self.tradfriIPv4 + ':5684/15004'
+ " | awk 'NR==4'",
shell=True).decode("utf-8")
self.listGroups = json.loads(stdoutdata)
for ID in self.listGroups:
stdoutdata = subprocess.check_output( self.pathToCoap
+ ' -m ' + method
+ ' -u "' + self.tradfriUser + '"'
+ ' -k "' + self.tradfriPassword + '"'
+ ' coaps://' + self.tradfriIPv4 + ':5684/15004/' + str(ID)
+ " | awk 'NR==4'",
shell=True).decode("utf-8")
raw = json.loads(stdoutdata)
tmpMembers = []
for id in raw['9018']['15002']['9003']:
tmpMembers.append( { 'ID': str( id ), 'name': self.getDeviceNameByID(id) } )
self.listGroupsDetails.append( { 'ID': str( raw['9003'] ),
'name': raw['9001'],
'isGroupOn': False,
'members': tmpMembers } )
def getArrayLightbulbs(self):
for item in self.listDevicesDetails:
if item['3']['6'] == 1: # is lightbulb
id = item['9003']
name = item['9001']
groupID = self.getGroupIDByID(id)
groupName = self.getGroupNameByID(id)
manufacturer = item['3']['0']
description = item['3']['1']
isReachable = True
isBulbOn = False
isDimmable = False
isWhiteSpectrum = False
isColorSpectrum = False
brightnessOfBulb = ''
currentColor = ''
#isReachable
if len(item['3311'][0]) == 1:
isReachable = False;
else:
#isBulbOn
if item['3311'][0]['5850'] == 1:
isBulbOn = True
#dimmable & brightnessOfBulb
if '5851' in item['3311'][0]:
brightnessOfBulb = str( item['3311'][0]['5851'] )
isDimmable = True
#currentColor
if '5706' in item['3311'][0]:
currentColor = item['3311'][0]['5706']
#isWhiteSpectrum
if ' WS ' in description:
isWhiteSpectrum = True
#isColorSpectrum
if ' CWS ' in description:
isWhiteSpectrum = True
isColorSpectrum = True
self.listLightbulbs.append( { 'ID': str( id ),
'Name': name,
'groupID': str( groupID ),
'groupName' : groupName,
'manufacturer': manufacturer,
'description' : description,
'isReachable': isReachable,
'isBulbOn': isBulbOn,
'isDimmable': isDimmable,
'isWhiteSpectrum': isWhiteSpectrum,
'isColorSpectrum': isColorSpectrum,
'brightnessOfBulb': brightnessOfBulb,
'currentColor': currentColor } )
def getArrayOtherDevices(self):
for device in self.listDevicesDetails:
if device['3']['6'] == 3:
self.listOtherDevices.append( { 'ID': str( device['9003'] ),
'Name': device['9001'],
'groupID': str( self.getGroupIDByID(str( device['9003']) ) ),
'groupName': self.getGroupNameByID(str( device['9003']) ),
'manufacturer': device['3']['0'],
'description': device['3']['1'] } )
def getDeviceNameByID(self, id):
name = ''
for key in self.listDevicesDetails:
if key['9003'] == id:
name = key['9001']
return name
def getGroupIDByID(self, id):
groupID = ''
for group in self.listGroupsDetails:
for member in group['members']:
if member['ID'] == id:
groupID = group['ID']
return groupID
def getGroupNameByID(self, id):
groupName = ''
for group in self.listGroupsDetails:
for member in group['members']:
if member['ID'] == id:
groupName = group['name']
return groupName
You use class wide attributes to save the data.
Whenever you create a new instance of tradfri your methods work on the same class wide attribute listOtherDevices. Note that the class lives in memory until you restart the server. Therefore the amount of values increases with each request, as you append values to your list.
You should use attributes which are available per instance. In python this is achieved by initializing the attributes inside of __init__(). It might look like that:
class tradfri:
def __init__(self):
self.tradfriIPv4 = 'blablabla'
self.tradfriUser = 'blablabla'
self.tradfriPassword = 'blablabla'
self.pathToCoap = '/blablabla/coap-client'
self.listDevices = []
self.listGroups = []
self.listDevicesDetails = []
self.listGroupsDetails = []
self.listLightbulbs = []
self.listOtherDevices = []
self.getDevices()
self.getGroups()
self.getArrayLightbulbs()
self.getArrayOtherDevices()
Read python class attribute for more details. The official documentation on that topic are also worth reading.