HTTPrequest to GCP BigQuery via Cloud Function with Multiple Tables

HTTPrequest to GCP BigQuery via Cloud Function with Multiple Tables - google-cloud-platform

I'm looking to create a HTTPrequest to BigQuery via a Cloud Function within GCP where the request passes a value, that value is passed to the query and another value is returned from a joined table. The SQL works in BQ but having issues getting a value returned when I apply it to the Cloud Function.
Here's the curl I'm using as well.
https://us-central1-something.cloudfunctions.net/something/something2?client_Id=823754783.2
Thanks in advance.
from flask import escape
from google.cloud import bigquery
def cors_enabled_function(request):
if request.method == 'OPTIONS':
headers = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Max-Age': '3600'
}
return ('', 204, headers)
client = bigquery.Client()
def something2(request):
request_json = request.get_json(silent=True)
request_args = request_args
table = ['`audience-cookie.ecommerce.traffic` as traffic JOIN `audience-cookie.ecommerce.cardholder` as cardholder ON traffic.customerId = cardholder.customerId']
QUERY = ('SELECT '+cardholder+' from `'+table+'` WHERE client_Id='+client_Id)
try:
query_job = client.query(QUERY)
rows = query_job.result()
row_list = []
for row in rows:
row_list.append(str(row[cardholder]))
return("<p>"+"</p><p>".join(row(row_list) + "</p>"))
except e:
return(e)

I recommend you to print your query. There is several mistakes
Why are you using [] in the table var definition?
you use backquote ` in table and when your insert table into the query. So, at the end, you have a double backquote
client_Id isn't defined in your code

Related

How can I save the output of an API call to DynamoDB?

I am calling a REST API from lambda function in AWS . I receive the response in the terminal. How can I store these API responses in DynamoDB. I am sorry if this may seem naive to some. I am a beginner and self learner.

Here is a full working example for a Python based Lambda function:
# Import AWS (Python version) SDK
import boto3
def lambda_handler(event, context):
# Construct the client to the DynamoDB service
client = boto3.client('dynamodb')
db = boto3.resource('dynamodb')
table_name = 'my-awesome-table'
# Table instance
table = db.Table(table_name)
# Example data to query an entry from the table
primary_column_name = 'email'
primary_key = 'john#gmail.com'
# We get an object for the retrieved table entry
table_entry = get_table_item(table, primary_column_name, primary_key)
# Example data to put into the table
new_entry = {
'email': 'jack#outlook.com',
'name': 'Jack Ma',
'age': 56
}
put_table_item(table, new_entry)
# Function to retrieve an entry from the DynamoDB table
def get_table_item(table, primary_column_name, primary_key):
response = table.get_item(
Key={
primary_column_name: primary_key
}
)
return response['Item']
# Function to put an entry into the DynamoDB table
def put_table_item(table, item):
response = table.put_item(
Item=item
)
# If there were errors, throw an exception
if (response['ResponseMetadata']['HTTPStatusCode'] != 200):
raise Exception('Failed to update table!')

How do I run a SQL query via REST against QuestDB?

I'm using Python for some tests and I would like to be able to run a SQL query via REST. Is there an easy way to use requests to run queries like:
requests.get('http:myserver:9000/exec' query="select * from my_table")

If you need to use REST via Python, this can be done similar to the following example:
import requests
import json
host = 'http://myserver:9000'
sql_query = "select * from my_table limit 100"
query_params = {'query': sql_query}
try:
response = requests.post(host + '/exec', params=query_params)
json_response = json.loads(response.text)
rows = json_response['dataset']
for row in rows:
print(row)
except requests.exceptions.RequestException as e:
print("Error: %s" % (e))
There is additional documentation for this on the QuestDB REST docs page

AWS Cloudwatch Logs to Azure Log Analytics

I am aware of the HTTP Data Collector API that can be used to pull data into Azure Log analytics, my ask here is on AWS Cloudwatch data to Azure. We have Azure hosted application and an external AWS hosted Serverless Lamda functions and we want to import the logs of those 13 serverless functions into Azure. I know from the documentation and there is a python function that can be used as a AWS Lamda function and the python example is in MSFT documentation. But what I am failing to understand is what Json format that AWS cloud collector needs to create so they can send it to Azure Log Analytics. Any examples on this ? Any help on how this can be done. I have come across this blog also but that is splunk specific. https://www.splunk.com/blog/2017/02/03/how-to-easily-stream-aws-cloudwatch-logs-to-splunk.html

Hey never mind I was able to dig a little deeper and I found that in AWS I can STREAM the Logs from one Lambda to other Lambda function thru subscription. Once that was setthen all I did was consumed that and on the fly created the JSON and sent it to Azure Logs. In case if you or anyone is interested in it, following is the code:-
import json
import datetime
import hashlib
import hmac
import base64
import boto3
import datetime
import gzip
from botocore.vendored import requests
from datetime import datetime
Update the customer ID to your Log Analytics workspace ID
customer_id = "XXXXXXXYYYYYYYYYYYYZZZZZZZZZZ"
For the shared key, use either the primary or the secondary Connected Sources client authentication key
shared_key = "XXXXXXXXXXXXXXXXXXXXXXXXXX"
The log type is the name of the event that is being submitted
log_type = 'AWSLambdafuncLogReal'
json_data = [{
"slot_ID": 12345,
"ID": "5cdad72f-c848-4df0-8aaa-ffe033e75d57",
"availability_Value": 100,
"performance_Value": 6.954,
"measurement_Name": "last_one_hour",
"duration": 3600,
"warning_Threshold": 0,
"critical_Threshold": 0,
"IsActive": "true"
},
{
"slot_ID": 67890,
"ID": "b6bee458-fb65-492e-996d-61c4d7fbb942",
"availability_Value": 100,
"performance_Value": 3.379,
"measurement_Name": "last_one_hour",
"duration": 3600,
"warning_Threshold": 0,
"critical_Threshold": 0,
"IsActive": "false"
}]
#body = json.dumps(json_data)
#####################
######Functions######
#####################
Build the API signature
def build_signature(customer_id, shared_key, date, content_length, method, content_type, resource):
x_headers = 'x-ms-date:' + date
string_to_hash = method + "\n" + str(content_length) + "\n" + content_type + "\n" + x_headers + "\n" + resource
bytes_to_hash = bytes(string_to_hash, encoding="utf-8")
decoded_key = base64.b64decode(shared_key)
encoded_hash = base64.b64encode(
hmac.new(decoded_key, bytes_to_hash, digestmod=hashlib.sha256).digest()).decode()
authorization = "SharedKey {}:{}".format(customer_id,encoded_hash)
return authorization
Build and send a request to the POST API
def post_data(customer_id, shared_key, body, log_type):
method = 'POST'
content_type = 'application/json'
resource = '/api/logs'
rfc1123date = datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
print (rfc1123date)
content_length = len(body)
signature = build_signature(customer_id, shared_key, rfc1123date, content_length, method, content_type, resource)
uri = 'https://' + customer_id + '.ods.opinsights.azure.com' + resource + '?api-version=2016-04-01'
headers = {
'content-type': content_type,
'Authorization': signature,
'Log-Type': log_type,
'x-ms-date': rfc1123date
}
response = requests.post(uri,data=body, headers=headers)
if (response.status_code >= 200 and response.status_code <= 299):
print("Accepted")
else:
print("Response code: {}".format(response.status_code))
print(response.text)
def lambda_handler(event, context):
cloudwatch_event = event["awslogs"]["data"]
decode_base64 = base64.b64decode(cloudwatch_event)
decompress_data = gzip.decompress(decode_base64)
log_data = json.loads(decompress_data)
print(log_data)
awslogdata = json.dumps(log_data)
post_data(customer_id, shared_key, awslogdata, log_type)

How should I set integration request for a lambda function in API gateway

I have Below lambda function which is invoked by API gateway
import sys
import logging
import pymysql
import json
rds_host="rds.amazonaws.com"
name="name"
password="pass"
db_name="DB"
port = 3306
def save_events(event):
result = []
conn = pymysql.connect(rds_host, user=name, passwd=password, db=db_name,
connect_timeout=30)
Bid=pymysql.escape_string("3")
with conn.cursor(pymysql.cursors.DictCursor) as cur:
cur.execute("select exid,exercise_name,image from exercise where bid = 3")
result = cur.fetchall()
cur.close()
print ("Data from RDS...")
print (result)
workout = json.dumps(result)
workouts=(workout.replace("\"", "'"))
def lambda_handler(event, context):
save_events(event)
return workouts
Now in Integration Request of get method in api gateway what should i add in mapping template to get data in json format from the user and how can i pass that user value in the query(eg :select exid,exercise_name,image from exercise where bid = "user supplied value"). AM a beginner to AWS and backend development. Thanks in advance

you can pass query argument. for that you have to mention querystringparameter in integration request and in mapping templet add json/application to map parameter in lambda code.
cur.execute("select exid,exercise_name,image from exercise where bid = 3")
replace this line with
cur.execute("select exid,exercise_name,image from exercise where bid = event['params']['querystring']['parameter_name']")
for more information check this
pass parameter to lambda from api gateway

java.sql.SQLExceptionPyRaisable on the second attempt connecting to Athena using Django

I am using the python module called PyAthenaJDBC in order to query Athena using the provided JDBC driver.
Here is the link : https://pypi.python.org/pypi/PyAthenaJDBC/
I have been facing some persistent issue. I keep getting this java error whenever I use the Athena connection twice in a row.
As a matter of fact, I was able to connect to Athena, show databases, create new tables and even query the content. I am building an application using Django and running its server to use Athena
However, I am obliged to re-run the server in order for the Athena connection to work once again,
Here is a glimpse of the class I have built
import os
import configparser
import pyathenajdbc
#Get aws credentials for the moment
aws_config_file = '~/.aws/config'
Config = configparser.ConfigParser()
Config.read(os.path.expanduser(aws_config_file))
access_key_id = Config['default']['aws_access_key_id']
secret_key_id = Config['default']['aws_secret_access_key']
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
athena_jdbc_driver_path = BASE_DIR + "/lib/static/AthenaJDBC.jar"
log_path = BASE_DIR + "/lib/static/queries.log"
class PyAthenaLoader():
def __init__(self):
pyathenajdbc.ATHENA_JAR = athena_jdbc_driver_path
def connecti(self):
self.conn = pyathenajdbc.connect(
s3_staging_dir="s3://aws-athena-query-results--us-west-2",
access_key=access_key_id,
secret_key=secret_key_id,
#profile_name = "default",
#credential_file = aws_config_file,
region_name="us-west-2",
log_path=log_path,
driver_path=athena_jdbc_driver_path
)
def databases(self):
dbs = self.query("show databases;")
return dbs
def tables(self, database):
tables = self.query("show tables in {0};".format(database))
return tables
def create(self):
self.connecti()
try:
with self.conn.cursor() as cursor:
cursor.execute(
"""CREATE EXTERNAL TABLE IF NOT EXISTS sales4 (
Day_ID date,
Product_Id string,
Store_Id string,
Sales_Units int,
Sales_Cost float,
Currency string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '|',
'field.delim' = '|',
'collection.delimm' = 'undefined',
'mapkey.delim' = 'undefined'
) LOCATION 's3://athena-internship/';
""")
res = cursor.description
finally:
self.conn.close()
return res
def query(self, req):
self.connecti()
try:
with self.conn.cursor() as cursor:
cursor.execute(req)
print(cursor.description)
res = cursor.fetchall()
finally:
self.conn.close()
return res
def info(self):
res = []
for i in dir(pyathenajdbc):
temp = i + ' = ' + str(dic[i])
#print(temp)
res.append(temp)
return res
Example of usage :
def test(request):
athena = jdbc.PyAthenaLoader()
res = athena.query('Select * from sales;')
return render(request, 'test.html', {'data': res})
Works just fine!
However refreshing the page would cause this error :
Error
Note that I am using a local .jar file: I thought that would solve the issue but I was wrong
Even if I remove the path of the JDBC driver and let the module download it from s3, the error persists:
File "/home/tewfikghariani/.virtualenvs/venv/lib/python3.4/site-packages/pyathenajdbc/connection.py", line 69, in init
ATHENA_CONNECTION_STRING.format(region=self.region_name, schema=schema_name), props)
jpype._jexception.java.sql.SQLExceptionPyRaisable:
java.sql.SQLException: No suitable driver found for
jdbc:awsathena://athena.us-west-2.amazonaws.com:443/hive/default/
Furthermore, when I run the module on its own, it works just fine.
When I set multiple connection inside my view before rendering the template, that works just fine as well.
I guess the issue is related to the django view, once one of the views is performing a connection with athena, the next connection is not possible anymore and the error is raised unless I restart the server
Any help? If other details are missing I will provide them immediately.

Update:
After posting the issue in github, the author solved this problem and released a new version that works perfectly.
It was a multi-threading problem with JPype.
Question answered!
ref : https://github.com/laughingman7743/PyAthenaJDBC/pull/8

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

HTTPrequest to GCP BigQuery via Cloud Function with Multiple Tables - google-cloud-platform

I recommend you to print your query. There is several mistakes Why are you using [] in the table var definition? you use backquote ` in table and when your insert table into the query. So, at the end, you have a double backquote client_Id isn't defined in your code

Related

How can I save the output of an API call to DynamoDB?

How do I run a SQL query via REST against QuestDB?

AWS Cloudwatch Logs to Azure Log Analytics

How should I set integration request for a lambda function in API gateway

java.sql.SQLExceptionPyRaisable on the second attempt connecting to Athena using Django

Categories

Resources