AWS Quicksight parseJson function not working with redshift

AWS Quicksight parseJson function not working with redshift - amazon-web-services

How to fix parseJson function error in AWS Quicksight?
I have a json column in AWS redshift called discount_codes of type varchar. The data looks like this:
{'code': 'blabla', 'amount': '12.00', 'type': 'percentage'}
I want to have a seperate column for 'code' in Quicksight. There is a function for this called parseJson. The formular should look like this.
parseJson({discount_codes}, "$.code")
Unfortunately it is not working and giving me the following error:
[Amazon](500310) Invalid operation: JSON parsing error Details: ----------------------------------------------- error: JSON parsing error code: 8001 context: invalid json object {'code': 'blabla', 'amount': '12.00', 'type': 'percentage'}
Any idea how to fix this?

I could fix it by myself. The json column had single quotation marks. I replaced them by normal ones. Now the data looks like this:
{"code": "blabla", "amount": "12.00", "type": "percentage"}
parseJson works now.

Related

Error when calling Lambda UDF from Redshift

When calling a python lambda UDF from my Redshift stored procedure i am getting the following error. Any idea what could be wrong ?
ERROR: Invalid External Function Response Detail:
-----------------------------------------------
error: Invalid External Function Response code:
8001 context: Extra rows in external function response query: 0
location: exfunc_data.cpp:330 process: padbmaster [pid=8842]
-----------------------------------------------
My Python Lambda UDF looks as follows.
def lambda_handler(event, context):
#...
result = DoJob()
#...
ret = dict()
ret['results'] = result
ret_json = json.dumps(ret)
return ret_json
The above lambda function is associated to an external function in Redshift by name send_email_lambda. The permissions and invocation works without any issues. I am calling the lambda function as follows.
select send_email_lambda('sebder#company.com',
'recipient1#company.com',
'sample body',
'sample subject);
Edit :
As requested , adding the event payload passed from redshift to lambda.
{
"user":"awsuser",
"cluster":"arn:aws:redshift:us-central-1:dummy:cluster:redshift-test-cluster",
"database":"sample",
"external_function":"lambda_send_email",
"query_id":178044,
"request_id":"20211b87-26c8-6d6a-a256-1a8568287feb",
"arguments":[
[
"sender#company.com",
"user1#company.com,user2#company.com",
"<html><h1>Hello Therer</h1><p>A sample email from redshift. Take care and stay safe</p></html>",
"Redshift email lambda UDF",
"None",
"None",
"text/html"
]
],
"num_records":1
}

It looks like a UDF can be passed multiple rows of data. So, it could receive a request to send multiple emails. The code needs to loop through each of the top-level array, then extract the values from the array inside that.
It looks like it then needs to return an array that is the same length as the input array.
For your code, create an array with one entry and then return the dictionary inside that.

Unable to PUT to DynamoDB from a Python Lambda function

Im using a Lambda function to get and put data to DynamoDB.
I am able to put a new item into my table, but when I try to pass anything other than KEY I get the following error:
An error occurred (ValidationException) when calling the UpdateItem operation:
The provided key element does not match the schema
This works (with key only):
"body": "{\"TableName\":\"myExampleTableName\",\"Key\":{\"id\": {\"S\": \"SomeID\"}}}"
This throws error(with key and some data):
"body": "{\"TableName\":\"myExampleTableName\",\"Key\":{\"id\": {\"S\": \"SomeID\"},\"Data\": {\"S\": \"MyDataExampleData\"}}}"
Although it seems to be the same syntax as the example shows here.
Anybody see what I am doing wrong?
Here's the my body in a more readable way:
{
"TableName":"myExampleTableName",
"Key":{
"id": {"S": "SomeID"},
"Data": {"S": "MyDataExampleData"}
}
}

The "Data" field inside the "Key" doesn't seem right. It would be easier to understand if we had the actual code and the schema of the table you're trying to put.
This is an example using python, where you can see that there's no "Key" attribute, but "Item" (I'm assuming you're using python based on the doc you've sent)
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.03.html
response = table.put_item(
Item={
'year': year,
'title': title,
'info': {
'plot': plot,
'rating': rating
}
}
)

Boto3 athena query without saving data to s3

I am trying to use boto3 to run a set of queries and don't want to save the data to s3. Instead I just want to get the results and want to work with those results. I am trying to do the following
import boto3
client = boto3.client('athena')
response = client.start_query_execution(
QueryString='''SELECT * FROM mytable limit 10''',
QueryExecutionContext={
'Database': 'my_db'
}.
ResultConfiguration={
'OutputLocation': 's3://outputpath',
}
)
print(response)
But here I don't want to give ResultConfiguration because I don't want to write the results anywhere. But If I remove the ResultConfiguration parameter I get the following error
botocore.exceptions.ParamValidationError: Parameter validation failed:
Missing required parameter in input: "ResultConfiguration"
So it seems like giving s3 output location for writing is mendatory. So what could the way to avoid this and get the results only in response?

The StartQueryExecution action indeed requires a S3 output location. The ResultConfiguration parameter is mandatory.
The alternative way to query Athena is using JDBC or ODBC drivers. You should probably use this method if you don't want to store results in S3.

You will have to specify an S3 temp bucket location whenever running the 'start_query_execution' command. However, you can get a result set (a dict) by running the 'get_query_results' method using the query id.
The response (dict) will look like this:
{
'UpdateCount': 123,
'ResultSet': {
'Rows': [
{
'Data': [
{
'VarCharValue': 'string'
},
]
},
],
'ResultSetMetadata': {
'ColumnInfo': [
{
'CatalogName': 'string',
'SchemaName': 'string',
'TableName': 'string',
'Name': 'string',
'Label': 'string',
'Type': 'string',
'Precision': 123,
'Scale': 123,
'Nullable': 'NOT_NULL'|'NULLABLE'|'UNKNOWN',
'CaseSensitive': True|False
},
]
}
},
'NextToken': 'string'
}
For more information, see boto3 client doc: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html#Athena.Client.get_query_results
You can then delete all files in the S3 temp bucket you've specified.

You still need to provide s3 as temporary location for Athena to save the data although you want to process the data using python. But you can page through the data as tuple using Pagination API. please refer to the example here. Hope that helps

Getting response from AWS Lambda function to AWS Lex bot is giving error?

I have created one AWS Lex bot and I am invoking one lambda function from that bot. When testing the lambda function I am getting proper response but at bot I am getting below error:
An error has occurred: Received invalid response from Lambda: Can not
construct instance of IntentResponse: no String-argument
constructor/factory method to deserialize from String value
('2017-06-22 10:23:55.0') at [Source: "2017-06-22 10:23:55.0"; line:
1, column: 1]
Not sure, what is wrong and where I am missing. Could anyone assist me please?

The solution to above problem is that we need to make sure response returned by lambda function, to be used at AWS lex chat bot should be in below format:
{
"sessionAttributes": {
"key1": "value1",
"key2": "value2"
...
},
"dialogAction": {
"type": "ElicitIntent, ElicitSlot, ConfirmIntent, Delegate, or Close",
Full structure based on the type field. See below for details.
}
}
By this, chat bot expectd DialogAction and corresponding elements in order to process the message i.e. IntentResponse.
Reference: http://docs.aws.amazon.com/lex/latest/dg/lambda-input-response-format.html

no String-argument constructor/factory method to deserialize from String value
You are getting this error because you must be passing string values in the response of lambda function. You have to pass a predefined json object blueprint in the response.
Because the communication between Lex and Lambda is not simple value passing like normal functions. Amazon Lex expects output from Lambda in a particular JSON format and data is sent to Lambda in a particular JSON. The examples are here: Lambda Function Input Event and Response Format.
And just copying and pasting the blueprint won't work because in some fields you have choose between some predefined values and in some fields you have to entry valid input.
For example in,
"dialogAction": {
"type": "Close",
"fulfillmentState": "Fulfilled or Failed",
"message": {
"contentType": "PlainText or SSML",
"content": "Thanks, your pizza has been ordered."
}
}
you have assign a value "Fulfilled" or "Failed" to field 'fulfillmentState'. And same goes for 'contentType'.

Delimiter not found error - AWS Redshift Load from s3 using Kinesis Firehose

I am using Kinesis firehose to transfer data to Redshift via S3.
I have a very simple csv file that looks like this. The firehose puts it to s3 but Redshift errors out with Delimiter not found error.
I have looked at literally all posts related to this error but I made sure that delimiter is included.
File
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:23:56.986397,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:02.061263,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:07.143044,848.78
GOOG,2017-03-16T16:00:01Z,2017-03-17 06:24:12.217930,848.78
OR
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:48:59.993260","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:07.034945","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:12.306484","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:18.020833","852.12"
"GOOG","2017-03-17T16:00:02Z","2017-03-18 05:49:24.203464","852.12"
Redshift Table
CREATE TABLE stockvalue
( symbol VARCHAR(4),
streamdate VARCHAR(20),
writedate VARCHAR(26),
stockprice VARCHAR(6)
);
Error
Error
Just in case, here's what my kinesis stream looks like
Firehose
Can someone point out what may be wrong with the file.
I added a comma between the fields.
All columns in destination table are varchar so there should be no reason for datatype error.
Also, the column lengths match exactly between the file and redshift table.
I have tried embedding columns in double quotes and without.

Can you post the full COPY command? It's cut off in the screenshot.
My guess is that you are missing DELIMITER ',' in your COPY command. Try adding that to the COPY command.

I was stuck on this for hours, and thanks to Shahid's answer it helped me solve it.
Text Case for Column Names is Important
Redshift will always treat your table's columns as lower-case, so when mapping JSON keys to columns, make sure the JSON keys are lower-case, e.g.
Your JSON file will look like:
{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}{'id': 'val1', 'name': 'val2'}
And the COPY statement will look like
COPY latency(id,name) FROM 's3://<bucket-name>/<manifest>' CREDENTIALS 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>' MANIFEST json 'auto';
Settings within Firehose must have the column names specified (again, in lower-case). Also, add the following to Firehose COPY options:
json 'auto' TRUNCATECOLUMNS blanksasnull emptyasnull
How to call put_records from Python:
Below is a snippet showing how to use the put_records functions with kinesis in python:
'objects' passed into the 'put_to_stream' function is an array of dictionaries:
def put_to_stream(objects):
records = []
for metric in metrics:
record = {
'Data': json.dumps(metric),
'PartitionKey': 'swat_report'
};
records.append(record)
print(records)
put_response = kinesis_client.put_records(StreamName=kinesis_stream_name, Records=records)
flush
``

1- You need to add FORMAT AS JSON 's3://yourbucketname/aJsonPathFile.txt'. AWS has not mentioned this already. Please note that this only works when your data is in JSON form like
{'attr1': 'val1', 'attr2': 'val2'} {'attr1': 'val1', 'attr2': 'val2'} {'attr1': 'val1', 'attr2': 'val2'} {'attr1': 'val1', 'attr2': 'val2'}
2- You also needs to verify the column order in kinesis firehouse and in csv file.and try adding
TRUNCATECOLUMNS blanksasnull emptyasnull
3- An example
COPY testrbl3 ( eventId,serverTime,pageName,action,ip,userAgent,location,plateform,language,campaign,content,source,medium,productID,colorCode,scrolltoppercentage) FROM 's3://bucketname/' CREDENTIALS 'aws_iam_role=arn:aws:iam:::role/' MANIFEST json 'auto' TRUNCATECOLUMNS blanksasnull emptyasnull;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

AWS Quicksight parseJson function not working with redshift - amazon-web-services

I could fix it by myself. The json column had single quotation marks. I replaced them by normal ones. Now the data looks like this: {"code": "blabla", "amount": "12.00", "type": "percentage"} parseJson works now.

Related

Error when calling Lambda UDF from Redshift

Unable to PUT to DynamoDB from a Python Lambda function

Boto3 athena query without saving data to s3

Getting response from AWS Lambda function to AWS Lex bot is giving error?

Delimiter not found error - AWS Redshift Load from s3 using Kinesis Firehose

Categories

Resources