Redshift copy json from S3 fails - amazon-web-services

Following this documents, I was trying to load JSON Data from S3 to RedShift.
Created JSONPath file & validated (on https://jsonpath.curiousconcept.com/# with expression $.*)
{
"jsonpaths": [
"$['_record_id']",
"$['_title']",
"$['_server_updated_at']",
"$['_project']",
"$['_assigned_to']",
"$['_updated_by']",
"$['_latitude']",
"$['_longitude']",
"$['date']",
"$['date_received']",
"$['inspection_type']"
]
}
and sample data
[{
"_record_id": "cf68c930-b7c8-4c3f-a04c-58b49f383cca",
"_title": "FAIL, 128",
"_server_updated_at": "2021-08-03T15:06:05.000Z",
"_project": null,
"_assigned_to": null,
"_updated_by": "XYZ",
"_geometry": {
"type": "Point",
"coordinates": [-74.5048900706, 40.3395964363]
},
"_latitude": 40.3395964363,
"_longitude": -74.5048900706,
"date": "2021-08-03T00:00:00.000Z",
"date_received": "2021-07-30T00:00:00.000Z",
"inspection_type": "New Product Inspection"
}, {
"_record_id": "9c8af79a-eaaf-405e-8c42-62560fdf15d5",
"_title": "PASS, 52",
"_server_updated_at": "2021-08-03T14:56:23.000Z",
"_project": null,
"_assigned_to": null,
"_updated_by": "XYZ",
"_geometry": null,
"_latitude": null,
"_longitude": null,
"date": "2021-08-03T00:00:00.000Z",
"date_received": "2021-07-30T00:00:00.000Z",
"inspection_type": "New Product Inspection"
}]
When I run this COPY command
copy rab.rab_dbo.shipmentreceivinglog2
from 's3://<bucket>/data_report.json'
iam_role 'arn:aws:iam::1234567890:role/RedshiftFileTransfer'
json 's3://<bucket>g/JSONPaths.json';
I get ERROR: Load into table 'shipmentreceivinglog2' failed. Check 'stl_load_errors' system table for details. When I run select * from stl_load_errors; I see
Invalid JSONPath format: Member is not an object. for s3://<bucket>/data_report.json
Whats wrong with my JSONPath File ?

The issue is with your data file. Redshift json input data needs to be a set of json records just smashed together. You have a file that is one json array of objects. An array is one thing. You need to take out the enclosing [] and the commas between elements. Your sample data should look like
{
"_record_id": "cf68c930-b7c8-4c3f-a04c-58b49f383cca",
"_title": "FAIL, 128",
"_server_updated_at": "2021-08-03T15:06:05.000Z",
"_project": null,
"_assigned_to": null,
"_updated_by": "XYZ",
"_geometry": {
"type": "Point",
"coordinates": [-74.5048900706, 40.3395964363]
},
"_latitude": 40.3395964363,
"_longitude": -74.5048900706,
"date": "2021-08-03T00:00:00.000Z",
"date_received": "2021-07-30T00:00:00.000Z",
"inspection_type": "New Product Inspection"
}
{
"_record_id": "9c8af79a-eaaf-405e-8c42-62560fdf15d5",
"_title": "PASS, 52",
"_server_updated_at": "2021-08-03T14:56:23.000Z",
"_project": null,
"_assigned_to": null,
"_updated_by": "XYZ",
"_geometry": null,
"_latitude": null,
"_longitude": null,
"date": "2021-08-03T00:00:00.000Z",
"date_received": "2021-07-30T00:00:00.000Z",
"inspection_type": "New Product Inspection"
}
An easy way to get this is to pump the json you have through jq.
jq '.[]' file.json

Related

AWS Step Function Error with Input to Map State

I have the following iteration state defined in a Map State:
"WriteRteToDB": {
"Comment": "Write Rte to DB. Also records the risk calculations in the same table.",
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"End": true,
"Parameters": {
"FunctionName": "logger-lambda",
"RtInfo.$": "States.Array($)",
"ExecutionId.$": "$$.Execution.Id",
"InitTime.$": "$$.Execution.StartTime"
}
The parameters defined produce the following input:
{
"FunctionName": "logger-lambda",
"RtInfo": {
"status": 200,
"rte": {
"date": "2022-06-05 00:00:00",
"rt_value": 778129128.6631782,
"lower_80": 0,
"upper_80": 0.5,
"location_id": "WeWork Office Space & Coworking, Town Square, Alpharetta, GA, USA",
"syndrome": "Gastrointestinal"
}
},
"InitTime": "2022-06-05T15:04:57.297Z",
"ExecutionId": "arn:aws:states:us-east-1:1xxxxxxxxxx1:execution:RadaRx-rteForecast:0dbf2743-abb5-e0b6-56d0-2cc82a24e3b4"
}
But the following Error is produced:
{
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'WriteRteToDB' (entered at the event id #28). The Parameters '{\"FunctionName\":\"logger-lambda\",\"RtInfo\":[{\"status\":200,\"rte\":{\"date\":\"2022-12-10 00:00:00\",\"rt_value\":1.3579795204795204,\"lower_80\":0,\"upper_80\":0.5,\"location_id\":\"Atlanta Tech Park, Technology Parkway, Peachtree Corners, GA, USA\",\"syndrome\":\"Influenza Like Illnesses\"}}],\"InitTime\":\"2022-06-05T16:06:10.132Z\",\"ExecutionId\":\"arn:aws:states:us-east-1:1xxxxxxxxxx1:execution:RadaRx-rteForecast:016a37f2-d01c-9bfd-dc3f-1288fb7c1af6\"}' could not be used to start the Task: [The field \"RtInfo\" is not supported by Step Functions]"
}
I have already tried wrapping the RtInfo inside an array of length 1 as you can observe from above, considering that it is a state within the Map State. I have also checked Input size to make sure that it does not cross the Max Input/Output quota of 256KB.
Your task's Parameters has incorrect syntax. Pass RtInfo and the other user-defined inputs under the Payload key:
"Parameters": {
"FunctionName": "logger-lambda",
"Payload": {
"RtInfo.$": "States.Array($)",
"ExecutionId.$": "$$.Execution.Id",
"InitTime.$": "$$.Execution.StartTime"
}
}

Secondary Index not working for Database using #key

I should get the DynamoDb id for Justin. The call doesn't seem to fail. If i console.log(returned) i get an [object Object]. When i try to get to the returned.data.getIdFromUserName.id or returned.data.getIdFromUserName.email (anything else in the table) i get undefined. What am i missing?
Returned data:
{
"data": {
"getIdFromUserName": {
"items": [
{
"id": "3a5a2ks4-f137-41e2-a604-594e0c52a298",
"userName": "Justin",
"firstname": "null",
"weblink": "#JustinTimberlake",
"email": "iuiubiwewe#hotmail.com",
"mobileNum": "+0123456789",
"profilePicURI": "null",
"listOfVideosSeen": null,
"userDescription": "I wanna rock your body, please stay",
"isBlocked": false,
"GridPairs": null
}
],
"nextToken": null
}
}
}
I'd suggest getting a better idea of what console.log(returned) is printing.
Try console.log(JSON.stringify(returned, null, 2)) to inspect what is being returned.
EDIT: The data you're working with looks like this:
{
"data": {
"getIdFromUserName": {
"items": [
{
"id": "3a5a2ks4-f137-41e2-a604-594e0c52a298",
"userName": "Justin",
"firstname": "null",
"weblink": "#JustinTimberlake",
"email": "iuiubiwewe#hotmail.com",
"mobileNum": "+0123456789",
"profilePicURI": "null",
"listOfVideosSeen": null,
"userDescription": "I wanna rock your body, please stay",
"isBlocked": false,
"GridPairs": null
}
],
"nextToken": null
}
}
}
Pay close attention to the structure of that response. Both data and getIdFromUserName are maps. The content of data.getIdFromUserName is an array named items. Therefore, data.getIdFromUserName.items is an array containing the results of your query. You'll need to iterate over that array to get the data you are looking for.
For example, data.getIdFromUserName.items[0].id would be 3a5a2ks4-f137-41e2-a604-594e0c52a298
To access the email it would be data.getIdFromUserName.items[0].email.

Create a table in AWS athena parsing dynamic keys in nested json

I have JSON files each line of the format below and I would like to parse this data and index it to a table using AWS Athena.
{
"123": {
"abc": {
"id": "test",
"data": "ipsum lorum"
},
"abd": {
"id": "test_new",
"data": "lorum ipsum"
}
}
}
Can a table with this format be created for the above data? In the documentation, it is mentioned that struct can be used for parsing nested JSON, however, there are no sample examples for dynamic keys.
You could cast JSON to map or array and transform it in any way you want. In this case you could use map_values and CROSS JOIN UNNEST to produce rows from JSON objects:
with test AS
(SELECT '{ "123": { "abc": { "id": "test", "data": "ipsum lorum" }, "abd": { "id": "test_new", "data": "lorum ipsum" } } }' AS str),
struct_like AS
(SELECT cast(json_parse(str) AS map<varchar,
map<varchar,
map<varchar,
varchar>>>) AS m
FROM test),
flat AS
(SELECT item
FROM struct_like
CROSS JOIN UNNEST(map_values(m)) AS t(item))
SELECT
key,
value['id'] AS id,
value['data'] AS data
FROM flat
CROSS JOIN unnest(item) AS t(key, value)
The result:
key id data
abc test ipsum lorum
abd test_new lorum ipsum

"type mismatch error, expected type LIST" for querying a one-to-many relationship in AppSync

The schema:
type User {
id: ID!
createdCurricula: [Curriculum]
}
type Curriculum {
id: ID!
title: String!
creator: User!
}
The resolver to query all curricula of a given user:
{
"version" : "2017-02-28",
"operation" : "Query",
"query" : {
## Provide a query expression. **
"expression": "userId = :userId",
"expressionValues" : {
":userId" : {
"S" : "${context.source.id}"
}
}
},
"index": "userIdIndex",
"limit": #if(${context.arguments.limit}) ${context.arguments.limit} #else 20 #end,
"nextToken": #if(${context.arguments.nextToken}) "${context.arguments.nextToken}" #else null #end
}
The response map:
{
"items": $util.toJson($context.result.items),
"nextToken": #if(${context.result.nextToken}) "${context.result.nextToken}" #else null #end
}
The query:
query {
getUser(id: "0b6af629-6009-4f4d-a52f-67aef7b42f43") {
id
createdCurricula {
title
}
}
}
The error:
{
"data": {
"getUser": {
"id": "0b6af629-6009-4f4d-a52f-67aef7b42f43",
"createdCurricula": null
}
},
"errors": [
{
"path": [
"getUser",
"createdCurricula"
],
"locations": null,
"message": "Can't resolve value (/getUser/createdCurricula) : type mismatch error, expected type LIST"
}
]
}
The CurriculumTable has a global secondary index titled userIdIndex, which has userId as the partition key.
If I change the response map to this:
$util.toJson($context.result.items)
The output is the following:
{
"data": {
"getUser": {
"id": "0b6af629-6009-4f4d-a52f-67aef7b42f43",
"createdCurricula": null
}
},
"errors": [
{
"path": [
"getUser",
"createdCurricula"
],
"errorType": "MappingTemplate",
"locations": [
{
"line": 4,
"column": 5
}
],
"message": "Unable to convert \n{\n [{\"id\":\"87897987\",\"title\":\"Test Curriculum\",\"userId\":\"0b6af629-6009-4f4d-a52f-67aef7b42f43\"}],\n} to class java.lang.Object."
}
]
}
If I take that string and run it through a console.log in my frontend app, I get:
{
[{"id":"2","userId":"0b6af629-6009-4f4d-a52f-67aef7b42f43"},{"id":"1","userId":"0b6af629-6009-4f4d-a52f-67aef7b42f43"}]
}
That's clearly an object. How do I make it... not an object, so that AppSync properly reads it as a list?
SOLUTION
My response map had a set of curly braces around it. I'm pretty sure that was placed there in the generator by Amazon. Removing them fixed it.
I think I'm not seeing the complete view of your schema, I was expecting something like:
schema {
query: Query
}
Where Query is RootQuery, in fact you didn't share us your Query definition. Assuming you have the right Query definition. The main problem is in your response template.
> "items": $util.toJson($context.result.items)
This means that you are passing a collection named: *"items"* to Graphql query engine. And you are referring this collection as "createdCurricula". So solve this issue your response-mapping-template is the right place to fix. How? just replace the above line with the following.
"createdCurricula": $util.toJson($context.result.items),
Please the main thing to note here is, the mapping template is a bridge between your datasources and qraphql, feel free to make any computation, or name mapping but don't forget that object names in that response json are the one should match in schema/query definition.
Thanks.
Musema
change to result type to $util.toJson($ctx.result.data.posts)
The exception msg says that it expected a type list.
Looking at:
{
[{"id":"2","userId":"0b6af629-6009-4f4d-a52f-67aef7b42f43"},{"id":"1","userId":"0b6af629-6009-4f4d-a52f-67aef7b42f43"}]
}
I don't see that createdCurricula is a LIST.
What is currently in DDB is:
"id": "0b6af629-6009-4f4d-a52f-67aef7b42f43",
"createdCurricula": null

How to get List<Object> with RestTemplate(SpringBoot)

I want to get: List<User>. I have endpoint(GET) for getting users. It gets:
[
{
"id": "d71dcbca-54f3-4b19-aec4-3776bfe34730",
"name": "test",
"surname": "test",
"login": "test",
"password": "-26104458",
"email": "test",
"role": "user"
}
]
I try get getting this list with using rest template:
ResponseEntity<User[]> responseEntity = rest.getForEntity(my-endpoint, User[].class);
return Arrays.asList(responseEntity.getBody());
But I get this errror:
org.springframework.web.client.RestClientException: Could not extract response: no suitable HttpMessageConverter found for response type
Question: How to get List<User> with using rest template?
Maybe you want to try this approach:
ResponseEntity<List<User>> responseEntity = rest.exchange(
"your-endpoint",
HttpMethod.GET,
null,
new ParameterizedTypeReference<List<User>>() {
});
See also https://docs.spring.io/spring/docs/4.3.12.RELEASE/spring-framework-reference/htmlsingle/#rest-resttemplate