IoT Core OpenSearch Action Rule "mapper_parsing_exception" - amazon-web-services

I'm using the OpenSearch 1.2 version deployed on AWS.
I was following this AWS tutorial but with my own sensor data. I've created the following IoT Core Rule OpenSearch Action:
OpenSearchTopicRule:
Type: AWS::IoT::TopicRule
Properties:
TopicRulePayload:
Actions:
- OpenSearch:
Endpoint: !Join ['', ['https://', !GetAtt OpenSearchServiceDomain.DomainEndpoint]]
Id: '${newuuid()}'
Index: sensors
RoleArn: !GetAtt IoTOSActionRole.Arn
Type: sensor_data
Sql: SELECT *, timestamp() as ts FROM 'Greenhouse/+/Sensor/Status'
The IoTOSActionRole has propper es:ESHttpPut permission. But when I try to create an index with following command send from Postman that would match the Type: sensor_data attribute:
curl --location --request PUT 'https://search-iot***-avt***i.eu-west-1.es.amazonaws.com/sensors' \
--header 'Content-Type: application/json' \
--data-raw '{
"mappings": {
"sensor_data": {
"properties": {
"ts": { "type": "long",
"copy_to": "datetime"},
"datetime": {"type": "date",
"store": true},
"deviceID": {"type": "text",
"store": true},
"humidity": {"type": "integer",
"store": true},
"temperature": {"type": "integer",
"store": true},
"lux": {"type": "integer",
"store": true},
"soil": {"type": "integer",
"store": true}
}}}'
I receive an error:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [sensor_data : {properties={datetime={store=true, type=date}, temperature={store=true, type=integer}, humidity={store=true, type=integer}, soil={store=true, type=integer}, deviceID={store=true, type=text}, lux={store=true, type=integer}, ts={copy_to=datetime, type=long}}}]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters: [...]",
"caused_by": {
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [...}]"
}
},
"status": 400
}
I've tried removing the 'type' "sensor_data" attribute as mentioned here (but that's ElasticSearch solution) and that allowed me to create an index with that mapping,
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "sensors"
}
and then index pattern in OpenSearch Dashboard, but what happens then is that the IoT Core Rule even though it gets triggered does not result in any data ingestion to the OpenSearch domain. So I guess IoT Core Action tries to send that data with type sensor_databut there's no corresponding type in OS. Additionally, when I open the Discovery tab in OS dashboard I get this notice:
"undefined" is not a configured index pattern ID
Showing the default index pattern: "sensors*" (d97775d0-***a2fd725)
Sample data:
{
"deviceID": "Tomatoes",
"Greenhouse": 1,
"date": "05-05",
"time": "09:35:39",
"timestamp": 1651743339,
"humidity": 60,
"temperature": 33.3,
"lux": 9133.333,
"soil": 78
}
What PUT call I have to make to create a sensor_data type mapping in OS that would match the type specified in IoT Core OpenSearch Action Rule?
UPDATE
I've tried the same API call with Elasticsearch_7.10 and received the same mapper_parsing_exception response. I've tried removing the "store": true attribute. The only call accepted is the one omitting the sensor_data type attribute to the .../sensors/_mappings API.

Related

API Gateway -> SQS HTTP POST MessageAttributes

I have an API gateway setup which sends to SQS which fires a Lambda, I am trying to pass message attributes to the SQS but when I hit the endpoint in postman I keep getting a 400 Bad Request.. what is the right way to send the attributes over a JSON POST body
here is body from postman (have tried a few options based on this link)
"message": "Message",
"MessageAttributes": {
"Name": "Name",
"Type": "String",
"Value": "my value"
}
}
Here is how API Gateway is configured
Incase someone stumbles on this later here is worked from the CDK side
let intergation = new apiGateway.CfnIntegration(this, 'Integration', {
apiId: props.httpApi.httpApiId,
payloadFormatVersion: '1.0',
integrationType: 'AWS_PROXY',
credentialsArn: apigwRole.roleArn,
integrationSubtype: 'SQS-SendMessage',
requestParameters: {
QueueUrl: sqsqueue.queueUrl,
MessageBody: '$request.body',
MessageAttributes: '$request.body.MessageAttributes'
}
})
new apiGateway.CfnRoute(this, 'Route', {
apiId: props.httpApi.httpApiId,
routeKey: apiGateway.HttpRouteKey.with('/url/foo', apiGateway.HttpMethod.POST).key,
target: `integrations/${intergation .ref}`
}).addDependsOn(intergation);
and the cloudformation
MessageBody: $request.body
MessageAttributes: $request.body.MessageAttribute
then in post man the POST body content type as application/json
{
"message": "Message",
"MessageAttributes": {
"Attributes": {
"DataType": "String",
"StringValue": "my value"
}
}
}
the lamba would log out both separate for each Record from the event body
Records: [
{
....
body: 'Message',
attributes: [Object],
messageAttributes: [Object]
}
]
}
the messageAttributes object from above:
{
Attributes: {
stringValue: 'my value',
stringListValues: [],
binaryListValues: [],
dataType: 'String'
}
}
This is using AWS API Gateway v2 HTTP API also

Issue while inserting Data in Firestore using cloud workflows firestore connector with Json object coming from previous step which is a cloud function

I am trying to build a workflow where in step1 I am running a cloud function which returns a Json Object in the form of python dictionary and I want the same to be inserted in Firestore using firestore connector. But I am getting the below error:
HTTP server responded with error code 400
in step "create_document", routine "main", line: 27
HTTP server responded with error code 400
in step "create_document", routine "main", line: 28
{
"body": {
"error": {
"code": 400,
"details": [
{
"#type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"description": "Invalid JSON payload received. Unknown name \"field1\" at 'document.fields[0].value': Cannot find field.",
"field": "document.fields[0].value"
},
{
"description": "Invalid value at 'document.fields[1].value' (type.googleapis.com/google.firestore.v1.Value), 200",
"field": "document.fields[1].value"
},
{
"description": "Invalid JSON payload received. Unknown name \"Alt-Svc\" at 'document.fields[2].value': Cannot find field.",
"field": "document.fields[2].value"
},
{
"description": "Invalid JSON payload received. Unknown name \"Cache-Control\" at 'document.fields[2].value': Cannot find field.",
"field": "document.fields[2].value"
},
{
"description": "Invalid JSON payload received. Unknown name \"Content-Length\" at 'document.fields[2].value': Cannot find field.",
"field": "document.fields[2].value"
},
{
"description": "Invalid JSON payload received. Unknown name \"Content-Type\" at 'document.fields[2].value': Cannot find field.",
"field": "document.fields[2].value"
},
{
"description": "Invalid JSON payload received. Unknown name \"Date\" at 'document.fields[2].value': Cannot find field.",
"field": "document.fields[2].value"
}
This is how my workflow looks like
main:
params: [args]
steps:
- step1:
call: http.get
args:
url: https://XXXXXXXXXXXXX.cloudfunctions.net/step1-workflow
query:
bucket_name: ${args.bucket_name}
blob_name: ${args.blob_name}
result: key_val
- step2:
assign:
- project_id: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
- collection: "a-dummy-collection"
- document: "new7-dummy-document"
- create_document:
call: googleapis.firestore.v1.projects.databases.documents.createDocument
args:
collectionId: ${collection}
parent: ${"projects/" + project_id + "/databases/(default)/documents"}
query:
documentId: ${document}
body:
fields: ${key_val}
result: inserted
if in place of ${key_val} I use simple json {"field1": {"stringValue": "str1"},"field2": {"integerValue": 10}} it works fine and data gets inserted in the Firestore but if I try to use the object from variable ${key_val} which is in the same structure as mentioned json it gives error.
Answer given in the comments: the ${key_val} result from the call to the cloud function is actually returning the whole response object, not just the body. That's why in the error messages, you were seeing things like content-type and other headers.
The solution here is to say that we want the body of that response with: ${key_val.body}.

Failed to create API Gateway

I'm trying to create this API Gateway (gist) with Authorizer, and ANY method.
I run into this error:
The following resource(s) failed to create: [BaseLambdaExecutionPolicy, ApiGatewayDeployment]
I've checked the parameters passed into this template from my other stacks and they're correct. I've checked this template and it's valid.
My template is modified from this template with "Runtime": "nodejs8.10".
This is the same stack (gist) which is created successfully using swagger 2. I just want to replace swagger 2 with AWS::ApiGateway::Method.
Update 6 Jun 2019:
I tried to create the whole nested stack using the working version of the API Gateway stack, then create another API Gateway with the template that doesn't work with the parameters I get from the nested stack, then I have this:
The REST API doesn't contain any methods (Service: AmazonApiGateway; Status Code: 400; Error Code: BadRequestException; Request ID: ID)
But I did specify the method in my template following AWS docs:
"GatewayMethod": {
"Type" : "AWS::ApiGateway::Method",
"DependsOn": ["LambdaRole", "ApiGateway"],
"Properties" : {
"ApiKeyRequired" : false,
"AuthorizationType" : "Cognito",
"HttpMethod" : "ANY",
"Integration" : {
"IntegrationHttpMethod" : "ANY",
"Type" : "AWS",
"Uri" : {
"Fn::Sub": "arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${LambdaFunction.Arn}/invocations"
}
},
"MethodResponses" : [{
"ResponseModels": {
"application/json": "Empty"
},
"StatusCode": 200
}],
"RequestModels" : {"application/json": "Empty"},
"ResourceId" : {
"Fn::GetAtt": ["ApiGateway", "RootResourceId"]
},
"RestApiId" : {
"Ref": "ApiGateway"
}
}
},
Thanks to #John's suggestion. I've tried to create the nested stack with the version that worked and pass in the parameters for the version that doesn't work.
The reason for that error is:
CloudFormation might try to create Deployment before it creates Method
from balaji's answer here.
So this is what I did:
"methodANY": {
"Type": "AWS::ApiGateway::Method",
"Properties": {
"AuthorizationType": "COGNITO_USER_POOLS",
...},
"ApiGatewayDeployment": {
"Type": "AWS::ApiGateway::Deployment",
"DependsOn": "methodANY",
...
I also found this article on cloudonaut.io by Michael Wittig helpful.

ClientError: train channel is not specified with AWS object_detection_augmented_manifest_training using ground truth images

I have completed a labelling job in AWS ground truth and started working on the notebook template for object detection.
I have 2 manifests which has 293 labeled images for birds in a train and validation set like this:
{"source-ref":"s3://XXXXXXX/Train/Blackbird_1.JPG","Bird-Label-Train":{"workerId":XXXXXXXX,"imageSource":{"s3Uri":"s3://XXXXXXX/Train/Blackbird_1.JPG"},"boxesInfo":{"annotatedResult":{"boundingBoxes":[{"width":1612,"top":841,"label":"Blackbird","left":1276,"height":757}],"inputImageProperties":{"width":3872,"height":2592}}}},"Bird-Label-Train-metadata":{"type":"groundtruth/custom","job-name":"bird-label-train","human-annotated":"yes","creation-date":"2019-01-16T17:28:23+0000"}}
Below are the parameters I am using for the notebook instance:
training_params = \
{
"AlgorithmSpecification": {
"TrainingImage": training_image, # NB. This is one of the named constants defined in the first cell.
"TrainingInputMode": "Pipe"
},
"RoleArn": role,
"OutputDataConfig": {
"S3OutputPath": s3_output_path
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.p3.2xlarge",
"VolumeSizeInGB": 5
},
"TrainingJobName": job_name,
"HyperParameters": { # NB. These hyperparameters are at the user's discretion and are beyond the scope of this demo.
"base_network": "resnet-50",
"use_pretrained_model": "1",
"num_classes": "1",
"mini_batch_size": "16",
"epochs": "5",
"learning_rate": "0.001",
"lr_scheduler_step": "3,6",
"lr_scheduler_factor": "0.1",
"optimizer": "rmsprop",
"momentum": "0.9",
"weight_decay": "0.0005",
"overlap_threshold": "0.5",
"nms_threshold": "0.45",
"image_shape": "300",
"label_width": "350",
"num_training_samples": str(num_training_samples)
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 86400
},
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_train_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","Bird-Label-Train"] # NB. This must correspond to the JSON field names in your augmented manifest.
}
},
"ContentType": "image/jpeg",
"RecordWrapperType": "None",
"CompressionType": "None"
},
{
"ChannelName": "validation",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_validation_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","Bird-Label"] # NB. This must correspond to the JSON field names in your augmented manifest.
}
},
"ContentType": "image/jpeg",
"RecordWrapperType": "None",
"CompressionType": "None"
}
]
I would end up with this being printed after running my ml.p3.2xlarge instance:
InProgress Starting
InProgress Starting
InProgress Starting
InProgress Training
Failed Failed
Followed by this error message:
'ClientError: train channel is not specified.'
Does anyone have any thoughts for how I can get this running with no errors? Any help is much apreciated!
Successful run: Below is the paramaters that were used, along with the Augmented Manifest JSON Objects for a successful run.
training_params = \
{
"AlgorithmSpecification": {
"TrainingImage": training_image, # NB. This is one of the named constants defined in the first cell.
"TrainingInputMode": "Pipe"
},
"RoleArn": role,
"OutputDataConfig": {
"S3OutputPath": s3_output_path
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.p3.2xlarge",
"VolumeSizeInGB": 50
},
"TrainingJobName": job_name,
"HyperParameters": { # NB. These hyperparameters are at the user's discretion and are beyond the scope of this demo.
"base_network": "resnet-50",
"use_pretrained_model": "1",
"num_classes": "3",
"mini_batch_size": "1",
"epochs": "5",
"learning_rate": "0.001",
"lr_scheduler_step": "3,6",
"lr_scheduler_factor": "0.1",
"optimizer": "rmsprop",
"momentum": "0.9",
"weight_decay": "0.0005",
"overlap_threshold": "0.5",
"nms_threshold": "0.45",
"image_shape": "300",
"label_width": "350",
"num_training_samples": str(num_training_samples)
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 86400
},
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_train_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": attribute_names # NB. This must correspond to the JSON field names in your **TRAIN** augmented manifest.
}
},
"ContentType": "application/x-recordio",
"RecordWrapperType": "RecordIO",
"CompressionType": "None"
},
{
"ChannelName": "validation",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_validation_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","ValidateBird"] # NB. This must correspond to the JSON field names in your **VALIDATION** augmented manifest.
}
},
"ContentType": "application/x-recordio",
"RecordWrapperType": "RecordIO",
"CompressionType": "None"
}
]
}
Training Augmented Manifest File generated during the running of the training job
Line 1
{"source-ref":"s3://XXXXX/Train/Blackbird_1.JPG","TrainBird":{"annotations":[{"class_id":0,"width":1613,"top":840,"height":766,"left":1293}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:21:29.829003","type":"groundtruth/object-detection"}}
Line 2
{"source-ref":"s3://xxxxx/Train/Blackbird_2.JPG","TrainBird":{"annotations":[{"class_id":0,"width":897,"top":665,"height":1601,"left":1598}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:22:34.502274","type":"groundtruth/object-detection"}}
Line 3
{"source-ref":"s3://XXXXX/Train/Blackbird_3.JPG","TrainBird":{"annotations":[{"class_id":0,"width":1040,"top":509,"height":1695,"left":1548}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:20:26.660164","type":"groundtruth/object-detection"}}
I then unzip the model.tar file to get the following files:hyperparams.JSON, model_algo_1-0000.params and model_algo_1-symbol
hyperparams.JSON looks like this:
{"label_width": "350", "early_stopping_min_epochs": "10", "epochs": "5", "overlap_threshold": "0.5", "lr_scheduler_factor": "0.1", "_num_kv_servers": "auto", "weight_decay": "0.0005", "mini_batch_size": "1", "use_pretrained_model": "1", "freeze_layer_pattern": "", "lr_scheduler_step": "3,6", "early_stopping": "False", "early_stopping_patience": "5", "momentum": "0.9", "num_training_samples": "11", "optimizer": "rmsprop", "_tuning_objective_metric": "", "early_stopping_tolerance": "0.0", "learning_rate": "0.001", "kv_store": "device", "nms_threshold": "0.45", "num_classes": "1", "base_network": "resnet-50", "nms_topk": "400", "_kvstore": "device", "image_shape": "300"}
Unfortunately, pipe mode with AugmentedManifestFile is not supported for the image/jpeg content type. To be able to use this feature, you will need to specify RecordWrapperType as RecordIO and ContentType as application/x-recordio.
The 'AttributeNames' parameter need to be ['source-ref', 'your label here'] in both your train and validation channel
Thank you again for your help. All of which were valid in helping me get further. Having received a response on the AWS forum pages, I finally got it working.
I understood that my JSON was slightly different to the augmented manifest training guide. Having gone back to basics, I created another labelling job, but used the 'Bounding Box' type as opposed to the 'Custom - Bounding box template'. My output matched what was expected. This ran with no errors!
As my purpose was to have multiple labels, I was able to edit the files and mapping of my output manifests, which also worked!
i.e.
{"source-ref":"s3://xxxxx/Blackbird_15.JPG","ValidateBird":{"annotations":[{"class_id":0,"width":2023,"top":665,"height":1421,"left":1312}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"ValidateBird-metadata":{"job-name":"labeling-job/validatebird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:23:51.174131","type":"groundtruth/object-detection"}}
{"source-ref":"s3://xxxx/Pigeon_19.JPG","ValidateBird":{"annotations":[{"class_id":2,"width":784,"top":634,"height":1657,"left":1306}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"ValidateBird-metadata":{"job-name":"labeling-job/validatebird","class-map":{"2":"Pigeon"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:23:51.074809","type":"groundtruth/object-detection"}}
The original mapping was 0:'Bird' for all images through the labelling job.

HIVE_INVALID_METADATA in Amazon Athena

How can I work around the following error in Amazon Athena?
HIVE_INVALID_METADATA: com.facebook.presto.hive.DataCatalogException: Error: : expected at the position 8 of 'struct<x-amz-request-id:string,action:string,label:string,category:string,when:string>' but '-' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null)
When looking at position 8 in the database table connected to Athena generated by AWS Glue, I can see that it has a column named attributes with a corresponding struct data type:
struct <
x-amz-request-id:string,
action:string,
label:string,
category:string,
when:string
>
My guess is that the error occurs because the attributes field is not always populated (c.f. the _session.start event below) and does not always contain all fields (e.g. the DocumentHandling event below does not contain the attributes.x-amz-request-id field). What is the appropriate way to address this problem? Can I make a column optional in Glue? Can (should?) Glue fill the struct with empty strings? Other options?
Background: I have the following backend structure:
Amazon PinPoint Analytics collects metrics from my application.
The PinPoint event stream has been configured to forward the events to an Amazon Kinesis Firehose delivery stream.
Kinesis Firehose writes data to S3
Use AWS Glue to crawl S3
Use Athena to write queries based on the databases and tables generated by AWS Glue
I can see PinPoint events successfully being added to json files in S3, e.g.
First event in a file:
{
"event_type": "_session.start",
"event_timestamp": 1524835188519,
"arrival_timestamp": 1524835192884,
"event_version": "3.1",
"application": {
"app_id": "[an app id]",
"cognito_identity_pool_id": "[a pool id]",
"sdk": {
"name": "Mozilla",
"version": "5.0"
}
},
"client": {
"client_id": "[a client id]",
"cognito_id": "[a cognito id]"
},
"device": {
"locale": {
"code": "en_GB",
"country": "GB",
"language": "en"
},
"make": "generic web browser",
"model": "Unknown",
"platform": {
"name": "macos",
"version": "10.12.6"
}
},
"session": {
"session_id": "[a session id]",
"start_timestamp": 1524835188519
},
"attributes": {},
"client_context": {
"custom": {
"legacy_identifier": "50ebf77917c74f9590c0c0abbe5522d2"
}
},
"awsAccountId": "672057540201"
}
Second event in the same file:
{
"event_type": "DocumentHandling",
"event_timestamp": 1524835194932,
"arrival_timestamp": 1524835200692,
"event_version": "3.1",
"application": {
"app_id": "[an app id]",
"cognito_identity_pool_id": "[a pool id]",
"sdk": {
"name": "Mozilla",
"version": "5.0"
}
},
"client": {
"client_id": "[a client id]",
"cognito_id": "[a cognito id]"
},
"device": {
"locale": {
"code": "en_GB",
"country": "GB",
"language": "en"
},
"make": "generic web browser",
"model": "Unknown",
"platform": {
"name": "macos",
"version": "10.12.6"
}
},
"session": {},
"attributes": {
"action": "Button-click",
"label": "FavoriteStar",
"category": "Navigation"
},
"metrics": {
"details": 40.0
},
"client_context": {
"custom": {
"legacy_identifier": "50ebf77917c74f9590c0c0abbe5522d2"
}
},
"awsAccountId": "[aws account id]"
}
Next, AWS Glue has generated a database and a table. Specifically, I see that there is a column named attributes that has the value of
struct <
x-amz-request-id:string,
action:string,
label:string,
category:string,
when:string
>
However, when I attempt to Preview table from Athena, i.e. execute the query
SELECT * FROM "pinpoint-test"."pinpoint_testfirehose" limit 10;
I get the error message described earlier.
Side note, I have tried to remove the attributes field (by editing the database table from Glue), but that results in Internal error when executing the SQL query from Athena.
This is a known limitation. Athena table and database names allow only underscore special characters#
Athena table and database names cannot contain special characters, other than underscore (_).
Source: http://docs.aws.amazon.com/athena/latest/ug/known-limitations.html
Use tick (`) when table name has - in the name
Example:
SELECT * FROM `pinpoint-test`.`pinpoint_testfirehose` limit 10;
Make sure you select "default" database on the left pane.
I believe the problem is your struct element name: x-amz-request-id
The "-" in the name.
I'm currently dealing with a similar issue since my elements in my struct have "::" in the name.
Sample data:
some_key: {
"system::date": date,
"system::nps_rating": 0
}
Glue derived struct Schema (it tried to escape them with ):
struct <
system\:\:date:String
system\:\:nps_rating:Int
>
But that still gives me an error in Athena.
I don't have a good solution for this other than changing Struct to STRING and trying to process the data that way.