Cloud Functions triggered by Cloud Pub Sub and Logs Viewer EventId Issue - google-cloud-platform

Since 2020-04-28, I noticed that function context.event_id is no more equals to the labels execution_id in Logs Viewer:
To reproduce the error, create a Cloud Functions triggered by Pub Sub (here with Python):
import logging
def hello_pubsub(event, context):
logging.info(context.event_id)
I expected to get an entry like this:
{
"textPayload": "447023927402809",
"insertId": "000000-599a0542-c78a-42e3-b0d0-bb455078dabf",
"resource": {
"type": "cloud_function",
"labels": {
"project_id": "xxxxxxxxx",
"region": "us-central1",
"function_name": "function-1"
}
},
"timestamp": "2020-04-30T20:07:12.125Z",
"severity": "INFO",
"labels": {
"execution_id": "447023927402809"
},
"logName": "projects/xxxxxxxxx/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
"trace": "projects/xxxxxxxxx/traces/cfa595b77b16d6f27a5f77c472ed0e20",
"receiveTimestamp": "2020-04-30T20:07:14.388866116Z"
}
But the entry contains a different execution_id
{
"textPayload": "447023927402809",
"insertId": "000000-599a0542-c78a-42e3-b0d0-bb455078dabf",
"resource": {
"type": "cloud_function",
"labels": {
"project_id": "xxxxxxxxx",
"region": "us-central1",
"function_name": "function-1"
}
},
"timestamp": "2020-04-30T20:07:12.125Z",
"severity": "INFO",
"labels": {
"execution_id": "k994g1h0pte3"
},
"logName": "projects/xxxxxxxxx/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
"trace": "projects/xxxxxxxxx/traces/cfa595b77b16d6f27a5f77c472ed0e20",
"receiveTimestamp": "2020-04-30T20:07:14.388866116Z"
}
Any ideas about this change? The release page doesn't contain any reference to that:
https://cloud.google.com/functions/docs/release-notes
Thanks,
Philippe

Unfortunately it doesn't seem like this is currently possible.
I've filed an issue internally requesting this feature, and will update this answer if I have updates.

Related

Error in using InputPath to select parts of input in a Step Functions workflow

I am creating a Step Functions workflow which has various steps. I am referring to this topic in their documentation InputPath, ResultPath and OutputPath Examples. I am trying to check the identity and address of a person in my workflow as they've shown in their document. I'm passing the input for the Verify identity step within the state machine definition inside Parameters. My workflow looks like this.
Note: But when I run this, am getting the error -> An error occurred while executing the state 'Verify identity' (entered at the event id #19). Invalid path '$.identity' : Property ['identity'] not found in path $
What am I doing wrong here? Can someone please explain?
Thanks..
{
"StartAt": "Step1",
"States": {
"Step1": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
...something...
},
"Next": "Step2"
},
"Step2": {
"Type": "Choice",
"Choices": [
Do something...
],
"Default": "Step3.1"
},
"Step3.1": {
"Type": "Task",
...something...
}
},
"Next": "Step3.3"
},
...something...,
"Step4": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "Verify identity",
"States": {
"Verify identity": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"InputPath": "$.identity",
"Parameters": {
"Payload": {
"identity": {
"email": "jdoe#example.com",
"ssn": "123-45-6789"
},
"firstName": "Jane",
"lastName": "Doe"
},
"FunctionName": "{Lambda ARN}"
},
"End": true
}
}
},
{
"StartAt": "Verify address",
"States": {
"Verify address": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"Payload": {
"street": "123 Main St",
"city": "Columbus",
"state": "OH",
"zip": "43219"
},
"FunctionName": "{Lambda ARN}"
},
"End": true
}
}
}
],
"Next": "Step5"
},
"Step5": {
"Type": "Task",
"Parameters": {
something...
},
"End": true
}
}```
You don't have an explicit transition in your example to call Step4 but assuming the order you have defined (step1 -> step2 -> step3.1 -> step3.3 -> step4)
This means the output from step3.3 should be something like
{
"cat": "meow",
"dog": "woof",
"identity": { // this is whats missing
"email": "jdoe#example.com",
"ssn": "123-45-6789"
}
}
this is what will get passed to each branch of your parallel state (Step4)
However, since you have anInputPath defined for Step4."Verify identity", the effective input to the task becomes
{
"email": "jdoe#example.com",
"ssn": "123-45-6789"
}
The error youre seeing
An error occurred while executing the state 'Verify identity' (entered at the event id #19). Invalid path '$.identity' : Property ['identity'] not found in path $
means the "identity" key (aka $.identity) isn't getting added to the output of Step3.3 (aka $)

Google Cloud Run randomly spiking requests

I've deployed two Hasura instances via Cloud Run, but have been getting randomly spiking requests periodically for one of the containers. As far as I can see, this is not being initiated by any of our frontends, and the spikes look irregular. Weirdly enough, this issue is only happening on one of our instances.
Getting the following messages for each request:
#1:
{
"insertId": "x",
"jsonPayload": {
"type": "webhook-log",
"detail": {
"http_error": null,
"response": null,
"message": null,
"method": "GET",
"status_code": 200,
"url": "x/auth"
},
"timestamp": "2021-08-26T22:35:40.857+0000",
"level": "info"
},
"resource": {
"type": "cloud_run_revision",
"labels": {
"service_name": "x",
"configuration_name": "x",
"location": "us-central1",
"project_id": "x",
"revision_name": "x"
}
},
"timestamp": "2021-08-26T22:35:41.839935Z",
"labels": {
"instanceId": "x"
},
"logName": "x",
"receiveTimestamp": "2021-08-26T22:35:42.002274277Z"
}
#2:
{
"insertId": "x",
"jsonPayload": {
"timestamp": "2021-08-26T22:35:40.857+0000",
"detail": {
"user_vars": null,
"event": {
"type": "accepted"
},
"connection_info": {
"msg": null,
"token_expiry": null,
"websocket_id": "x"
}
},
"level": "info",
"type": "websocket-log"
},
"resource": {
"type": "cloud_run_revision",
"labels": {
"project_id": "x",
"revision_name": "x",
"service_name": "x",
"configuration_name": "x",
"location": "us-central1"
}
},
"timestamp": "2021-08-26T22:35:41.839957Z",
"labels": {
"instanceId": "x"
},
"logName": "x",
"receiveTimestamp": "2021-08-26T22:35:42.002274277Z"
}
Drawing a blank right now as to what's going on. Any advice is helpful!!
Got it! Turns out it was a few open WebSocket connections from users keeping their browser tab open.
Lesson learned!

AWS Stepfunction, ValidationException

i got the error "The provided key element does not match the schema", while getting data from AWS dynamoDB using stepfunction.
stepfunction Defination
{
"Comment": "This is your state machine",
"StartAt": "Choice",
"States": {
"Choice": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.data.Type",
"StringEquals": "GET",
"Next": "DynamoDB GetItem"
},
{
"Variable": "$.data.Type",
"StringEquals": "PUT",
"Next": "DynamoDB PutItem"
}
]
},
"DynamoDB GetItem": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem",
"Parameters": {
"TableName": "KeshavDev",
"Key": {
"Email": {
"S": "$.Email"
}
}
},
"End": true
},
"DynamoDB PutItem": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "KeshavDev",
"Item": {
"City": {
"S.$": "$.City"
},
"Email": {
"S.$": "$.Email"
},
"Address": {
"S.$": "$.Address"
}
}
},
"InputPath": "$.data",
"End": true
}
}
}
Input
{
"data": {
"Type": "GET",
"Email": "demo#gmail.com"
}
}
Error
{ "resourceType": "dynamodb", "resource": "getItem", "error":
"DynamoDB.AmazonDynamoDBException", "cause": "The provided key
element does not match the schema (Service: AmazonDynamoDBv2; Status
Code: 400; Error Code: ValidationException; Request ID:
a78c3d7a-ca3f-4483-b986-1735201d4ef2; Proxy: null)" }
I see some potential issues with the getItem task when compared to AWS documentation.
I think the Key field needs to be S.$ similar to what you have in your putItem task.
There is no ResultPath attribute to tell the state machine where to put the results.
Your path may not be correct, try $.data.Email
"DynamoDB GetItem": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem",
"Parameters": {
"TableName": "KeshavDev",
"Key": {
"Email": {
"S.$": "$.data.Email"
}
}
},
"ResultPath": "$.DynamoDB",
"End": true
},
To be honest, I'm not sure if one of all of these are contributing to the validation error those are some things to experiment with.
On another note, there are some open source validators for Amazon State Language but for this case, they were not very helpful and said that your code was valid.
its working, above JD D mentoned steps and also by adding both key in step function definition.
DynamoDb have two key.
primary partition key
primary sort key

Monitoring api in Google gives "By" as response

I am reading monitoring data through Google Timeseries api. The api is working correctly and if give alignment period=3600s it gives me the values for that time series between start and end time for any metric type.
I am calling it through Python like this:
service.projects().timeSeries().list(
name=api_args["project_name"],
filter=api_args["metric_filter"],
aggregation_alignmentPeriod=api_args["aggregation_alignment_period"],
# aggregation_crossSeriesReducer=api_args["crossSeriesReducer"],
aggregation_perSeriesAligner=api_args["perSeriesAligner"],
aggregation_groupByFields=api_args["group_by"],
interval_endTime=api_args["end_time_str"],
interval_startTime=api_args["start_time_str"],
pageSize=config.PAGE_SIZE,
pageToken=api_args["nextPageToken"]
).execute()
and in Postman:
https://monitoring.googleapis.com/v3/projects/my-project/timeSeries?pageSize=500&interval.startTime=2020-07-04T16%3A39%3A37.230000Z&aggregation.alignmentPeriod=3600s&aggregation.perSeriesAligner=ALIGN_SUM&filter=metric.type%3D%22compute.googleapis.com%2Finstance%2Fnetwork%2Freceived_bytes_count%22+&pageToken=&interval.endTime=2020-07-04T17%3A30%3A01.497Z&alt=json&aggregation.groupByFields=metric.labels.key
I face an issue here:
{
"metric": {
"labels": {
"instance_name": "insta-demo1",
"loadbalanced": "false"
},
"type": "compute.googleapis.com/instance/network/received_bytes_count"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "1234343552",
"zone": "us-central1-f",
"project_id": "my-project"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2020-07-04T16:30:01.497Z",
"endTime": "2020-07-04T17:30:01.497Z"
},
"value": {
"int64Value": "6720271"
}
}
]
},
{
"metric": {
"labels": {
"loadbalanced": "true",
"instance_name": "insta-demo2"
},
"type": "compute.googleapis.com/instance/network/received_bytes_count"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "1234566343",
"project_id": "my-project",
"zone": "us-central1-f"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2020-07-04T16:30:01.497Z",
"endTime": "2020-07-04T17:30:01.497Z"
},
"value": {
"int64Value": "579187"
}
}
]
}
],
"unit": "By". //This "By" is the value which is causing problem,
I am getting this value like "unit": "By" or "unit":"ms" or something like that at the end, Also if I don't find any data for a range I'm getting this value, as I am evaluating this response in Python I am getting key error as there is not key called "unit"
logMessage: "Key Error: ' '"
severity: "ERROR"
As the response is empty I am getting the single key called "unit". Also at the end of any response I am getting this "unit":"ms" or "unit":"by" - is there any way to prevent that unit value coming in the response?
I am new to Google Cloud APIs and Python. What can I try next?
The "unit" field expresses the kind of resource the metric is counting. For bytes, it is "By". Read this. I understand it is always returned, so there is no way of not receiving it; I recommend you to adapt your code to correctly deal with its appearance in the responses.

AWS Data Pipeline stuck on Waiting For Runner

My goal is to copy a table in a postgreSQL database running on AWS RDS to a .csv file on Amazone S3. For this I use AWS data pipeline and found the following tutorial however when I follow all steps my pipeline is stuck at: "WAITING FOR RUNNER" see screenshot. The AWS documentation states:
ensure that you set a valid value for either the runsOn or workerGroup
fields for those tasks
however the field "runs on" is set. Any idea why this pipeline is stuck?
and my definition file:
{
"objects": [
{
"output": {
"ref": "DataNodeId_Z8iDO"
},
"input": {
"ref": "DataNodeId_hEUzs"
},
"name": "DefaultCopyActivity01",
"runsOn": {
"ref": "ResourceId_oR8hY"
},
"id": "CopyActivityId_8zaDw",
"type": "CopyActivity"
},
{
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"name": "DefaultResource1",
"id": "ResourceId_oR8hY",
"type": "Ec2Resource",
"terminateAfter": "1 Hour"
},
{
"*password": "xxxxxxxxx",
"name": "DefaultDatabase1",
"id": "DatabaseId_BWxRr",
"type": "RdsDatabase",
"region": "eu-central-1",
"rdsInstanceId": "aqueduct30v05.cgpnumwmfcqc.eu-central-1.rds.amazonaws.com",
"username": "xxxx"
},
{
"name": "DefaultDataFormat1",
"id": "DataFormatId_wORsu",
"type": "CSV"
},
{
"database": {
"ref": "DatabaseId_BWxRr"
},
"name": "DefaultDataNode2",
"id": "DataNodeId_hEUzs",
"type": "SqlDataNode",
"table": "y2018m07d12_rh_ws_categorization_label_postgis_v01_v04",
"selectQuery": "SELECT * FROM y2018m07d12_rh_ws_categorization_label_postgis_v01_v04 LIMIT 100"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"pipelineLogUri": "s3://rutgerhofste-data-pipeline/logs",
"scheduleType": "ONDEMAND",
"name": "Default",
"id": "Default"
},
{
"dataFormat": {
"ref": "DataFormatId_wORsu"
},
"filePath": "s3://rutgerhofste-data-pipeline/test",
"name": "DefaultDataNode1",
"id": "DataNodeId_Z8iDO",
"type": "S3DataNode"
}
],
"parameters": []
}
Usually "WAITING FOR RUNNER" state implies that it is waiting for a resource (such as an EMR cluster). You seem to have not set 'workGroup' field. It means that you have specified "What" to do, but have not specified "who" should do it.