Error when deploying google cloud function - run out of memory? - google-cloud-platform

I have used the following deployment for the example code used in the tutorial for a google cloud function. The function should simply print the statements below when a new item is added to my bucket, (which happens every half hour)
Example code (file is also called hello_gs.py):
def hello_gcs(event, context):
print('Event ID: {}'.format(context.event_id))
print('Event type: {}'.format(context.event_type))
print('Bucket: {}'.format(event['bucket']))
print('File: {}'.format(event['name']))
print('Metageneration: {}'.format(event['metageneration']))
print('Created: {}'.format(event['timeCreated']))
print('Updated: {}'.format(event['updated']))
I deploy it with:
gcloud functions deploy hello_gcs \
--trigger-resource bucket1 \
--trigger-event google.storage.object.finalize
I get the following error in my logs
insertId: "000000-f7b8ac5b-61f2-4d37-902a-b21ab56372c9"
labels: {1}
logName: "projects/project-name-v2/logs/cloudfunctions.googleapis.com%2Fcloud-functions"
receiveTimestamp: "2021-10-20T11:38:19.093774441Z"
resource: {2}
severity: "ERROR"
textPayload: "Function cannot be initialized. Error: memory limit exceeded.
"
timestamp: "2021-10-20T11:38:18.112056018Z"
and yet the function is so simple and small I find this hard to understand?
Any ideas what I am doing wrong here, and help would be appreciated.

Related

Prometheus series values for time metrics

I'm defining a data series for testing a Prometheus alert using the container_last_seen metric from the cadvisor exporter.
How do I enter timestamp series values, as returned by the container_last_seen metric? I'm testing Prometheus alerts on an Apple Mac which run in production on Linux boxes.
Here's one thing I tried:
input_series:
- series: |
container_last_seen{container_label_com_docker_swarm_service_name="service1",env="prod",instance="10.0.0.1"}
values: '1563968832+0x61'
It seems whatever I put in the values for the series is not accepted.
I've also tried durations: '0h+1mx60'
As this is legal: time() - container_last_seen{...} cls is definitely a timestamp, and I would expect a timestamp to be represented by a Unix epoch number. Executing the query on Prometheus gives Unix epoch times, but putting numbers in a series is rejected with the error below.
promtool is recognising the different types but giving much the same error:
➜ promtool test rules alertrules-service-oriented-test.yml
Unit Testing: alertrules-service-oriented-test.yml
FAILED:
1:1: parse error: unexpected number "0" in series values
If the values are '1h+0mx61', promtool correctly identifies the values as durations:
1:1: parse error: unexpected duration "1h" in series values
Note that when this test is commented out, there is no 1:1: parse error and the tests complete successfully. This is not a problem with out of sight parts of the test file.
Thanks for any insights.
Here's the alert:
alertrules.yaml:
- name: containers
interval: 15s
rules:
- alert: prod_container_crashing
expr: |
count by (instance, container_label_com_docker_swarm_service_name)
(
count_over_time(container_last_seen{container_label_com_docker_swarm_service_name!="",env="prod"}[15m])
) - 1 > 2
for: 5m
labels:
service: prod
type: container
severity: critical
annotations:
summary: "pdce {{ $labels.container_label_com_docker_swarm_service_name }}"
description: "{{ $labels.container_label_com_docker_swarm_service_name }} in prod cluster on {{ $labels.instance }} is crashing"
and here's the test file:
alertrules_test.yml:
rule_files:
- alertrules.yml
evaluation_interval: 1m
tests:
- name: container_tests
interval: 15s
input_series:
- series: |
container_last_seen{container_label_com_docker_swarm_service_name="service1",env="prod",instance="10.0.0.1"}
values: '1563968832+0x61'
alert_rule_test:
- eval_time: 15m
alertname: prod_container_crashing
exp_alerts:
- exp_labels:
service: prod
type: container
severity: critical
exp_annotations:
summary: prod service1
description: service1 in prod cluster on 10.0.0.1 is crashing
When the series: value is all on one line, without a > or | yaml flow operator, e.g.
- series: container_last_seen{container_label_com_docker_swarm_service_name="service1",env="prod",instance="10.0.0.1"}
values: '1563968832+0x61'
the error is not there, I don't know why. So this doesn't appear to be a data typing issue.
It's a shame for readability reasons-- either Prometheus or GoLang may have a squeaky wheel in their YAML implementation.

GCP Dataproc - Error: Unknown name "optionalComponents" at 'cluster.config': Cannot find field

I am trying to create dataproc cluster using configurations mentioned in YAML file (using import):
The command I have been using successfully:
$ gcloud beta dataproc clusters import $CLUSTER_NAME --region=$REGION
--source=cluster_conf_file.yaml
Later on I tried adding HABSE component which is a part of available optional components using attribute --optional-components:
$ gcloud beta dataproc clusters import $CLUSTER_NAME --optional-components=HBASE --region=$REGION
--source=cluster_conf_file.yaml
(Documentation referred:
https://cloud.google.com/dataproc/docs/concepts/components/hbase#installing_the_component)
Which caused below error:
ERROR: (gcloud.beta.dataproc.clusters.import) unrecognized arguments: --optional-components=HBASE
Then I tried adding the attribute --optional-components as optionalComponents in the YAML file (instead of passing through command line) by referring this documentation.
Sample YAML:
config:
endpointConfig:
enableHttpPortAccess: BOOLEAN_VALUE
configBucket: BUCKET_NAME
gceClusterConfig:
serviceAccount: SERVICE_ACCOUNT
subnetworkUri: SUBNETWORK_URI
tags:
- Tag1
- TAG2
optionalComponents: <---- Attribute causing error
- HBASE
softwareConfig:
imageVersion: IMAGE_VERSION
properties:
PROPERTY: VALUE
.
.
.
masterConfig:
diskConfig:
bootDiskSizeGb: SIZE
bootDiskType: TYPE
machineTypeUri: TYPE_URI
numInstances: COUNT
Which caused below error:
ERROR: (gcloud.dataproc.clusters.import) INVALID_ARGUMENT: Invalid JSON payload received. Unknown name "optionalComponents" at 'cluster.config': Cannot find field.
- '#type': type.googleapis.com/google.rpc.BadRequest
fieldViolations:
- description: "Invalid JSON payload received. Unknown name \"optionalComponents\"\
\ at 'cluster.config': Cannot find field."
field: cluster.config
Is there a way to fix this?
optionalComponents should be under config.softwareConfig:
config:
...
softwareConfig:
imageVersion: IMAGE_VERSION
optionalComponents:
- ZOOKEEPER
- HBASE
You can prove it by first creating a cluster with optional components, then export it to a YAML file.

Getting 400 Bad Request Error while creating S3 Batch Job from Java Code

As per the doc, I am trying to create a batch job from Java Code.
I am able to create a job from console with same role and lambda arn, but from code, I am getting 400 Bad Request. Also, I don't see any error message as per this doc
Here is my code snippet -
JobOperation jobOperation = new JobOperation().withLambdaInvoke(new LambdaInvokeOperation()
.withFunctionArn("arn:aws:lambda:eu-west-1:<account_id>:function:s3BatchOperarationsPOCLambda"));
JobManifest manifest = new JobManifest()
.withSpec(new JobManifestSpec().withFormat(JobManifestFormat.S3InventoryReport_CSV_20161130)
.withFields(new String[] { "Bucket", "Key" }))
.withLocation(
new JobManifestLocation().withObjectArn("arn:aws:s3:::<bucket_name>/manifest.csv")
.withETag("e55392fa1ad40a08e40b13b3c000a0aa"));
JobReport jobReport = new JobReport().withBucket(reportBucketName).withPrefix("testreport")
.withFormat(JobReportFormat.Report_CSV_20180820).withEnabled(true).withReportScope("AllTasks");
AWSS3Control s3ControlClient = AWSS3ControlClientBuilder.standard().withRegion(Regions.US_WEST_1).build();
String roleArn = "arn:aws:iam::<account_id>:role/S3-Batch-Role";
String accountId = <account_id>;
s3ControlClient.createJob(new CreateJobRequest().withAccountId(accountId).withOperation(jobOperation)
.withManifest(manifest).withPriority(12).withRoleArn(roleArn).withReport(jobReport)
.withClientRequestToken(uuid).withDescription("S3 job").withConfirmationRequired(false));
} catch (AmazonServiceException e) {
// The call was transmitted successfully, but Amazon S3 couldn't process
// it and returned an error response.
e.printStackTrace();
} catch (SdkClientException e) {
System.out.println("test2" + e.getMessage());
// Amazon S3 couldn't be contacted for a response, or the client
// couldn't parse the response from Amazon S3.
e.printStackTrace();
}
Role has full IAM and s3 batch operation permissions, also lambda has access permission for s3.
Trust policy is also defined for batch operations.
Here is my error log -
(Service: AWSS3Control; Status Code: 400; Error Code: 400 Bad Request; Request ID: null; Proxy: null)
com.amazonaws.services.s3control.model.AWSS3ControlException: null (Service: AWSS3Control; Status Code: 400; Error Code: 400 Bad Request; Request ID: null; Proxy: null)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1811)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1395)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1371)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at com.amazonaws.services.s3control.AWSS3ControlClient.doInvoke(AWSS3ControlClient.java:1532)
at com.amazonaws.services.s3control.AWSS3ControlClient.invoke(AWSS3ControlClient.java:1499)
at com.amazonaws.services.s3control.AWSS3ControlClient.invoke(AWSS3ControlClient.java:1488)
at com.amazonaws.services.s3control.AWSS3ControlClient.executeCreateJob(AWSS3ControlClient.java:265)
at com.amazonaws.services.s3control.AWSS3ControlClient.createJob(AWSS3ControlClient.java:236)
at com.code.platformintegrationsscheduler.handlers.test.createS3Job(test.java:68)
at com.code.platformintegrationsscheduler.handlers.test.main(test.java:27)
I was stuck with the same issue today and after some debugging and trying out the same operation on CLI, I found that
new JobReport().withBucket(reportBucketName)
takes a bucketArn instead of a bucket name.
The actual issue might be different in your case. I suggest you serialize your request from code and try out the same operation in CLI and match both the requests.
AWS Error messages are often not very helpful when we actually need them.
I got the issue, issue was related to the gradle versions, we need to make sure we have all aws services gradle versions to be same.
In my case -
compile group: 'com.amazonaws', name: 'aws-java-sdk-dynamodb', version: '1.11.844'
compile group: 'com.amazonaws', name: 'aws-java-sdk-iam', version: '1.11.844'
compile group: 'com.amazonaws', name: 'aws-java-sdk-events', version: '1.11.844'
compile group: 'com.amazonaws', name: 'aws-java-sdk-s3', version: '1.11.844'
compile group: 'com.amazonaws', name: 'aws-java-sdk-batch', version: '1.11.844'
compile group: 'com.amazonaws', name: 'aws-java-sdk-s3control', version:'1.11.844'

Google Cloud Scheduler calling Google Cloud Function when Function has succeeded (returned 200 status) still gets 500 status

I have a Scheduler job call my Function once an hour. The function is working without a problem and returning HTTP 200 status every time. However, every 3-5 invocations, the Scheduler job returns a HTTP 500 status. What is the cause/fix for this problem?
Example: here are what the Function logs are showing:
D 2020-02-03T10:02:10.958520185Z Function execution took 130785 ms, finished with status code: 200
D 2020-02-03T11:01:40.819608573Z Function execution took 99762 ms, finished with status code: 200
D 2020-02-03T12:01:41.049430737Z Function execution took 100126 ms, finished with status code: 200
D 2020-02-03T13:02:07.369401657Z Function execution took 127213 ms, finished with status code: 200
D 2020-02-03T14:04:24.352839424Z Function execution took 263896 ms, finished with status code: 200
D 2020-02-03T15:03:14.664760657Z Function execution took 194125 ms, finished with status code: 200
D 2020-02-03T16:06:23.162542969Z Function execution took 382609 ms, finished with status code: 200
D 2020-02-03T17:03:17.458640891Z Function execution took 196799 ms, finished with status code: 200
D 2020-02-03T18:02:54.614556691Z Function execution took 170119 ms, finished with status code: 200
D 2020-02-03T19:04:43.064083790Z Function execution took 277775 ms, finished with status code: 200
D 2020-02-03T20:02:59.315497864Z Function execution took 178499 ms, finished with status code: 200
And these are examples from the logs from Scheduler
2020-02-03 11:03:00.567 CST
{"jobName":"projects/my-project/locations/us-west2/jobs/my-function-trigger","targetType":"HTTP","url":"https://us-central1-my-function.cloudfunctions.net/analytics-archiver","status":"UNKNOWN","#type":"type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"}
{
httpRequest: {
}
insertId: "redacted"
jsonPayload: {
#type: "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"
jobName: "projects/my-project/locations/us-west2/jobs/my-function-trigger"
status: "UNKNOWN"
targetType: "HTTP"
url: "https://us-central1-my-project.cloudfunctions.net/my-function"
}
logName: "projects/my-project/logs/cloudscheduler.googleapis.com%2Fexecutions"
receiveTimestamp: "2020-02-03T17:03:00.567786781Z"
resource: {
labels: {
job_id: "my-function-trigger"
location: "us-west2"
project_id: "my-function"
}
type: "cloud_scheduler_job"
}
severity: "ERROR"
timestamp: "2020-02-03T17:03:00.567786781Z"
}
2020-02-03 13:03:00.765 CST
{"jobName":"projects/my-project/locations/us-west2/jobs/my-function-trigger","targetType":"HTTP","url":"https://us-central1-my-project.cloudfunctions.net/my-function","status":"UNKNOWN","#type":"type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"}
Expand all | Collapse all{
httpRequest: {
}
insertId: "redacted"
jsonPayload: {
#type: "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"
jobName: "projects/my-project/locations/us-west2/jobs/my-function-trigger"
status: "UNKNOWN"
targetType: "HTTP"
url: "https://us-central1-my-project.cloudfunctions.net/my-function"
}
logName: "projects/my-project/logs/cloudscheduler.googleapis.com%2Fexecutions"
receiveTimestamp: "2020-02-03T19:03:00.765993324Z"
resource: {
labels: {
job_id: "my-function-trigger"
location: "us-west2"
project_id: "my-project"
}
type: "cloud_scheduler_job"
}
severity: "ERROR"
timestamp: "2020-02-03T19:03:00.765993324Z"
}
Considering the time that these requests are taking, it seems that this is causing timeout on Google Scheduler, which end up causing some invocations to fail.
As per the documentation cron.yaml Reference, you can configure there the max time that your Scheduler will take before timing out. I would recommend you to take a look at it and confirm how yours is configured and try to keep your invocations within the time set on your cron.yaml file.
Let me know if the information clarified and helped you!
I had exactly the same problem. Although you cannot see it yet in the console, Google Cloud Scheduler has some flags you can set through the gcloud scheduler jobs create http command. This is an example I am using:
gcloud scheduler jobs create http my-job \
--schedule="0 * * * *" \
--uri=https://europe-west1-${PROJECT_ID}.cloudfunctions.net/my-func/ \
--http-method=POST \
--message-body="my-body" \
--max-retry-attempts 3 \
--max-backoff 10s \
--attempt-deadline 10m
Especially the attempt-deadline seems to be important when running functions that last minutes instead of a few seconds. Setting these flags mitigated some of the problems for me, but not all. For more flags I refer you to the documentation.
Furthermore, if this does not help, it is probably a server side error somewhere under the hood of Google. This is due to the UNKNOWN error status, which you can look up in this table. Myself, I got an INTERNAL error status, which is also referred to as a server side bug. Not particularly helpful of Google..

Error when creating indexes for flexible Cloud Datastore: Unexpected attribute 'indexes' for object of type AppInfoExternal

When I access to the Cloud Datastore web management, there are no indexes listed under the "Indexes" section and I would like to define explicitly some indexes in order to run advanced queries. I have a yaml file that looks like:
indexes:
- kind: order
ancestor: no
properties:
- name: email
- name: name
- name: ownerId
- name: status
- name: updated_at
- name: created_at
direction: desc
And I run the following command to create the indexes:
gcloud preview datastore create-indexes indexes.yaml
and this is the error message that I'm getting:
"Unexpected attribute 'indexes' for object of type AppInfoExternal"
Has anybody come across the same issue? Any ideas?
Regards,
Jose
Unfortunately the create-indexes command is a little brittle: it requires that the index.yaml file that you provide is named index.yaml and not indexes.yaml. Otherwise, it will try and parse it as a different type of configuration.
Try renaming your index file to index.yaml then calling the command again.