I have one simple cloud task queue and have successfully submitted a task to the queue. It is supposed to deliver a JSON payload to my API to perform a basic database update. The task is created at the end of a process in a .net core 3.1 app running locally on my desktop triggered by postman and the API is a golang app running in cloud run. However, the task never seems to fire and never registers an error.
The tasks in queue is always 0 and the tasks running is always blank. I have hit the "Run Now" button dozens of times but it never changes anything and no log entries or failed attempts are ever registered.
The task is created with the OIDCToken with a service account and audience set for the service account that has the authorization to create tokens and execute the cloud run instance.
Screen Shot of Tasks Queue in Google Cloud Console
Task creation log entry shows that it was created OK:
{
"insertId": "efq7sxb14",
"jsonPayload": {
"taskCreationLog": {
"targetAddress": "PUT https://{readacted}",
"targetType": "HTTP",
"scheduleTime": "2020-04-25T01:15:48.434808Z",
"status": "OK"
},
"#type": "type.googleapis.com/google.cloud.tasks.logging.v1.TaskActivityLog",
"task": "projects/{readacted}/locations/us-central1/queues/database-updates/tasks/0998892809207251757"
},
"resource": {
"type": "cloud_tasks_queue",
"labels": {
"target_type": "HTTP",
"project_id": "{readacted}",
"queue_id": "database-updates"
}
},
"timestamp": "2020-04-25T01:15:48.435878120Z",
"severity": "INFO",
"logName": "projects/{readacted}/logs/cloudtasks.googleapis.com%2Ftask_operations_log",
"receiveTimestamp": "2020-04-25T01:15:49.469544393Z"
}
Any ideas as to why the tasks are not running? This is my first time using Cloud Tasks so don't rule out the idiot between the keyboard and the chair.
Thanks!
You might be using a non-default service. See Configuring Cloud Tasks queues
Try creating a task from the command line and watch the logs e.g.
gcloud tasks create-app-engine-task --queue=default \
--method=POST --relative-uri=/update_counter --routing=service:worker \
--body-content=10
In my own case, I used --routing=service:api and it worked straight away. Then I added AppEngineRouting to the AppEngineHttpRequest.
Related
Any record logged from a GCP Cloud Function contains a labels.execution_id, e.g.:
{
"textPayload": "Function execution started",
"insertId": "12mylqhfm6hy8i",
"resource": {
"type": "cloud_function",
"labels": {
"function_name": "redacted",
"region": "europe-west2",
"project_id": "redacted"
}
},
"timestamp": "2022-09-26T10:57:26.917823762Z",
"severity": "DEBUG",
"labels": {
"execution_id": "1l1qb00ft6kv"
},
"logName": "projects/redacted/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
"trace": "projects/redacted/traces/d2f793cf6e2fb149a8ce8dc6fd0498b4",
"receiveTimestamp": "2022-09-26T10:57:26.920210899Z"
}
This is very useful for correlating all logs from a single invocation of the cloud function because it can be filtered upon in Logs Explorer:
labels.execution_id="1l1qb00ft6kv"
I see no equivalent for Cloud Run though. Cloud Run logs do have labels.instance_id but my understanding is that that pertains to the Cloud Run app instance so will be the same for all invocations on that instance. Hence its not the same as Cloud Functions' labels.execution_id.
Does Cloud Run have an equivalent of Cloud Functions' execution_id or would I have to roll my own? If the latter, does anyone have any strategies for doing so?
No there isn't an execution ID, only the instanceID. To have that, you can use instrumentation tools, like Open Telemetry as mentioned by guillaume at stackoverflow question, you can refer this video. You can also customize the app logs with a custom/random execution ID (similar of what OT does).
Also Have a look at this link1 & link2 which might help
I configured aws bridge event rule (via web gui) for running aws batch job - rule is triggered but a I am getting following error after invocation:
shareIdentifier must be specified. (Service: AWSBatch; Status Code: 400; Error Code: ClientException; Request ID: 07da124b-bf1d-4103-892c-2af2af4e5496; Proxy: null)
My job is using scheduling policy and needs shareIdentifier to be set but I don`t know how to set it. Here is screenshot from configuration of rule:
There are no additional settings for subsequent arguments/parameters of job, the only thing I can configure is retries. I also checked aws-cli command for putting rule (https://awscli.amazonaws.com/v2/documentation/api/latest/reference/events/put-rule.html) but it doesn`t seem to have any additional settings. Any suggestions how to solve it? Or working examples?
Edited:
I ended up using java sdk for aws batch: https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-batch. I have a scheduled method that periodically spawns jobs with following peace of code:
AWSBatch client = AWSBatchClientBuilder.standard().withRegion("eu-central-1").build();
SubmitJobRequest request = new SubmitJobRequest()
.withJobName("example-test-job-java-sdk")
.withJobQueue("job-queue")
.withShareIdentifier("default")
.withJobDefinition("job-type");
SubmitJobResult response = client.submitJob(request);
log.info("job spawn response: {}", response);
Have you tried to provide additional settings to your target via the input transformer as referenced in the AWS docs AWS Batch Jobs as EventBridge Targets ?
FWIW I'm running into the same problem.
I had a similar issue, from the CLI and the GUI, I just couldn't find a way to pass ShareIdentifier from an Eventbridge rule. In the end I had to use a state machine (step function) instead:
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobName": <name>,
"JobDefinition": <Arn>,
"JobQueue": <QueueName>,
"ShareIdentifier": <Share>
},
...
You can see it could handle ShareIdentifier fine.
I have an emr step which is submitted through step function. During step run I can see task is submitted, but emr step is not executed and emr console don’t have any information .
How can I debug this?
How can I send an sns when a step doesn’t start execution with in a threshold timeframe?in my case step function shows emr task submitted but no information on emr console and pipeline is long running without failing for more than half hr
You could start the debugging process through the Step Functions execution log and identify the specific step that has failed, and later, you can move on looking for the EMR console or the specific service that has failed. Usually when the EMR step doesn't appear in the EMR console, is due to a Runtime Error, caused by an exception raised when calling the EMR step.
For this scenario, you can use the Error Handling that Step Functions has, using the Catch and Timeout fields, you can find more details in the AWS documentation here.
Basically you need to add this fields as show bellow:
{
"StartAt": "EmrStep",
"States": {
"EmrStep": {
"Type": "Task",
"Resource": "arn:aws:emr:execute-X-step",
"Comment": "This is your EMR step",
"TimeoutSeconds": 10,
"Catch": [ {
"ErrorEquals": ["States.Timeout"],
"Next": "ShutdownClusterAndSendSNS"
} ],
"End": true
},
"ShutdownClusterAndSendSNS": {
"Type": "Pass",
"Comment": "This step handles the timeout exception raised",
"Result": "You can shutdown the EMR cluster to avoid increased cost here and later send a sns notification!",
"End": true
}
}
Note: To catch the timeout exception, you have to catch the error States.Timeout, but also you can define the same catch field for other types of error.
Based on Stackdriver, I want to send notifications to my Centreon monitoring (behind Nagios) for workflow reasons, do you have any idea on how to do so?
Thank you
Stackdriver alerting allows webhook notifications, so you can run a server to forward the notifications anywhere you need to (including Centreon), and point the Stackdriver alerting notification channel to that server.
There are two ways to send external information in the Centreon queue without a traditional passive agent mode.
First, you can use the Centreon DSM (Dynamic Services Management) addon.
It is interesting because you don't have to register a dedicated and already known service in your configuration to match the notification.
With Centreon DSM, Centreon can receive events such as SNMP traps resulting from the detection of a problem and assign the event dynamically to a slot defined in Centreon, like a tray event.
A resource has a set number of “slots” on which alerts will be assigned (stored). While this event has not been taken into account by human action, it will remain visible in the Centreon web frontend. When the event is acknowledged, the slot becomes available for new events.
The event must be transmitted to the server via an SNMP Trap.
All the configuration is made through Centreon web interface after the module installation.
Complete explanations, screenshots, and tips are described on the online documentation: https://documentation.centreon.com/docs/centreon-dsm/en/latest/user.html
Secondly, Centreon developers added a Centreon REST API you can use to submit information to the monitoring engine.
This feature is easier to use than the SNMP Trap way.
In that case, you have to create both host/service objects before any API utilization.
To send status, please use the following URL using POST method:
api.domain.tld/centreon/api/index.php?action=submit&object=centreon_submit_results
Header
key value
Content-Type application/json
centreon-auth-token the value of authToken you got on the authentication response
Example of service body submit: The body is a JSON with the parameters provided above formatted as below:
{
"results": [
{
"updatetime": "1528884076",
"host": "Centreon-Central"
"service": "Memory",
"status": "2"
"output": "The service is in CRITICAL state"
"perfdata": "perf=20"
},
{
"updatetime": "1528884076",
"host": "Centreon-Central"
"service": "fake-service",
"status": "1"
"output": "The service is in WARNING state"
"perfdata": "perf=10"
}
]
}
Example of body response: :: The response body is a JSON with the HTTP return code, and a message for each submit:
{
"results": [
{
"code": 202,
"message": "The status send to the engine"
},
{
"code": 404,
"message": "The service is not present."
}
]
}
More information is available in the online documentation: https://documentation.centreon.com/docs/centreon/en/19.04/api/api_rest/index.html
Centreon REST API also allows to get real-time status for hosts, services and do the object configuration.
I am writing a script that will create an EFS file system with a name from input. I am using the AWS SDK for PHP Version 3.
I am able to create the file system using the createFileSystem command. This new file system is not usable until it has a mount target created. If I run the CreateMountTarget command after the createFileSystem command then I receive an error that the file system's life cycle state is not in the 'available' state.
I have tried using createFileSystemAsync to create a promise and calling the wait function on that promise to force the script to run synchronously. However, the promise is always fulfilled while the file system is still in 'creating' life cycle state.
Is there a way to force the script to wait for the file system to be in the available state using the AWS SDK?
One way is to check the status of the file system using DescribeFileSystems API. In the response look at the LifeCycleState, if it is available fire the CreateMountTarget API. You can keep checking the DescribeFileSystems in a loop with a few seconds delay until the LifeCycleState is Available
It looks like you want a waiter for FileSystemAvailable, but the elasticfilesystem files don't specify one. I'd file an issue on GitHub asking for one. You'd need to wait for DescribeFileSystems to have a LifeCycleState of available.
In the mean time, you can probably write your own with something like the following and following the waiters guide.
{
"version":2,
"FileSystemAvailable": {
"delay": 15,
"operation": "DescribeFileSystems",
"maxAttempts": 40,
"acceptors": [
{
"expected": "available",
"matcher": "pathAll",
"state": "success",
"argument": "FileSystems[].LifeCycleState"
},
{
"expected": "deleted",
"matcher": "pathAny",
"state": "failure",
"argument": "FileSystems[].LifeCycleState"
},
{
"expected": "deleting",
"matcher": "pathAny",
"state": "failure",
"argument": "FileSystems[].LifeCycleState"
}
]
},
}
Promises in the AWS SDK for PHP are used for making the HTTP request concurrently. This doesn't help in this case because the behavior of the API call is to start an asynchronous task in EFS.