Dataflow logs from Stackdriver - google-cloud-platform

Dataflow logs from Stackdriver - google-cloud-platform

The resource.labels.region field for the dataflow_step logs in stackdriver, points to global even though the specified regional endpoint is Europe-west2.
Any idea on what is it exactly pointing to?

Once you've supplied GCP Logs Viewer with the desired filtering option, as most simple query based on your inputs seeking for dataflow_step resource type:
resource.type="dataflow_step"
resource.labels.region="europe-west2"
You would probably observe query results retrieved from Cloud Dataflow REST API, consisting with logs entries formatted as a JSON outputs for all Dataflow Jobs that are residing within your GCP project in europe-west2 Regional endpoint:
{
"insertId": "insertId",
"jsonPayload": {
....
"message": "Message content",
....
},
"resource": {
"type": "dataflow_step",
"labels": {
"job_id": "job_id",
"region": "europe-west2",
"job_name": "job_name",
"project_id": "project_id",
"step_id": "step_id"
}
},
"timestamp": "timestamp",
"severity": "severity_level",
"labels": {
"compute.googleapis.com/resource_id": "resource_id",
"dataflow.googleapis.com/job_id": "job_id",
"compute.googleapis.com/resource_type": "resource_type",
"compute.googleapis.com/resource_name": "resource_name",
"dataflow.googleapis.com/region": "europe-west2",
"dataflow.googleapis.com/job_name": "job_name"
},
"logName": "logName",
"receiveTimestamp": "receiveTimestamp"
According to GCP logging service documentation each monitoring resource type derives particular labels from the nested service API, dataflow.googleapis.com corresponds to Dataflow service.
Therefore, if you run Dataflow Job defining the location for job's metadata region, GCP logging service will fetch up this regional endpoint from job description throughout dataflow.googleapis.com REST methods.

The resource.labels.region field on Dataflow Step logs should refer to the regional endpoint that the job is using. "Global" is not an expected value there.

Related

How to trigger a notification in AWS OpenSearch

I have created a Lambda which is triggered by an EventBridge rule that I created.
The purpose is for the Lambda to send a slack notification when an OpenSearch Service Upgrade is available.
I've tested the Lambda manually with a sample event and it works well, but I want to test it "for real" by getting a real OpenSearch instance to send a notification.
The OpenSearch domain I created is not sending notifications like I would expect it to.
I've created a new OpenSearch domain and used an old version of OpenSearch (1.0).
When I look at the OpenSearch domain I created in the AWS console, it shows the Version is OpenSearch 1.0 and there is an Upgrade Available (to 1.3).
However, this did not trigger a notification.
How do notifications get triggered? Would a notification only get triggered if a new Upgrade becomes available (e.g. 1.4) when my OpenSearch domain is already up and running?
Is there any way to force OpenSearch to trigger the notification?
I want OpenSearch to trigger a notification, which in turn is captured by EventBridge, and triggers my Lambda with an event like:
`
{
"version": "0",
"id": "01234567-0123-0123-0123-012345678901",
"detail-type": "Amazon OpenSearch Service Software Update Notification",
"source": "aws.es",
"account": "123456789012",
"time": "2016-11-01T13:12:22Z",
"region": "us-east-1",
"resources": [
"arn:aws:es:us-east-1:123456789012:domain/test-domain"
],
"detail": {
"event": "Service Software Update",
"status": "Available",
"severity": "Informational",
"description": "Service software update [R20200330-p1] available."
}
}
`

Find out which IAM account manually triggered a scheduled function

I have a GCP Cloud function which runs on a schedule every morning. The logs show that it has been triggered off-schedule three other times in the last week, which I presume can only happen if someone has gone to the Cloud Scheduler page and clicked 'Run now' on that function. How can I find out who did this? The Logs Explorer doesn't show this information. (Heads will not roll, but IAM permissions may be stripped. Bonus points if it turns out to have been me.)
For scheduled functions, there are two sets of logs - one for the cloud function triggered by the schedule, and one for the Cloud Scheduler itself. In the logs for the Cloud Scheduler, only the daily schedule shows up, not the extra triggers.
Log of the function starting in the logs explorer for the Cloud Function:
{
"textPayload": "Function execution started",
"insertId": "REDACTED",
"resource": {
"type": "cloud_function",
"labels": {
"region": "REDACTED",
"function_name": "REDACTED",
"project_id": "REDACTED"
}
},
"timestamp": "2022-05-04T08:49:37.980952884Z",
"severity": "DEBUG",
"labels": {
"execution_id": "REDACTED"
},
"logName": "projects/REDACTED/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
"trace": "projects/REDACTED/traces/REDACTED",
"receiveTimestamp": "2022-05-04T08:49:37.981500851Z"
}

If the trigger of your Cloud Function is http and without authorization, it will be very hard (or near impossible) to figure out who called it.
Additionally, it is possible there were not enough available instances when it was scheduled, and then the Cloud Function ran later by a retry.

Following migrating Amplify CLI and AppSync Transformer to v2 no longer able to access CloudFormation parameters from a custom resource

We have an AWS Amplify project that I am in the process of migrating the API from Transformer 1 to 2.
As part of this, we have a number of custom resolvers that previously had their own stack JSON template in the stacks/ folder as generated by the Amplify CLI.
As per the migration instructions, I have created new custom resources using amplify add custom which allow me to create either CDK (Cloud development kit) resource or a CloudFormation template. I just want a lift n shift for now so I've gone with the template option and moved the content from the stack JSON to the new custom resolver JSON template.
This seems like it should work, but the custom templates no longer have access to the parameters shared from the parent stack:
{
"AWSTemplateFormatVersion": "2010-09-09",
"Metadata": {},
"Parameters": {
"AppSyncApiId": {
"Type": "String",
"Description": "The id of the AppSync API associated with this project."
},
"S3DeploymentBucket": {
"Type": "String",
"Description": "The S3 bucket containing all deployment assets for the project."
},
"S3DeploymentRootKey": {
"Type": "String",
"Description": "An S3 key relative to the S3DeploymentBucket that points to the root of the deployment directory."
}
},
...
}
So these are standard parameters that were used previously and my challenge now is in accessing the deployment bucket and root key as these values are generated upon deployment.
The exact use case is for the AppSync function configuration when I attempt to locate the request and response mapping template S3 locations:
"RequestMappingTemplateS3Location": {
"Fn::Sub": [
"s3://${S3DeploymentBucket}/${S3DeploymentRootKey}/resolvers/MyCustomResolver.req.vtl",
{
"S3DeploymentBucket": {
"Ref": "S3DeploymentBucket"
},
"S3DeploymentRootKey": {
"Ref": "S3DeploymentRootKey"
}
}
]
},
The error message I am receiving is
AWS::CloudFormation::Stack Tue Feb 15 2022 18:20:42 GMT+0000 (Greenwich Mean Time) Parameters: [S3DeploymentBucket, AppSyncApiId, S3DeploymentRootKey] must have values
I feel like I am missing a step to plumb the output values to the parameters in the JSON but I can't find any documentation to suggest how to do this using the updated Amplify CLI options.
Let me know if you need any further information and fingers crossed it is something simple for you Amplify/CloudFormation ninjas out there!
Thank you in advance!

Is there any way to get AWS resource usage by tag?

I'm creating resources in AWS (mainly EC2, EBS disks and S3 space) for our customers as part of our SaaS product. I would like to be able to get the usage of those resources to be able to send that usage to Stripe to charge and invoice my users.
I was thinking that tagging would be a good way to group resources of a specific customer, so if I put this tag to all its resources: "Customer" => "cust_id_4894168127", then I could do this in pseudo-code API call:
https://www.aws_api_url.com/api/getResourceUsage?Tag=Customer/cust_id_4894168127&From=2020/02/02%To=2020/03/03
And the API would return something like:
{
[
"ResourceID": "8hf8972g8h9",
"ResourceType": "EC2",
"UsageHours": 231,
],
[
"ResourceID": "09j05h05hj",
"ResourceType": "EBS disk",
"DiskSpaceUsedGB": 200,
],
[
"ResourceID": "h87f3go2f2",
"ResourceType": "S3 space",
"SpaceUsedGB": 500,
],
}
I would like to get everything that Amazon is going to charge me in order to charge the customer for all that concepts. If I can't find a way to do it, I'll have to store all the user actions in my database, then calculate the time the EC2 was running, etc.
Do you know of a way to do it with the SDK API?

You can use the get-resources function with tag filters to get all resources.
An example of this is below
aws resourcegroupstaggingapi get-resources --tag-filters Key=Customer,Values=cust_id_4894168127
This would return the Arns of all resources along with any tags attached, from here you would need to call the API for that service to get the metadata you need about it.
For billing you can make use of cost allocation tags to identify the charges for a specific customer.

How long AssumeRoleSaml session valid?

I am trying to figure out usage of an AD user, using AWS via AssumeRoleWithSAML, following this ink, https://aws.amazon.com/blogs/security/how-to-easily-identify-your-federated-users-by-using-aws-cloudtrail/.
However, i dont see AssumeRoleWithSAML event at all in my Cloudtrails, though i can clearly see activity from this user. I went all the way to early July in cloudtrail to look up AssumeRoleWithSaml and dont see any event.
Am i missing something? Bcos of this event not coming, i am not able to correlate what this user is doing in AWS.
Thanks
Amit

You are right, there should be an event with name AssumeRoleWithSAML in the CloudTrail logs.
You already referenced the correct AWS security blog post which describes how to "identify a SAML federated user". [1]
Let's go into detail.
The IAM docs [2] contain an example how the AssumeRoleWithSAML event should look like:
{
"eventVersion": "1.05",
"userIdentity": {
"type": "WebIdentityUser",
"principalId": "accounts.google.com:[id-of-application].apps.googleusercontent.com:[id-of-user]",
"userName": "[id of user]",
"identityProvider": "accounts.google.com"
},
"eventTime": "2016-03-23T01:39:51Z",
"eventSource": "sts.amazonaws.com",
"eventName": "AssumeRoleWithWebIdentity",
"awsRegion": "us-east-2",
"sourceIPAddress": "192.0.2.101",
"userAgent": "aws-cli/1.3.23 Python/2.7.6 Linux/2.6.18-164.el5",
"requestParameters": {
"durationSeconds": 3600,
"roleArn": "arn:aws:iam::444455556666:role/FederatedWebIdentityRole",
"roleSessionName": "MyAssignedRoleSessionName"
},
"responseElements": {
"provider": "accounts.google.com",
"subjectFromWebIdentityToken": "[id of user]",
"audience": "[id of application].apps.googleusercontent.com",
"credentials": {
"accessKeyId": "ASIACQRSTUVWRAOEXAMPLE",
"expiration": "Mar 23, 2016 2:39:51 AM",
"sessionToken": "[encoded session token blob]"
},
"assumedRoleUser": {
"assumedRoleId": "AROACQRSTUVWRAOEXAMPLE:MyAssignedRoleSessionName",
"arn": "arn:aws:sts::444455556666:assumed-role/FederatedWebIdentityRole/MyAssignedRoleSessionName"
}
},
"resources": [
{
"ARN": "arn:aws:iam::444455556666:role/FederatedWebIdentityRole",
"accountId": "444455556666",
"type": "AWS::IAM::Role"
}
],
"requestID": "6EXAMPLE-e595-11e5-b2c7-c974fEXAMPLE",
"eventID": "bEXAMPLE-0b30-4246-b28c-e3da3EXAMPLE",
"eventType": "AwsApiCall",
"recipientAccountId": "444455556666"
}
As we can see, the requestParameters contain an element durationSeconds which is the value you are looking for.
Why is the event missing?
First of all, it is necessary to know if you are using the AWS CloudTrail Console or if you are parsing the CloudTrail files which were delivered to the S3 bucket. If you use the CloudTrail console, you are able the view the last 90 days of recorded API activity and events in an AWS Region only!! [3]
So make sure that you use AWS Athena or another solution if you must go further back in time.
You must look into the trail of the correct region! You do this by inspecting the respective S3 prefix for a multi-region trail or by clicking onto the desired region in the top right corner if you use the AWS CloudTrail Console. This is important because regional services are logging to their respective trail!! AWS mentions this as follows:
If you activate AWS STS endpoints in Regions other than the default global endpoint, then you must also turn on CloudTrail logging in those Regions. This is necessary to record any AWS STS API calls that are made in those Regions. For more information, see Turning On CloudTrail in Additional Regions in the AWS CloudTrail User Guide. [4]
Make sure to look into the correct account! You must inspect the trail of the account whose role was assumed. I mention this explicitly because there are multi-account environments which might use centralized identity accounts etc.
References
[1] https://aws.amazon.com/de/blogs/security/how-to-easily-identify-your-federated-users-by-using-aws-cloudtrail/
[2] https://docs.aws.amazon.com/IAM/latest/UserGuide/cloudtrail-integration.html
[3] https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events-console.html
[4] https://docs.aws.amazon.com/IAM/latest/UserGuide/cloudtrail-integration.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js