Google cloud storage function sending already delivered message when subscriber connects - google-cloud-platform

I have a bucket in google storage cloud. Also, I have a storage function that gets triggered every time there is a new file/folder created on this bucket. The idea of this function is to publish on a google PubSub the files that were created under "monitoring" folder. So, it will get triggered once there is a new file, but only sending the message to PubSub if the file was created under the mentioned folder. Besides, I have a Java application subscribed to the PubSub receiving this messages. It is able to receive messages without issues at all, but when I shut down the application and lunch it again, after some minutes the messages that were delivered previously, are coming again. I checked the logs and see if the storage function was triggered, but it is not the case and it seems that no message was sent to PubSub again. All messages were Acked and PubSub was empty. Am I missing something related to storage function or PubSub ?
This is my storage function definition:
const {PubSub} = require('#google-cloud/pubsub');
const topicName = 'test-topic-1';
const monitoringFolder = 'monitoring/';
exports.handler = (event, context) => {
console.log(event);
if (isMonitoringFolder(event.name)) {
publishEvent(event);
}
};
const publishEvent = (event) => {
const pubSub = new PubSub();
const payload = {
bucket: event.bucket,
filePath: event.name,
timeCreated: event.timeCreated
};
const data = Buffer.from(JSON.stringify(payload));
pubSub
.topic(topicName)
.publish(data)
.then(id => console.log(`${payload.filePath} was added to pubSub with id: ${id}`))
.catch(err => console.log(err));
};
const isMonitoringFolder = filePath => filePath.search(monitoringFolder) != -1
I would really appreciate any advice

Pubsub doesn't guarantee a single occurance of the message
Google Cloud Pubsub have a At-Least-Once delivery policy. This delivers each published message at least once for every subscription. But it can be delivered multiple times
https://cloud.google.com/pubsub/docs/subscriber

Related

GCP Pub/Sub Push Subscription

I am working with GCP Pub/Sub within a Cloud Run service, as per the documentation I am required to use a push subscription.
The push endpoint is configured and is working, I was testing the error case on the push endpoint to make sure that Pub/Sub would not retry delivery in an infinite loop.
After following the documentation I have been unable to get the Dead Letter Queues working with the push subscription.
How can I configure a fixed amount of retries with a push subscription? Code below.
const pubSubClient = new PubSub();
async function pubsubInit() {
await pubSubClient.createTopic(config.asyncTasks.topic);
await pubSubClient.createTopic(config.asyncTasks.deadletter);
await pubSubClient.createSubscription(config.asyncTasks.topic, config.asyncTasks.sub, {
deadLetterPolicy: {
deadLetterTopic: config.asyncTasks.deadletter,
maxDeliveryAttempts: 5,
},
pushEndpoint: config.asyncTasks.endpoint,
});
await pubSubClient.createSubscription(config.asyncTasks.deadletter, config.asyncTasks.deadletterSub, {
pushEndpoint: config.asyncTasks.deadEndpoint,
});
}```

GCP cloud build VIEW RAW logs link

I have written a small cloud function in GCP which is subscribed to Pub/Sub event. When any cloud builds triggered function post message into the slack channel over webook.
In response, we get lots of details to trigger name, branch name, variables details but i am more interested in Build logs URL.
Currently getting build logs URL in response is like : logUrl: https://console.cloud.google.com/cloud-build/builds/899-08sdf-4412b-e3-bd52872?project=125205252525252
which requires GCP console access to check logs.
While in the console there an option View Raw. Is it possible to get that direct URL in the event response? so that i can directly sent it to slack and anyone can access direct logs without having GCP console access.
In your Cloud Build event message, you need to extract 2 values from the JSON message:
logsBucket
id
The raw file is stored here
<logsBucket>/log-<id>.txt
So, you can get it easily in your function with Cloud Storage client library (preferred solution) or with a simple HTTP Get call to the storage API.
If you need more guidance, let me know your dev language, I will send you a piece of code.
as #guillaume blaquiere helped.
Just sharing the piece of code used in cloud function to generate the singedURL of cloud build logs.
var filename ='log-' + build.id + '.txt';
var file = gcs.bucket(BUCKET_NAME).file(filename);
const getURL = async () => {
return new Promise((resolve, reject) => {
file.getSignedUrl({
action: 'read',
expires: Date.now() + 76000000
}, (err, url) => {
if (err) {
console.error(err);
reject(err);
}
console.log("URL");
resolve(url);
});
})
}
const singedUrl = await getURL();
if anyone looking for the whole code please follow this link : https://github.com/harsh4870/Cloud-build-slack-notification/blob/master/singedURL.js

S3 putObject event - older version received

I am setting up a cloudwatch event to trigger on s3 put object and call a lambda function. I am able to trigger the function successfully and here is the sample code that I am trying to run.
exports.handler = function(event, context, callback) {
console.log("Incoming Event: ", event);
print("please");
const bucket = event.Records[0].s3.bucket.name;
const filename = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' '));
const message = `File is uploaded in - ${bucket} -> ${filename}`;
console.log(message);
callback(null, message);
};
I am getting an error as the event data does not contain the property "Records". I checked the AWS docs and the event data should contain "Records". The version shown in the documentation is "eventVersion":"2.2". In the event data I am getting the version as: eventVersion: '1.07'
Is there some additional configuration needed to make this work?
Here is what my cloudwatch event looks like:
You've configured CloudTrail API events. The format of those events is different to the event notifications generated from S3 (the docs you linked to).
If you go to the S3 bucket and apply an event trigger there, it will be in the format you expected. See Configuring Event Notifications.

Should I create message channel at runtime?

When using MQ system, like RabbitMQ, Google Pub/Sub.
Should I create a message channel/queue at runtime of the application? Or, Create it manually first?
For example, when using Google Pub/Sub, create topic at runtime.
async function createTopic(topicName: string): Promise<any> {
const topicInstance = pubsubClient.topic(topicName);
const [exists] = await topicInstance.exists();
if (exists) {
logger.info(`${topicName} topic is existed`);
return;
}
return pubsubClient
.createTopic(topicName)
.then((data) => {
logger.info(`Create topic ${topicName} successfully`);
return data;
})
.catch((err) => logger.error(err));
}
Especially considering development, deployment, and continuous integration processes.
I read from a book that creating a message queue in real time is not very useful.
There is nothing that prevents you from creating the topic at runtime. However, unless you have clients which are checking for the topic's existence and waiting to subscribe to it, you would be publishing messages that will never be received. A better pattern would be to establish your topic beforehand with autoscaling subscribers (perhaps running in cloud functions) ready to receive messages and take appropriate action whenever your publisher begins generating them.

How can I get log content in AWS Lambda from Cloudwatch

I have this basic lambda that posts an image to a web server.
From the events in CloudWatch, I can log successfully anything that happens in that lambda function :
From this Log Group (the lambda function) I clicked on Stream to AWS Lambda, chose a new lambda function in which I expect to receive my logs and didn't put any filters at all so I can get all logs.
The Lambda is triggered properly, but the thing is when I persist what I received in the event and context objects, I have all CloudWatch log stream information but I don't see any of the logs.
What I get :
Do I need to specify a filter for me to see any logs at all? Because in the filter section if I don't put any filters and click on test filter, I get all the logs in the preview window which seems to mean it should send the whole logs to my Lambda function. Also, it looked to me the logs where that unreadable stream in AWSLogs and that it was in Base64 but didn't get any results trying to convert that.
Yes the logs are gzipped and base64-encoded as mentioned by jarmod.
Sample code in NodeJs for extracting the same in lambda will be:
var zlib = require('zlib');
exports.handler = (input, context, callback) => {
var payload = new Buffer(input.awslogs.data, 'base64');
zlib.gunzip(payload, function(e, result) {
if (e) {
context.fail(e);
} else {
result = JSON.parse(result.toString());
console.log(result);
}
});