Google Cloud Firestore Triggers - google-cloud-platform

How to add a new attribute for a newly created document by cloud function that was triggered by onCreate() cloud firestore trigger.
Does one can use same approach to upadate a document in client side as well as to server side ie in Cloud Functions?

Per the docs, you can use event.data.ref to perform operations:
exports.addUserProperty = functions.firestore
.document('users/{userId}')
.onCreate(event => {
// Get an object representing the document
// e.g. {'name': 'Marie', 'age': 66}
var data = event.data.data();
// add a new property to the user object, write it to Firestore
return event.data.ref.update({
"born": "Poland"
});
});

Related

Cloud Function to import data into CloudSQL from cloud storage bucket but getting already schema exist error

I'm trying to import data into CloudSQL instance from cloud storage bucket using cloud function.
How can i delete schema's before importing the data using a single cloud function?
I am using Node.js in cloud function.
error:
error: exit status 3 stdout(capped at 100k bytes): SET SET SET SET SET set_config ------------ (1 row) SET SET SET SET stderr: ERROR: schema "< >" already exists
https://cloud.google.com/sql/docs/mysql/admin-api/rest/v1beta4/instances/import
in below code where do i need to put delete all existing schema's apart from public schema?
Entry point : importDatabase
index.js
const {google} = require('googleapis');
const {auth} = require("google-auth-library");
var sqlAdmin = google.sqladmin('v1beta4');
exports.importDatabase = (_req, res) => {
async function doIt() {
const authRes = await auth.getApplicationDefault();
let authClient = authRes.credential;
var request = {
project: 'my-project', // TODO: Update placeholder value.
instance: 'my-instance', // TODO: Update placeholder value.
resource: {
importContext: {
kind: "sql#importContext",
fileType: "SQL", // CSV
uri: <bucket path>,
database: <database-name>
// Options for importing data as SQL statements.
// sqlimportOptions: {
// /**
},
auth: authClient,
};
sqladmin.instances.import(request, function(err, result) {
if (err) {
console.log(err);
} else {
console.log(result);
}
res.status(200).send("Command completed", err, result);
});
}
doIt();
};
package.json
{
"name": "import-database",
"version": "0.0.1",
"dependencies": {
"googleapis": "^39.2.0",
"google-auth-library": "3.1.2"
}
}
The error looks to be occurring due to a previous aborted import managed to transfer the "schema_name" schema, and then this subsequent import was done without first re-initializing the DB,check helpful document on Cloud SQL import
One way to prevent this issue is to change the create statements in the SQL file from:
CREATE SCHEMA schema_name;
to
CREATE SCHEMA IF NOT EXISTS schema_name;
As far the removing of currently created schema is considered by default, only user or service accounts with the Cloud SQL Admin (roles/cloudsql.admin) or Owner (roles/owner) role have the permission to delete a Cloud SQL instance,please check the helpful document on cloudsql.instances.delete to help you understand the next steps.You can also define an IAM custom role for the user or service account that includes the cloudsql.instances.delete permission. This permission is supported in IAM custom roles.
As a best practice for import export operations, we recommend that you adopt the principle of least privilege, which in this case would mean creating a custom role and adding that specific permission and assigning it to your service account. Alternatively, the service account could be given the “Cloud SQL Admin” role, or the “Cloud Composer API Service Agent” role, which include this permission, and would therefore allow you to execute this command.
NOTE:It is recommended and advised to revalidate any delete actions performed as this may lead to loss of useful data.

aws amplify datastore syncing the whole database

In my DynamoDb I have about 200k datapoints, there will be more in the future. When I logout my local datastorage gets cleared. When I log in, datastore starts to sync it with the cloud. The problem is that the syncing is taking really long for over 200k datapoints. The datapoints are sensorik data that its displayed on a chart.
My idea is to fetch only the data directly from the database which I need without bloating up my entire local storage.
Is there a way to fetch the data what I need without saving it into the offline storage? I was thinking to rather use AWS timeseries for my chart data.
SyncExpression Configuration is required for fetch the specific data based on your need.
DOC: https://docs.amplify.aws/lib/datastore/sync/q/platform/js/
import { DataStore, syncExpression } from 'aws-amplify';
import { Post, Comment } from './models';
DataStore.configure({
syncExpressions: [
syncExpression(Post, () => {
return post => post.rating.gt(5);
}),
syncExpression(Comment, () => {
return comment => comment.status.eq('active');
}),
]
});

Get generated API key from AWS AppSync API created with CDK

I'm trying to access data from my stack where I'm creating an AppSync API. I want to be able to use the generated Stacks' url and apiKey but I'm running into issues with them being encoded/tokenized.
In my stack I'm setting some fields to the outputs of the deployed stack:
this.ApiEndpoint = graphAPI.url;
this.Authorization = graphAPI.graphqlApi.apiKey;
When trying to access these properties I get something like ${Token[TOKEN.209]} and not the values.
If I'm trying to resolve the token like so: this.resolve(graphAPI.graphqlApi.apiKey) I instead get { 'Fn::GetAtt': [ 'AppSyncAPIApiDefaultApiKey537321373E', 'ApiKey' ] }.
But I would like to retrieve the key itself as a string, like da2-10lksdkxn4slcrahnf4ka5zpeemq5i.
How would I go about actually extracting the string values for these properties?
The actual values of such Tokens are available only at deploy-time. Before then you can safely pass these token properties between constructs in your CDK code, but they are opaque placeholders until deployed. Depending on your use case, one of these options can help retrieve the deploy-time values:
If you define CloudFormation Outputs for a variable, CDK will (apart from creating it in CloudFormation), will, after cdk deploy, print its value to the console and optionally write it to a json file you pass with the --outputs-file flag.
// AppsyncStack.ts
new cdk.CfnOutput(this, 'ApiKey', {
value: this.api.apiKey ?? 'UNDEFINED',
exportName: 'api-key',
});
// at deploy-time, if you use a flag: --outputs-file cdk.outputs.json
{
"AppsyncStack": {
"ApiKey": "da2-ou5z5di6kjcophixxxxxxxxxx",
"GraphQlUrl": "https://xxxxxxxxxxxxxxxxx.appsync-api.us-east-1.amazonaws.com/graphql"
}
}
Alternatively, you can write a script to fetch the data post-deploy using the listGraphqlApis and listApiKeys commands from the appsync JS SDK client. You can run the script locally or, for advanced use cases, wrap the script in a CDK Custom Resource construct for deploy-time integration.
Thanks to #fedonev I was able to extract the API key and url like so:
const client = new AppSyncClient({ region: "eu-north-1" });
const command = new ListGraphqlApisCommand({ maxResults: 1 });
const res = await client.send(command);
if (res.graphqlApis) {
const apiKeysCommand = new ListApiKeysCommand({
apiId: res.graphqlApis[0].apiId,
});
const apiKeyResponse = await client.send(apiKeysCommand);
const urls = flatMap(res.graphqlApis[0].uris);
if (apiKeyResponse.apiKeys && res.graphqlApis[0].uris) {
sendSlackMessage(urls[1], apiKeyResponse.apiKeys[0].id || "");
}
}

GCP Cloud Tasks: shorten period for creating a previously created named task

We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.

How to return an entire Datastore table by name using Node.js on a Google Cloud Function

I want to retrieve a table (with all rows) by name. I want to HTTP request using something like this on the body {"table": user}.
Tried this code without success:
'use strict';
const {Datastore} = require('#google-cloud/datastore');
// Instantiates a client
const datastore = new Datastore();
exports.getUsers = (req, res) => {
//Get List
const query = this.datastore.createQuery('users');
this.datastore.runQuery(query).then(results => {
const customers = results[0];
console.log('User:');
customers.forEach(customer => {
const cusKey = customer[this.datastore.KEY];
console.log(cusKey.id);
console.log(customer);
});
})
.catch(err => { console.error('ERROR:', err); });
}
Google Datastore is a NoSQL database that is working with entities and not tables. What you want is to load all the "records" which are "key identifiers" in Datastore and all their "properties", which is the "columns" that you see in the Console. But you want to load them based the "Kind" name which is the "table" that you are referring to.
Here is a solution on how to retrieve all the key identifiers and their properties from Datastore, using HTTP trigger Cloud Function running in Node.js 8 environment.
Create a Google Cloud Function and choose the trigger to HTTP.
Choose the runtime to be Node.js 8
In index.js replace all the code with this GitHub code.
In package.json add:
{
"name": "sample-http",
"version": "0.0.1",
"dependencies": {
"#google-cloud/datastore": "^3.1.2"
}
}
Under Function to execute add loadDataFromDatastore, since this is the name of the function that we want to execute.
NOTE: This will log all the loaded records into the Stackdriver logs
of the Cloud Function. The response for each record is a JSON,
therefore you will have to convert the response to a JSON object to
get the data you want. Get the idea and modify the code accordingly.