ttl not working in loopback when I set it for a Model - loopbackjs

I am trying to set ttl for a loopback model so that the document gets deleted automatically after specified time.
Here is the property I have added:
"ttl": {
"type": "number",
"required": true
}
This is not AccessToken model but a separate model whose documents I want to be deleted after specified time interval.

AccessTokens don't get deleted after their ttl is up, they just invalidate the token for login purposes. I'm not sure that any database/ORM will just delete rows after they've been around for a certain mount of time I was wrong mongodb does this, however loopback does not actually use this feature. This script will create a job which deletes all rows who have expired according to their ttl column.
server/boot/job-delete-expired.js
module.exports = (server) => {
const myModel = server.models.myModel;
if (!myModel) {
throw new Error("My model not found!");
}
const deleteExpiredModels = async () => {
const now = new Date();
const all = await myModel.find();
// If the created time + the ttl is paste now then it can be deleted
const expired = all.filter(m => (m.created + m.ttl) > now);
// Delete them all
await Promise.all(expired.map(e => myModel.destroyById(e.id)));
};
// Execute this every 10 minutes
setInterval(() => deleteExpiredModels(), 60000)
};
Disclaimer: This code has no error handling, and setInterval does not wait for promises to resolve, if you're using this in production consider a while loop with async/await to make sure that only one instance of deleteExpiredModels is ever executed.

I was able to this solve as follows:
MyCollection.getDataSource().connector.connect(function(err, db) {
if(!err){
var collection = db.collection('MyCollection');
collection.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } );
}
});
Then for each document, I inserted expireAtwhich corresponds to the time the document should expire.
MongoDB automatically deletes documents from the collection at the documents’ expireAt time.

I used the model.json file to solve this
"indexes":{
"expireAt_1":{
"keys": {"createdOn": 1},
"options":{"expireAfterSeconds": 2592000}
}
}
I used a name for the index.
The key defined has the object property that has the date value.
The expireAfterSeconds value needs to be set in the options property.In this case I have set it to 30 days after createdOn date

Related

How to modify the request/response to get directly the prices of an AWS virtual machine?

I am currently learning how to use AWS pricing SDK. My objective is to get all the prices of aws virtual machines, as the prices can be different from a region to another.
Basically, I am running this code :
AmazonPricingClient client = new(keyId, key, RegionEndpoint.USEast1);
// Developement filters to handle smaller amount of data
GetProductsRequest request = new() {
ServiceCode = "AmazonEC2",
Filters = new() {
new ()
{
Field = "vcpu",
Type = "TERM_MATCH",
Value = "2"
},
new()
{
Field = "currentGeneration",
Type = "TERM_MATCH",
Value = "Yes"
},
new()
{
Field = "regionCode",
Type = "TERM_MATCH",
Value = "eu-west-1"
},
new()
{
Field = "operatingSystem",
Type = "TERM_MATCH",
Value = "Windows"
}
}
};
GetProductsResponse response = await client.GetProductsAsync(request);
Taking in consideration the filters (added to reduce the amount of data while testing the code out), I will only get the prices of the matching virtual machines for region eu-west-1.
If I delete this dev filter (for production for exemple), I will get the prices of every region, but each time, I will also get this part of the returned json :
"product":{
"productFamily":"Compute Instance",
"attributes":{
"enhancedNetworkingSupported":"Yes",
"intelTurboAvailable":"No",
"memory":"16 GiB",
"dedicatedEbsThroughput":"Up to 3500 Mbps",
[...]
"operation":"RunInstances:000g",
"availabilityzone":"NA"
},
"sku":"2A56CED7V5PFGAH8"
}
And this part would be duplicated for each region.
Is there a way to tell the api that I just want the different prices of a specific virtual machine ? Using either the request or the response objects ?
I may have missed some possibilities offered by the SDK, feel free to tell me anything I can improve in that snippet, good practices,...
Thanks !

sqlAdmin.backupRuns.list how to set maxresult attribute

I have written a Nodejs code on Cloud Functions to create an on-demand backup and delete backup after retention days. However, calling sqlAdmin.backupRuns.list() returns a default of only 20 backups.
How do I set the setMaxResults attribute? This returns an error stating method does not exist.
sqlAdmin.backupRuns.list.setMaxResults(100);
This is how I'm retrieving all the backups:
sqlAdmin.backupRuns.list(request, function(err, response)
sqlAdmin.backupRuns.list.setMaxResults(100);
sqlAdmin.backupRuns.list(request, function(err, response) {
response.data.items.forEach(element => {
//do useful
});
}
You need to pass maxResults as a parameter based on the documentation. For example:
params = {
"project": "PROJECT_NAME",
"instance": "INSTANCE_ID",
"maxResults": 100
}
const res = await sqlAdmin.backupRuns.list(params);
console.log(res.data)
INSTANCE_ID is Cloud SQL instance ID. This does not include the project ID.

GCP Cloud Tasks: shorten period for creating a previously created named task

We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.

Not able to solve throttlingException in DynamoDB

I have a lambda function which does a transaction in DynamoDB similar to this.
try {
const reservationId = genId();
await transactionFn();
return {
statusCode: 200,
body: JSON.stringify({id: reservationId})
};
async function transactionFn() {
try {
await docClient.transactWrite({
TransactItems: [
{
Put: {
TableName: ReservationTable,
Item: {
reservationId,
userId,
retryCount: Number(retryCount),
}
}
},
{
Update: {
TableName: EventDetailsTable,
Key: {eventId},
ConditionExpression: 'available >= :minValue',
UpdateExpression: `set available = available - :val, attendees= attendees + :val, lastUpdatedDate = :updatedAt`,
ExpressionAttributeValues: {
":val": 1,
":updatedAt": currentTime,
":minValue": 1
}
}
}
]
}).promise();
return true
} catch (e) {
const transactionConflictError = e.message.search("TransactionConflict") !== -1;
// const throttlingException = e.code === 'ThrottlingException';
console.log("transactionFn:transactionConflictError:", transactionConflictError);
if (transactionConflictError) {
retryCount += 1;
await transactionFn();
return;
}
// if(throttlingException){
//
// }
console.log("transactionFn:e.code:", JSON.stringify(e));
throw e
}
}
It just updating 2 tables on api call. If it encounter a transaction conflict error, it simply retry the transaction by recursively calling the function.
The eventDetails table is getting too much db updates. ( checked it with aws Contributor Insights) so, made provisioned unit to a higher value than earlier.
For reservationTable Provisioned capacity is on Demand.
When I do load test over this api with 400 (or more) users using JMeter (master slave configuration) I am getting Throttled error for some api calls and some api took more than 20 sec to respond.
When I checked X-Ray for this api found that, DynamoDB is taking too much time for this transasction for the slower api calls.
Even with much fixed provisioning ( I tried on demand scaling too ) , I am getting throttled exception for api calls.
ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded.
Consider increasing your provisioning level with the UpdateTable API.
UPDATE
And one more thing. When I do the load testing, I am always uses the same eventId. It means, I am always updating the same row for all the api requests. I have found this article, which says that, a single partition can only have upto 1000 WCU. Since I am always updating the same row in the eventDetails table during load testing, is that causing this issue ?
I had this exact error and it helped me to change the On Demand to Provisioned under Read/write capacity mode. Try to change that, if that doesn't help, we'll go from there.
From the link you cite in your update, also described in an AWS help article here, it sounds like the issue is that all of your load testers are writing to the same entry in the table, which is going to be in the same partition, subject to the hard limit of 1,000 WCU.
Have you tried repeating this experiment with the load testers writing to different partitions?

Copying one table to another in DynamoDB

What's the best way to identically copy one table over to a new one in DynamoDB?
(I'm not worried about atomicity).
Create a backup(backups option) and restore the table with a new table name. That would get all the data into the new table.
Note: Takes considerable amount of time depending on the table size
I just used the python script, dynamodb-copy-table, making sure my credentials were in some environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY), and it worked flawlessly. It even created the destination table for me.
python dynamodb-copy-table.py src_table dst_table
The default region is us-west-2, change it with the AWS_DEFAULT_REGION env variable.
AWS Pipeline provides a template which can be used for this purpose: "CrossRegion DynamoDB Copy"
See: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-crossregion-ddb-create.html
The result is a simple pipeline that looks like:
Although it's called CrossRegion you can easily use it for the same region as long the destination table name is different (Remember that table names are unique per account and region)
You can use Scan to read the data and save it to the new table.
On the AWS forums a guy from the AWS team posted another approach using EMR: How Do I Duplicate a Table?
Here's one solution to copy all items from one table to another, just using shell scripting, the AWS CLI and jq. Will work OK for smallish tables.
# exit on error
set -eo pipefail
# tables
TABLE_FROM=<table>
TABLE_TO=<table>
# read
aws dynamodb scan \
--table-name "$TABLE_FROM" \
--output json \
| jq "{ \"$TABLE_TO\": [ .Items[] | { PutRequest: { Item: . } } ] }" \
> "$TABLE_TO-payload.json"
# write
aws dynamodb batch-write-item --request-items file://"$TABLE_TO-payload.json"
# clean up
rm "$TABLE_TO-payload.json"
If you both tables to be identical, you'd want to delete all items in TABLE_TO first.
DynamoDB now supports importing from S3.
https://aws.amazon.com/blogs/database/amazon-dynamodb-can-now-import-amazon-s3-data-into-a-new-table/
So, probably in almost all use cases, the easiest and cheapest way to replicate a table is
Use "Export to S3" feature to dump entire table into S3. Since this uses backup to generate the dump, table's throughput is not affected, and is very fast as well. You need to have backups (PITR) enabled. See https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/
Use "Import from S3" to import the dump created in step 1. This automatically requires you to create a new table.
Use this node js module : copy-dynamodb-table
This is a little script I made to copy the contents of one table to another.
It's based on the AWS-SDK v3. Not sure how well it would scale to big tables but as a quick and dirty solution it does the job.
It gets your AWS credentials from a profile in ~/.aws/credentials change default to the name of the profile you want to use.
Other than that it takes two args one for the source table and one for destination
const { fromIni } = require("#aws-sdk/credential-providers");
const { DynamoDBClient, ScanCommand, PutItemCommand } = require("#aws-sdk/client-dynamodb");
const ddbClient = new DynamoDBClient({
credentials: fromIni({profile: "default"}),
region: "eu-west-1",
});
const args = process.argv.slice(2);
console.log(args)
async function main() {
const { Items } = await ddbClient.send(
new ScanCommand({
TableName: args[0],
})
);
console.log("Successfully scanned table")
console.log("Copying", Items.length, "Items")
const putPromises = [];
Items.forEach((item) => {
putPromises.push(
ddbClient.send(
new PutItemCommand({
TableName: args[1],
Item: item,
})
)
);
});
await Promise.all(putPromises);
console.log("Successfully copied table")
}
main();
Usage
node copy-table.js <source_table_name> <destination_table_name>
Python + boto3 🚀
The script is idempotent as far as you maintain the same Keys.
import boto3
def migrate(source, target):
dynamo_client = boto3.client('dynamodb', region_name='us-east-1')
dynamo_target_client = boto3.client('dynamodb', region_name='us-west-2')
dynamo_paginator = dynamo_client.get_paginator('scan')
dynamo_response = dynamo_paginator.paginate(
TableName=source,
Select='ALL_ATTRIBUTES',
ReturnConsumedCapacity='NONE',
ConsistentRead=True
)
for page in dynamo_response:
for item in page['Items']:
dynamo_target_client.put_item(
TableName=target,
Item=item
)
if __name__ == '__main__':
migrate('awesome-v1', 'awesome-v2')
On November 29th, 2017 Global Tables was introduced. This may be useful depending on your use case, which may not be the same as the original question. Here are a few snippets from the blog post:
Global Tables – You can now create tables that are automatically replicated across two or more AWS Regions, with full support for multi-master writes, with a couple of clicks. This gives you the ability to build fast, massively scaled applications for a global user base without having to manage the replication process.
...
You do not need to make any changes to your existing code. You simply send write requests and eventually consistent read requests to a DynamoDB endpoint in any of the designated Regions (writes that are associated with strongly consistent reads should share a common endpoint). Behind the scenes, DynamoDB implements multi-master writes and ensures that the last write to a particular item prevails. When you use Global Tables, each item will include a timestamp attribute representing the time of the most recent write. Updates are propagated to other Regions asynchronously via DynamoDB Streams and are typically complete within one second (you can track this using the new ReplicationLatency and PendingReplicationCount metrics).
Another option is to download the table as a .csv file and upload it with the following snippet of code.
This also eliminates the need for providing your AWS credentials to a packages such as the one #ezzat suggests.
Create a new folder and add the following two files and your exported table
Edit uploadToDynamoDB.js and add the filename of the exported table and your table name
Run npm install in the folder
Run node uploadToDynamodb.js
File: Package.json
{
"name": "uploadtodynamodb",
"version": "1.0.0",
"description": "",
"main": "uploadToDynamoDB.js",
"author": "",
"license": "ISC",
"dependencies": {
"async": "^3.1.1",
"aws-sdk": "^2.624.0",
"csv-parse": "^4.8.5",
"fs": "0.0.1-security",
"lodash": "^4.17.15",
"uuid": "^3.4.0"
}
}
File: uploadToDynamoDB.js
var fs = require('fs');
var parse = require('csv-parse');
var async = require('async');
var _ = require('lodash')
var AWS = require('aws-sdk');
// If your table is in another region, make sure to update this
AWS.config.update({ region: "eu-central-1" });
var ddb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
var csv_filename = "./TABLE_CSV_EXPORT_FILENAME.csv";
var tableName = "TABLENAME"
function prepareData(data_chunk) {
const items = data_chunk.map(obj => {
const keys = Object.keys(obj)
let attr = Object.values(obj)
attr = attr.map(a => {
let newAttr;
// Can we make this an integer
if (isNaN(Number(a))) {
newAttr = { "S": a }
} else {
newAttr = { "N": a }
}
return newAttr
})
let item = _.zipObject(keys, attr)
return {
PutRequest: {
Item: item
}
}
})
var params = {
RequestItems: {
[tableName]: items
}
};
return params
}
rs = fs.createReadStream(csv_filename);
parser = parse({
columns : true,
delimiter : ','
}, function(err, data) {
var split_arrays = [], size = 25;
while (data.length > 0) {
split_arrays.push(data.splice(0, size));
}
data_imported = false;
chunk_no = 1;
async.each(split_arrays, function(item_data, callback) {
const params = prepareData(item_data)
ddb.batchWriteItem(
params,
function (err, data) {
if (err) {
console.log("Error", err);
} else {
console.log("Success", data);
}
});
}, function() {
// run after loops
console.log('all data imported....');
});
});
rs.pipe(parser);
It's been a very long time since the question was posted and AWS has been continuously improvising features. At the time of writing this answer, we have the option to export the Table to S3 bucket then use the import feature to import this data from S3 into a new table which automatically will re-create a new table with the data. Plese refer this blog for more idea on export & import
Best part is that you get to change the name, PK or SK.
Note: You have to enable PITR (might incur additional costs). Always best to refer documents.
Here is another simple python util script for this: ddb_table_copy.py. I use it often.
usage: ddb_table_copy.py [-h] [--dest-table DEST_TABLE] [--dest-file DEST_FILE] source_table
Copy all DynamoDB items from SOURCE_TABLE to either DEST_TABLE or DEST_FILE. Useful for migrating data during a stack teardown/re-creation.
positional arguments:
source_table Name of source table in DynamoDB.
optional arguments:
-h, --help show this help message and exit
--dest-table DEST_TABLE
Name of destination table in DynamoDB.
--dest-file DEST_FILE
2) a valid file path string to save the items to, e.g. 'items.json'.