I developed sample apps which backend DB is Redshift and try to execute query by following SDK code.
import { RedshiftDataClient, ExecuteStatementCommand } from '#aws-sdk/client-redshift-data';
export const resolvers: IResolvers<unknown, Context> = {
Query: {
user: (parent, args, context): User => ({ login: context.login }),
region: (): string => getRegion(),
getData: async () => {
const redshift_client = new RedshiftDataClient({});
const request = new ExecuteStatementCommand({
ClusterIdentifier: 'testrs',
Sql: `select * from test`,
SecretArn: 'arn:aws:secretsmanager:us-east-1:12345561:secret:test-HYRSWs',
Database: 'test',
});
try {
const data = await redshift_client.send(request);
console.log('data', data);
return data;
} catch (error) {
console.error(error);
throw new Error('Failed fetching data to Redshift');
} finally {
// execute regardless of error state
}
},
},
};
It returned following error
ERROR AccessDeniedException:
User: arn:aws:sts::12345561:assumed-role/WebsiteStack-Beta-US-EAST-GraphQLLambdaServiceRole1BCPB5P3Q4IS9/GraphQLLambda
is not authorized to perform: redshift-data:ExecuteStatement on resource: arn:aws:redshift:us-east-1:12345561:cluster:testrs
because no identity-based policy allows the redshift-data:ExecuteStatement action
Must I use sdk package like STS ?
If someone has opinion,or materials. will you please let me know
Thanks
I know when using the AWS SDK for Java V2 for the exact same use case, you can successfully query data by setting the ExecuteStatementRequest object and passing it to the Data Client's executeStatement like this:
if (num ==5)
sqlStatement = "SELECT TOP 5 * FROM blog ORDER BY date DESC";
else if (num ==10)
sqlStatement = "SELECT TOP 10 * FROM blog ORDER BY date DESC";
else
sqlStatement = "SELECT * FROM blog ORDER BY date DESC" ;
ExecuteStatementRequest statementRequest = ExecuteStatementRequest.builder()
.clusterIdentifier(clusterId)
.database(database)
.dbUser(dbUser)
.sql(sqlStatement)
.build();
ExecuteStatementResponse response = redshiftDataClient.executeStatement(statementRequest);
As shown here -- the required values are clusterId, database, and dbUser.
I would assume the AWS SDK for JavaScript would work the same way. (I have not tried using that SDK however).
The reference docs confirm this...
Temporary credentials - when connecting to a cluster, specify the cluster identifier, the database name, and the database user name. Also, permission to call the redshift:GetClusterCredentials operation is required. When connecting to a serverless endpoint, specify the database name.
https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-redshift-data/classes/executestatementcommand.html
I want to retrieve a table (with all rows) by name. I want to HTTP request using something like this on the body {"table": user}.
Tried this code without success:
'use strict';
const {Datastore} = require('#google-cloud/datastore');
// Instantiates a client
const datastore = new Datastore();
exports.getUsers = (req, res) => {
//Get List
const query = this.datastore.createQuery('users');
this.datastore.runQuery(query).then(results => {
const customers = results[0];
console.log('User:');
customers.forEach(customer => {
const cusKey = customer[this.datastore.KEY];
console.log(cusKey.id);
console.log(customer);
});
})
.catch(err => { console.error('ERROR:', err); });
}
Google Datastore is a NoSQL database that is working with entities and not tables. What you want is to load all the "records" which are "key identifiers" in Datastore and all their "properties", which is the "columns" that you see in the Console. But you want to load them based the "Kind" name which is the "table" that you are referring to.
Here is a solution on how to retrieve all the key identifiers and their properties from Datastore, using HTTP trigger Cloud Function running in Node.js 8 environment.
Create a Google Cloud Function and choose the trigger to HTTP.
Choose the runtime to be Node.js 8
In index.js replace all the code with this GitHub code.
In package.json add:
{
"name": "sample-http",
"version": "0.0.1",
"dependencies": {
"#google-cloud/datastore": "^3.1.2"
}
}
Under Function to execute add loadDataFromDatastore, since this is the name of the function that we want to execute.
NOTE: This will log all the loaded records into the Stackdriver logs
of the Cloud Function. The response for each record is a JSON,
therefore you will have to convert the response to a JSON object to
get the data you want. Get the idea and modify the code accordingly.
I am trying to set ttl for a loopback model so that the document gets deleted automatically after specified time.
Here is the property I have added:
"ttl": {
"type": "number",
"required": true
}
This is not AccessToken model but a separate model whose documents I want to be deleted after specified time interval.
AccessTokens don't get deleted after their ttl is up, they just invalidate the token for login purposes. I'm not sure that any database/ORM will just delete rows after they've been around for a certain mount of time I was wrong mongodb does this, however loopback does not actually use this feature. This script will create a job which deletes all rows who have expired according to their ttl column.
server/boot/job-delete-expired.js
module.exports = (server) => {
const myModel = server.models.myModel;
if (!myModel) {
throw new Error("My model not found!");
}
const deleteExpiredModels = async () => {
const now = new Date();
const all = await myModel.find();
// If the created time + the ttl is paste now then it can be deleted
const expired = all.filter(m => (m.created + m.ttl) > now);
// Delete them all
await Promise.all(expired.map(e => myModel.destroyById(e.id)));
};
// Execute this every 10 minutes
setInterval(() => deleteExpiredModels(), 60000)
};
Disclaimer: This code has no error handling, and setInterval does not wait for promises to resolve, if you're using this in production consider a while loop with async/await to make sure that only one instance of deleteExpiredModels is ever executed.
I was able to this solve as follows:
MyCollection.getDataSource().connector.connect(function(err, db) {
if(!err){
var collection = db.collection('MyCollection');
collection.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } );
}
});
Then for each document, I inserted expireAtwhich corresponds to the time the document should expire.
MongoDB automatically deletes documents from the collection at the documents’ expireAt time.
I used the model.json file to solve this
"indexes":{
"expireAt_1":{
"keys": {"createdOn": 1},
"options":{"expireAfterSeconds": 2592000}
}
}
I used a name for the index.
The key defined has the object property that has the date value.
The expireAfterSeconds value needs to be set in the options property.In this case I have set it to 30 days after createdOn date
I have tried automating the backup of AWS ec2 instance using lambda function and triggering a cloudwatch event. I am using a free tier service.
I have scheduled the backup every 5 mins but, After first backup i.e AMI creation, there is no further AMI creation.
Can we create the multiple AMI of the same instance?
Below is the lambda function used.
Regards
Monika
var aws = require('aws-sdk');
aws.config.region = 'us-east-1';
var ec2 = new aws.EC2();
var now = new Date();
var date = now.toISOString().substring(0, 10);
var hours = now.getHours() ;
var minutes = now.getMinutes() ;
exports.handler = function(event, context) {
var instanceparams = {
Filters: [{
Name: 'tag:Backup',
Values: [
'yes'
]
}]
};
ec2.describeInstances(instanceparams, function(err, data) {
if (err) console.log(err, err.stack);
else {
for (var i in data.Reservations) {
for (var j in data.Reservations[i].Instances) {
var instanceid = data.Reservations[i].Instances[j].InstanceId;
var nametag = data.Reservations[i].Instances[j].Tags;
for (var k in data.Reservations[i].Instances[j].Tags) {
if (data.Reservations[i].Instances[j].Tags[k].Key == 'Name') {
var name = data.Reservations[i].Instances[j].Tags[k].Value;
}
}
console.log("Creating AMIs of the Instance: ", name);
var imageparams = {
InstanceId: instanceid,
Name: name + "_" + date + "_" + hours + "-" + minutes,
NoReboot: true
};
ec2.createImage(imageparams, function(err, data) {
if (err) console.log(err, err.stack);
else {
var image = data.ImageId;
console.log(image);
var tagparams = {
Resources: [image],
Tags: [{
Key: 'DeleteOn',
Value: 'yes'
}]
};
ec2.createTags(tagparams, function(err, data) {
if (err) console.log(err, err.stack);
else console.log("Tags added to the created AMIs");
});
}
});
}
}
}
});
};
It is not being created because it is impossible to have the same AMI name for multiple instances.
An AMI is the same as a Snapshot, except it can also be used to launch a new instance. An AMI can also contain multiple snapshots (multiple drives).
If your system operates from one volume (the boot volume), having an AMI is an easy way to launch a new instance with exactly the same data. This is normally done to launch an instance with pre-installed software (thus making it in a known state), but can also be used for backup purposes.
Having a snapshot as a backup certainly provides a copy of the volume as at the time of snapshot creation, but to restore the snapshot you actually have to restore the snapshot to a new EBS volume, convert the snapshot to an AMI, then launch an instance from it. (It's a bit harder if it is a Windows boot volume.)
Snapshots and AMIs are incremental, only needing to copy blocks that have been added or changed since a previous snapshot/AMI was created. Thus, they can be very fast to create.
It is not immediately obvious why your code is not functioning correctly. I would suggest adding debug statements before each API call and within the callbacks to obtain more information.
For reference, see also an EBS Snapshotter in Python.
You can automate your AMI backups. I'm not a Lambda expert, but it can be done -- make sure the IAM roles have the correct permissions and that your functions look for the EC2 Backup and Retention tags. Then you can schedule it through the management console. Here's an article with solid details on creating this function. There are other ways to automate snapshots/backups in AWS, if interested.
What's the best way to identically copy one table over to a new one in DynamoDB?
(I'm not worried about atomicity).
Create a backup(backups option) and restore the table with a new table name. That would get all the data into the new table.
Note: Takes considerable amount of time depending on the table size
I just used the python script, dynamodb-copy-table, making sure my credentials were in some environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY), and it worked flawlessly. It even created the destination table for me.
python dynamodb-copy-table.py src_table dst_table
The default region is us-west-2, change it with the AWS_DEFAULT_REGION env variable.
AWS Pipeline provides a template which can be used for this purpose: "CrossRegion DynamoDB Copy"
See: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-crossregion-ddb-create.html
The result is a simple pipeline that looks like:
Although it's called CrossRegion you can easily use it for the same region as long the destination table name is different (Remember that table names are unique per account and region)
You can use Scan to read the data and save it to the new table.
On the AWS forums a guy from the AWS team posted another approach using EMR: How Do I Duplicate a Table?
Here's one solution to copy all items from one table to another, just using shell scripting, the AWS CLI and jq. Will work OK for smallish tables.
# exit on error
set -eo pipefail
# tables
TABLE_FROM=<table>
TABLE_TO=<table>
# read
aws dynamodb scan \
--table-name "$TABLE_FROM" \
--output json \
| jq "{ \"$TABLE_TO\": [ .Items[] | { PutRequest: { Item: . } } ] }" \
> "$TABLE_TO-payload.json"
# write
aws dynamodb batch-write-item --request-items file://"$TABLE_TO-payload.json"
# clean up
rm "$TABLE_TO-payload.json"
If you both tables to be identical, you'd want to delete all items in TABLE_TO first.
DynamoDB now supports importing from S3.
https://aws.amazon.com/blogs/database/amazon-dynamodb-can-now-import-amazon-s3-data-into-a-new-table/
So, probably in almost all use cases, the easiest and cheapest way to replicate a table is
Use "Export to S3" feature to dump entire table into S3. Since this uses backup to generate the dump, table's throughput is not affected, and is very fast as well. You need to have backups (PITR) enabled. See https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/
Use "Import from S3" to import the dump created in step 1. This automatically requires you to create a new table.
Use this node js module : copy-dynamodb-table
This is a little script I made to copy the contents of one table to another.
It's based on the AWS-SDK v3. Not sure how well it would scale to big tables but as a quick and dirty solution it does the job.
It gets your AWS credentials from a profile in ~/.aws/credentials change default to the name of the profile you want to use.
Other than that it takes two args one for the source table and one for destination
const { fromIni } = require("#aws-sdk/credential-providers");
const { DynamoDBClient, ScanCommand, PutItemCommand } = require("#aws-sdk/client-dynamodb");
const ddbClient = new DynamoDBClient({
credentials: fromIni({profile: "default"}),
region: "eu-west-1",
});
const args = process.argv.slice(2);
console.log(args)
async function main() {
const { Items } = await ddbClient.send(
new ScanCommand({
TableName: args[0],
})
);
console.log("Successfully scanned table")
console.log("Copying", Items.length, "Items")
const putPromises = [];
Items.forEach((item) => {
putPromises.push(
ddbClient.send(
new PutItemCommand({
TableName: args[1],
Item: item,
})
)
);
});
await Promise.all(putPromises);
console.log("Successfully copied table")
}
main();
Usage
node copy-table.js <source_table_name> <destination_table_name>
Python + boto3 🚀
The script is idempotent as far as you maintain the same Keys.
import boto3
def migrate(source, target):
dynamo_client = boto3.client('dynamodb', region_name='us-east-1')
dynamo_target_client = boto3.client('dynamodb', region_name='us-west-2')
dynamo_paginator = dynamo_client.get_paginator('scan')
dynamo_response = dynamo_paginator.paginate(
TableName=source,
Select='ALL_ATTRIBUTES',
ReturnConsumedCapacity='NONE',
ConsistentRead=True
)
for page in dynamo_response:
for item in page['Items']:
dynamo_target_client.put_item(
TableName=target,
Item=item
)
if __name__ == '__main__':
migrate('awesome-v1', 'awesome-v2')
On November 29th, 2017 Global Tables was introduced. This may be useful depending on your use case, which may not be the same as the original question. Here are a few snippets from the blog post:
Global Tables – You can now create tables that are automatically replicated across two or more AWS Regions, with full support for multi-master writes, with a couple of clicks. This gives you the ability to build fast, massively scaled applications for a global user base without having to manage the replication process.
...
You do not need to make any changes to your existing code. You simply send write requests and eventually consistent read requests to a DynamoDB endpoint in any of the designated Regions (writes that are associated with strongly consistent reads should share a common endpoint). Behind the scenes, DynamoDB implements multi-master writes and ensures that the last write to a particular item prevails. When you use Global Tables, each item will include a timestamp attribute representing the time of the most recent write. Updates are propagated to other Regions asynchronously via DynamoDB Streams and are typically complete within one second (you can track this using the new ReplicationLatency and PendingReplicationCount metrics).
Another option is to download the table as a .csv file and upload it with the following snippet of code.
This also eliminates the need for providing your AWS credentials to a packages such as the one #ezzat suggests.
Create a new folder and add the following two files and your exported table
Edit uploadToDynamoDB.js and add the filename of the exported table and your table name
Run npm install in the folder
Run node uploadToDynamodb.js
File: Package.json
{
"name": "uploadtodynamodb",
"version": "1.0.0",
"description": "",
"main": "uploadToDynamoDB.js",
"author": "",
"license": "ISC",
"dependencies": {
"async": "^3.1.1",
"aws-sdk": "^2.624.0",
"csv-parse": "^4.8.5",
"fs": "0.0.1-security",
"lodash": "^4.17.15",
"uuid": "^3.4.0"
}
}
File: uploadToDynamoDB.js
var fs = require('fs');
var parse = require('csv-parse');
var async = require('async');
var _ = require('lodash')
var AWS = require('aws-sdk');
// If your table is in another region, make sure to update this
AWS.config.update({ region: "eu-central-1" });
var ddb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
var csv_filename = "./TABLE_CSV_EXPORT_FILENAME.csv";
var tableName = "TABLENAME"
function prepareData(data_chunk) {
const items = data_chunk.map(obj => {
const keys = Object.keys(obj)
let attr = Object.values(obj)
attr = attr.map(a => {
let newAttr;
// Can we make this an integer
if (isNaN(Number(a))) {
newAttr = { "S": a }
} else {
newAttr = { "N": a }
}
return newAttr
})
let item = _.zipObject(keys, attr)
return {
PutRequest: {
Item: item
}
}
})
var params = {
RequestItems: {
[tableName]: items
}
};
return params
}
rs = fs.createReadStream(csv_filename);
parser = parse({
columns : true,
delimiter : ','
}, function(err, data) {
var split_arrays = [], size = 25;
while (data.length > 0) {
split_arrays.push(data.splice(0, size));
}
data_imported = false;
chunk_no = 1;
async.each(split_arrays, function(item_data, callback) {
const params = prepareData(item_data)
ddb.batchWriteItem(
params,
function (err, data) {
if (err) {
console.log("Error", err);
} else {
console.log("Success", data);
}
});
}, function() {
// run after loops
console.log('all data imported....');
});
});
rs.pipe(parser);
It's been a very long time since the question was posted and AWS has been continuously improvising features. At the time of writing this answer, we have the option to export the Table to S3 bucket then use the import feature to import this data from S3 into a new table which automatically will re-create a new table with the data. Plese refer this blog for more idea on export & import
Best part is that you get to change the name, PK or SK.
Note: You have to enable PITR (might incur additional costs). Always best to refer documents.
Here is another simple python util script for this: ddb_table_copy.py. I use it often.
usage: ddb_table_copy.py [-h] [--dest-table DEST_TABLE] [--dest-file DEST_FILE] source_table
Copy all DynamoDB items from SOURCE_TABLE to either DEST_TABLE or DEST_FILE. Useful for migrating data during a stack teardown/re-creation.
positional arguments:
source_table Name of source table in DynamoDB.
optional arguments:
-h, --help show this help message and exit
--dest-table DEST_TABLE
Name of destination table in DynamoDB.
--dest-file DEST_FILE
2) a valid file path string to save the items to, e.g. 'items.json'.