I am new to dynamodb. I want to auto increment id value when I use putitem with dynamodb.
Is possible to do that?
This is anti-pattern in DynamoDB which is build to scale across many partitions/shards/servers. DynamoDB does not support auto-increment primary keys due to scaling limitations and cannot be guaranteed across multiple servers.
Better option is to assemble primary key from multiple indices. Primary key can be up to 2048 bytes. There are few options:
Use UUID as your key - possibly time based UUID which makes it unique, evenly distributed and carries time value
Use randomly generated number or timestamp + random (possibly bit-shifting) like: ts << 12 + random_number
Use another service or DynamoDB itself to generate incremental unique id (requires extra call)
Following code will auto-increment counter in DynamoDB and then you can use it as primary key.
var documentClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName: 'sampletable',
Key: { HashKey : 'counters' },
UpdateExpression: 'ADD #a :x',
ExpressionAttributeNames: {'#a' : "counter_field"},
ExpressionAttributeValues: {':x' : 1},
ReturnValues: "UPDATED_NEW" // ensures you get value back
};
documentClient.update(params, function(err, data) {});
// once you get new value, use it as your primary key
My personal favorite is using timestamp + random inspired by Instagram's Sharding ID generation at http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
Following function will generate id for a specific shard (provided as parameter). This way you can have unique key, which is assembled from timestamp, shard no. and some randomness (0-512).
var CUSTOMEPOCH = 1300000000000; // artificial epoch
function generateRowId(shardId /* range 0-64 for shard/slot */) {
var ts = new Date().getTime() - CUSTOMEPOCH; // limit to recent
var randid = Math.floor(Math.random() * 512);
ts = (ts * 64); // bit-shift << 6
ts = ts + shardId;
return (ts * 512) + randid;
}
var newPrimaryHashKey = "obj_name:" + generateRowId(4);
// output is: "obj_name:8055517407349240"
DynamoDB doesn't provide this out of the box. You can generate something in your application such as UUIDs that "should" be unique enough for most systems.
I noticed you were using Node.js (I removed your tag). Here is a library that provides UUID functionality: node-uuid
Example from README
var uuid = require('node-uuid');
var uuid1 = uuid.v1();
var uuid2 = uuid.v1({node:[0x01,0x23,0x45,0x67,0x89,0xab]});
var uuid3 = uuid.v1({node:[0, 0, 0, 0, 0, 0]})
var uuid4 = uuid.v4();
var uuid5 = uuid.v4();
You probably can use AtomicCounters.
With AtomicCounters, you can use the UpdateItem operation to implement
an atomic counter—a numeric attribute that is incremented,
unconditionally, without interfering with other write requests. (All
write requests are applied in the order in which they were received.)
With an atomic counter, the updates are not idempotent. In other
words, the numeric value increments each time you call UpdateItem.
You might use an atomic counter to track the number of visitors to a
website. In this case, your application would increment a numeric
value, regardless of its current value. If an UpdateItem operation
fails, the application could simply retry the operation. This would
risk updating the counter twice, but you could probably tolerate a
slight overcounting or undercounting of website visitors.
Came across a similar issue, where I required auto-incrementing primary key in my table. We could use some randomization techniques to generate a random key and store it using that. But it won't be in a incremental fashion.
If you require something in incremental fashion, you can use Unix Time as your primary key. Not assuring, that you can get a accurate incrementation(one-by-one), but yes every record you put, it would be in incremental fashion, with respect to the difference in how much time each record in inserted in.
Not a complete solution, if you don't want to read the entire table and get it's last id and then increment it.
Following is the code for inserting a record in DynamoDB using NodeJS:
.
.
const params = {
TableName: RANDOM_TABLE,
Item: {
ip: this.ip,
id: new Date().getTime()
}
}
dynamoDb.put(params, (error, result) => {
console.log(error, result);
});
.
.
If you are using NoSQL Dynamo DB then using Dynamoose, you can easily set default unique id, here is the simple user create example
// User.modal.js
const dynamoose = require("dynamoose");
const { v4: uuidv4 } = require("uuid");
const userSchema = new dynamoose.Schema(
{
id: {
type: String,
hashKey: true,
},
displayName: String,
firstName: String,
lastName: String,
},
{ timestamps: true },
);
const User = dynamoose.model("User", userSchema);
module.exports = User;
// User.controller.js
exports.create = async (req, res) => {
const user = new User({ id: uuidv4(), ...req.body }); // set unique id
const [err, response] = await to(user.save());
if (err) {
return badRes(res, err);
}
return goodRes(res, reponse);
};
Update for 2022 :
I was looking for the same issue and came across following research.
DynamoDB still doesn't support auto-increment of primary keys.
https://aws.amazon.com/blogs/database/simulating-amazon-dynamodb-unique-constraints-using-transactions/
Also the package node-uuid is now deprecated. They recommend we use uuid package instead that creates RFC4122 compliant UUID's.
npm install uuid
import { v4 as uuidv4 } from 'uuid';
uuidv4(); // ⇨ '9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d'
For Java developers, there is the DynamoDBMapper, which is a simple ORM. This supports the DynamoDBAutoGeneratedKey annotation. It doesn't increment a numeric value like a typical "Long id", but rather generates a UUID like other answers here suggest. If you're mapping classes as you would with Hibernate, GORM, etc., this is more natural with less code.
I see no caveats in the docs about scaling issues. And it eliminates the issues with under or over-counting as you have with the auto-incremented numeric values (which the docs do call out).
Related
I am trying to make the primary key of my dynamodb table something like user_uuid. The user is being created in AWS Cognito and I can't seem to find a uuid like field as part of the CognitoUser class. I am trying to avoid using the username as the pk.
Can someone guide me to the right solution? I can't seem to find anything on the internet regarding a user_uuid field and for some reason I can't even find the documentation of CognitoUser class that is imported from "amazon-cognito-identity-js";
Depends if you plan to use email or phone as a 'username'. In that case, I would use the sub because it never changes. But, the sub is not k-sortable so that requires the use of an extra DB item and index/join to make users sortable by date added. If you plan to generate your GUID/KSUID, and only use email/phone as an alias, then I would use the 'username' as a common id between your DB and userpool.
Good luck with your project!
FWIW - the KSUID generators found in wild are massively overbuilt. 3000+ lines of code and 80+ dependencies. I made my own k-sortable and prefixed pseudo-random ID gen for Cognito users. Here's the code.
export function idGen(prefix: any) {
const validPrefix = [
'prefix1',
'prefix2'
];
//check if prefix argument is supplied
if (!prefix) {
return 'error! must supply prefix';
}
//check if value is a valid type
else if (validPrefix.indexOf(prefix) == -1) {
return 'error! prefix value supplied must be: ' + validPrefix;
} else {
// generate epoch time in seconds
const epoch = Math.round(Date.now() / 1000);
// convert epoch time to 6 character base36 string
const time = epoch.toString(36);
// generate 20 character base36 pseudo random string
const random =
Math.random().toString(36).substring(2, 12) +
Math.random().toString(36).substring(2, 12);
// combine prefix, strings, insert : divider and return id
return prefix + ':' + time + random;
}
}
Cognito user unique identifiers can be saved to a database using a combination of the "sub" value and the username, please refer to this question for a more lengthy discussion.
In the description of amazon-cognito-identity-js (found here, use case 5), they show how to get the userAttributes of a CognitoUser. One of the attributes is the sub value, which you can get at for example like this:
user.getUserAttributes(function(err, attributes) {
if (err) {
// Handle error
} else {
// Do something with attributes
const sub = attributes.find(obj => obj.Name === 'sub').Value;
}
});
I couldn't find any documentation on the available user attributes either, I recommend using the debugger to look at the user attributes returned from the function.
I am working with AWS keyspaces and trying to insert data from C# but getting this error."Consistency level LOCAL_ONE is not supported for this operation. Supported consistency levels are: LOCAL_QUORUM". can anyone please help out here.
AWS keyspace
CREATE KEYSPACE IF NOT EXISTS "DevOps"
WITH REPLICATION={'class': 'SingleRegionStrategy'} ;
Table
CREATE TABLE IF NOT EXISTS "DevOps"."projectdetails" (
"id" UUID PRIMARY KEY,
"name" text,
"lastupdatedtime" timestamp,
"baname" text,
"customerid" UUID)
C# code
public async Task AddRecord(List<projectdetails> projectDetails)
{
try
{
if (projectDetails.Count > 0)
{
foreach (var item in projectDetails)
{
projectdetails projectData = new projectdetails();
projectData.id = item.id;
projectData.name = item.name;
projectData.baname = "Vishal";
projectData.lastupdatedtime = item.lastupdatedtime;
projectData.customerid = 1;
await mapper.InsertAsync<projectdetails>(projectData);
}
}
}
catch (Exception e)
{
}
}
The error clearly says that you need to use correct consistency level LOCAL_QUORUM instead of the LOCAL_ONE that is used by default. AWS documentation says that for write operations, it's only the consistency level supported. You can set consistency level by using the version of InsertAsync that accepts the CqlQueryOptions, like this (maybe create instance of the query options only once, during initialization of the application):
mapper.InsertAsync<projectdetails>(projectData,
new CqlQueryOptions().SetConsistencyLevel(ConsistencyLevel.LocalQuorum))
I am working on a react app with react-apollo
calling data through graphql when I check in browser network tab response it shows all elements of the array different
but what I get or console.log() in my app then all elements of array same as the first element.
I don't know how to fix please help
The reason this happens is because the items in your array get "normalized" to the same values in the Apollo cache. AKA, they look the same to Apollo. This usually happens because they share the same Symbol(id).
If you print out your Apollo response object, you'll notice that each of the objects have Symbol(id) which is used by Apollo cache. Your array items probably have the same Symbol(id) which causes them to repeat. Why does this happen?
By default, Apollo cache runs this function for normalization.
export function defaultDataIdFromObject(result: any): string | null {
if (result.__typename) {
if (result.id !== undefined) {
return `${result.__typename}:${result.id}`;
}
if (result._id !== undefined) {
return `${result.__typename}:${result._id}`;
}
}
return null;
}
Your array item properties cause multiple items to return the same data id. In my case, multiple items had _id = null which caused all of these items to be repeated. When this function returns null the docs say
InMemoryCache will fall back to the path to the object in the query,
such as ROOT_QUERY.allPeople.0 for the first record returned on the
allPeople root query.
This is the behavior we actually want when our array items don't work well with defaultDataIdFromObject.
Therefore the solution is to manually configure these unique identifiers with the dataIdFromObject option passed to the InMemoryCache constructor within your ApolloClient. The following worked for me as all my objects use _id and had __typename.
const client = new ApolloClient({
link: authLink.concat(httpLink),
cache: new InMemoryCache({
dataIdFromObject: o => (o._id ? `${o.__typename}:${o._id}`: null),
})
});
Put this in your App.js
cache: new InMemoryCache({
dataIdFromObject: o => o.id ? `${o.__typename}-${o.id}` : `${o.__typename}-${o.cursor}`,
})
I believe the approach in other two answers should be avoided in favor of following approach:
Actually it is quite simple. To understand how it works simply log obj as follows:
dataIdFromObject: (obj) => {
let id = defaultDataIdFromObject(obj);
console.log('defaultDataIdFromObject OBJ ID', obj, id);
}
You will see that id will be null in your logs if you have this problem.
Pay attention to logged 'obj'. It will be printed for every object returned.
These objects are the ones from which Apollo tries to get unique id ( you have to tell to Apollo which field in your object is unique for each object in your array of 'items' returned from GraphQL - the same way you pass unique value for 'key' in React when you use 'map' or other iterations when rendering DOM elements).
From Apollo dox:
By default, InMemoryCache will attempt to use the commonly found
primary keys of id and _id for the unique identifier if they exist
along with __typename on an object.
So look at logged 'obj' used by 'defaultDataIdFromObject ' - if you don't see 'id' or '_id' then you should provide the field in your object that is unique for each object.
I changed example from Apollo dox to cover three cases when you may have provided incorrect identifiers - it is for cases when you have more than one GraphQL types:
dataIdFromObject: (obj) => {
let id = defaultDataIdFromObject(obj);
console.log('defaultDataIdFromObject OBJ ID', obj, id);
if (!id) {
const { __typename: typename } = obj;
switch (typename) {
case 'Blog': {
// if you are using other than 'id' and '_id' - 'blogId' in this case
const undef = `${typename}:${obj.id}`;
const defined = `${typename}:${obj.blogId}`;
console.log('in Blogs -', undef, defined);
return `${typename}:${obj.blogId}`; // return 'blogId' as it is a unique
//identifier. Using any other identifier will lead to above defined problem.
}
case 'Post': {
// if you are using hash key and sort key then hash key is not unique.
// If you do query in DB it will always be the same.
// If you do scan in DB quite often it will be the same value.
// So use both hash key and sort key instead to avoid the problem.
// Using both ensures ID used by Apollo is always unique.
// If for post you are using hashKey of blogID and sortKey of postId
const notUniq = `${typename}:${obj.blogId}`;
const notUniq2 = `${typename}:${obj.postId}`;
const uniq = `${typename}:${obj.blogId}${obj.postId}`;
console.log('in Post -', notUniq, notUniq2, uniq);
return `${typename}:${obj.blogId}${obj.postId}`;
}
case 'Comment': {
// lets assume 'comment's identifier is 'id'
// but you dont use it in your app and do not fetch from GraphQl, that is
// you omitted 'id' in your GraphQL query definition.
const undefnd = `${typename}:${obj.id}`;
console.log('in Comment -', undefnd);
// log result - null
// to fix it simply add 'id' in your GraphQL definition.
return `${typename}:${obj.id}`;
}
default: {
console.log('one falling to default-not good-define this in separate Case', ${typename});
return id;
}
I hope now you see that the approach in other two answers are risky.
YOU ALWAYS HAVE UNIQUE IDENTIFIER. SIMPLY HELP APOLLO BY LETTING KNOW WHICH FIELD IN OBJECT IT IS. If it is not fetched by adding in query definition add it.
An alternative option to the accepted is to instead of dataIdFromObject, which appears to be for everything in the query, I was able to provide a keyFields function per type that required it.
const client = new ApolloClient({
cache: new InMemoryCache({
typePolicies: {
ItemType: {
keyFields: (obj) =>
obj.id + "-" + obj.language.id,
},
},
}),
});
In the above example ItemType can be whichever Type is specified in your schema. I happened to be joining a non-unique ID with a language to make a unique key but you can do it however you wish.
Can a Date.now type function be used in either map or reduce functions? Can it be used anywhere at all?
More specifically, the view must not cache the Date.now value.
Here is what I tested that only worked for the first run after a change to any view function:
function (doc){
var n = new Date();
if(doc.TimeStamp > n.getTime() - 30000){
emit(doc._id, doc);
}
}
The view rows will be refreshed only when the particular doc gets updated. But you can request the view for that result: emit the doc.TimeStamp as key and request the view with ?startkey=timestamp where timestamp is the value of now.getTime() - 30000.
Yes. var now = new Date() should work.
The condition must result in false. You can test it with the view:
function (doc) {
var now = new Date()
var timestamp = now.getTime()
emit(timestamp,null)
}
It will respond something like
{
"total_rows":1,
"offset":0,
"rows" :[{
"id":"ecd99521eeda9a79320dd8a6954ecc2c",
"key":1429904419591, // timestamp as key
"value":null
}]
}
Make sure that doc.TimeStamp is a number (maybe you have to execute parseInt(doc.TimeStamp)) and greater then timestamp - 30000
Two words about your line of code emit(doc._id, doc);:
To emit doc._id as key means maybe you doesn't need the view. Simply request the doc by GET /databasename/:id. Also to include doc._id in multipart keys or the value of the view row is mostly not necessary because its included in every row automatically as additional property. One valid reason would be when you want to sort the view over the doc ids.
To emit the doc as value is not recommended for performance reasons of the view. Simply add ?include_docs=true when you request the view and every row will have an additional property doc with whole doc in it.
I've been looking through the AWS Dynamo DB documentation and the Amazon Dynamo interface and it seems like there's no way to remove a column from a table, outside of deleting the entire table with it's contents and starting over, is that true?
If so, why would Amazon not support this?
Try removing all data from that column, it will automatically remove that column.
Using document client with javascript, we can do this:
const paramsUpdate = {
TableName: tableName,
Key: { HashKey: 'hashKey' },
UpdateExpression: 'remove #c ',
ExpressionAttributeNames: { '#c': 'columnName' }
};
documentClient.update(paramsUpdate, (errUpdate) => {
if (errUpdate) log.error(errUpdate);
});
In here we set UpdateExpression with remove sentence
There is a REMOVE action in the DynamoDB API.
DynamoDB does not have a schema definition, and so there is no such thing as a "column". It also means there is no way to delete all attributes with the same name without iterating over each record.
A solution I recommend is to keep these attributes, and to make your code refer to that same data using a fresh attribute name.
For example, attribute content could become content_v2. It might not look so clean, but it's cheap, quick and your old data would be backed up.
Setting all instances of the column value to null clears the column.
In C#, this method does the trick using the persistence framework:
static void RemoveColumn()
{
var myItems = context.ScanAsync<MyObjectType>(null).GetRemainingAsync().Result;
// Foreach item, update
myItems.ForEach(myObject =>
{
myObject.UnwantedColumn = null;
context.Save(myObject);
});
}
Just remove all the data for that one column. On my end, it automatically refreshed, might have to refresh the page.