When I update some record in DynamoDB as such
UpdateExpression: "set #audioField = :payload",
ExpressionAttributeValues: {
":payload": something,
},
var something = {"test.com1": {}}
DynamoDB puts a random character in the record like this
{ "test.com1" : { "M" : { } }}
What's up with this? And how do I prevent this?
This is not a random character, this is how DynamoDB stores and represents types.
DynamoDB embeds type information in each value that is stores. See the following for the list of types: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_AttributeValue.html
Based on the linked above, The "M" that you are seeing is describing the contents of "test.com1" attribute which is a map (M for map).
The reason you are not seeing these in your other attributes is probably because the SDK is automatically translating this DynamoDB structure into native types for the top-level attributes but not for nested attributes.
What language/SDK are you using? Many SDKs have helpers that you can pass your results through to parse these embedded types and convert them into native types that are easier to work with.
Related
Is it possible to apply a filter based on values inside a dynamodb database?
Let's say the database contains an object info within a table:
info: {
toDo: x,
done: y,
}
Using the ExpressionAttributeValues, is it possible to check whether the info.toDo = info.done and apply a filter on it without knowing the current values of info.toDo and info.done ?
At the moment I tried using ExpressionAttributeNames so it contains:
'#toDo': info.toDo, '#done': info.done'
and the filter FilterExpression is
#toDo = #done
but I'm retrieving no items doing a query with this filter.
Thanks a lot!
DynamoDB is not designed to perform arbitrary queries as you might be used to in a relational database. It is designed for fast lookups based on keys.
Therefore, if you can add an index allowing you to access the records you look for, you can use it for this new access pattern. For example, if you add an index that uses info.toDo as the partition key and info.done as the sort key. You can then use the index to scan the records with the conditional expression of PK=x and SK=x, assuming that the list of possible values is limited and known.
"order (S)","method (NULL)","time (L)"
"/1553695740/Bar","true","[ { ""N"" : ""1556593200"" }, { ""N"" : ""1556859600"" }]"
"/1556439461/adasd","true","[ { ""N"" : ""1556593200"" }, { ""N"" : ""1556679600"" }]"
"/1556516482/Foobar","cheque","[ { ""N"" : ""1556766000"" }]"
How do I scan or query for that matter on empty "method" attribute values? https://s.natalian.org/2019-04-29/null.mp4
Unfortunately the DynamoDB console offers a simple GUI and assumes the operations you want to perform all have the same type. When you select filters on columns of type "NULL", it only allows you to do exists or not exists. This makes sense since a column containing only NULL datatypes can either exist or not exist.
What you have here is a column that contains multiple datatypes (since NULL is a different datatype than String). There are many ways to filter what you want here but I don't believe they are available to you on the console. Here is an example on how you could filter the dataset via the AWS CLI (note: since your column is a named a reserved word method, you will need to alias it with an expression attribute name):
Using Filter expressions
$ aws dynamodb scan --table-name plocal --filter-expression '#M = :null' --expression-attribute-values '{":null":{"NULL":true}}' --expression-attribute-names '{"#M":"method"}'
An option to consider to avoid this would be to update your logic to write some of sort filler string value instead of a null or empty string when writing your data to the database (i.e. "None" or "N/A"). Then you could solely operate on Strings and search on that value instead.
DynamoDB currently does not allow String values of an empty string and will give you errors if you try and put those items directly. To make this "easier", many of the SDKs have provided mappers/converters for objects to DyanmoDB items and this usually involves converting empty strings to Null types as a way of working around the rule of no empty strings.
If you need to differentiate between null and "", you will need to write some custom logic to marshall/unmarshall empty strings to a unique string value (i.e. "__EMPTY_STRING") when they are stored in DyanmoDB.
I'm pretty sure that there is no way to filter using the console. But I'm guessing that what you really want is to use such a filter in code.
DynamoDB has a very peculiar way of storing NULLs. There is a "NULL" data type which basically represents the concept of null values but it really is sort of like a boolean.
If you have the opportunity to change the data type of that attribute to be a string, or numeric, I strongly recommend doing so. Then you'll be able to create much more powerful queries with filter conditions to match what you want.
If the data already exists and you don't have a significant number of items that need to be updated, I recommend creating a new attribute to represent your data and backfilling.
Just following up on the comments. If you prefer using the mapper, you can customize how it marshals certain attributes that may be null/empty. Have a look at the go sdk encoder implementation for some examples: https://git.codingcafe.org/Mirrors/aws/aws-sdk-go/blob/9b5aaeba7a51edcf3f87bda525a08b04b90d2ef8/service/dynamodb/dynamodbattribute/encode.go
I was able to do this inside a FilterExpression:
attribute_type(MyProperty, :nullType) - Where :nullType is a string with value NULL. This one finds null entries.
attribute_type(MyProperty, :stringType) - Where :stringType is a string with value S. This one finds non-null entries.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.OperatorsAndFunctions.html#Expressions.OperatorsAndFunctions.Syntax
I need to create a new table on AWS DynamoDB that will have a structure like the following:
{
"email" : String (key),
... : ...,
"someStuff" : SomeType,
... : ...,
"listOfIDs" : Array<String>
}
This table contains users' data and a list of strings that I'll often query (see listOfIDs).
Since I don't want to scan the table every time in order to get the user linked to that specific ID due to its slowness, and I cannot create an index since it's an Array and not a "simple" type, how could I improve the structure of my table? Should I use a different table where I have all my IDs and the users linked to them in a "flat" structure? Is there any other way?
Thank you all!
Perhaps another table that looks like:
ID string / hash key,
Email string / range key,
Any other attributes you may want to access
The unique combination of ID and email will allow you to search on the "List of IDs". You may want to include other attributes within this table to save you from needing to perform another query.
Should I use a different table where I have all my IDs and the users linked to them in a "flat" structure?
I think this is going to be your best bet if you want to leverage DynamoDB's parallelism for query performance.
Another option might be using a CONTAINS expression in a query if your listOfIDs is stored as a set, but I can't imagine that will scale performance-wise as your table grows.
I'm manipulating documents containing a dictionnary of arbitrary metadata, and I would like to search documents based on metadata.
So far my approach is to build an index from the metadata. For each document, I insert each (key,value) pair of the metadata dictionary.
var metaIndexDoc = {
_id: '_design/meta_index',
views: {
by_meta: {
map: function(doc) {
if (doc.meta) {
for (var k in doc.meta) {
var v = doc.meta[k];
emit(k,v);
}
}
}.toString()
}
}
};
That way, I can query for all the docs that have a metadata date, and the result will be sorted based on the value associated with date. So far, so good.
Now, I would like to make queries based on multiple criteria: docs that have date AND important in their metadata. From what I've gathered so far, it looks like the way I built my index won't allow that. I could create a new index with ['date', 'important'] as keys, but the number of indexes would grow exponentially as I add new metadata keys.
I also read about using a new kind of document to store the fact that document X has metadata (key,value), which is definitely how you would do it in a relational database, but I would rather have self-contained documents if it is possible.
Right now, I'm thinking about keeping my metaIndex, making one query for date, one for important, and then use underscore.intersection to compute the intersection of both lists.
Am I missing something ?
EDIT: after discussion with #alexis, I'm reconsidering the option to create custom indexes when I need them and to let PouchDB manage them. It is still true that with a growing number of metadata fields, the number of possible combinations will grow exponentially, but as long as the indexes are created only when they are needed, I guess I'm good to go...
I have a view like
function (doc, meta)
{
if(doc.Tenant)
{
emit([doc.Tenant.Id,doc.Tenant.User.Name],doc);
}
}
In this view I want all the value belongs to Tenant.Id == 1 and User.Name where Contains "a"
I can search this in my C# by collecting all the Tenant data belongs to particular Tenant Id.
But I have million of data for each Tenant. So need to check this in the server side itself.
Is this possible to search.
I'm guessing that you want to be able to change which letter you are searching for in the string, unfortunately couchbase isn't going to be the best thing for this type of query.
If it will always be the letter 'a' that you want to search for then you could do a map like this and then query on the id.
function (doc, meta) {
if(doc.Tenant) {
var name = doc.Tenant.User.Name.toLowerCase();
if(name.indexOf("a") > -1) {
emit(doc.Tenant.Id,null);
}
}
}
If however you want to be able to dynamically change which letter or even substring you want to search for in the name then you want to consider something like elasticsearch (great for text searching). Couchbase has an elasticsearch transport plugin that will automatically replicate to your elasticsearch node(s).
Here is a presentation on ES and Couchbase
http://www.slideshare.net/Couchbase/using-elasticsearch-and-couchbase-together-to-build-large-scale-applications
The documentation for installation and getting started with the ES plugin
http://docs.couchbase.com/couchbase-elastic-search/
And a cool tutorial detailing how to add a GUI on top of your ES data for easy filtering.
http://blog.jeroenreijn.com/2013/07/visitor-analysis-with-couchbase-elasticsearch.html