I have certain usecase where api service are retrieving BigTable rows using rowKey. The issue I'm going through is api service is trying to retrieve some columns with boolean value and doing boolean comparison in front-end side. Since BigTable doesn't support that datatype. The front-end service comparison part is not working as expected. I'm using the below code to store boolean value in BigTable.
Boolean value = Boolean.parseBoolean(newImageMap.get(Key).toString());
SetCell setCell = SetCell.newBuilder()
.setFamilyName(Utility.COLUMN_FAMILY)
.setColumnQualifier(Utility.str_to_bb(Key,StandardCharsets.UTF_8))
.setTimestampMicros(yearAgoMillis)
.setValue(ByteString.copyFrom(Bytes.toBytes(value)))
// .setValue(Utility.str_to_bb(String.valueOf(value),StandardCharsets.UTF_8))
.build();
But the boolean value is stored as string in BigTable. You can see the stored value in below snapshot.
Let me know in case there is a way-out to handle this type of use-cases.
boolean-value-ss
so there are two parts to this: working with the existing boolean string values you have and converting the current setup to be easier to work with.
To work with the current setup, you can just compare your results against a "true" and "false" string. This seems to not be what you want to do based on your discussion in the comments with Pievis.
So to convert the current setup:
You should use 0 as false and 1 as true (or you can leave empty as false) This will be more efficient for space. And then to convert all of your existing values, you'd have to do a full table scan filtering on the value "true" and then update those cells to be 1 and one where you filter on the value "false" where you delete those cells.
Dumping boolean object in BigTable is not possible as BigTable doesn't support boolean datatype. Rather I will suggest you to make the changes in your driver code and type-case the datatype to suitable form and send it to front-end side, which would be more suitable. For that you have to pre-identify the boolean datatype from source end and stored it as configuration which will be an input to driver code.
Let me know in case this solution works.
Related
I have a DynamoDB-based web application that uses DynamoDB to store my large JSON objects and perform simple CRUD operations on them via a web API. I would like to add a new table that acts like a categorization of these values. The user should be able to select from a selection box which category the object belongs to. If a desirable category does not exist, the user should be able to create a new category specifying a name which will be available to other objects in the future.
It is critical to the application that every one of these categories be given a integer ID that increments starting the first at 1. These numbers that are auto generated will turn into reproducible serial numbers for back end reports that will not use the user-visible text name.
So I would like to have a simple API available from the web fronted that allows me to:
A) GET /category : produces { int : string, ... } of all categories mapped to an ID
B) PUSH /category : accepts string and stores the string to the next integer
Here are some ideas for how to handle this kind of project.
Store it in DynamoDB with integer indexes. This leaves has some benefits but it leaves a lot to be desired. Firstly, there's no auto incrementing ID in DynamoDB, but I could definitely get the state of the table, create a new ID, and store the result. This might have issues with consistency and race conditions but there's probably a way to achieve this safely. It might, however, be a big anti pattern to use DynamoDB this way.
Store it in DynamoDB as one object in a table with some random index. Just store the mapping as a JSON object. This really forgets the notion of tables in DynamoDB and uses it as a simple file. It might also run into some issues with race conditions.
Use AWS ElasticCache to have a Redis key value store. This might be "the right" decision but the downside is that ElasticCache is an always on DB offering where you pay per hour. For a low-traffic web site like mine I'd be paying minumum $12/mo I think and I would really like for this to be pay per access/update due to the low volume. I'm not sure there's an auto increment feature for Redis built in the way I'd need it. But it's pretty trivial to make a trasaction that gets the length of the table, adds one, and stores a new value. Race conditions are easily avoid with this solution.
Use a SQL database like AWS Aurora or MYSQL. Well this has the same upsides as Redis, but it's also more overkill than Redis is, and also it costs a lot more and it's still always on.
Run my own in memory web service or MongoDB etc... still you're paying for constant containers running. Writing my own thing is obviously silly but I'm sure there are services that match this issue perfectly but they'd all require a constant container to run.
Is there a food way to just store a simple list, or integer mapping like this that doesn't cost a constant monthly cost? Is there a better way to do this with DynamoDB?
Store the maxCounterValue as an item in DyanamoDB.
For the PUSH /category, perform the following:
Get the current maxCounterValue.
TransactWrite:
Put the category name and id into a new item with id = maxCounterValue + 1.
Update the maxCounterValue +1, add a ConditionExpression to check that maxCounterValue = :valueFromGetOperation.
If TransactWrite fails, start at 1 again, try X more times
"order (S)","method (NULL)","time (L)"
"/1553695740/Bar","true","[ { ""N"" : ""1556593200"" }, { ""N"" : ""1556859600"" }]"
"/1556439461/adasd","true","[ { ""N"" : ""1556593200"" }, { ""N"" : ""1556679600"" }]"
"/1556516482/Foobar","cheque","[ { ""N"" : ""1556766000"" }]"
How do I scan or query for that matter on empty "method" attribute values? https://s.natalian.org/2019-04-29/null.mp4
Unfortunately the DynamoDB console offers a simple GUI and assumes the operations you want to perform all have the same type. When you select filters on columns of type "NULL", it only allows you to do exists or not exists. This makes sense since a column containing only NULL datatypes can either exist or not exist.
What you have here is a column that contains multiple datatypes (since NULL is a different datatype than String). There are many ways to filter what you want here but I don't believe they are available to you on the console. Here is an example on how you could filter the dataset via the AWS CLI (note: since your column is a named a reserved word method, you will need to alias it with an expression attribute name):
Using Filter expressions
$ aws dynamodb scan --table-name plocal --filter-expression '#M = :null' --expression-attribute-values '{":null":{"NULL":true}}' --expression-attribute-names '{"#M":"method"}'
An option to consider to avoid this would be to update your logic to write some of sort filler string value instead of a null or empty string when writing your data to the database (i.e. "None" or "N/A"). Then you could solely operate on Strings and search on that value instead.
DynamoDB currently does not allow String values of an empty string and will give you errors if you try and put those items directly. To make this "easier", many of the SDKs have provided mappers/converters for objects to DyanmoDB items and this usually involves converting empty strings to Null types as a way of working around the rule of no empty strings.
If you need to differentiate between null and "", you will need to write some custom logic to marshall/unmarshall empty strings to a unique string value (i.e. "__EMPTY_STRING") when they are stored in DyanmoDB.
I'm pretty sure that there is no way to filter using the console. But I'm guessing that what you really want is to use such a filter in code.
DynamoDB has a very peculiar way of storing NULLs. There is a "NULL" data type which basically represents the concept of null values but it really is sort of like a boolean.
If you have the opportunity to change the data type of that attribute to be a string, or numeric, I strongly recommend doing so. Then you'll be able to create much more powerful queries with filter conditions to match what you want.
If the data already exists and you don't have a significant number of items that need to be updated, I recommend creating a new attribute to represent your data and backfilling.
Just following up on the comments. If you prefer using the mapper, you can customize how it marshals certain attributes that may be null/empty. Have a look at the go sdk encoder implementation for some examples: https://git.codingcafe.org/Mirrors/aws/aws-sdk-go/blob/9b5aaeba7a51edcf3f87bda525a08b04b90d2ef8/service/dynamodb/dynamodbattribute/encode.go
I was able to do this inside a FilterExpression:
attribute_type(MyProperty, :nullType) - Where :nullType is a string with value NULL. This one finds null entries.
attribute_type(MyProperty, :stringType) - Where :stringType is a string with value S. This one finds non-null entries.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.OperatorsAndFunctions.html#Expressions.OperatorsAndFunctions.Syntax
I need to have a table with the structure
{
userId : 123,
Tracking: true
}
It is possible that the user does not exist for the first operation. So, by default a false should be set. The next request makes this value true, 3rd request makes it false again and so on. Similar to NOT(Tracking) i.e writing a negation to the value.
I could do this by reading the table, negating the value in the my lambda function and updating the table with new attributes.
This would mean a GET and UPDATE request for the DB. I am looking for a way to send a negation flag instead. In this way I just write a false if the user is not existing. If the user is existing, I would toggle between true and false depending on the already existing boolean value.
Just wondering if there is a way to do this. It would be then a single update request to the DB. Any pointers would be helpful.
I did not find much help from the documentation https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB/DocumentClient.html#update-property
Instead of a boolean you can use an integer. Increment it by 1 each time and then use even as true, odd as false (or reversed)
Can some one help me and please let me know,Which tool in AWS I should use to perform math operations such as add/sub/mul/div/mod etc with data inside a dynamodb table?
Currently I am hosting my data inside DynamoDB via raspberry Pi using python as my programming language. But with that raw data I need to do some computations. Initially,I want to start with trying small computations.
So Which tool is helpful to do computations inside cloud such as doing some math and finding whether the number is even number or not ,giving some input and performing some algorithm and finding the result?
I just want to pick data from my table residing in DynamoDB and do these computations.I saw Redshift in google but its little expensive,will it be possible to use that and load the data from dynamoDB to redshift and do the math operations or there any better alternate options?
Will you please share me any links that will help me to start with?
Thank you very much .
In DynamoDB, you can only increment and decrement numeric attributes using the + (plus) and - (minus) operators in the SET action of an UpdateExpression in update_item or the ADD action if only both; the existing attribute is a number and the value is also a number.
Otherwise, you should perform your desired math operations against attribute values before updating them in DynamoDB.
Note on ADD:
If you use ADD to increment or decrement a number value for an item
that doesn't exist before the update, DynamoDB uses 0 as the initial
value. Similarly, if you use ADD for an existing item to increment or
decrement an attribute value that doesn't exist before the update,
DynamoDB uses 0 as the initial value.
For example:
response = client.update_item(
ExpressionAttributeNames={
'#C': 'Count'
},
ExpressionAttributeValues={
':val': {
'N': '1'
}
},
Key={
'ItemId': {
'S': 'BC3AB494-EDD8-4F47-B80F-32ACA92D8C5C'
}
},
ReturnValues='ALL_NEW',
TableName='MyTable',
UpdateExpression='SET #C = #C - :val'
)
print(response)
See also: Incrementing and Decrementing Numeric Attributes.
I have a sample database in CouchDB with the information of a number of aircraft, and a view which shows the manufacturer as key and the model as the value.
The map function is
function(doc) {
emit(doc["Manufacturer"], doc._id)
}
and the reduce function is
function(keys, values, rereduce){
return values.length;
}
This is pretty simple. And I indeed get the correct result when I show the view using Futon, where I have 26 aircraft of Boeing:
"BOEING" 26
But if I use a REST client to query the view using
http://localhost:6060/aircrafts/_design/basic/_view/VendorProducts?key="BOEING"
I get
{"rows":[
{"key":null,"value":2}
]}
I have tested different clients (including web browser, REST client extensions, and curl), all give me the value 2! While queries with other keys work correctly.
Is there something wrong with the MapReduce function or my query?
The issue could be because of grouping
Using group=true (which is Futon's default), you get a separate reduce value for each unique key in the map - that is, all values which share the same key are grouped together and reduced to a single value.
Were you passing group=true as a query parameter when querying with curl etc? Since it is passed by default in futon you saw the results like
BOEING : 26
Where as without group=true only the reduced value was being returned.
So try this query
http://localhost:6060/aircrafts/_design/basic/_view/VendorProducts?key="BOEING"&group=true
You seem to be falling into the re-reduce-trap. Couchdb strictly speaking uses a map-reduce-rereduce process.
Map: reformats your data in the output format.
Reduce: aggregates the data of several (but not all entries with the same key) - which works correctly in your case.
Re-reduce: does the same as reduce, but on previously reduced data.
As you change the format of the value in the reduce stage, the re-reduce call will aggregate the number of already reduced values.
Solutions:
You can just set the value in the map to 1 and reduce a sum of the values.
You check for rereduce==true and in that case return a sum of the values - which will be the integer values returned by the initial reduce.