DynamoDB ProjectionExpression Set Contains

DynamoDB ProjectionExpression Set Contains - amazon-web-services

I have table with items which each contain a string set, as well as other information. Can I query the table and get back a flag if the set contains a particular entry. I don't want to filter by the set, just find out if it contains an item without pulling back the full set.
For example, say I have the following items:
[
{
things: { "a", "b" },
name: "Coffee"
},
{
things: { "b" },
name: "Tea"
}
]
I would like to query this table with something like Projection: "flag=(thing CONTAINS 'a'), name" and get back:
[
{
flag: true,
name: "Coffee"
},
{
flag: false,
name: "Tea"
}
]
Is this possible?

Related

How to aggregate data from another cube in cubejs?

I have the following cubes (I'm only showing the data necessary to reproduce the problem):
SentMessages:
cube(`SentMessages`, {
sql: `Select * from messages_sent`,
dimensions: {
campaignId: {
sql: `campaign_id`,
type: `number`
},
phone: {
sql: `phone_number`,
type: `number`
}
}
});
Campaigns:
cube(`Campaign`, {
sql: `SELECT * FROM campaign`,
joins: {
SentMessages: {
sql: `${Campaign}.id = ${SentMessages}.campaign_id`,
relationship: `hasMany`
}
},
measures: {
messageSentCount: {
sql: `${SentMessages}.phone`,
type: `count`
}
},
dimensions: {
name: {
sql: `name`,
type: `string`
},
}
});
The query being sent looks like this:
"query": {
"dimensions": ["Campaign.name"],
"timeDimensions": [
{
"dimension": "Campaign.createdOn",
"granularity": "day"
}
],
"measures": [
"Campaign.messageSentCount"
],
"filters": []
},
"authInfo": {
"iat": 1578961890,
"exp": 1579048290
},
"requestId": "da7bf907-90de-4ba0-80f8-1a802dd442f6"
For some reason this is resulting in the following error:
Error: 'Campaign.messageSentCount' references cubes that lead to row multiplication. Please rewrite it using sub query.
I've searched quite a bit on this error and cant find anything. Can someone please help or provide some insight into the problem? It would be really nice if the framework could show the erroneous sql generated just for troubleshooting purposes.

Campaign has many SentMessages and if joined to calculate Campaign.messageSentCount this calculation results might be affected. There's a simple check that ensures there're no hasMany cubes referenced inside aggregation function. This simple sanity check is required to avoid situation which leads to incorrect calculation results. For example if ReceivedMessages is also added as a join to the Campaign then Campaign.messageSentCount will generate incorrect results if ReceivedMessages and SentMessages are selected simultaneously.
To avoid this sanity check error, substitution with sub query is expected here as follows:
SentMessages:
cube(`SentMessages`, {
sql: `Select * from messages_sent`,
measures: {
count: {
type: `count`
}
},
dimensions: {
campaignId: {
sql: `campaign_id`,
type: `number`
},
phone: {
sql: `phone_number`,
type: `number`
}
}
});
Campaigns:
cube(`Campaign`, {
sql: `SELECT * FROM campaign`,
joins: {
SentMessages: {
sql: `${Campaign}.id = ${SentMessages}.campaign_id`,
relationship: `hasMany`
}
},
measures: {
totalMessageSendCount: {
sql: `${messageSentCount}`,
type: `sum`
}
},
dimensions: {
messageSentCount: {
sql: `${SentMessages.count}`,
type: `number`,
subQuery: true
},
name: {
sql: `name`,
type: `string`
},
}
});
For cases where Campaign.messageSentCount doesn't make any sense as a dimension, schema can be simplified and SentMessages.count can be used directly.

I figured part of this out on my own (at least the solution part), figured I'd post in case anyone else was having difficulty:
It appears that this definition is problematic (and uncessary):
messageSentCount: {
sql: `${SentMessages}.phone`,
type: `count`
}
I believe the correct way to do this is to add a measure to the table you want the COUNT to be applied to. In this query I want a count of SentMessages.phone (as shown above), so the following should be added to the SentMessages cube.
count: {
sql: `phone`
type: `count`,
},
Then the query works simply as follows:
"query": {
"dimensions": [
"Campaign.name"
],
"timeDimensions": [
{
"dimension": "SentMessages.createdOn",
"granularity": "day"
}
],
"measures": [
"SentMessages.count"
],
"filters": []
},
"authInfo": {
"iat": 1578964732,
"exp": 1579051132
},
"requestId": "c84b4596-2ee8-48e7-8e0a-974eb284dde3"
And it works as expected. I still don't understand the row multiplication error and why this measure doesn't work if placed on the Campaign cube. I will wait to accept this answer as i found this experimentally and still unclear of the problem.

AWS DynamoDB | Check if List of Maps contains a specific value

Im storing user data in AWS DynamoDB.
One of the attributes is a List of Maps:
skills: [
{
name: 'foo'
},
{
name: 'bar'
}
]
How can I write a Scan that checks if skills have a map with name = foo?
I'm using DocumentClient
I have tried using contains but can't get it to work with Maps nested in a List:
let params: {
TableName: 'tablename',
FilterExpression: 'contains(skills, :val)',
ExpressionAttributeValues: {
':val': 'foo'
}
}

Is it possible to atomically add several items to DynamoDB array with checking their occurance in it (to avoid duplication)?

I have email black list stored as one item in DynamoDB
// item example
{
id: "blackList" // PrimaryKey of item
list: [ "email_1#example.com", "email_2#example.com" ]
}
It is possible to add new email to the list and the same time check if it's not already presented in the list (to avoid duplication) by atomic update:
const email = "email_new#example.com";
const params = {
TableName: "myTable",
Key: {
id: "blackList"
},
AttributeUpdates: {
list: {
Action: "ADD",
Value: [email] // several emails can also be added with incorrect Expected check
},
},
Expected: {
list: {
ComparisonOperator: "NOT_CONTAINS",
Value: email
},
}
};
await docClient.update(params).promise();
The question is whether it's possible to perform the same atomic operation for several emails at once?

Use a string set if you want there to be no duplicates. If you want to see if they existed in the set before you added them, return the old item.
const email = "email_new#example.com";
const params = {
TableName: "myTable",
Key: {
id: "blackList"
},
UpdateExpression: "ADD #email :email",
ExpressionAttributeNames: {
"#email": "email"
},
ExpressionAttributeValues: {
":email": docClient.createSet(email)
}
};
await docClient.update(params).promise();
This will add the email_new#example.com to the email attribute as a string set. If the email attribute doesn't exist on the object it will be created.
See https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.UpdateExpressions.html#Expressions.UpdateExpressions.ADD for the documentation.

couchdb - startkey endkey doesn't filter records with key arrays as expected

I have Couch-DB map-reduce view which outputs this;
{
rows: [
{
key: [
"2014-08-20",
2,
"registration"
],
value: 2
},
{
key: [
"2014-08-20",
2,
"search"
],
value: 3
},
{
key: [
"2014-08-21",
2,
"registration"
],
value: 3
},
{
key: [
"2014-08-21",
2,
"search"
],
value: 4
}
]
}
I need to query all the records that has between 2014-08-20 and 2014-08-21 Also the same time I need the integer value in the middle to be 2 and the last value to be "registration".
My curl request URL looks like this
BASE_URL?group=true&startkey=["2014-08-20",1,"registration"]
&endkey=["2014-08-21",1,"registration"]
This filters out the date but none of the other elements (the integer value and the string). Anyone got an idea whats happening?
Any help would be appreciated.
UPDATE
My Document structure looks something like this
[
{
platform_version: 2,
UDID: "EWHSJeNp20sBFuzqcorkKVVi",
session: {
timestamp: "2014-08-20T00:00:00.000Z",
session_id: "kOnNIhCNQ31LlkpEPQ7XnN1D",
ip: "202.150.213.66",
location: "1.30324,103.5498"
},
events: [
{
event_type: "search",
timestamp: "2014-08-21T00:00:00.000Z",
location: "1.30354,103.5301",
attributes: {
}
}
]
},
{
platform_version: 2,
UDID: "EWHSJeNp20sBFuzqcorkKVVi",
session: {
timestamp: "2014-08-21T00:00:00.000Z",
session_id: "kOnNIhCNQ31LlkpEPQ7XnN1D",
ip: "202.150.213.66",
location: "1.30324,103.5498"
},
events: [
{
event_type: "search",
timestamp: "2014-08-21T00:00:00.000Z",
location: "1.30354,103.5301",
attributes: {
}
}
]
},
{
platform_version: 2,
UDID: "EWHSJeNp20sBFuzqcorkKVVi",
session: {
timestamp: "2014-08-20T00:00:00.000Z",
session_id: "kOnNIhCNQ31LlkpEPQ7XnN1D",
ip: "202.150.213.66",
location: "1.30324,103.5498"
},
events: [
{
event_type: "click",
timestamp: "2014-08-21T00:00:00.000Z",
location: "1.30354,103.5301",
attributes: {
}
}
]
}
]
and the map reduce function looks like this.
function(doc) {
date = doc.session.timestamp.split("T")[0];
eventArray = doc.events;
for (i = 0; i < eventArray.length; i++) {
emit([doc.app_version,eventArray[i].event_type,date],1);
}
}
It started working after I change the order of the keys. but still, I can't use a wildcard to query all the event types.

You are getting documents with different second array item different then 1 because in the CouchDB the limits (startkey and endkey) are compared to map keys using lexicographical order. In the lexicographical order [1,1] < [1,2] < [2,1] < [2,2].
What you need is either multidimensional queries (which are not supported by CouchDB), additional client-side filtering (which may increase data transferred between CouchDB and your app) or additional server-side filtering with list function (which increase processing time of queries).
If your app needs filtering using range only on the first element of key array (just like in your example query), you would solve your problem easily by placing the item at the last position of array, eg. emitting [1, "registration", "2014-08-21"] instead of ["2014-08-21", 1, "registration"].

What is the format expected by a find(id) request?

My backend replies to find all requests:
User.find();
Like this
{ 'users' : [ user1_obj, user2_obj ] }
Ember-data is happy about it. Now if I do a simple single object find:
User.find('user1');
I have tried configuring the backend to return any of the following:
user1
{ 'user1' : user1_obj }
{ 'user' : { 'user1' : user1_obj } }
{ 'user' : user1_obj }
But none of those are working. What should I return from the backend in reply to find("obj-id") requests? According to the documentation about JSON ROOT, the right format looks like:
{ 'user' : user1_obj }
Ember does not complain about it, but the Ember Objects processed have a very strange structure, like this:
As you can see, _reference.record is referring to the top record. Also (not shown here) _data field is empty.
What could be causing that strange nesting?
EDIT
As linked by mavilein in his answer, the JSON API suggests using a different format for singular resources:
{ 'users' : [user1_obj] }
That means, the same format as for plural resources. Not sure if Ember will swallow that, I'll check now.

Following this specification, i would suspect the following:
{
'users' : [{
"id": "1",
"name" : "John Doe"
},{
"id": "2",
"name" : "Jane Doe"
}]
}
For singular resources the specification says:
Singular resources are represented as JSON objects. However, they are
still wrapped inside an array:
{
'users' : [{
"id": "1",
"name" : "John Doe"
}]
}

Using User.find() will expect the rootKey pluralized and in your content an array of elements, the response format is the following json:
{
users: [
{ id: 1, name: 'Kris' },
{ id: 2, name: 'Luke' },
{ id: 3, name: 'Formerly Alex' }
]
}
And with User.find(1) the rootKey in singular, and just one object:
{
user: {
id: 1, name: 'Kris'
}
}
Here a demo showing this working

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

DynamoDB ProjectionExpression Set Contains - amazon-web-services

Related

How to aggregate data from another cube in cubejs?

AWS DynamoDB | Check if List of Maps contains a specific value

Is it possible to atomically add several items to DynamoDB array with checking their occurance in it (to avoid duplication)?

couchdb - startkey endkey doesn't filter records with key arrays as expected

What is the format expected by a find(id) request?

Categories

Resources