Azure Cosmos query to convert into List - list

This is my JSON data, which is stored into cosmos db
{
"id": "e064a694-8e1e-4660-a3ef-6b894e9414f7",
"Name": "Name",
"keyData": {
"Keys": [
"Government",
"Training",
"support"
]
}
}
Now I want to write a query to eliminate the keyData and get only the Keys (like below)
{
"userid": "e064a694-8e1e-4660-a3ef-6b894e9414f7",
"Name": "Name",
"Keys" :[
"Government",
"Training",
"support"
]
}
So far I tried the query like
SELECT c.id,k.Keys FROM c
JOIN k in c.keyPhraseBatchResult
Which is not working.
Update 1:
After trying with the Sajeetharan now I can able to get the result, but the issue it producing another JSON inside the Array.
Like
{
"id": "ee885fdc-9951-40e2-b1e7-8564003cd554",
"keys": [
{
"serving": "Government"
},
{
"serving": "Training"
},
{
"serving": "support"
}
]
}
Is there is any way that extracts only the Array without having key value pari again?
{
"userid": "e064a694-8e1e-4660-a3ef-6b894e9414f7",
"Name": "Name",
"Keys" :[
"Government",
"Training",
"support"
]
}

You could try this one,
SELECT C.id, ARRAY(SELECT VALUE serving FROM serving IN C.keyData.Keys) AS Keys FROM C

Please use cosmos db stored procedure to implement your desired format based on the #Sajeetharan's sql.
function sample() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT C.id,ARRAY(SELECT serving FROM serving IN C.keyData.Keys) AS keys FROM C',
function (err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) {
var response = getContext().getResponse();
response.setBody('no docs found');
}
else {
var response = getContext().getResponse();
var map = {};
for(var i=0;i<feed.length;i++){
var keyArray = feed[i].keys;
var array = [];
for(var j=0;j<keyArray.length;j++){
array.push(keyArray[j].serving)
}
feed[i].keys = array;
}
response.setBody(feed);
}
});
if (!isAccepted) throw new Error('The query was not accepted by the server.');
}
Output:

Related

Dynamodb update multiple items in one transaction

In my dynamodb table, I have a scenario where when I add a new item to my table, I need to also increment a counter for another entry in my table.
For instance, when USER#1 follows USER#2, I would like to increment followers count for USER#2.
HashKey
RangeKey
counter
USER1
USER2
USER3
USER2
USER2
USER2
2
I do not want to use auto-increment as I want to control how the increment happens.
Naturally, everything works as expected if I make two update calls to dynamodb. One to create the relationship between users and another to update the count for the other user.
The question is, if it is a good approach to make two such calls or would a transactWrite be a better alternative.
If so how could I make an increment using transactwrite api.
I can add items using the following approach. But I am not sure how I can increment
"TransactItems": [
{
"Put": {
"TableName": "Table",
"Item": {
"hashKey": {"S":"USER1"},
"rangeKey": {"S":"USER2"}
}
}
},
{
"Update": {
"TableName": "TABLE",
"Key": {
"hashKey": {"S":"USER2"},
"rangeKey": {"S":"USER2"}
},
"ConditionExpression": "#cefe0 = :cefe0",
"ExpressionAttributeNames": {"#cefe0":"counter"},
"ExpressionAttributeValues": ?? how do I increment here
}
}
]
Transactions would defintely be the best way to approach it, you can increment using SET in the UpdateExpression:
"TransactItems": [
{
"Put": {
"TableName": "Table",
"Item": {
"hashKey": {"S":"USER1"},
"rangeKey": {"S":"USER2"}
}
}
},
{
"Update": {
"TableName": "TABLE",
"Key": {
"hashKey": {"S":"USER2"},
"rangeKey": {"S":"USER2"}
},
"UpdateExpression": "SET #cefe0 = #cefe0 + :cefe0",
"ExpressionAttributeNames": {"#cefe0":"counter"},
"ExpressionAttributeValues": {"cefe0": {"N": "1"}}
}
}
]

How can I get distinct values for the area.names using graphene?

my resolver in schema.py looks like this
def resolve_areas(self, info, **kwargs):
result = []
dupfree = []
user = info.context.user
areas = BoxModel.objects.filter(client=user, active=True).values_list('area_string', flat=True)
In GraphiQL I am using this query:
{
areas {
edges {
node {
id
name
}
}
}
}
And get Output that starts like this:
{
"data": {
"areas": {
"edges": [
{
"node": {
"id": "QXJlYTpkZWZ",
"name": "default"
}
},
{
"node": {
"id": "QXJlYTptZXN",
"name": "messe"
}
},
{
"node": {
"id": "QXJlYTptZXN",
"name": "messe"
}
},
But i want distinct values on the name variable
(Using a MySQL Database so distinct does not work)
SOLVED:
distinct was not working. so i just wrote a short loop which tracked onlye the string names duplicates in a list and only appended the whole "area" object if it's name has not been added to the duplicates list yet
result = []
dupl_counter = []
for area in areas:
if area not in dupl_counter:
dupl_counter.append(area)
result.append(Area(name=area))
print(area)

Google Cloud DLP tokenization of tabular data with CryptoDeterministicConfig and custom infotype

I am trying to tokenize the string value (passed in the tabular format) with custom regex infotype, but having issues when I add more than one row in the table. If I pass the single row, it successfully tokenize the string_value and returns the encoded string. I'm using the python library for this.
Custom info type is currently set to any value in a string for demo purpose and wrapped key is present in cloud KMS (removed it here for security reasons).
Following is the configuration that I am using:
# Construct FPE configuration dictionary
crypto_replace_ffx_fpe_config = {
"crypto_key": {
"kms_wrapped": {
"wrapped_key": wrapped_key,
"crypto_key_name": key_name,
}
}
}
# Add surrogate type
if surrogate_type:
crypto_replace_ffx_fpe_config["surrogate_info_type"] = {
"name": surrogate_type
}
# Construct inspect configuration dictionary
inspect_config = {
#"info_types": [{"name": info_type} for info_type in info_types],
#"min_likelihood": "VERY_UNLIKELY",
"custom_info_types": [
{
"info_type": {
"name": "custom"
},
"exclusion_type": "EXCLUSION_TYPE_UNSPECIFIED",
"likelihood": "POSSIBLE",
"regex": {
"pattern": "(?:.*)"
#"pattern": ".*"
}
}
]
}
# Construct deidentify configuration dictionary
deidentify_config = {
"info_type_transformations": {
"transformations": [
{
"primitive_transformation": {
"crypto_deterministic_config": crypto_replace_ffx_fpe_config
}
}
]
}
}
item={
"table":{
"headers":[{
"name":header
} for header in data_headers
],
"rows":[
{
"values":[
{
"string_value":"asa s.com"
}
]
}, #Issue starts when the below row is added having any value in string_value
{
"values":
[
{
"string_value":"14562#gmail.com"
}
]
}
]
}
}
# Call the API
response = dlp.deidentify_content(
parent,
inspect_config=inspect_config,
deidentify_config=deidentify_config,
item=item,
)
# Print results
return response.item.table
If i am sending one row of data, getting response as
headers {
name: "token"
}
rows {
values {
string_value: "EMAIL_ADDRESS(XX):XXXXXXXXXXXXXXXXXXX="
}
}
And when i am sending item with more than one row, i am getting what i originally sent to api as it is back:
For example:
headers {
name: "token"
}
rows {
values {
string_value: "asa s.com"
}
}
rows {
values {
string_value: "14562#gmail.com"
}
}
It seems like you are using InfoTypeTransformations for DeidentifyConfig.
As per the documentation, you should use RecordTransformations instead, as this category of transformation "is applied to values within submitted tabular text data that are identified as a specific infoType, or on an entire column of tabular data" and treat the dataset as structured.

elasticsearch in json string (and / or )

I am new to AWS elasticsearch but need to create queries to search the follow data with different criteria.
search_metadata (json string with key/value pair) - "{\"number\":\"111\"; \"area\":\"central\"; "\code\":\"1111\"; \"type\":\"internal\"}"
category - "statement" or "bill" or "email"
datetime - "2019-05-04T00:00:00" or "2019-07-16T00:01:00"
flag - "good" or "bad"
I need to construct query to do the following
AND or OR condition in search_metadata field (JSON string) -> not sure how to do it.
along with AND condition for category, datetime range and flag. -> Do I need to use muliti-match for flag and category ?
"query": {
"bool": {
"must": [
{
"match_phrase": {
"search_metadata": "number 111" --> not sure about AND or OR with "area" and others
}
},
{
"range": {
"datetime": {
"gte": "2019-05-04T00:00:00Z",
"lte": "2019-07-16T00:01:00Z"
}
}
}
]
}
}
}

Deleting row using composite key

I have the table 'column_defn' with the following schema. The keys are column_name,database_name and table_name
column_name STRING(130) NOT NULL
database_name STRING(150) NOT NULL
table_name STRING(130) NOT NULL
column_description STRING(1000) NOT NULL
I am trying to delete a row using the following REST request
{
"session":"xxxxxxxxx"
"singleUseTransaction": {
"readWrite": {}
},
"mutations": [
{
"delete": {
"table": "column_defn",
"keySet": {
"keys": [
[
{
"column_name": "testd"
},
{
"table_name": "test atbd"
},
{
"database_name": "ASDFDFS"
}
]
]
}
}
}
]
}
but I keep getting the following error. Any idea as to where is wrong in the above request
{
"error": {
"code": 400,
"message": "Invalid value for column database_name in table column_defn: Expected STRING.",
"status": "FAILED_PRECONDITION"
}
}
Update: The following request seems to be successful. At least it was returning the success code 200 and the commitTimestamp. However, the row didn't get deleted
{
"singleUseTransaction": {
"readWrite": {}
},
"mutations": [
{
"delete": {
"table": "column_defn",
"keySet": {
"keys": [
[
"testd",
"dsafd",
"test atbd"
]
]
}
}
}
]
}
keys should contain an array-of-arrays. In the outer array, there will be one entry for each row you are trying to delete. Each inner array will be the ordered list of key-values that define a single row (order matters). So in your example, you want:
"keys": [["testd","ASDFDFS","test atbd"]]
Note that the original question is inconsistent in the true ordering of the keys in the table. The above answer assumes the primary key is defined something like:
PRIMARY KEY(column_name,database_name,table_name)