Django JSONField nested greater than operation - django

Here's the structure of the JSONField - data I have.
data : {
"animals" : [
"animal_id" : 2321,
"legs" : [
{
"name" : "front",
"measured_at" : 1596740795.453353
},
{
"name" : "back",
"measured_at": 1596740795.453353
}
]
]
}
I need to find all records that match the following condition
legs-name = "front" OR
measured_at is greater than 6 hrs from now.
I have tried the following
six_hrs_ago = (datetime.utcnow() - timedelta(hours = 6)).timestamp()
obj = Model.objects.filter(data__has_key='animals').
filter(Q(data__animals__legs__name="front") | Q(data__animals__legs__measured_at__gt = six_hrs_ago))))
This doesn't work, and it fetches no record.
I also tried to just filter the records with legs - name - front, that doesn't work either
obj = Model.objects.filter(data__animals__contains=[{'legs': [{"name": "front"}]}])
Any ideas are appreciated.

Related

MongoDB query to find text in third level array of objects

I have a Mongo collection that contains data on saved searches in a Vue/Laravel app, and it contains records like the following:
{
"_id" : ObjectId("6202f3357a02e8740039f343"),
"q" : null,
"name" : "FCA last 3 years",
"frequency" : "Daily",
"scope" : "FederalContractAwardModel",
"filters" : {
"condition" : "AND",
"rules" : [
{
"id" : "awardDate",
"operator" : "between_relative_backward",
"value" : [
"now-3.5y/d",
"now/d"
]
},
{
"id" : "subtypes.extentCompeted",
"operator" : "in",
"value" : [
"Full and Open Competition"
]
}
]
},
The problem is the value in the item in the rules array that has the decimal.
"value" : [
"now-3.5y/d",
"now/d"
]
in particular the decimal. Because of a UI error, the user was allowed to enter a decimal value, and so this needs to be fixed to remove the decimal like so.
"value" : [
"now-3y/d",
"now/d"
]
My problem is writing a Mongo query to identify these records (I'm a Mongo noob). What I need is to identify records in this collection that have an item in the filters.rules array with an item in the 'value` array that contains a decimal.
Piece of cake, right?
Here's as far as I've gotten.
myCollection.find({"filters.rules": })
but I'm not sure where to go from here.
UPDATE: After running the regex provided by #R2D2, I found that it also brings up records with a valid date string , e.g.
"rules" : [
{
"id" : "dueDate",
"operator" : "between",
"value" : [
"2018-09-10T19:04:00.000Z",
null
]
},
so what I need to do is filter out cases where the period has a double 0 on either side (i.e. 00.00). If I read the regex correctly, this part
[^\.]
is excluding characters, so I would want something like
[^00\.00]
but running this query
db.collection.find( {
"filters.rules.value": { $regex: /\.[^00\.00]*/ }
} )
still returns the same records, even though it works as expected in a regex tester. What am I missing?
To find all documents containing at least one value string with (.) , try:
db.collection.find( {
"filters.rules.value": { $regex: /\.[^\.]*/ }
} )
Or you can filter only the fields that need fix via aggregation as follow:
[direct: mongos]> db.tes.aggregate([ {$unwind:"$filters.rules"}, {$unwind:"$filters.rules.value"}, {$match:{ "filters.rules.value": {$regex: /\.[^\.]*/ } }} ,{$project:{_id:1,oldValue:"$filters.rules.value"}} ])
[
{ _id: ObjectId("6202f3357a02e8740039f343"), oldValue: 'now-3.5y/d' }
]
[direct: mongos]>
Later to update those values:
db.collection.update({
"filters.rules.value": "now-3.5y/d"
},
{
$set: {
"filters.rules.$[x].value.$": "now-3,5y/d-CORRECTED"
}
},
{
arrayFilters: [
{
"x.value": "now-3.5y/d"
}
]
})
playground

Spring data mongo query with regex within an array

I have a collection with structure somewhat like this :
{
"organization" : "Org1",
"active" : true,
"fields" : [
{
"key" : "key1",
"value" : "table"
},
{
"key" : "key2",
"value" : "Harrison"
}
]
}
I need to find all documents with organization : "Org1", active : true, and regex match the 'value' in fields.
In mongo shell, it works perfectly. I tried the query:
db.collection.find({"organization" : "Org1", "active" : true, "fields" : {$elemMatch : {"key" : "key2","value" : {$regex : /iso/i}}}}).pretty()
But when I tried to convert it to a Java code with Spring, it gives wrong results.
1. This one will give documents even if it didn't match the pattern:
#Query("{'organization' : ?0, 'active' : true, 'fields' : {$elemMatch : {'key' : ?1, 'value' : {$regex : ?2}}}}")
List<ObjectCollection> findFieldDataByRegexMatch(String org, String key, String pattern);
This one doesn't give any documents even though it should.
MongoTemplate MONGO_TEMPLATE = null;
try {
MONGO_TEMPLATE = multipleMongoConfig.secondaryMongoTemplate();
} catch (Exception e) {
e.printStackTrace();
}
List<Criteria> criteriaListAnd = new ArrayList<Criteria>();
Criteria criteria = new Criteria();
String pattern = "/iso/i";
criteriaListAnd.add(Criteria.where("organization").is("Org1"));
criteriaListAnd.add(Criteria.where("active").is(true));
criteriaListAnd.add(Criteria.where("fields").elemMatch(Criteria.where("key").is(key).and("value").regex(pattern)));
criteria.andOperator(criteriaListAnd.toArray(new Criteria[criteriaListAnd.size()]));
Query query = new Query();
query.addCriteria(criteria);
List<ObjectCollection> objects = MONGO_TEMPLATE.find(query, ObjectCollection.class);
What am I missing here and how should I form my query?
You are making a very small mistake, in the pattern you are passing / which is the mistake, it took me half an hour to identify it, finally, I got it after enabling the debug log of spring boot.
For the first query, it should be called as below:
springDataRepository.findFieldDataByRegexMatch("Org1", "key2", "iso")
And the query should be modified in the Repository as to hanlde the case sensetivity:
#Query("{'organization' : ?0, 'active' : true, 'fields' : {$elemMatch : {'key' : ?1, 'value' : {$regex : ?2, $options: 'i'}}}}")
List<Springdata> findFieldDataByRegexMatch(String org, String key, String pattern);
The same issue in your second query also, just change String pattern = "/iso/i"; to String pattern = "iso" or String pattern = "iso.*" ;
Both will start working, For details please check the my GitHub repo https://github.com/krishnaiitd/learningJava/blob/master/spring-boot-sample-data-mongodb/src/main/java/sample/data/mongo/main/Application.java#L60
I hope this will resolve your problem.

django-elasticsearch-dsl Mapping for completion suggest field not working

This problem has been driving me crazy for days with not solution.
I create a document as follows from my django model.
from django_elasticsearch_dsl import fields
#registry.register_document class QuestionDocument(Document):
complete = fields.CompletionField(attr='title')
class Index:
name = 'questions'
class Django:
model = QuestionModel
fields = ['text', 'title']
Now i want to perform a completion query like this:
matched_questions = list(QuestionDocument.search().suggest("suggestions", word, completion={'field': 'complete'}).execute())
But i keep getting the following error:
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Field [complete] is not a completion suggest field')
I think the problem is that The mapping for this field is not created correctly, but i don't know how to fix it. Can anybody help with this it is literally driving me crazy.
UPDATE:
I realized that in my mapping, complete is created as a text field, and i don't know why this is happening or how to fix this. This is my mapping:
{
"questions" : {
"mappings" : {
"doc" : {
"properties" : {
"complete" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"text" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
I struggled too with a similar issue.
You should try to declare your Document like this:
#registry.register_document
class QuestionDocument(Document):
title = fields.TextField(
fields={
'raw': fields.TextField(analyzer='standard'),
'suggest': fields.CompletionField(),
}
)
class Index:
name = 'questions'
class Django:
model = QuestionModel
fields = ['text']
then to recover suggests from the title field:
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
client = Elasticsearch("localhost:9200")
s = Search(using=client)
query = s.suggest('name_suggestion',"your_prefixword",completion={'field':'title.suggest'})
response = query.execute()
response.suggest['name_suggestion']
Hope it helps. Let me know if that do the job.
Your index is being created automatically instead of you creating it with the mappings you need. You need to, before you index any document, create the index. Not sure how this is done in django_elasticsearch_dsl but in elasticsearch_dsl it would be just calling QuestionDocument.init()
Hope this helps!

dynamoDB update-item python boto3

I have a column in DynamoDB table which will be of the following type:
{"History": {"L": [{"M": {"id": {"S": "id"}, "Flow": {"L":[{"S": "test2"}]},"UUID": {"S": "1234"}}}]}}
Column History is of type 'List' in which each list element is a map with 3 values - id (string), Flow (List), uuid (String)
My code would trigger update-item multiple times and all I want is, given the same id and uuid, new values are to be appended into the Flow list without disturbing anything else.
I have referred the documentation but unable to figure out how to write the UpdateExpression.
My existing code is as below:
response_update = client.update_item(
TableName = 'tableName',
Key = {
'k1': {
'S': 'v1'
},
'k2': {
'S': 'v2'
}
},
UpdateExpression="SET History=list_append(if_not_exists(History, :empty_list), :attrValue)",
ExpressionAttributeValues = {":attrValue" :{"L":[ { "M" : { "id" : { "S" : "123" }, "UUID" : { "S" : "uuid123" }, "Flow" : { "L" : [ { "S" : "now2" } ] } } } ]},":empty_list":{"L":[]}})
With this code, each time I trigger the update function, a new element in getting appended in History list. Instead, I need my desired string to be appended to the Flow list.
Please let me know how the expression should be.

Implement auto-complete feature using MongoDB search

I have a MongoDB collection of documents of the form
{
"id": 42,
"title": "candy can",
"description": "canada candy canteen",
"brand": "cannister candid",
"manufacturer": "candle canvas"
}
I need to implement auto-complete feature based on the input search term by matching in the fields except id. For example, if the input term is can, then I should return all matching words in the document as
{ hints: ["candy", "can", "canada", "canteen", ...]
I looked at this question but it didn't help. I also tried searching how to do regex search in multiple fields and extract matching tokens, or extracting matching tokens in a MongoDB text search but couldn't find any help.
tl;dr
There is no easy solution for what you want, since normal queries can't modify the fields they return. There is a solution (using the below mapReduce inline instead of doing an output to a collection), but except for very small databases, it is not possible to do this in realtime.
The problem
As written, a normal query can't really modify the fields it returns. But there are other problems. If you want to do a regex search in halfway decent time, you would have to index all fields, which would need a disproportional amount of RAM for that feature. If you wouldn't index all fields, a regex search would cause a collection scan, which means that every document would have to be loaded from disk, which would take too much time for autocompletion to be convenient. Furthermore, multiple simultaneous users requesting autocompletion would create considerable load on the backend.
The solution
The problem is quite similar to one I have already answered: We need to extract every word out of multiple fields, remove the stop words and save the remaining words together with a link to the respective document(s) the word was found in a collection. Now, for getting an autocompletion list, we simply query the indexed word list.
Step 1: Use a map/reduce job to extract the words
db.yourCollection.mapReduce(
// Map function
function() {
// We need to save this in a local var as per scoping problems
var document = this;
// You need to expand this according to your needs
var stopwords = ["the","this","and","or"];
for(var prop in document) {
// We are only interested in strings and explicitly not in _id
if(prop === "_id" || typeof document[prop] !== 'string') {
continue
}
(document[prop]).split(" ").forEach(
function(word){
// You might want to adjust this to your needs
var cleaned = word.replace(/[;,.]/g,"")
if(
// We neither want stopwords...
stopwords.indexOf(cleaned) > -1 ||
// ...nor string which would evaluate to numbers
!(isNaN(parseInt(cleaned))) ||
!(isNaN(parseFloat(cleaned)))
) {
return
}
emit(cleaned,document._id)
}
)
}
},
// Reduce function
function(k,v){
// Kind of ugly, but works.
// Improvements more than welcome!
var values = { 'documents': []};
v.forEach(
function(vs){
if(values.documents.indexOf(vs)>-1){
return
}
values.documents.push(vs)
}
)
return values
},
{
// We need this for two reasons...
finalize:
function(key,reducedValue){
// First, we ensure that each resulting document
// has the documents field in order to unify access
var finalValue = {documents:[]}
// Second, we ensure that each document is unique in said field
if(reducedValue.documents) {
// We filter the existing documents array
finalValue.documents = reducedValue.documents.filter(
function(item,pos,self){
// The default return value
var loc = -1;
for(var i=0;i<self.length;i++){
// We have to do it this way since indexOf only works with primitives
if(self[i].valueOf() === item.valueOf()){
// We have found the value of the current item...
loc = i;
//... so we are done for now
break
}
}
// If the location we found equals the position of item, they are equal
// If it isn't equal, we have a duplicate
return loc === pos;
}
);
} else {
finalValue.documents.push(reducedValue)
}
// We have sanitized our data, now we can return it
return finalValue
},
// Our result are written to a collection called "words"
out: "words"
}
)
Running this mapReduce against your example would result in db.words look like this:
{ "_id" : "can", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "canada", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "candid", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "candle", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "candy", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "cannister", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "canteen", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
{ "_id" : "canvas", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
Note that the individual words are the _id of the documents. The _id field is indexed automatically by MongoDB. Since indices are tried to be kept in RAM, we can do a few tricks to both speed up autocompletion and reduce the load put to the server.
Step 2: Query for autocompletion
For autocompletion, we only need the words, without the links to the documents.
Since the words are indexed, we use a covered query – a query answered only from the index, which usually resides in RAM.
To stick with your example, we would use the following query to get the candidates for autocompletion:
db.words.find({_id:/^can/},{_id:1})
which gives us the result
{ "_id" : "can" }
{ "_id" : "canada" }
{ "_id" : "candid" }
{ "_id" : "candle" }
{ "_id" : "candy" }
{ "_id" : "cannister" }
{ "_id" : "canteen" }
{ "_id" : "canvas" }
Using the .explain() method, we can verify that this query uses only the index.
{
"cursor" : "BtreeCursor _id_",
"isMultiKey" : false,
"n" : 8,
"nscannedObjects" : 0,
"nscanned" : 8,
"nscannedObjectsAllPlans" : 0,
"nscannedAllPlans" : 8,
"scanAndOrder" : false,
"indexOnly" : true,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"_id" : [
[
"can",
"cao"
],
[
/^can/,
/^can/
]
]
},
"server" : "32a63f87666f:27017",
"filterSet" : false
}
Note the indexOnly:true field.
Step 3: Query the actual document
Albeit we will have to do two queries to get the actual document, since we speed up the overall process, the user experience should be well enough.
Step 3.1: Get the document of the words collection
When the user selects a choice of the autocompletion, we have to query the complete document of words in order to find the documents where the word chosen for autocompletion originated from.
db.words.find({_id:"canteen"})
which would result in a document like this:
{ "_id" : "canteen", "value" : { "documents" : [ ObjectId("553e435f20e6afc4b8aa0efb") ] } }
Step 3.2: Get the actual document
With that document, we can now either show a page with search results or, like in this case, redirect to the actual document which you can get by:
db.yourCollection.find({_id:ObjectId("553e435f20e6afc4b8aa0efb")})
Notes
While this approach may seem complicated at first (well, the mapReduce is a bit), it is actual pretty easy conceptually. Basically, you are trading real time results (which you won't have anyway unless you spend a lot of RAM) for speed. Imho, that's a good deal. In order to make the rather costly mapReduce phase more efficient, implementing Incremental mapReduce could be an approach – improving my admittedly hacked mapReduce might well be another.
Last but not least, this way is a rather ugly hack altogether. You might want to dig into elasticsearch or lucene. Those products imho are much, much more suited for what you want.
Thanks to #Markus solution, I came up with something similar with aggregations instead. Knowing that map-reduce are flagged as deprecated for later versions.
const { MongoDBNamespace, Collection } = require('mongodb')
//.replace(/(\b(\w{1,3})\b(\W|$))/g,'').split(/\s+/).join(' ')
const routine = `function (text) {
const stopwords = ['the', 'this', 'and', 'or', 'id']
text = text.replace(new RegExp('\\b(' + stopwords.join('|') + ')\\b', 'g'), '')
text = text.replace(/[;,.]/g, ' ').trim()
return text.toLowerCase()
}`
// If the pipeline includes the $out operator, aggregate() returns an empty cursor.
const agg = [
{
$match: {
a: true,
d: false,
},
},
{
$project: {
title: 1,
desc: 1,
},
},
{
$replaceWith: {
_id: '$_id',
text: {
$concat: ['$title', ' ', '$desc'],
},
},
},
{
$addFields: {
cleaned: {
$function: {
body: routine,
args: ['$text'],
lang: 'js',
},
},
},
},
{
$replaceWith: {
_id: '$_id',
text: {
$trim: {
input: '$cleaned',
},
},
},
},
{
$project: {
words: {
$split: ['$text', ' '],
},
qt: {
$const: 1,
},
},
},
{
$unwind: {
path: '$words',
includeArrayIndex: 'id',
preserveNullAndEmptyArrays: true,
},
},
{
$group: {
_id: '$words',
docs: {
$addToSet: '$_id',
},
weight: {
$sum: '$qt',
},
},
},
{
$sort: {
weight: -1,
},
},
{
$limit: 100,
},
{
$out: {
db: 'listings_db',
coll: 'words',
},
},
]
// Closure for db instance only
/**
*
* #param { MongoDBNamespace } db
*/
module.exports = function (db) {
/** #type { Collection } */
let collection
/**
* Runs the aggregation pipeline
* #return {Promise}
*/
this.refreshKeywords = async function () {
collection = db.collection('listing')
// .toArray() to trigger the aggregation
// it returns an empty curson so it's fine
return await collection.aggregate(agg).toArray()
}
}
Please check for very minimal changes for your convenience.