Query Django JSONFields that are a list of dictionaries - django

Given a Django JSONField that is structured as a list of dictionaries:
# JSONField "materials" on MyModel:
[
{"some_id": 123, "someprop": "foo"},
{"some_id": 456, "someprop": "bar"},
{"some_id": 789, "someprop": "baz"},
]
and given a list of values to look for:
myids = [123, 789]
I want to query for all MyModel instances that have a matching some_id anywhere in those lists of dictionaries. I can do this to search in dictionaries one at a time:
# Search inside the third dictionary in each list:
MyModel.objects.filter(materials__2__some_id__in=myids)
But I can't seem to construct a query to search in all dictionaries at once. Is this possible?

Given the clue here from Davit Tovmasyan to do this by incrementing through the match_targets and building up a set of Q queries, I wrote this function that takes a field name to search, a property name to search against, and a list of target matches. It returns a new list containing the matching dictionaries and the source objects they come from.
from iris.apps.claims.models import Claim
from django.db.models import Q
def json_list_search(
json_field_name: str,
property_name: str,
match_targets: list
) -> list:
"""
Args:
json_field_name: Name of the JSONField to search in
property_name: Name of the dictionary key to search against
match_targets: List of possible values that should constitute a match
Returns:
List of dictionaries: [
{"claim_id": 123, "json_obj": {"foo": "y"},
{"claim_id": 456, "json_obj": {"foo": "z"}
]
Example:
results = json_list_search(
json_field_name="materials_data",
property_name="material_id",
match_targets=[1, 22]
)
# (results truncated):
[
{
"claim_id": 1,
"json_obj": {
"category": "category_kmimsg",
"material_id": 1,
},
},
{
"claim_id": 2,
"json_obj": {
"category": "category_kmimsg",
"material_id": 23,
}
},
]
"""
q_keys = Q()
for match_target in match_targets:
kwargs = {
f"{json_field_name}__contains": [{property_name: match_target}]
}
q_keys |= Q(**kwargs)
claims = Claim.objects.filter(q_keys)
# Now we know which ORM objects contain references to any of the match_targets
# in any of their dictionaries. Extract *relevant* objects and return them
# with references to the source claim.
results = []
for claim in claims:
data = getattr(claim, json_field_name)
for datum in data:
if datum.get(property_name) and datum.get(property_name) in match_targets:
results.append({"claim_id": claim.id, "json_obj": datum})
return results

contains might help you. Should be something like this:
q_keys = Q()
for _id in myids:
q_keys |= Q(materials__contains={'some_id': _id})
MyModel.objects.filter(q_keys)

Related

customized sorting using search term in django

I am searching a term "john" in a list of dict ,
I have a list of dict like this :
"response": [
{
"name": "Alex T John"
},
{
"name": "Ajo John"
},
{
"name": "John",
}]
I am using :
response_query = sorted(response, key = lambda i: i['name'])
response_query return ascending order of result only but I need a result with first name as a priority.
Expected result:
{
"name": "John"
},
{
"name": "Ajo John"
},
{
"name": "Alex T John",
}
The first name containing search term should appear first.
If you need to sort with priorities you can try a key-function that returns tuple. In your particular case, as far as I got the question, this function will work fine:
response_query = sorted(
response,
key=lambda i: (len(i['name'].split()) > 1, i['name'])
)
In other words, I added the condition len(i['name'].split()) > 1 that return False (it will go first) if the name consists of one word only, else True.
For the case, if you need the priority condition as the name starts with the term you used in the search, the result would be:
term = 'john'
...
response_query = sorted(
response,
key=lambda i: (not i['name'].lower().startswith(term), i['name'])
)

Django annotate several same objects in QuerySet by different related object

I got:
# models
class Building(models.Model):
...
class Flat(models.Model):
building = models.ForeignKey(Building)
class Profile(models.Model):
flats = models.ManyToManyField(Flat)
# logic
building = Building.objects.create()
flat_1 = Flat.objects.create(building=building)
flat_2 = Flat.objects.create(building=building)
profile = Profile.objects.create()
profile.flats.add(flat_1)
profile.flats.add(flat_2)
profiles = Profile.objects.filter(flats__building=building)
I got in profiles 2 same profiles. How i can annotate each of them by different flat like this: profiles.first().flat == flat_1 and profiles.last().flat == flat_2?
Maybe Subquery() but how?
UPD I need this in some DRF list view. Output in JSON must be something like:
[
{
"profile_id": 1,
"flat_id": 2
},
{
"profile_id": 1,
"flat_id": 3
}
]
To obtain that output, you could do:
data = Profile.objects.all().values('flats', 'id')
return Response(data=data)
in your DRF view.
You don't have to profile instances ...
I wrote the code for your exact needs at the end, but first wrote a couple of things that might be of interest.
In your code sample, you've created only one profile, I'm sure you are not getting 2 instances of Profile that are equals but only one.
The thing is if you have a QuerySet with only one entry, then:
profiles.first() == profiles.last() # True
since profile.first() and profiles.last() are the same instance.
You should try creating 2 Profile instances:
building = Building.objects.create()
flat_1 = Flat.objects.create(building=building)
flat_2 = Flat.objects.create(building=building)
profile_1 = Profile.objects.create() # You coud/should use bulk_create here.
profile_2 = Profile.objects.create()
profile_1.flats.add(flat_1)
profile_2.flats.add(flat_2)
Then
profiles = Profile.objects.filter(flats__building=building)
will return two different profile objects.
On the other hand, obtaining the JSON like you want ...
Following the example, you posted, filter flats by profile and get the values (this also works if you have more that one profile).
Flat.objects.filter(profile=profile_1).values('profile__id', 'id')
This will return something like ("id" stands for flats ids):
[
{
"profile__id": 1,
"id": 1
},
{
"profile__id": 1,
"id": 3
}
]
If you do not filter by profile (and you have more than one) you could get something like:
[
{
"profile__id": 1,
"id": 1
},
{
"profile__id": 2,
"id": 3
},
{
"profile__id": 2,
"id": 4
},
...
]
Annotating to get the EXACT json you want:
Filter as shown previously annotate, and get desired values:
Flat.objects.filter(profile=profile_1).annotate(
flat_id=F('id')
).annotate(
profile_id=F('profile__id')
).values(
'profile_id', 'flat_id'
)
will give exactly what you want:
[
{
"profile_id": 1,
"flat_id": 2
},
{
"profile_id": 1,
"flat_id": 3
}
]
You can do that with the right serializer and the right annotation:
The serializer:
class FlatSerializer(serializers.ModelSerializer):
class Meta:
model = Flat
fields = ('flat_id', 'building_id')
flat_id = serializers.CharField(read_only=True)
Then I would simply query Flats rather than profiles and serialize:
flats = Flat.objects \
.annotate(flat_id=F('id')) \
.filter(building=building)
serialized = FlatSerializer(flats, many=True)
print(serialized.data) # [ { flat_id: 1, building_id: 1 }, { flat_id: 2, building_id: 1 } ]
Let me know if that works for you

Writing JSON with dict properties to Google Cloud Datastore

Using Apache Beam(Python 2.7 SDK) I am trying to write JSON files as entities into Google Cloud Datastore.
Sample JSON:
{
"CustId": "005056B81111",
"Name": "John Smith",
"Phone": "827188111",
"Email": "john#xxx.com",
"addresses": [
{"type": "Billing", "streetAddress": "Street 7", "city": "Malmo", "postalCode": "CR0 4UZ"},
{"type": "Shipping", "streetAddress": "Street 6", "city": "Stockholm", "postalCode": "YYT IKO"}
]
}
I have written a Apache Beam pipeline with mainly 3 steps,
beam.io.ReadFromText(input_file_path)
beam.ParDo(CreateEntities())
WriteToDatastore(PROJECT)
In step 2, I am converting JSON object(dict) into an entity,
class CreateEntities(beam.DoFn):
def process(self, element):
element = element.encode('ascii','ignore')
element = json.loads(element)
Id = element.pop('CustId')
entity = entity_pb2.Entity()
datastore_helper.add_key_path(entity.key, 'CustomerDF', Id)
datastore_helper.add_properties(entity, element)
return [entity]
This works fine for basic properties. However since address is a dict object itself it fails.
I have read a similar post.
However did not get the exact code to convert dict -> entity
Tried below to set address element as entity but does not work,
element['addresses'] = entity_pb2.Entity()
Other References:
https://www.the-swamp.info/blog/uploading-data-cloud-datastore-using-dataflow/
https://gcloud-python.readthedocs.io/en/latest/datastore/entities.html
Are you trying to store this as a repeated structured property?
ndb.StructuredPropertys appear in dataflow with the keys flattened, and for repeated structured properties, each individual property within the structured property object becomes an array. So I think you would need to write it like this:
datastore_helper.add_properties(entity, {
...
"addresses.type": ["Billing", "Shipping"],
"addresses.streetAddress": ["Street 7", "Street 6"],
"addresses.city": ["Malmo", "Stockholm"],
"addresses.postalCode": ["CR0 4UZ", "YYT IKO"],
})
Alternatively, if youre trying to save this as a ndb.JsonProperty, you can do this:
datastore_helper.add_properties(entity, {
...
"addresses": json.dumps(element['addresses']),
})
I know this is an old question, but I had a similar issue (although Python 3.6 and NDB) and wrote a function to convert all dicts inside a dict into Entity. This uses recursion to go through all nodes converting as necessary:
def dict_to_entity(data):
# the data can be a dict or a list, and they are iterated over differently
# also create a new object to store the child objects
if type(data) == dict:
childiterator = data.items()
new_data = {}
elif type(data) == list:
childiterator = enumerate(data)
new_data = []
else:
return
for i, child in childiterator:
# if the child is a dict or a list, continue drilling...
if type(child) in [dict, list]:
new_child = dict_to_entity(child)
else:
new_child = child
# add the child data to the new object
if type(data) == dict:
new_data[i] = new_child
else:
new_data.append(new_child)
# convert the new object to Entity if needed
if type(data) == dict:
child_entity = datastore.Entity()
child_entity.update(new_data)
return child_entity
else:
return new_data

How do I insert or create data with Django ORM programmatically? Or how do I specific Model field name with a string?

If I have:
class Tag(models.Model):
number = models.IntegerField()
config = {
"data": 1,
"field": "number"
}
How do I do the following?
record = Tag(config["field"]=config["data"])
record.save()
You can unpack dict to arguments using this syntax **. For example:
config = {
"number": 1
}
record = Tag(**config)
record.save()
This will create new tag instance with number=1 value.

groovy: create a list of values with all strings

I am trying to iterate through a map and create a new map value. The below is the input
def map = [[name: 'hello', email: ['on', 'off'] ], [ name: 'bye', email: ['abc', 'xyz']]]
I want the resulting data to be like:
[hello: ['on', 'off'], bye: ['abc', 'xyz']]
The code I have right now -
result = [:]
map.each { key ->
result[random] = key.email.each {random ->
"$random"
}
}
return result
The above code returns
[hello: [on, off], bye: [abc, xyz]]
As you can see from above, the quotes from on, off and abc, xyz have disappeared, which is causing problems for me when i am trying to do checks on the list value [on, off]
It should not matter. If you see the result in Groovy console, they are still String.
Below should be sufficient:
map.collectEntries {
[ it.name, it.email ]
}
If you still need the single quotes to create a GString instead of a String, then below tweak would be required:
map.collectEntries {
[ it.name, it.email.collect { "'$it'" } ]
}
I personally do not see any reasoning behind doing the later way. BTW, map is not a Map, it is a List, you can rename it to avoid unnecessary confusions.
You could convert it to a json object and then everything will have quotes
This does it. There should/may be a groovier way though.
def listOfMaps = [[name: 'hello', email: ['on', 'off'] ], [ name: 'bye', email: ['abc', 'xyz']]]
def result = [:]
listOfMaps.each { map ->
def list = map.collect { k, v ->
v
}
result[list[0]] = ["'${list[1][0]}'", "'${list[1][1]}'"]
}
println result