Elasticsearch in Django - sort alphabetically

Elasticsearch in Django - sort alphabetically - django

I have a following doc:
#brand.doc_type
class BrandDocument(DocType):
class Meta:
model = Brand
id = IntegerField()
name = StringField(
fields={
'raw': {
'type': 'keyword',
'fielddata': True,
}
},
)
lookup_name = StringField(
fields={
'raw': {
'type': 'string',
}
},
)
and I try to make a lookup using this:
BrandDocument.search().sort({
'name.keyword': order,
})
The problem is that I'm getting results sorted in a case sensitive way, which means that instead of 'a', 'A', 'ab', 'AB' I get 'A', 'AB', 'a', 'ab'. How can this be fixed?
EDIT After some additional search I've come up with something like this:
lowercase_normalizer = normalizer(
'lowercase_normalizer',
filter=['lowercase']
)
lowercase_analyzer = analyzer(
'lowercase_analyzer',
tokenizer="keyword",
filter=['lowercase'],
)
#brand.doc_type
class BrandDocument(DocType):
class Meta:
model = Brand
id = IntegerField()
name = StringField(
analyzer=lowercase_analyzer,
fields={
'raw': Keyword(normalizer=lowercase_normalizer, fielddata=True),
},
)
The issue persists, however, and I can't find in the docs how this normalizer should be used.

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.
So you have to update the following in the index settings:
{
"index": {
"analysis": {
"analyzer": {
"custom_sort": {
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
}
}
Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:
{
"properties":{
"sortField":{
"type":"text",
"analyzer":"custom_sort"
}
}
}
If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.
Assuming the field name having type as keyword already exists, update it as:
{
"properties":{
"name":{
"type": "keyword",
"fields":{
"sortval":{
"type":"text",
"analyzer":"custom_sort"
}
}
}
}
}
Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:
Case 1 (new field):
"sort": [
{
"sortField": "desc"
}
]
Case 2 (sub field):
"sort": [
{
"name.sortval": "desc"
}
]

Related

How can I get distinct values for the area.names using graphene?

my resolver in schema.py looks like this
def resolve_areas(self, info, **kwargs):
result = []
dupfree = []
user = info.context.user
areas = BoxModel.objects.filter(client=user, active=True).values_list('area_string', flat=True)
In GraphiQL I am using this query:
{
areas {
edges {
node {
id
name
}
}
}
}
And get Output that starts like this:
{
"data": {
"areas": {
"edges": [
{
"node": {
"id": "QXJlYTpkZWZ",
"name": "default"
}
},
{
"node": {
"id": "QXJlYTptZXN",
"name": "messe"
}
},
{
"node": {
"id": "QXJlYTptZXN",
"name": "messe"
}
},
But i want distinct values on the name variable
(Using a MySQL Database so distinct does not work)
SOLVED:
distinct was not working. so i just wrote a short loop which tracked onlye the string names duplicates in a list and only appended the whole "area" object if it's name has not been added to the duplicates list yet
result = []
dupl_counter = []
for area in areas:
if area not in dupl_counter:
dupl_counter.append(area)
result.append(Area(name=area))
print(area)

How to stop Graphql + Django-Filters to return All Objects when Filter String is Empty?

Using:
Django 3.x [ Django-Filters 2.2.0, graphene-django 2.8.0, graphql-relay 2.0.1 ]
Vue 2.x [ Vue-Apollo ]
I have a simple Birds Django-Model with Fields like name, habitat and applied different Filters on these Fields like icontains or iexact. My Goal was to apply a simple search field in my Frontend (Vue). So far it works, but whenever this Filter Value is empty or has blanks (see Example 3), Graphql returns all Objects.
My first approach was on the FrontEnd and to use some kind of logic on my input value, like when String is Empty/blank send isnull=true . But then i thought that Django should handle this in the first place.
I guess this Issue relates to my filters (see Django < relay_schema.py) or in other words do i have to apply some kind of logic on these Filters?
In the moment i try to customize some filterset_class but it feels like this would maybe too much, maybe i missed something? So i ask here if maybe someone has some hints, therefore my question is:
How to stop Graphql + Django-Filters to return All Objects when Filter String is Empty?
GraphiQL IDE
Example 1
query {birdsNodeFilter (name_Iexact: "finch") {
edges {
node {
id
name
}
}
}
}
Returns
{
"data": {
"birdsNodeFilter": {
"edges": [
{
"node": {
"id": "QmlyZHNOb2RlOjE=",
"name": "Finch",
"habitat": "Europe"
}
}
]
}
}
}
Fine for me!
Example 2
query {birdsNodeFilter (name_Iexact: "Unicorns") {
edges {
node {
id
name
habitat
}
}
}
}
Returns
{
"data": {
"birdsNodeFilter": {
"edges": []
}
}
}
No Unicorns there - good
Example 3
query {birdsNodeFilter (name_Iexact: "") {
edges {
node {
id
name
}
}
}
}
Return
{
"data": {
"birdsNodeFilter": {
"edges": [
{
"node": {
"id": "QmlyZHNOb2RlOjE=",
"name": "Finch",
"habitat": "Europe"
}
},
{
"node": {
"id": "QmlyZHNOb2RlOjI=",
"name": "Bald Eagle",
"habitat": "USA"
}
},
<...And so on...>
Not fine for me!
Django
relay_schema.py
class BirdsNode(DjangoObjectType):
class Meta:
model = Birds
filter_fields = {
'id': ['iexact'],
'name': ['iexact', 'icontains', 'istartswith', 'isnull'],
'habitat': ['iexact', 'icontains', 'istartswith'],
}
interfaces = (relay.Node, )
class BirdQuery(graphene.ObjectType):
birdConNode = relay.Node.Field(BirdsNode)
birdsNodeFilter = DjangoFilterConnectionField(BirdsNode)

This is my Solution which worked in GraphiQL and in my Frontend VUE.
I added a logic to the Birds2Query with def resolve_all_birds2 for each Filter (for testing purpose not on all Filter ) .
Besides that i also added a ExtendedConnection for counting.
Note: i changed the class names from my former Question.
Update: This Solution works on the python side. But the apollo client provides also the Apollo manager - also known as Dollar Apollo - there you can also use the this.$apollo.queries.tags.skip property as an static or dynamic solution to start and stop queries.
relay_schema.py
class ExtendedConnection(Connection):
class Meta:
abstract = True
total_count = Int()
edge_count = Int()
name_check = ""
def resolve_total_count(root, info, **kwargs):
return root.length
def resolve_edge_count(root, info, **kwargs):
return len(root.edges)
class Birds2Node(DjangoObjectType):
class Meta:
model = Birds
filter_fields = {
'id': ['exact', 'icontains'],
'name': ['exact', 'icontains', 'istartswith', 'iendswith'],
}
interfaces = (relay.Node, )
connection_class = ExtendedConnection
class Birds2Query(ObjectType):
birds2 = relay.Node.Field(Birds2Node)
all_birds2 = DjangoFilterConnectionField(Birds2Node)
def resolve_all_birds2(self, info, **kwargs):
# Filtering for Empty/ Blank Values in Filter.Key.Value before returning queryset
if 'name__icontains' in kwargs:
nameIcon = kwargs['name__icontains']
nameIconBool = bool(nameIcon.strip()) # if blanks turns False
if nameIconBool == False: # has blanks
return Birds.objects.filter(name=None)
pass
if 'name__istartswith' in kwargs:
nameIsta = kwargs['name__istartswith']
nameIstaBool = bool(nameIsta.strip()) # if blanks turns False
if nameIstaBool == False: # has blanks
return Birds.objects.filter(name=None)
pass
return
GraphiQL
Blockquote
Example 1
query {allBirds2 (name_Icontains:""){
totalCount
edgeCount
edges {
node {
id
name
habitat
}
}
}
}
Stopped to return all Objects while Filter is blank.
{
"data": {
"allBirds2": {
"totalCount": 0,
"edgeCount": 0,
"edges": []
}
}
}
Example 2 with blanks and one letter
query {allBirds2 (name_Icontains:" f "){
totalCount
edgeCount
edges {
node {
id
name
habitat
}
}
}
}
return - exactly what i wanted
{
"data": {
"allBirds2": {
"totalCount": 1,
"edgeCount": 1,
"edges": [
{
"node": {
"id": "QmlyZHMyTm9kZTox",
"name": "Finch",
"habitat": "Europe"
}
}
]
}
}
}

Flatten json return by DRF

I have json API returned as below format.
But I want to return json API decomposing namingzone key as specified below.
Could anyone tell me how I can revise serializer to achieve this?
serializer.py is also specified below.
For models.py and views.py, please refer to my previous post.
current
{
"zone": {
"zone": "office_enclosed",
"namingzone": [
{
"naming": "moffice"
}
]
},
"lpd": 11.9,
"sensor": true
},
{
"zone": {
"zone": "office_open",
"namingzone": [
{
"naming": "off"
},
{
"naming": "office"
}
]
},
"lpd": 10.5,
"sensor": true
}
Target
{
"zone": "office_enclosed",
"naming": "moffice",
"lpd": 11.9,
"sensor": true
},
{
"zone": "office_open",
"naming": "off",
"lpd": 10.5,
"sensor": true
},
{
"zone": "office_open",
"naming": "office",
"lpd": 10.5,
"sensor": true
}
serializer.py
class namingNewSerializer(serializers.ModelSerializer):
class Meta:
model=Naming
fields=('naming',)
class zoneSerializer(serializers.ModelSerializer):
namingzone=namingNewSerializer(many=True)
class Meta:
model=Zone
fields = ('zone','namingzone')
class lightSerializer(serializers.ModelSerializer):
zone = zoneSerializer()
class Meta:
model=Light
fields = ('zone','lpd','sensor')
class namingSerializer(serializers.ModelSerializer):
zone=zoneSerializer()
class Meta:
model=Naming
fields=('zone','naming')

I would say using Serializer might complicate the implementations. Rather, you can take an pythonic approach. Try like this:
class SomeView(APIView):
...
def get(self, request, *args, **kwargs):
data = lightSerializer(Light.objects.all(), many=True).data
data = list(data) # convert lazy object to list
updated_data = list()
for item in data:
newdict = dict()
zone = item['zone']
newdict.update({'zone':zone['zone'], 'lpd': item['lpd'], 'sensor':item['sensor']})
for naming_zone in zone.get('namingzone'):
naming_zone.update(newDict)
updated_data.append(naming_zone)
return Response(updated_data, status=status.HTTP_200_OK)

See DRF Field document about source. It will help you.
https://www.django-rest-framework.org/api-guide/fields/#source

Django graphql graphene removing redundant queries from return value

The last few days I have read up on so much graphql that I can't see the trees from the forest anymore.
The results that this person got in the beginning is almost exactly what I want (his problem, not his solution), but it seems that a lot of the code is deprecated and I can't seem to get it working: link
I have a bunch of containers that I return. All the containers have the amounts for every day in them. I only want to return the amounts of a certain day.
At the moment, I do return these results (day), but all the other results (days) also return with a Null value.
Current behavior:
{
"data": {
"listProductcontainers": [
{
"id": "1",
"productid": {
"productid": "CBG2",
"processedstockamountsSet": [
{
"timeStampID": {
"id": "2"
},
"id": "77745",
"prodName": {
"productid": "CBG2"
}
},
{
"timeStampID": null, <--------
"id": "89645",
"prodName": {
"productid": "CBG2"
}
},
{
"timeStampID": null, <--------
"id": "89848",
"prodName": {
"productid": "CBG2"
}
},
// ...
Requested behavior: (All values with 'Null' should not return)
{
"data": {
"listProductcontainers": [
{
"id": "1",
"productid": {
"productid": "CBG2",
"processedstockamountsSet": [
{
"timeStampID": {
"id": "2"
}
My query that I am running looks like this:
query{
listProductcontainers{
id
productid{
productid
processedstockamountsSet{
timeStampID(id:2){
id
}
id
prodName{
productid
}
}
}
}
}
Here are the relevant code for the results:
class TimeStampType(DjangoObjectType):
class Meta:
model = TimeStamp
class ProcessedStockAmountsType(DjangoObjectType):
timeStampID = graphene.Field(TimeStampType, id=graphene.Int())
class Meta:
model = ProcessedStockAmounts
def resolve_timeStampID(self, info, **kwargs):
id = kwargs.get('id')
if self.timeStampID.id == id:
return self.timeStampID
class ProductcontainersType(DjangoObjectType):
class Meta:
model = Productcontainers
class ProductlistType(DjangoObjectType):
class Meta:
model = Productlist
class Query(graphene.ObjectType):
list_productcontainers = graphene.List(ProductcontainersType)
def resolve_list_productcontainers(self, context, **kwargs):
return Productcontainers.objects.all()
I have read almost everything in graphene by now, but if you even have a link that mirrors what I want to do I would really appreciate it.
My final option is to make two calls where I get all the container ids, and another call where I get all the amounts (with container id) for a certain date, and with 2 for loops I just add the amounts into the corresponding container... :(

Search for a mix of numbers and chars with haystack (elasticsearch)

I am using Django Haystack with Elasticsearch. I have a string field called 'code' in this type of format:
76-010
I would like to be able to search
76-
And get as a result
76-111
76-110
76-210
...
and so on.
but I don't want to get these results:
11-760
11-076
...
I already have a custom elastic search backend but I am not sure how should i indexing it to get the desired behavior.
class ConfigurableElasticBackend(ElasticsearchSearchBackend):
def __init__(self, connection_alias, **connection_options):
# see http://stackoverflow.com/questions/13636419/elasticsearch-edgengrams-and-numbers
self.DEFAULT_SETTINGS['settings']['analysis']['analyzer']['edgengram_analyzer']['tokenizer'] = 'standard'
self.DEFAULT_SETTINGS['settings']['analysis']['analyzer']['edgengram_analyzer']['filter'].append('lowercase')
super(ConfigurableElasticBackend, self).__init__(connection_alias, **connection_options)

The idea is to use an edgeNGram tokenizer in order to index every prefix of your code field. For instance, we would like 76-111 to be indexed as 7, 76, 76-, 76-1, 76-11 and 76-111. That way you will find 766-11 by searching for any of its prefixes.
Note that this article provides a full-fledge solution to your problem. The index settings for your case would look like this in Django code. You can then follow that article to wrap it up, but this should get you started.
class ConfigurableElasticBackend(ElasticsearchSearchBackend):
DEFAULT_SETTINGS = {
"settings": {
"analysis": {
"analyzer": {
"edgengram_analyzer": {
"tokenizer": "edgengram_tokenizer",
"filter": [ "lowercase" ]
}
},
"tokenizer": {
"edgengram_tokenizer": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "25"
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"code": {
"type": "string",
"analyzer": "edgengram_analyzer"
}
}
}
}
}
def __init__(self, connection_alias, **connection_options):
super(ConfigurableElasticBackend, self).__init__(connection_alias, **connection_options)
self.conn = pyelasticsearch.ElasticSearch(connection_options['URL'], timeout=self.timeout)
self.index_name = connection_options['INDEX_NAME']
# create the index with the above settings
self.conn.create_index(self.index_name, self.DEFAULT_SETTINGS)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Elasticsearch in Django - sort alphabetically - django

Related

How can I get distinct values for the area.names using graphene?

How to stop Graphql + Django-Filters to return All Objects when Filter String is Empty?

Flatten json return by DRF

Django graphql graphene removing redundant queries from return value

Search for a mix of numbers and chars with haystack (elasticsearch)

Categories

Resources