Partial Text Matching GAE - python-2.7

I am developing a web application for managing customers. So I have a Customer entity which is made up by usual fields such as first_name, last_name, age etc.
I have a page where these customers are shown as a table. In the same page I have a search field, and I'd like to filter customers and update the table while the user is typing a something in the search field, using Ajax.
Here is how it should work:
Figure 1: The main page showing all of the customers:
Figure 2: As long as the user types letter "b", the table is updated with the results:
Given that partial text matching is not supported in GAE, I have tricked and implemented it arising from what is shown here: TL;DR: I have created a Customers Index, that contains a Search Document for every customer(doc_id=customer_key). Each Search Document contains Atom Fields for every customer's field I want to be able to search on(eg: first_name, last_name): every field is made up like this: suppose the last_name is Berlusconi, the field is going to be made up by these Atom Fields "b" "be" "ber" "berl" "berlu" "berlus" "berlusc" "berlusco" "berluscon" "berlusconi".
In this way I am able to perform full text matching in a way that resembles partial text matching. If I search for "Be", the Berlusconi customer is returned.
The search is made by Ajax calls: whenever a user types in the search field(the ajax is dalayed a little bit to see if the user keeps typing, to avoid sending a burst of requests), an Ajax call is made with the query string, and a json object is returned.
Now, things were working well in debugging, but I was testing it with a few people in the datastore. As long as I put many people, search looks very slow.
This is how I create search documents. This is called everytime a new customer is put to the datastore.
def put_search_document(cls, key):
"""
Called by _post_put_hook in BaseModel
"""
model = key.get()
_fields = []
if model:
_fields.append(search.AtomField(name="empty", value=""),) # to retrieve customers when no query string
_fields.append(search.TextField(name="sort1", value=model.last_name.lower()))
_fields.append(search.TextField(name="sort2", value=model.first_name.lower()))
_fields.append(search.TextField(name="full_name", value=Customer.tokenize1(
model.first_name.lower()+" "+model.last_name.lower()
)),)
_fields.append(search.TextField(name="full_name_rev", value=Customer.tokenize1(
model.last_name.lower()+" "+model.first_name.lower()
)),)
# _fields.append(search.TextField(name="telephone", value=Customer.tokenize1(
# model.telephone.lower()
# )),)
# _fields.append(search.TextField(name="email", value=Customer.tokenize1(
# model.email.lower()
# )),)
document = search.Document( # create new document with doc_id=key.urlsafe()
doc_id=key.urlsafe(),
fields=_fields)
index = search.Index(name=cls._get_kind()+"Index") # not in try-except: defer will catch and retry.
index.put(document)
#staticmethod
def tokenize1(string):
s = ""
for i in range(len(string)):
if i > 0:
s = s + " " + string[0:i+1]
else:
s = string[0:i+1]
return s
This is the search code:
#staticmethod
def search(ndb_model, query_phrase):
# TODO: search returns a limited number of results(20 by default)
# (See Search Results at https://cloud.google.com/appengine/docs/python/search/#Python_Overview)
sort1 = search.SortExpression(expression='sort1', direction=search.SortExpression.ASCENDING,
default_value="")
sort2 = search.SortExpression(expression='sort2', direction=search.SortExpression.ASCENDING,
default_value="")
sort_opt = search.SortOptions(expressions=[sort1, sort2])
results = search.Index(name=ndb_model._get_kind() + "Index").search(
search.Query(
query_string=query_phrase,
options=search.QueryOptions(
sort_options=sort_opt
)
)
)
print "----------------"
res_list = []
for r in results:
obj = ndb.Key(urlsafe=r.doc_id).get()
print obj.first_name + " "+obj.last_name
res_list.append(obj)
return res_list
Did anyone else had my same experience? If so, how have you solved it?
Thank you guys very much,
Marco Galassi
EDIT: names, email, phone are obviously totally invented.
Edit2: I have now moved to TextField, who look a little bit faster, but the problem still persist

Related

Writing a view to fetch user data and add it to the database

I make a project which generates a random book when the user press a button. The view that does that is:
def random_book(request):
cartile=Books.objects.all()
random_item=random.choice(cartile)
return render(request, 'carti/book.html', context={"random_item": random_item})
On the page were you are redirected to see the generated random book I want to add a button that says "add to read later".
I know how to make a manyotmany relationship between profile model and books model, but I have no idea how to write a view that gets the generated random book, adds it to the profile view_later field in the database, delete it with another button, what html tag and what should I write inside for "add to read later" and the delete button.
Some help would be appreciated!
I guess that depends on the flow of you website.
If I interpret your idea correctly, one simple way would be to have a view that takes the id of the selected book and add it to read later collection (in the case of the add).
Doing a lot of suppositions, something like this should do the basic,
urlpatterns = [
...
path('site/read_later_add<int:book_pk>/', views.read_later_add),
]
# View (in site/views.py)
def read_later_add(request, book_pk=1):
# get user profile
profile = Profile.objects.get(user=request.user)
# get the book
book = Book.objects.get(pk=book_pk)
# add book to view_later collection
profile.view_later.add(book) # <- this is it
# give some success message
return render(request, 'read_later_add_success.html')
The remove case would be similar just using profile.view_later.remove(book).
BTW, I did not add any error handling or checks, you should definitely should dot that.
If i understand your problem correctly. You can get the random book by implementing a function that return a random number between 1 and the size of Books column. And pass the return integer of this random function to Books.objects.get(pk=random_int)

How to enter a list in WTForms?

What I'm trying to do
I'm trying to enter a list of tags in flask that should become passable as a list but I can't figure out how to do it in flask, nor can I find documentation to add lists (of strings) in flask_wtf. Has anyone have experience with this?
Ideally I would like the tags to be selectively delete-able, after you entered them. So that you could enter.
The problem
Thus far my form is static. You enter stuff, hit submit, it gets processed into a .json. The tags list is the last element I can't figure out. I don't even know if flask can do this.
A little demo of how I envisioned the entry process:
How I envisioned the entry process:
The current tags are displayed and an entry field to add new ones.
[Tag1](x) | [Tag2](x)
Enter new Tag: [______] (add)
Hit (add)
[Tag1](x) | [Tag2](x)
Enter new Tag: [Tag3__] (add)
New Tag is added
[Tag1](x) | [Tag2](x) | [Tag3](x)
Enter new Tag: [______]
How I envisioned the deletion process:
Hitting the (x) on the side of the tag should kill it.
[Tag1](x) | [Tag2](x) | [Tag3](x)
Hit (x) on Tag2. Result:
[Tag1](x) | [Tag3](x)
The deletion is kind of icing on the cake and could probably be done, once I have a list I can edit, but getting there seems quite hard.
I'm at a loss here.
I basically want to know if it's possible to enter lists in general, since there does not seem to be documentation on the topic.
Your description is not really clear (is Tag1 the key in the JSON or is it Tag the key, and 1 the index?)
But I had a similar issue recently, where I wanted to submit a basic list in JSON and let WTForms handle it properly.
For instance, this:
{
"name": "John",
"tags": ["code", "python", "flask", "wtforms"]
}
So, I had to rewrite the way FieldList works because WTForms, for some reason, wants a list as "tags-1=XXX,tags-2=xxx".
from wtforms import FieldList
class JSONFieldList(FieldList):
def process(self, formdata, data=None):
self.entries = []
if data is None or not data:
try:
data = self.default()
except TypeError:
data = self.default
self.object_data = data
if formdata:
for (index, obj_data) in enumerate(formdata.getlist(self.name)):
self._add_entry(formdata, obj_data, index=index)
else:
for obj_data in data:
self._add_entry(formdata, obj_data)
while len(self.entries) < self.min_entries:
self._add_entry(formdata)
def _add_entry(self, formdata=None, data=None, index=None):
assert not self.max_entries or len(self.entries) < self.max_entries, \
'You cannot have more than max_entries entries in this FieldList'
if index is None:
index = self.last_index + 1
self.last_index = index
name = '%s-%d' % (self.short_name, index)
id = '%s-%d' % (self.id, index)
field = self.unbound_field.bind(form=None, name=name, id=id, prefix=self._prefix, _meta=self.meta,
translations=self._translations)
field.process(formdata, data)
self.entries.append(field)
return field
On Flask's end to handle the form:
from flask import request
from werkzeug.datastructures import ImmutableMultiDict
#app.route('/add', methods=['POST'])
def add():
form = MyForm(ImmutableMultiDict(request.get_json())
# process the form, form.tags.data is a list
And the form (notice the use of JSONFieldList):
class MonitorForm(BaseForm):
name = StringField(validators=[validators.DataRequired(), validators.Length(min=3, max=5)], filters=[lambda x: x or None])
tags = JSONFieldList(StringField(validators=[validators.DataRequired(), validators.Length(min=1, max=250)], filters=[lambda x: x or None]), validators=[Optional()])
I found a viable solution in this 2015 book, where a tagging system is being build for flask as part of a blog building exercise.
It's based on Flask_SQLAlchemy.
Entering lists therefore is possible with WTForms / Flask by submitting the items to the database via, e.g. FieldList and in the usecase of a tagging system, reading them from the database back to render them in the UI.
If however you don't want to deal with O'Rielly's paywall (I'm sorry, I can't post copyrighted material here) and all you want is a solution to add tags, check out taggle.js by Sean Coker. It's not flask, but javascript, but it does the job.

Django - Search matches with all objects - even if they don't actually match

This is the model that has to be searched:
class BlockQuote(models.Model):
debate = models.ForeignKey(Debate, related_name='quotes')
speaker = models.ForeignKey(Speaker, related_name='quotes')
text = models.TextField()
I have around a thousand instances on the database on my laptop (with around 50000 on the production server)
I am creating a 'manage.py' function that will search through the database and returns all 'BlockQuote' objects whose textfield contains the keyword.
I am doing this with the Django's (1.11) Postgres search options in order to use the 'rank' attribute, which sounds like something that would come in handy. I used the official Django fulltext-search documentation for the code below
Yet when I run this code, it matches with all objects, regardless if BlockQuote.text actually contains the queryfield.
def handle(self, *args, **options):
vector = SearchVector('text')
query = options['query'][0]
Search_Instance = Search_Instance.objects.create(query=query)
set = BlockQuote.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
for result in set:
match = QueryMatch.objects.create(quote=result, query=Search_Instance)
match.save()
Does anyone have an idea of what I am doing wrong?
I don't see you actually filtering ever.
BlockQuote.objects.annotate(...).filter(rank__gte=0.5)

Django - Update or create syntax assistance (error)

I've followed the guide in the queryset documentation as per (https://docs.djangoproject.com/en/1.10/ref/models/querysets/#update-or-create) but I think im getting something wrong:
my script checks against an inbox for maintenance emails from our ISP, and then sends us a calendar invite if you are subscribed and adds maintenance to the database.
Sometimes we get updates on already planned maintenance, of which i then need to update the database with the new date and time, so im trying to use "update or create" for the queryset, and need to use the ref no from the email to update or create the record
#Maintenance
if sender.lower() == 'maintenance#isp.com':
print 'Found maintenance in mail: {0}'.format(subject)
content = Message.getBody(mail)
postcodes = re.findall(r"[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}", content)
if postcodes:
print 'Found Postcodes'
else:
error_body = """
Email titled: {0}
With content: {1}
Failed processing, could not find any postcodes in the email
""".format(subject,content)
SendMail(authentication,site_admins,'Unprocessed Email',error_body)
Message.markAsRead(mail)
continue
times = re.findall("\d{2}/\d{2}/\d{4} \d{2}:\d{2}", content)
if times:
print 'Found event Times'
e_start_time = datetime.strftime(datetime.strptime(times[0], "%d/%m/%Y %H:%M"),"%Y-%m-%dT%H:%M:%SZ")
e_end_time = datetime.strftime(datetime.strptime(times[1], "%d/%m/%Y %H:%M"),"%Y-%m-%dT%H:%M:%SZ")
subscribers = []
clauses = (Q(site_data__address__icontains=p) for p in postcodes)
query = reduce(operator.or_, clauses)
sites = Circuits.objects.filter(query).filter(circuit_type='MPLS', provider='KCOM')
subject_text = "Maintenance: "
m_ref = re.search('\[(.*?)\]',subject).group(1)
if not len(sites):
#try use first part of postcode
h_pcode = postcodes[0].split(' ')
sites = Circuits.objects.filter(site_data__postcode__startswith=h_pcode[0]).filter(circuit_type='MPLS', provider='KCOM')
if not len(sites):
#still cant find a site, send error
error_body = """
Email titled: {0}
With content: {1}
I have found a postcode, but could not find any matching sites to assign this maintenance too, therefore no meeting has been sent
""".format(subject,content)
SendMail(authentication,site_admins,'Unprocessed Email',error_body)
Message.markAsRead(mail)
continue
else:
#have site(s) send an invite and create record
for s in sites:
create record in circuit maintenance
maint = CircuitMaintenance(
circuit = s,
ref = m_ref,
start_time = e_start_time,
end_time = e_end_time,
notes = content
)
maint, CircuitMaintenance.objects.update_or_create(ref=m_ref)
#create subscribers for maintenance
m_ref, is the unique field that will match the update, but everytime I run this in tests I get
sites_circuitmaintenance.start_time may not be NULL
but I've set it?
If you want to update certain fields provided that a record with certain values exists, you need to explicitly provide the defaults as well as the field names.
Your code should look like this:
CircuitMaintenance.objects.update_or_create(default=
{'circuit' : s,'start_time' : e_start_time,'end_time' : e_end_time,'notes' : content}, ref=m_ref)
The particular error you are seeing is because update_or_create is creating an object because one with rer=m_ref does not exist. But you are not passing in values for all the not null fields. The above code will fi that.

Django ModelForm Validate custom Autocomplete for M2M, instead of ugly Multi-Select

Given the following models (cut down for understanding):
class Venue(models.Model):
name = models.CharField(unique=True)
class Band(models.Model):
name = models.CharField(unique=True)
class Event(models.Model):
name = models.CharField(max_length=50, unique=True)
bands = models.ManyToManyField(Band)
venue = models.ForeignKey(Venue)
start = models.DateField()
end = models.DateField()
The admin area works great for what I'm doing, but I'd like to open the site up a bit so that certain users can add new Events. For the public portions, I have several "administrative" fields on these models that I don't want the public to see (which is easy enough to fix).
My specific problem, though, is changing the display of the ManyToMany selections when creating a new Event. Because the number of Bands possible to list for an event should not be sent along as a multiselect box, I'd like to use an AutoComplete that handles multiples (like the Tags box, here on StackOverflow!).
I have this part working, and it correctly fills in a hidden input with the Band.id's separated by commas for a value. However, I can't understand how to put together letting Django do the validation using the ModelForms, and somehow also validating the 'Bands' selection.
Ideally, I want to auto-complete like the tags here on StackOverflow, and send along the selected Bands ID's in some kind of Delimited string - all while letting Django validate that the bands passed exist, etc, as if I left the annoying multi-select list in place.
Do I have to create my own Auto-Complete Field type for a form or model, and use that? Is there something else I'm overlooking?
I have seen some existing AutoComplete widgets, but I'd really-really-really like to use my own Autocomplete code, since it's already set up, and some of them look a bit convoluted.
There was a lot more text/explanation here, but I cut back because I'm avoiding Wall Of Text. If I left important stuff out, let me know.
It's a little hard to say without knowing exactly what your autocomplete code is doing, but as long as it is sending the ids of the bands like they would be sent with the <select>, the ModelForm should validate them as usual.
Basically, your POST string should look like:
name=FooBar2009&bands=1&bands=3&bands=4&venue=7&start=...
The easiest way to do this might be to use Javascript to add (and remove) a hidden input field for each band entered with the name band and the id of the band as the value. Then, when the user submits the form, the browser will take care of posting the right stuff, and the ModelForm will validate it.
Using the annointed jquery autocomplete plugin,
On the client-side I have something like this:
jQuery("#id_tags").autocomplete('/tagging_utils/autocomplete/tasks/task/', {
max: 10,
highlight: false,
multiple: true,
multipleSeparator: " ",
scroll: true,
scrollHeight: 300,
matchContains: true,
autoFill: true,
});
So, I have a view that returns when I type in a:
http://skyl.org/tagging_utils/autocomplete/tasks/task/?q=a&limit=10&timestamp=1259652876009
You can see the view that serves that here:
http://github.com/skyl/skyl.org/blob/master/apps/tagging_utils/views.py
Now, it's going to be a little tricky .. you might except the POST, then in the clean method of the field try to .get() based on the strings and raise a form validation error if you can't get it ... right, name = ... unique=True .. so something like (off the top of my head) ... :
def clean_bands(self):
return Band.objects.filter( name__in = self.cleaned_data['bands'].split(' ') )
You could also check each string and raise a form error if there are no bands by that name .. not sure that the clean method should return a qs. Let me know if this helps and you want me to keep going/clarify.