Django haystack with Solr - how to specify GET parameters? - django

I'm using django_haystack with Solr 4.9. I've amended the /select request handler so that all requests use dismax by default.
The problem is that sometimes I would like to query specific fields, but I can't find a way to get the SearchQuerySet api to get it to play nicely with dismax. So basically I want to send the following (or equivalent) request to Solr:q=hello&qf=content_auto
I've tried the following aproaches:
Standard Api
SearchQuerySet().filter(content_auto='hello')
# understandably results in the following being sent to solr:
q=content_auto:hello
AltParser
query = AltParser('dismax', 'hello', qf="content_auto")
sqs = SearchQuerySet().filter(content=query)
# Results in
q=(_query_:"{!dismax+qf%3Dcontent_auto}hello")
Raw
query = Raw('hello&qf=content_auto')
# results in
q=hello%26qf%3Dcontent_auto
The last approach was so close, but since it escaped the = and & it doesn't seem to process the query correctly.
What is the best approach to dealing with this? I have no need for non-dismax querying so it would be preferable to keep the /select request handler the same rather than having to wrap every query in a Raw or AltParser.

In short, the answer is that it can't be done without creating a custom backend and SearchQuerySet. In the end I had to just revert back to a standard configuration and specifying dismax with an AltParser, slightly annoying because it affects your spelling suggestions.

Related

Ember dynamic query parameters

I have what I believe to be common but complicated problem to model. I've got a product configurator that has a series of buttons. Every time the user clicks on a button (corresponding to a change in the product configuration), the url will change, essentially creating a bookmarkable state to that configuration. The big caveat: I do not get to know what configuration options or values are until after app initialization.
I'm modeling this using EmberCLI. After much research, I don't think it's a wise idea to try to fold these directly into the path component, and I'm looking into using the new Ember query string additions. That should work for allowing bookmarkability, but I still have the problem of not knowing what those query parameters are until after initialization.
What I need is a way to allow my Ember app to query the server initially for a list of parameters it should accept. On the link above, the documentation uses the parameter 'filteredArticles' for a computed property. Within the associated function, they've hard-coded the value that the computed property should filter by. Is it a good idea to try to extend this somehow to be generalizable, with arguments? Can I even add query parameters on the fly? I was hoping for an assessment of the validity of this approach before I get stuck down the rabbit hole with it.
I dealt with a similar issue when generating a preview popup of a user's changes. The previewed model had a dynamic set of properties that could not be predetermined. The solution I came up with was to base64 encode a set of data and use that as the query param.
Your url would have something like this ?filter=ICLkvaDlpb0iLAogICJtc2dfa3
The query param is bound to a 2-way computed that takes in a base64 string and outputs a json obj,
JSON.parse(atob(serializedPreview));
as well as doing the reverse: take in a json obj and output a base64 string.
serializedPreview = btoa(JSON.stringify(filterParams));
You'll need some logic to prevent empty json objects from being serialized. In that case, you should just set the query param as null, and remove it from your url.
Using this pattern, you can store just about anything you want in your query params and still have the url as shareable. However, the downside is that your url's query params are obfuscated from your users, but I imagine that most users don't really read/edit query params by hand.

Removing query string from url in django while keeping GET information

I am working on a Django setup where I can receive a url containining a query string as part of a GET. I would like to be able to process the data provided in the query string and return a page that is adjusted for that data but does not contain the query string in the URL.
Ordinarily I would just use reverse(), but I am not sure how to apply it in this case. Here are the details of the situation:
Example URL: .../test/123/?list_options=1&list_options=2&list_options=3
urls.py
urlpatterns = patterns('',
url(r'test/(P<testrun_id>\d+)/'), views.testrun, name='testrun')
)
views.py
def testrun(request, testrun_id):
if 'list_options' in request.GET.keys():
lopt = request.GET.getlist('list_options')
:
:
[process lopt list]
:
:
:
:
[other processing]
:
:
context = { ...stuff... }
return render(request, 'test_tracker/testview.html', context)
When the example URL is processed, Django will return the page I want but with the URL still containing the query string on the end. The standard way of stripping off the unwanted query string would be to return the testrun function with return HttpResponseRedirect(reverse('testrun', args=(testrun_id,))). However, if I do that here then I'm going to get an infinite loop through the testrun function. Furthermore, I am unsure if the list_options data that was on the original request will still be available after the redirect given that it has been removed from the URL.
How should I work around this? I can see that it might make sense to move the parsing of the list_options variable out into a separate function to avoid the infinite recursion, but I'm afraid that it will lose me the list_options data from the request if I do it that way. Is there a neat way of simultaneously lopping the query string off the end of the URL and returning the page I want in one place so I can avoid having separate things out into multiple functions?
EDIT: A little bit of extra background, since there have been a couple of "Why would you want to do this?" queries.
The website I'm designing is to report on the results of various tests of the software I'm working on. This particular page is for reporting on the results of a single test, and often I will link to it from a bigger list of tests.
The list_options array is a way of specifying the other tests in the list I have just come from. This allows me to populate a drop-down menu with other relevant tests to allow me to easily switch between them.
As such, I could easily end up passing in 15-20 different values and creating huge URLs, which I'd like to avoid. The page is designed to have a default set of other tests to fill in the menu in question if I don't suggest any others in the URL, so it's not a big deal if I remove the list_options. If the user wishes to come back to the page directly he won't care about the other tests in the list, so it's not a problem if that information is not available.
First a word of caution. This is probably not a good idea to do for various reasons:
Bookmarking. Imagine that .../link?q=bar&order=foo will filter some search results and also sort the results in particular order. If you will automatically strip out the querystring, then you will effectively disallow users to bookmark specific search queries.
Tests. Any time you add any automation, things can and will probably go wrong in ways you never imagined. It is always better to stick with simple yet effective approaches since they are widely used thus are less error-prone. Ill give an example for this below.
Maintenance. This is not a standard behavior model therefore this will make maintenance harder for future developers since first they will have to understand first what is going on.
If you still want to achieve this, one of the simplest methods is to use sessions. The idea is that when there is a querystring, you save its contents into a session and then you retrieve it later on when there is no querystring. For example:
def testrun(request, testrun_id):
# save the get data
if request.META['QUERY_STRING']:
request.session['testrun_get'] = request.GET
# the following will not have querystring hence no infinite loop
return HttpResponseRedirect(reverse('testrun', args=(testrun_id,)))
# there is no querystring so retreive it from session
# however someone could visit the url without the querystring
# without visiting the querystring version first hence
# you have to test for it
get_data = request.session.get('testrun_get', None)
if get_data:
if 'list_options' in get_data.keys():
...
else:
# do some default option
...
context = { ...stuff... }
return render(request, 'test_tracker/testview.html', context)
That should work however it can break rather easily and there is no way to easily fix it. This should illustrate the second bullet from above. For example, imagine a user wants to compare two search queries side-by-side. So he will try to visit .../link?q=bar&order=foo and `.../link?q=cat&order=dog in different tabs of the same browser. So far so good because each page will open correct results however as soon as the user will try to refresh the first opened tab, he will get results from the second tab since that is what is currently stored in the session and because browser will have a single session token for both tabs.
Even if you will find some other method to achieve what you want without using sessions, I imagine that you will encounter similar issues because HTTP is stateless hence you will have to store state on the server.
There is actually a way to do this without breaking much of the functionality - store state on client instead of server-side. So you will have a url without a querystring and then let javascript query some API for whatever you will need to display on that page. That however will force you to make some sort of API and use some javascript which does not exactly fall into the scope of your question. So it is possible to do cleanly however that will involve more than just using Django.

CAML query in getLIstItems method returns no rows of items

I am invoking Sharepoint's List Web services and using the getListItems() method. In particular, I am keen on specifying a CAML query because I really want it to just retrieve one item that I am specifically interested in. This I am doing by specifying a query in my XML string, in varying degrees of combinations, either by specifying the EncodedAbsUrl, the LinkFileName, the URL or the FileRef, with most results returning 0.
The XML query looks like this :
<?xml version="1.0" ?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/"> <S:Body><GetListItems xmlns="http://schemas.microsoft.com/sharepoint/soap/"><listName>{5cbc4407-3851-4e00-964a-bb7e9b430f9f}</listName> <viewName></viewName> <rowLimit>1000</rowLimit> <webID></webID>
**<query><Query><Where><Eq><FieldRef Name = "FileRef"/><Value Type = "Text">"/Shared%20Documents/Ashish/Word_feb27.doc"</Value></Eq></Where></Query></query>**
<viewFields><ViewFields><FieldRef Name="FSObjType"/><FieldRef Name="LinkFilename"/><FieldRef Name="UniqueId"/><FieldRef Name="FileRef"/><FieldRef Name="FileRef"/><FieldRef Name="EncodedAbsUrl"/><FieldRef Name="FileSizeDisplay"/><FieldRef Name="_UIVersionString"/><FieldRef Name="_owshiddenversion"/></ViewFields></viewFields></GetListItems> </S:Body></S:Envelope>
Without the tags this Soap request does infact work, and it retrieves all the items that area available in the List. The frustration begins when i specify the query tag. In particular the Following combinations have been attempted by me
FieldRef.name = {LinkFileName, EncodedAbsUrl, URL,FileRef} and Value.type = {Text, URL}
Either they yield results with no 0 fields in it or they return internal errors. I figure, this is a syntactical issue and would rather shoot this question to you guys who have probably dunnit in the past to see where I am possibly messing it up.
Thanks
I would recommend using CAML Query Builder and Fiddler. Query builder can connect SP using Web services and you can build the query with that. After you got your expected results, capture the Web service request with Fiddler and use it :)
BTW: Have you considered using Sharepoint Client Object model? You do not have to worry about SOAP messages.
Remove the <query><Query> tags.

How do I retrieve Haystack SearchQuery parameters

I am looking for a way to serialize a Haystack search query (not the query results) so that I can reconstruct it later. Is there a way to do this without having to intercept the parameters from off of the request object?
For context, I want users to be able to subscribe to the results of a particular search, including any new results that may pop up over time.
Edit:
I settled on storing the search with:
filter = queryset.query.query_filter
and then loading this back in using:
SearchQuerySet().raw_search(filter)
Though I suspect this will tie me to whichever particular search back-end I'm using now. Is this true? Is there a better way?
You should have the query in your request.GET. Then it should be fairly easy to construct a RSS Feed using that query.

Uploading files to django-piston with ASIHTTPRequest

I'm trying to POST some JSON and a binary file from an iPhone to a Django server running django-piston using ASIHTTPRequest
I know how to get it to work if I am ONLY sending JSON strings, and I know how to make it work if I am ONLY sending a file, but doing both is tricky.
So we'll start with ASIHTTPRequest code
ASIFormDataRequest *request = [[ASIFormDataRequest alloc] initWithURL:url];
[request setRequestMethod:#"POST"];
[request setPostFormat:ASIMultipartFormDataPostFormat];
[request appendPostData:[#"{\"save\":{\"name\":\"iostest\"}}" dataUsingEncoding:NSUTF8StringEncoding]];
[request addData:UIImageJPEGRepresentation([UIImage imageNamed:#"test.jpg"], 1.0f)
withFileName:#"test.jpg"
andContentType:#"image/jpeg"
forKey:#"data"];
[request setDelegate:self];
[request startAsynchronous];
My best idea here is that adding raw string data directly to the POST body and then adding a file just doesn't work.
But if I instead try
[request setPostValue:#"{\"name\":\"iostest\"}" forKey:#"save"];
Then the piston data dictionary will store ['save'] as a string instead of a deserialized object, so it will literally deliver the string
"{\"name\":\"iostest\"}"
Here's my Piston handler code
def create(self, request):
data = request.data
print(data['save']) #{\"name\":\"iostest\"}"
print("Files: " + request.FILES['data'].name) #test.jpg
print("Data Save Name: " + data['save']['name']) #crash, interprets this as a string indeces lookup
Ideas are welcome.
I have basically hacked my way around this.
The basic problem is that the request format in which Django expects files to be submitted to the server is one which django-piston literally just drops the ball on.
When it encounters multipart requests, it simply doesn't try to parse the data.
The solution to this problem is to manually call the parsing engine, which, in the case of JSON, is straight out of django.utils (which is kind of disappointing).
You achieve this by using ASIHTTPRequest (or the request module of your choice) to set a standard post value by key, and then access it the old fashioned way.
from django.utils import simplejson
data = simplejson.loads(request.POST['save'])
Which basically just reduces this handler method at this point to nothing more than a regular old Django view in terms of the steps you have to take to get it going.
So clearly, django-piston is not built to deal with files apparently?
My best idea here is that adding raw
string data directly to the POST body
and then adding a file just doesn't
work.
That wouldn't work, no. If you're POSTing form data using 'application/x-www-form-urlencoded' format, or 'multipart/form-data' you're not going to be able to just tack some extra data on the end - it needs to go in as part of the form data. Something like this I guess...
[request setPostValue:#"{\"save\":{\"name\":\"iostest\"}}" forKey:#"data"];
But if I remove the string data and only post the file it still doesn't work.
Is more problematic...
or if it's Piston erroneously misreading the data.
I probably wouldn't look in that direction first - piston doesn't really mess with the request object, so it seems more likely that the ASI request isn't quite right.
I think the place to start would be to inspect the incoming request and check that it really is a valid formPOST request:
Check that request["CONTENT_TYPE"] is set to 'multipart/form-data'
Inspect the request.raw_post_data and make sure that it is valid form data as specified in http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 - check that the key names are as you expected and that the file content is present. (Obviously you'll want to use a small text file when you're testing this!)
Check which keys actually are present in request.FILES, if any, in case it's as simple as something like a misnamed field.
Failing all that I'd try to narrow down if it's a problem on the client or server side by trying to write a plain python client and seeing if you have the same issue then. Looking around, something like this: http://atlee.ca/software/poster/ might be useful.