How flask-whooshalchemy index data imported manually? - python-2.7

I'm using flask-whooshalchemy on sqlite, and mannually imported a lot of data, now whoosh can search none of it. I think it's because whoosh haven't indexed any of the data, right? How could I add whoosh index on those data manually?

you can try my fork https://github.com/Revolution1/Flask-WhooshAlchemyPlus
just
$ pip install flask_whooshalchemyplus
and
from flask_whooshalchemyplus import index_all
index_all(app)

Have a look at https://gist.github.com/davb5/21fbffd7a7990f5e066c
I've just written this to solve the same issue - rebuild search indices after a bulk data import.
It won't work out of the box for anyone else (my "lib" import contains all of my third party libraries, and you'll need to specify your Flask-SQLAlchemy models in the if name=="main" block), but it should be enough to get you started.
As stated in the file comments, you should consider deleting your search.db folder (WHOOSH_BASE) as this script doesn't remove deleted data, only re-indexes the current data set.
I've found it to be much quicker importing all of my data using SQLAlchemy core then running this script compared to importing my data via SQLAlchemy ORM with on-the-fly Whoosh index updates (44s vs 48m for my data set).

The code for the extension is pretty light you can view it on github. From looking at it, it does look as it just watches for changes when SQLAlchemy flushes the session, so externally entered data won't be indexed automatically.
Depending on the amount of data, and if this is a one off data-load, it might be easiest to just delete the Whoosh index (by default a directory called ‘whoosh_index’), as it looks like it will re-index everything if that index isn't found (see lines 154-165).

Related

How to unittest a django database migration?

We've changed our database, using django migrations (django v1.7+).
The data that exists in the database is no longer valid.
Basically I want to test a migration by, inside a unittest, constructing the pre-migration database, adding some data, applying the migration, then confirming everything went smoothly.
How does one:
hold back the new migration when loading the unittest
I found some stuff about overriding settings.MIGRATION_MODULES but couldn't work out how to use it. When I inspect executor.loader.applied_migrations it still lists everything. The only way I could prevent the new migration was to actually remove the file; not a solution I can use.
create a record in the unittest database (using the old model)
If we can prevent the migration then this should be pretty straightforward. myModel.object.create(...)
apply the migration
I think I can probably work this out now that I've found the test_executor: set a plan pointing to the migration file and execute it? Um, right? Got any code for that :-D
confirm the old data in the database now matches the new model
Again, I expect this should be pretty easy: just fetch the instance created before the migration and confirm it has changed in all the right ways.
So the challenge is really just working out how to prevent the unittest from applying the latest migration script and then applying it when we're ready?
Perhaps I have the wrong approach? Should I create fixtures, and just confirm that they're all good at the end? Do fixtures get loaded before the migrations are applied, or after they're all done?
By using the MigrationExecutor and picking out specific migrations with .migrate I've been able to, maybe?, roll it back to a specific state, then roll forward one-by-one. But that is popping up doubts; currently chasing down sqlite fudging around due to the lack of an actual ALTER TABLE instruction. Jury still out.
I wasn't able to prevent the unittest from starting with the current database schema, but I did find it is quite easy to revert to earlier points in the migration history:
Where "0014_nulls_permitted" is a file in the migrations directory...
from django.db.migrations.executor import MigrationExecutor
executor.migrate([("workflow_engine", "0014_nulls_permitted")])
executor.loader.build_graph()
NB: running the executor.loader.build_graph between invocations of executor.migrate seems to be a very important part of completing the migration and making things behave as one might expect
The migrations which are currently applicable to the database can be checked with something like:
print [x[1] for x in sorted(executor.loader.applied_migrations)]
[u'0001_initial', u'0002_fix_foreignkeys', ... u'0014_nulls_permitted']
I created a model instance via the ORM then ensured the database was in the old state by running some SQL directly:
job = Job.objects.create(....)
from django.db import connection
cursor = connection.cursor()
cursor.execute('UPDATE workflow_engine_job SET next_job_state=NULL')
Great. Now I know I have a database in the old state, and can test the forwards migration. So where 0016_nulls_banished is a migration file:
executor.migrate([("workflow_engine", "0016_nulls_banished")])
executor.loader.build_graph()
Migration 0015 goes through the database converting all the NULL fields to a default value. Migration 0016 alters the schema. You can scatter some print statements around to confirm things are happening as you think they should be.
And now the test can confirm that the migration has worked. In this case by ensuring there are no nulls left in the database.
jobs = Job.objects.all()
self.assertTrue(all([j.next_job_state is not None for j in jobs]))
We have used the following code in settings_test.py to ignore the migration for the tests:
MIGRATION_MODULES = dict(
(app.split('.')[-1], '.'.join([app, 'nonexistent_django_migrations_module']))
for app in INSTALLED_APPS
)
The idea here being that none of the apps have a nonexistent_django_migrations_module folder, and thus django will simply find no migrations.

How do I import globals in Postman

My team has just started using Team Syncing in Postman which seems to be a great feature, but we want to be able to share the large set of global variables we use within our collections.
These are not synced to the cloud server and there doesn't seem to be a way to import them.
Has any got a good way to share these throughout the team without everyone manually entering each one?
To share globals in Postman:
Export as JSON, share JSON file
Import globals from JSON
The same steps in more detail:
To export as JSON:
a) Go to gear in upper right-hand corner, choose Manage Environments from the dropdown
b) Click Globals button
c) Choose Download as JSON
To import from JSON:
a) Choose Import from upper left of Postman window
b) Select your JSON file or drag it into the resulting window:
NOTE: Even though this window says it only imports collections, environments, data dumps, curl commands, and RAML/WADL/Swagger/Runscope, it will also work for globals.
c) Click Open in the system dialog box (after choosing file). Your globals will be imported. You may receive an error along with the confirmation message, but the globals were still imported.
You can take a backup of the collections that the postman saves, by checking
To get the Postman storage location
Search for
chrome-extension://fhbjgbiflinjbdggehcddcbncdddomop/
The location given under Paths is the one that contains all the data. For eg: /home/xyz/.config/chromium/Default/Storage/ext/fhbjgbiflinjbdggehcddcbncdddomop/def
Using this, the collections also can be transferred from one machine to another.
In the current version (Version 9.31.0).
I find the number 1 & 2 depicted below is the way to export the environment fully to a JSON file.
The steps 3 & 4 in that order is for importing. Now, watch out for duplicates after importing as I find they aren't replaced, instead anew created and you may want to delete older/un-used versions of the same.

Django doesn't read from database – no error

I just set up the environment for an existing Django project, on a new Mac. I know for certain there is nothing wrong with the code itself (just cloned the repo), but for some reason, Django can't seem to retrieve data from the database.
I know the correct tables and data is in the db.
I know the codebase is as it should be.
I can make queries using the Django shell.
Django doesn't throw any errors despite the data missing on the web page.
I realize that it's hard to debug this without further information, but I would really appreciate a finger pointing me to the right direction. I can't seem to find any useful logs.
EDIT:
I just realized the problem lies elsewhere. Unfortunately I can't delete this post with the bounty still open.
Without seeing any code, I can only suggest some general advice that might help you debug your problem. Please add a link to your repository if you can or some snippets of your database settings, the view which includes the database queries etc...
Debugging the view
The first thing I would recommend is using the python debugger inside the view which queries the database. If you've not used pdb before, it's a life saver which allows you to set breakpoints in your Python script and then interactively execute code inside the interpreter
>>> import pdb
>>> pdb.set_trace()
>>> # look at the results of your queries
If you are using the Django ORM, the QuerySet returned from the query should have all the data you expect.
If it doesn't then you need to look into your database configuration in settings.py.
If it does, then you must might not be returning that object to the template? Unlikely as you said the code was the same, but double check the objects you pass with your HttpResponse object.
Debugging the database settings
If you can query the database using the project settings inside settings.py from the django shell it sounds unlikley that there is a problem with this - but like everything double check.
You said that you've set up a new project on a mac. What is on a different operating system before? Maybe there is a problem with the paths now - to make your project platform independent remember to use the os.path.join() method when working with file paths.
And what about the username and password details....
Debugging the template
Maybe your template is referencing the wrong object variable name or object attribute.You mentioned that
Django doesn't throw any errors despite the data missing on the web
page.
This doesn't really tell us much - to quote the Django docs -
If you use a variable that doesn’t exist, the template system will
insert the value of the TEMPLATE_STRING_IF_INVALID setting, which is
set to '' (the empty string) by default.
So to check all the variables available to your template, you could use the debug template tag
{{ debug }}
Probably even better though is to use the django-debugging-toolbar - this will also let you examine the SQL queries your view is making.
Missing Modules
I would expect this to raise an exception if this were the problem, but have you checked that you have the psycopg module on your new machine?

Storing important singular values in Django

So I'm working on a website where there are a couple important values that get used in various places throughout the site. For example, certain important dates, like the start and end dates for registration.
One way I can do this is making a model that stores these values, but that sounds like overkill (since I'd only have one instance). Another way is to store these values in the settings.py file, but if I wanted to change them, it seems like I would need to restart the webserver for them to take effect. I was wondering what would be the best practice in Django to handle this kind of stuff.
You can store them in settings.py. While there is nothing wrong with this (you can even organize your settings into multiple different files, if you have to many custom settings), you're right that you cannot change these at runtime.
We were solving the same problem where I work and came up with a simple app called django-constance (you can get it from github at https://github.com/comoga/django-constance). What this lets is store your settings in a settings.py, but once you need to turn them into settings configurable at runtime, you can switch to a Redis data store with django admin frontend. You can even use the value from settings as your default. I suggest you try this app out.
The changes to your code are pretty minimal, as pasted from docs you initialize your dynamic settings like this:
CONSTANCE_CONFIG = {
'MY_SETTINGS_KEY': (42, 'the answer to everything'),
}
And then instead of importing settings from django conf, you do this:
from constance import config
if config.MY_SETTINGS_KEY == 42:
answer_the_question()
If you want a specific set of variables available to all of your template, what you are looking for is Context Processors.
http://docs.djangoproject.com/en/dev/ref/templates/api/#writing-your-own-context-processors
More links
http://www.b-list.org/weblog/2006/jun/14/django-tips-template-context-processors/
http://blog.madpython.com/2010/04/07/django-context-processors-best-practice/
The code for your context processors, can live anywhere in your project. You just have to add it to your settings.py under:
TEMPLATE_CONTEXT_PROCESSORS =
You could keep the define your constants in your settings.py or even under a constants.py and just
from constants import *
However as you mentioned, you would need to reload your server each time the settings are updated. I think you first need to figure out how often will you be changing these settings? Is it worth the extra effort to be able to reload the settings automatically?
If you wanted to automatically enable the settings, each time they are updated you could do the following:
Store settings in the DB
Upon save/change, write output to a file
settings.py / constants.py reads files
reload server
In addition, you have a look at the mezzanine project which allows you to update settings from the django admin interface and will reload as well.
See: http://mezzanine.jupo.org/docs/configuration.html
If the variables you need will be updated infrequently, i suggest just store them in settings.py and add a custom context processor.
If you are using source control such as GIT, updating will be quite easy, you can just update the file and push to your server. For really simple reloading of the server you could also create a post-recieve hook for git that will automatically reload the server when new code is pushed.
I would only suggest the other option if you are updating settings fairly regularly.

Django model translation : store translations in database or use gettext?

I'm in a Django website's I18N process.
I've selected two potentially good django-apps :
django-modeltranslation which modifies the db schema to store translations
django-dbgettext which inspect db content to create .po files and uses gettext
From your point of view, what are the pros and cons of those two techniques ?
If you want to let users of your app(or third party translators) easily update the translations without code changes then go for one of the solutions that stores the translations in the database.
If you instead want greater quality control(version control, several set of eyes, etc), then use gettext. By using gettext you may also control which strings you want translate.
Just my 2c.
django-modeltranslation is best for storing translated value. you will go to django-admin and put translated value.
But If you are using django-dbgettext, then you dont need to put any value in django-admin, you can use rosetta for that. If you are not able to look any value for translation and you want it to translate, then you can do entry of model in "*dbgettext_registration.py*" and run command "python manage.py dbgettext_export" then "python manage.py compilemessages".
http://packages.python.org/django-easymode/ combines the two:
http://packages.python.org/django-easymode/i18n/index.html
http://packages.python.org/django-easymode/i18n/translation.html
Gettext is used to translate large ammounts of data, and the admin is used for day to day updates.
I would suggest you always use files for your translations. It's portable and doesn't have unknown impacts on DB performance (especially an issue when using "magic" packages that monkey patch your DB schema)
This package looks simple and extensible: https://github.com/ecometrica/django-vinaigrette