Good way to deploy a django app with an asynchronous script running outside of the app - django

I am building a small financial web app with django. The app requires that the database has a complete history of prices, regardless of whether someone is currently using the app. These prices are freely available online.
The way I am currently handling this is by running simultaneously a separate python script (outside of django) which downloads the price data and records it in the django database using the sqlite3 module.
My plan for deployment is to run the app on an AWS EC2 instance, change the permissions of the folder where the db file resides, and separately run the download script.
Is this a good way to deploy this sort of app? What are the downsides?
Is there a better way to handle the asynchronous downloads and the deployment? (PythonAnywhere?)

You can write the daemon code and follow this approach to push data to DB as soon as you get it from Internet. Since your daemon would be running independently from the Django, you'd need to take care of data synchronisation related issues as well. One possible solution could be to use DateTimeField in your Django model with auto_now_add = True, which will give you idea of time when data was entered in DB. Hope this helps you or someone else looking for similar answer.

Related

Update SQLite database on disk

My Django application (a PoC, not a final product) with a backend library uses a SQLite database - read only. The SQLite database is part of the repo and deployed to Heroku. This is working fine.
I have the requirement to allow updates to this database via the Django admin interface. This is not a Django managed database, so from Django's point of view just a binary file.
I could allow for a FileField to handle this, overwriting the database; I guess this would work in a self-managed server, but I am on Heroku and have the constraints imposed by Disk Backed Storage. My SQLite is not my webapp database, but limitations apply the same: I can not write to the webapp's filesystem and get any guarantee the new data will be visible by the running webapp.
I can think of alternatives, all with drawbacks:
Put the SQLite database in another server (a "media" server), and access it remotely: this will severely impact performance. Besides, accessing SQLite databases over the network does not seem easy.
Create a deploy script for the customer to upload the database via the usual deploy mechanisms. Since the customer is not technically fit, and I can not provide direct support, this is unfeasible.
Move out of Heroku to a self-managed server, so I can implement this quick-and-dirty upload without complications.
Do you have another suggestion?
PythonAnywhere.com
deploy your app and you can easily access all of your files and update them and your Sqlite3 database is going to be updated in real time without losing data.
herokuapp.com erase your Sqlite3 database every 24 hours that's why it's not preferred for Sqlite3 having web apps

Deploying Django as standalone internal app?

I'm developing an tool using Django for internal use at my organization. It's used to search and tag documents (using Haystack and Solr), and will be employed on different projects. My team currently has a working prototype and we want to deploy it 'in the wild.'
Our security environment is strict. Project documents are located on subfolders on a network drive, and access to these folders is restricted based on users' Windows credentials (we also have an MS SQL server that uses the same credentials). A user can only access the projects they are involved in. Since we're an exclusively Microsoft shop, if we want to deploy our app on the company intranet, we'll need to use an IIS server to deal with these permissions. No one on the team has the requisite knowledge to work with IIS, Active Directory, and our IT department is already over-extended. In short, we're not web developers and we don't have immediate access to anybody experienced.
My hacky solution is to forgo IIS entirely and have each end user run a lightweight server locally (namely, CherryPy) while each retaining access to a common project-specific database (e.g. a SQLite DB living on the network drive or a DB on the MS SQL server). In order to use the tool, they would just launch an all-in-one batch script and point their browser to 127.0.0.1:8000. I recognize how ugly this is, but I feel like it leverages the security measures already in place (note that never expect more than 10 simultaneous users on a given project). Is this a terrible idea, and if so, what's a better solution?
I've dealt with a similar situation (primary development was geared toward a normal deployment situation, but some users have a requirement to use the application on a standalone workstation). Rather than deploy web and db servers on a standalone workstation, I just run the app with the Django internal development server and a SQLite DB. I didn't use CherryPy, but hopefully this is somewhat useful to you.
My current solution makes a nice executable for users not familiar with the command line (who also have trouble remembering the URL to put in their browser) but is also relatively easy development:
Use PyInstaller to package up the Django app into single executable. Once you figure this out, don't continue to do it by hand, add it to your continuous integration system (or at least write a script).
Modify the manage.py to:
Detect if the app is frozen by PyInstaller and there are no arguments (i.e.: user executed it by double clicking it) and if so, then run execute_from_command_line(..) with arguments to start the Django development server.
Right before running the execute_from_command_line(..), pop off a thread that does a time.sleep(2) (to let the development server come up fully) and then webbrowser.open_new("http://127.0.0.1:8000").
Modify the app's settings.py to detect if frozen and change things around such as the path to the DB server, enabling the development server, etc.
A couple additional notes.
If you go with SQLite, Windows file locking on network shares may not be adequate if you have concurrent writing to the DB; concurrent readers should be fine. Additionally, since you'll have different DB files for different projects you'll have to figure out a way for the user to indicate which file to use. Maybe prompt in app, or build the same app multiple times with different settings.py files. Variety of a ways to hit this nail...
If you go with MSSQL (or any client/server DB), the app will have to know the DB credentials (which means they could be extracted by a knowledgable user). This presents a security risk that may not be acceptable. Basically, don't try to have the only layer of security within the app that the user is executing. The DB credentials used by the app that a user is executing should only have the access that the user is allowed.

wagtail cms content deploy to production

I am study on the popular django cms framework - wagtail and coming to question: how do you deploy your developed contents - like pages/documents/images to production environments?
I am puzzled because these contents(like page) are saved into database, essentially they are just database tables rows but not a resource in git repo, so if I develope a simple web site in my dev and when I come to deploy to prod, it's not as simple as a git push. what is the best practice on this?
I read some codes from torchbox, there are some database dump and records pulling tasks using fabaric, not sure if that's the preferred way and neither can fully understand them.
Or if it's production site, is it supposed that everyone add content there and prod is the source of truth, there won't need of "content deployment" as all but only those schema changes via souths migration or other static resources only.
Please help if anyone has got experience on this and provide guidance.
Thanks
On our (Torchbox) sites, all content entry usually happens on the production site, so we don't need to push any database content as part of our regular deployments. Many of our sites have tens or even hundreds of editors, so it would be almost impossible to synchronise the content across multiple installations of the site.
Whenever we need to transfer content from one installation to another (for example, deploying the production site for the first time, or pulling a snapshot of the live site to help with development), we use the Postgresql pg_dump command to make a SQL dump of the complete database, then restore it at the destination using the psql command. Tools like Fabric can be used to automate this, but this isn't essential.

Django and services

I'm building a simple website with django that requires constant monitoring of text-based data from another website, that's the way it have to be.
How could I run this service on my web-host using django? would I have to start a separate app and run it via SSH, so it updates the database used by django, or are there any easier/better way?
You could use celery to schedule a job that would read data from that other website and do whatever you want with it.
As an alternative to celery, you could also create a cron job that executes a custom django-admin command. That would give you full access to your django install and ORM. The downside is that cron's smallest time resolution is 1 minute, so if you need it to be real-time, you're not going to be able to do that.
If you do need realtime, then creating a python daemon might be a better option.

Rather than using crontab, can Django execute something automatically at a predefined time

How to make Django execute something automatically at a particular time.?
For example, my django application has to ftp upload to remote servers at pre defined times. The ftp server addresses, usernames, passwords, time, day and frequency has been defined in a django model.
I want to run a file upload automatically based on the values stored in the model.
One way to do is to write a python script and add it to the crontab. This script runs every minute and keeps an eye on the time values defined in the model.
Other thing that I can roughly think of is maybe django signals. I'm not sure if they can handle this issue. Is there a way to generate signals at predefined times (Haven't read indepth about them yet).
Just for the record - there is also celery which allows to schedule messages for the future dispatch. It's, however, a different beast than cron, as it requires/uses RabbitMQ and is meant for message queues.
I have been thinking about this recently and have found django-cron which seems as though it would do what you want.
Edit: Also if you are not specifically looking for Django based solution, I have recently used scheduler.py, which is a small single file script which works well and is simple to use.
I've had really good experiences with django-chronograph.
You need to set one crontab task: to call the chronograph python management command, which then runs other custom management commands, based on an admin-tweakable schedule
The problem you're describing is best solved using cron, not Django directly. Since it seems that you need to store data about your ftp uploads in your database (using Django to access it for logs or graphs or whatever), you can make a python script that uses Django which runs via cron.
James Bennett wrote a great article on how to do this which you can read in full here: http://www.b-list.org/weblog/2007/sep/22/standalone-django-scripts/
The main gist of it is that, you can write standalone django scripts that cron can launch and run periodically, and these scripts can fully utilize your Django database, models, and anything else they want to. This gives you the flexibility to run whatever code you need and populate your database, while not trying to make Django do something it wasn't meant to do (Django is a web framework, and is event-driven, not time-driven).
Best of luck!