I have been crawling around its doc but mostly it uses database with model.
The problem is my database is too large and I don't want to create any models
since it's legacy one, and
I will have to call different tables dynamically,
so I just want to pull data from it. Is that possible in django?
You can go around the model layer and use sql directly. However, you will have to process the tables in python, not having the advantage of using ORM objects.
https://docs.djangoproject.com/en/1.10/topics/db/sql/#executing-custom-sql-directly
As pointed out in a comment, Django provides a way to automatically generate the models from the legacy database with inspectdb.
This guide describes the few manual steps required to "clean" the automatically generated models.
While this doesn't directly answer the stated question of avoiding models, it does address your issue of not wanting to create them yourself, due to the large database.
Data should be stored somewhere. There are a lot of ways to store data, but the most reliable one is a database (hence the name).
You could be storing data in a JSON file and save that. You could also be storing data in environment variables. You can even store data in a plain text file. All of those are NOT recommended. I would just try to use a database, any type of database (MongoDB / Postgres / MySQL, anything). That's what it is meant for.
Related
I'm a beginner at Django, and as a practice project I would like to create a webpage with a dashboard to track investments in a particular p2p platform. They do not have a nice dashboard (but provide excel file with all data). As I see it, main steps that I need to do in this project are as follow:
Create login so that users would have account where they upload their excel files.
Make it possible to import excel file to a database
Manipulate/calculate data for it to be later used in dashboard
Create dashboard.
Host webpage.
After some struggle I have implemented point no. 2, and will deal with 1 and 5 later. But number 3 is my biggest issue now.
I'm completely unsure what I need to do, and google did not help. I need to calculate data before I can make dashboard from it. Union two of the tables, and then join them together with a third table, creating some additional needed calculated fields. Do I create a view in the database and somehow fetch this data to Django? Or do I need to create some rules so that new table would be created at the time of the import? I think having table instead of a view would have better performance. Or maybe I'm doing it completely wrong, and should take completely different approach for this kind of task? Also, is SQLite a good database for a task (I'm using it, because it was a default in Django)?
I assume for vizualization part I will need to do it with some JavaScript library, such as D3? Which then would use data from step 3.
For part 3 there is 2 way, either do these stuff and save the result in your database or you can do it when you need it using django model features like annotation, aggregation and etc.
Option 1 requires to add a table for you calculation which is Models in django.
Option 2 requires to create a doing the annotations in a view or model managers and then using them in views.
Django docs: Aggregation
Which is the best is depended on how big your data is, how complicated the calculation is and how often you need them.
And for database; SQLite is just a database for development use not the production and surly not with a lot of data and a lot of calculations. The recommended database for django is postgresql which is pretty good at handling millions and even billions of data and doing heavy calculation.
And for vizualization you should handle it on the template side which is basically HTML, CSS and JS.
For the app I'm building I need to be able to create a new data model in models.py as fast as possible automatically.
I created a way to do this by making a seperate python program that opens models.py, edits it, closes it, and does server migrations automatically but there must be a better way.
edit: my method works on my local server but not on pythonanywhere
In the Django documentation, I found SchemaEditor, which is exactly what you want. Using the SchemaEditor, you can create Models, delete Models, add fields, delete fields etc..
Here's an excerpt:
Django’s migration system is split into two parts; the logic for
calculating and storing what operations should be run
(django.db.migrations), and the database abstraction layer that turns
things like “create a model” or “delete a field” into SQL - which is
the job of the SchemaEditor.
Don't rewrite your models.py file automatically, that is not how it's meant to work. When you need more flexibility in the way you store data, you should do the following:
think hard about what kind of data you want to store and make your data model more abstract to fit more cases, if needed.
Use JSON fields to store arbitrary JSON data with your model (e.g. for the Postgres database)
if it's not a fit, don't use Django's ORM and use a different store (e.g. Redis for key-value or MongoDB for JSON documents)
What are the cons of allowing a user to add a column to a table in the database at runtime in a production environment. Is there a correct way to do it?
Normally when using a relational DB we never extend the DB at runtime.
At high performance, this is basically impossible (requires modifying the entire dataset, so the request will hang for the user)
Besides that, we do not really want to give users the power to grow our dataset (adding a new column means requiring an extra field for every row in the DB).
However, some relational DBs like Postgres support unstructured data like JSON. This might serve your purpose.
I was wondering this could produce any problem if I directly add rows or remove some from a model table. I thought maybe Django records the number of rows in all tables? or this could mess up the auto-generated id's?
I don't think it matters but I'm using MySql.
No, it's not a problem because Django does the same that you do "directly" to the database, it execute SQL statements, and the auto generated id is handled by the database server (MySql server in this case), no matter where that SQL queries comes from, whatever it is Mysql Client or Django.
Since you can have Django work on a pre-existing database (one that wasn't created by Django), I don't think you will have problems if you access/write the tables of your own app (you might want to avoid modifying Django's internal tables like auth, permission, content_type etc until you are familiar with them)
When you create a model through Django, Django doesn't store the count or anything (unless your app does), so it's okay if you create the model with Django on the SQL database, and then have another app write/read from that same SQL table
If you use Django signals, those will not be triggered by modifying the SQL table directly through the DB, so you might want to pay attention to side effects like that.
Your RDBMS handles it's own auto generated IDs and referential integrity, counts etc, so you don't have to worry about messing it up.
I'm not sure how to handle the following case (thus my question, obviously).
I have a django setup with postgresql to contains all the django model data, but I also have mongoengine managing (let's call them) extended data.
I also have a circular reference between the two (mongo_id points from django model to mongoengine document PK, and db_id points from mongoengine to django model PK).
Obviously, if I run dumpdata, I only get django model data. How can I make it to also dump data from mongoengine? Is there a way for me to achieve this?
This is to get a backup of the data. Backup of referenced files can be easily done by just grabbing the file on disk.
I did not define another DATABASES in the settings.py file (mainly because I was not required to). Is that what I need to do?
Thanks for any pointers.
As a bonus, I would appreciate if I could those mongoengine in the admin interface, but also the base django models.
First of all you can dump your data using mongodump
In one project we had to move data from one database to another with a significantly different schema so we created a management command that would do that. If you would want to use it in a similar manner it would have the advantage of moving only valid data for your current Document definitions and leaving out any possible leftovers from the older ones.
The dumping management command should contain something like
from bson import json_utils
json_util.dumps(map(lambda x: x.to_mongo(), SomeDocument.objects.all()))