We are using Django with its ORM in connection with an underlying PostgreSQL database and want to extend the data model and technology stack to store massive amounts of time series data (~5 million entries per day onwards).
The closest questions I found were this and this which propose to combine Django with databases such as TimescaleDB or InfluxDB. But his creates parallel structures to Django's builtin ORM and thus does not seem to be straightforward.
How can we handle large amounts of time series data while preserving or staying really close to Django's ORM?
Any hints on proven technology stacks and implementation patterns are welcome!
Your best option is to keep your relational data in Postgres and your time series data in a separate database, and combining them when needed in your code.
With InfluxDB you can do this join with a Flux script by passing it the SQL that Django's ORM would execute, along with your database connection info. This will return your data in InfluxDB's format though, not Django models.
why not using in parallel to your existing postgres a timescaledb for the time series data, and use this django integration for the latter one: https://pypi.org/project/django-timescaledb/.
Using multiple databases in django is possible, also I not did it by myself so far. Have a look here to do it in a convenient way (reroute certain Models to another db instead of default postgres one)
Using Multiple Databases with django
I have seen several questions and answers on SO, most were three years old or older and I looked at the Django documentation (hoping I didn't miss it). I have to have a 9+ digit number for an id. Most responses were to do this at the database. I am guessing that means to create the model in Django and then go back to the database and change the id column Django created with a new starting/next value attribute on the column.
If not how can I create a database table from Django, Code First, that allows me to create a table with an id column that starts at 100000000? And, it be done with the stock model object methods in Django. I don't really want to do a special hack. If that is the case, I can go the database and fix the column. I was trying to adhere to the Code First ideas of Django (though I prefer database first, and am afraid using inspectdb will make a mess.)
Edit: I didn't want to use UUID. I believe BigAutoField is best.
You should be able to do this in two steps:
1 - Specify your primary key explicitly using primary_key=TRUE in your model definition. See the Django docs for more info. You can then specify BigAutoField or whatever other type you want for the primary key.
2A - If you're populating the database up front, just set pk: 100000000 in your fixture.
OR
2B - If you're not populating the database up front, use Django Model Migration Operations RunSQL as detailed here. For your SQL use ALTER TABLE tableName AUTO_INCREMENT=100000000.
I am currently developing a server using Flask/SqlAlchemy. It occurs that when an ORM model is not present as a table in the database, it is created by default by SqlAlchemy.
However when an ORM class is changed with for instance an extra column is added, these changes do not get saved in the database. So the extra column will be missing, every time I query. I have to adjust my DB manually every time there is a change in the models that I use.
Is there a better way to apply changes in the models during development? I hardly think manual MySql manipulation is the best solution.
you can proceed as the following:
new_column = Column('new_column', String, default='some_default_value')
new_column.create(my_table, populate_default=True)
you can find more details about sqlalchemy migration in: https://sqlalchemy-migrate.readthedocs.org/en/latest/changeset.html
NOTE: I deleted the question as it existed previously and providing only the relevant info here.
Our database server (RH) has TIME_ZONE = "Europe/London" specified. And, within the Django settings.py, we specify TIME_ZONE = "America/New_York".
And, in my Model class I have specified:
created = models.DateTimeField(editable=False,auto_now=False, auto_now_add=True)
modified = models.DateTimeField(editable=False,auto_now=True, auto_now_add=True)
When I then go look at the data in the admin site, I get UTC/GMT time instead of Eastern.
I thought that all time is adjusted automagically by Django since I specified "America/New_York" as Django's Time Zone.
Any help/clarification is appreciated.
Thanks
Eric
Relying on date/time 'automagic' is dangerous and these auto_add model parameters are a trap. Always understand the timezone(s) you are dealing with. Python makes this easier by attaching a tzinfo member to its datetime objects. While these objects are 'naive' by default, I encourage you to always attach tzinfo detail. Still Python needs some extra help with either python-dateutil or pytz (what I use). Here's a universal rule though - always store your datetimes in a database as UTC.
Why? Your users may be in different locals, mobile phones and laptops travel, servers are misconfigured or mirrored in different timezones. So many headaches. Datetimes should never be naive and if they are (as in a database) and you need the context, also include a timezone field in the table.
So in your case.
Don't use the auto_now fields, use a custom save() instead.
Store UTC in the database
If you need to know the timezone - for say a user event - store the timezone in the database as well.
Convert to the necessary/requested timezone
If you are using pytz, the localize() method is great. Python's datetime object has the useful replace() and astimezone().
One more note, if your database is timezone naive (like MySQL) make sure your datetimes are in UTC and then use replace(tzinfo=None) because the database connector can't handle tz-aware objects.
Here is a thread with detail on Django's auto_now fields.
The simplest/fastest fix [said above by Ajay Yadav] is this ,
Just add TIME_ZONE attribute to the Database's section in settings.py,
settings.py
# Database
# https://docs.djangoproject.com/en/3.1/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': BASE_DIR / 'db.sqlite3',
'TIME_ZONE': 'Asia/Tokyo',
}
}
For the available Timezone choices , See the Official documentation linked below ,
DJANGO TIMEZONE CHOICES
First off, I would want to store my data as UTC cause its a good starting point.
So let me ask this, Why do you need the time in EST, is this for the end-user, or do you need to do logic on the server and need it in EST?
If its for the enduser, an easy fix is to let the users browser handle converting to the correct time. On the server convert the datetime object to a timestamp:
timestamp = time.mktime(datetime_obj.timetuple()) * 1000
And then on the web page instantiate a Date object:
var date_obj = new Date({{ timestamp }});
var datetime_string = date_obj.toString();
// the datetime_string will be in the users local timezone
Now, on the other hand, if you want to have the time in the correct zone on the server so you can perform logic on it. I recommend using the help of python-dateutil. It will allow you to easily swap to a different timezone:
from datetime import datetime
from dateutil import zoneinfo
from_zone = zoneinfo.gettz('UTC')
to_zone = zoneinfo.gettz('America/New_York')
utc = created # your datetime object from the db
# Tell the datetime object that it's in UTC time zone since
# datetime objects are 'naive' by default
utc = utc.replace(tzinfo=from_zone)
# Convert time zone
eastern_time = utc.aztimezone(to_zone)
Now if you really wanna store the datetime in EST, you need change the time on the DB server (like Ajay Yadav and gorus said). I don't know why you want to store them as EST, but then again I don't know what your application is.
When you say auto_now_add=True, the value will be added by your database server and not your django server. So you need to set time zone on your database server.
Since you edited the question, I'll edit my answer :) Django cannot control the time zone of your db, so the way to fix this is to update the time zone for your db. For MySql, run this query:
SELECT ##global.time_zone, ##session.time_zone;
This should return SYSTEM, SYSTEM by default, which in your case means "Europe/London", and the cause of your problem. Now that you've verified this, follow the instructions in the first comment on this page:
http://dev.mysql.com/doc/refman/5.5/en/time-zone-support.html
Remember to restart MySql server after you've updated the time zone for the changes to take effect.
Is there a way to run a custum SQL statement in Django? I have some timestamp fields in my database that have timezone information. Normally you could just enter the time in a format like: 2010-7-30 15:11:22 EDT and in my case postgresql will figure it out. But in Django it treats timestamps as Datetimes which don't store timezone information so I can't just update the model object with this string and save it. Any ideas?
I somehow must have missed the link in the documentation that covers this: http://docs.djangoproject.com/en/dev/topics/db/sql/#executing-custom-sql-directly.