Storing wall-clock datetimes in Django/Postgres - django

I want to save a future wall-clock datetime for events in Django (I have timezone string stored separately).
I can't simply use the DateTimeField because it enforces timestamp with time zone and always saves time in current timezone. It doesn't handle DST or possible timezone changes between current date and the date of actual event.
I could use any of these options:
Pick any timezone to store timestamps and always throw this timezone away before applying actual timezone in Python.
Split timestamp to DateField and TimeField.
Store datetime as string.
Custom field that stores datetime as timestamp without time zone.
but it makes queries more difficult and seems quite weird.
Are there any better options I miss? This usecase seems quite common so I guess there is a better way to do that?
EDIT: my usecase:
Let's say my user want to book an appointment to 2019-12-20 10:00 and currently it's 2019-03-10. I know the timezone of this user (it's stored separately as string like 'US/Eastern').
If I assume that EST starts at November 3, 2019, the best I can do is to store timestamp to 2019-12-20 15:00:00+00:00 (or 2019-12-20 10:00-05:00. I don't want this because:
I have no idea if my tzdata has correct information for future datetime
Even if it currently does, I have no idea if there would be any unexpected change in US/Eastern timezone and it becomes worse when it's not US. Future DST changes are not guaranteed.
If user moves to different timezone, I'll have to recalculate every single appointment while taking care about DST.
If tzdata changes during this recalculation... let's not think about that.
I'd prefer to store future dates as naive datetime + timezone string like 'US/Eastern' and (almost) never construct tz-aware datetime for any date further than a week. Django + postgres currently forces me to use timestamp with time zone, which is great for logs and past events, but it has fixed offset (not even timezone name) so it doesn't fit for future wall clock datetimes.
For this usecase, let's say that I don't care about ambiguous times: not much users want to book at 02:00 AM.

I see a few possible solutions:
Set USE_TZ = False and TIME_ZONE = 'UTC' and use calendar times. No conversions will be done, so essentially you're just storing the calendar time and getting it back as a naive datetime. The main problem is that this setting is global, and is not a good one for many uses (e.g. auto_now).
As above, but set USE_TZ = True. As long as you express your calendar times in UTC, there won't be any untoward conversions. The problem here is that you'll be getting aware datetimes, so you'll have to take care to ignore or remove the time zone everywhere.
Use separate DATE_FIELD and TIME_FIELD. This may or may not be a good solution depending on what kind of queries you're trying to run.
Create your own field that uses timestamp without time zone. (Or perhaps it already exists?)
Note that this issue has nothing to do with past versus future. It's about wanting to use a fixed moment in time versus a calendar (or wall clock) time. The points you raised are certainly valid objections to using a point in time to represent a calendar time.

Related

Storing unix timestamp as an IntegerField [duplicate]

Which one is best to use, DateTime or INT (Unix Timestamp) or anything else to store the time value?
I think INT will be better at performance and also more universal, since it can be easily converted to many timezones. (my web visitors from all around the world can see the time without confusion)
But, I'm still doubt about it.
Any suggestions?
I wouldn't use INT or TIMESTAMP to save your datetime values. There is the "Year-2038-Problem"! You can use DATETIME and save your datetimes for a long time.
With TIMESTAMP or numeric column types you can only store a range of years from 1970 to 2038. With the DATETIME type you can save dates with years from 1000 to 9999.
It is not recommended to use a numeric column type (INT) to store datetime information. MySQL (and other sytems too) provides many functions to handle datetime information. These functions are faster and more optimized than custom functions or calculations: https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html
To convert the timezone of your stored value to the client timezone you can use CONVERT_TZ. In this case you need to know the timezone of the server and the timezone of your client. To get the timezone of the server you can see some possibilites on this question.
Changing the client time zone The server interprets TIMESTAMP values
in the client’s current time zone, not its own. Clients in different
time zones should set their zone so that the server can properly
interpret TIMESTAMP values for them.
And if you want to get the time zone that a certain one you can do this:
CONVERT_TZ(#dt,'US/Central','Europe/Berlin') AS Berlin,
I wouldn't store it in int, you should check out MySQL Cookbook by Paul DuBois he covers lot's of things in it.Also there is a big portion about your quetion.

Stop Django translating times to UTC

Timezones are driving me crazy. Every time I think I've got it figured out, somebody changes the clocks and I get a dozen errors. I think I've finally got to the point where I'm storing the right value. My times are timestamp with time zone and I'm not stripping the timezone out before they're saved.
TIME_ZONE = 'Europe/London'
USE_I18N = USE_L10N = USE_TZ = True
Here's a specific value from Postgres through dbshell:
=> select start from bookings_booking where id = 280825;
2019-04-09 11:50:00+01
But here's the same record through shell_plus
Booking.objects.get(pk=280825).start
datetime.datetime(2019, 4, 9, 10, 50, tzinfo=<UTC>)
DAMMIT DJANGO, IT WASN'T A UTC TIME!
These times work fine in templates/admin/etc but when I'm generating PDF and spreadsheet reports, this all goes awry and I'm suddenly have to re-localise the times manually. I don't see why I have to do this. The data is localised. What is happening between the query going to the database and me getting the data?
I bump into these issues so often I have absolutely no confidence in myself here —something quite unnerving for a senior dev— so I lay myself at your feet. What am I supposed to do?
You're interpreting this wrongly. The database stores a UTC time most of the time. If you use PostgreSQL, the database can store a time with time zone info, but for practical purposes (*) it's easiest to just think the time in your db is stored as UTC (i.e. as an absolute time that can be converted to any time zone) when USE_TZ = True. It always represents a correct point in time for which you don't need to remember or assume any timezone. And as far as I know, Django will always store the time as time-aware in UTC timezone.
So when you're fetching the time object using select in psql, you're getting back the time in your machine's local time zone (the time zone where you're running psql). If someone in "America/New_York" would run the same select query, she would see a -04 timestamp. Had the date been 2019-03-20, you'd have seen 2019-03-20 10:50:00+00 because on that date, Europe/London and UTC were the same.
When fetching the value of a DateTimeField as a python datetime.datetime object, Django always fetches the UTC value, because:
Dealing with aware datetime objects isn’t always intuitive. For
instance, the tzinfo argument of the standard datetime constructor
doesn’t work reliably for time zones with DST. Using UTC is generally
safe; if you’re using other time zones, you should review the pytz
documentation carefully.
This makes it easier to work with these datetime objects in your python code: They're always UTC times.
If you want to print these values in a PDF, use the same methods Django uses for the template rendering:
from django.utils import timezone
print(timezone.template_localtime(Booking.objects.get(pk=280825).start))
This renders the datetime in the default timezone (or if you activate() a different timezone, in the current timezone).
(*) Note: Why you should not give any meaning to the timezone saved in your db and just think about it as if it's all UTC: If you were to run servers in various timezones, you might actually end up saving timestamps in different timezones. They are still all correct (absolute timestamps) and can be converted to any other timezone. So basically the timezone used for saving is meaningless.
Please be aware that both Django and the Postgresql Database have their own timezone setting.
The Django timezone is set in the settings.py file:
TIME_ZONE = 'UTC'
USE_TZ = True
The Postgresql setting can be checked using:
SHOW TIMEZONE;
and set using:
SET TIMEZONE='UTC';
I'm not an expert on this, but I believe Django wants to store everything in UTC in the database and then convert to the Django timezone setting after it has been queried. On that basis I think you want to set the Postgresql timezone to be UTC then up to you if you change the Django setting to get the auto conversion or leave it as UTC and handle any conversions yourself in code.

timezones and doing analytics on tables

This strange behavior has recently came to my attention, while I was testing my Rails app on local environment in which I use around_filter to set the timezone to registered user (the default timezone is UTC).
What I did was that I registered a new user in my app. My current time was 10pm GMT-5 (March 3), and this user's created_at time was saved to database to 4am UTC (March 4). Now, I know that this time is saved in database with the timezone settings, but here comes the problem:
I use a graph for visual representation of daily registered users, and when I called the following function to tell me number of users registered in the last few days:
from ||= Date.today - 1.month
to ||= Date.today
where(created_at: from..to).group('DATE(created_at)').count
It would say that this user was registered in March 4, while it was in fact registered on March 3 from my perspective.
My question is:
How should I call where function and group by a created_at column, so that the dates with be affected correctly (according to my timezone) ?
Or is there something else that I should be doing differently?
I'm not a rubyist, so I'll let someone else give the specific code, but I can answer from a general algorithmic perspective.
If you're storing UTC in the database, then you need to query by UTC as well.
In determining the range of the query (the from and to), you'll need to know the start and stop times for "today" in your local time zone, and convert those each to UTC.
For example, I'm in the US Pacific time zone, and today is March 7th, 2015.
from: 2015-03-07T00:00:00-08:00 = 2015-03-07T08:00:00Z
to: 2015-03-08T00:00:00-08:00 = 2015-03-08T08:00:00Z
If you want to subtract a month like you showed in the example, do it before you convert to UTC. And watch out for daylight saving time. There's no guarantee the offsets will be the same.
Also, you'll want to use a half-open interval range that excludes the upper bound. I believe in Ruby that this is done with three dots (...) instead of two (at least according to this).
Grouping is usually a bit more difficult. I assume this is a query against a database, right? Well, if the db you're querying has time zone support, then you could use it convert the date to your time zone before grouping. Something like this (pseudocode):
groupby(DATE(CONVERT_TZ(created_at,'UTC','America/Los_Angeles')))
Since you didn't state what DB you're using, I can't be more specific. CONVERT_TZ is available on MySQL, and I believe Oracle and Postgres both have time zone support as well.
Date.today will default to your system's set timezone (which by the way should always be UTC, here's why) so if you want to use UTC, simply do Time.zone.now.to_date if rails is set to UTC
Otherwise you should do
Time.use_zone('UTC') do
Time.zone.now.to_date
end
After this you should display the created_at dates by doing object.created_at.in_time_zone('EST')
to show it in your current timezone

How do I force boost::posix_time to recognize timezones?

I'm reading timestamp fields from a PostgreSQL database. The timestamp column is defined as:
my_timestamp TIMESTAMP WITH TIME ZONE DEFAULT NOW()
When reading from the database, I convert it to a boost timestamp like this:
boost::posix_time::ptime pt( boost::posix_time::time_from_string( str ) );
The problem seems to be that boost::posix_time::time_from_string() ignores the timezone.
For example:
database text string == "2013-05-30 00:27:04.8299-07" // note -07 timezone
boost::posix_time::to_iso_extended_string(pt) == "2013-05-30T00:27:04.829900"
When I do arithmetic with the resulting ptime object, the time is off by exactly 7 hours. Is there something better I should be doing to not lose the timezone information?
I think you should be using boost::local_date_time, which handles time zones. There is an example in the documentation that is very similar to what you're trying to do: http://www.boost.org/doc/libs/1_41_0/doc/html/date_time/examples.html#date_time.examples.seconds_since_epoch
EDIT: Boost supports date parsing with specific formats. http://www.boost.org/doc/libs/1_40_0/doc/html/date_time/date_time_io.html#date_time.format_flags
string inp("2013-05-30 00:27:04.8299-07");
string format("%Y-%m-%d %H:%M:%S%F%Q");
date d;
d = parser.parse_date(inp,
format,
svp);
// d == 2013-05-30 00:27:04.8299-07
I originally asked this question so many years ago, I don't even remember doing it. But since then, all my database date/time code on the client side has been greatly simplified. The trick is to tell PostgreSQL the local time zone when the DB connection is first established, and let the server automatically add or remove the necessary hours/minutes when it sends back timestamps. This way, timestamps are always in local time.
You do that with a 1-time call similar to this one:
SET SESSION TIME ZONE 'Europe/Berlin';
You can also use one of the many timezone abbreviations. For example, these two lines are equivalent:
SET SESSION TIME ZONE 'Asia/Hong_Kong';
SET SESSION TIME ZONE 'HKT';
The full list of timezones can be obtained with this:
SELECT * FROM pg_timezone_names ORDER BY name;
Note: there are over 1000 timezone names to pick from!
I have more details on PostgreSQL and timezones available on this post: https://www.ccoderun.ca/programming/2017-09-14_PostgreSQL_timestamps/index.html

How to communicate time data between different zone?

In my Django app I've got a Task model with some date and time fields:
class Task(models.Model):
date = models.DateField()
start_time = models.TimeField(help_text='hh:mm')
end_time = models.TimeField(help_text='hh:mm')
# more stuff
I'll send some Task instances to some Android clients that will be in a time zone (TZ1) different from my server time zone (TZ2).
The start_time and end_time fields must be set to the target time zone (TZ1), i.e. if I enter '13:00' in the start_time field in the Task admin, it should be '13:00' in TZ1.
How can I set the start_time and end_time values to be TZ1 times? If I leave the values entered in the default admin I guess the times will be set to the server time zone (TZ2), right?
Then what's the best format to send these values (through JSON) to the Android clients to get the correct TZ2 time?
Now I'm using Python Datetime's isoformat(), which gives something like
2013-02-11T13:17:23.811680
but it has no time zone data...
This is not the best way to handle timezones.
The best way is to convert times to UTC as early as possible and convert them back as late as possible.
In other words, if I enter the current time here as Feb 11, 21:03, it should never be stored like that. Instead it should be changed to UTC before anything else happens.
That's so, no matter what happens with it, it's correct. If I send it to Inner Mongolia, it should stay as UTC right up until the point someone wants to look at it. Then and only then should it be converted (and for display only).
Following that rule will save you a lot of grief in any software that has to work across multiple timezones. Trust me on that, we fixed a major Telco up after they'd implemented some hideous system that sent timezones across the wire, meaning that every point had to be able to convert to and from every timezone.
Getting them into UTC as quickly as possible, and only getting them back on demand, saved bucketloads of time and money.