Django Datetime not getting saved - django

I am trying to create an object of the following model and save it through using code as below.
chk_list_for_batch = ChkListForBatch(batch, chk_point, False, datetime.datetime.now())
chk_list_for_batch.save()
But, I get the following error
django.core.exceptions.ValidationError: ["'2019-05-25 11:20:23.240094'
value must be either True or False."]
I searched but couldn't find any direction. Kindly suggest.

You can create an object and save it to the database like this:
from django.utils import timezone
chk_list_for_batch = ChkListForBatch.objects.create(batch='batch',
chk_point='chk_point', some_field=False, creation_time=timezone.now())
chk_list_for_batch.save()
The docs explain how to create objects in more detail.

Related

Getting the list of timezones supported by PostgreSQL in Django using RawSQL

I am trying to get the list of all the timezones supported by PSQL database in my Django project, so I can validate timestamps with timezones before sending them to the the database. I asked another question and got an answer regarding the PSQL query here:
How to get the list of timezones supported by PostgreSQL?
Using that, I am trying to do the following:
from django.db.models.expressions import RawSQL
RawSQL("SELECT name, abbrev, utc_offset, is_dst FROM pg_timezone_names;", [])
However, it does not seem to work. I saw the docs for RawSQL, and it usually has a model attached to it, which I can't really have. How to do solve this issue? Thanks.
Following should work for you
from django.db import connection
with connection.cursor() as cursor:
cursor.execute("SELECT name, abbrev, utc_offset, is_dst FROM pg_timezone_names")
zones = cursor.fetchall()

'idf vector is not fitted' error when using a saved classifier/model

Pardon me if I use the wrong terminology but what I want is to train a set of data (using GaussianNB Naive Bayes from Scikit Learn), save the model/classifier and then load it whenever I need and predict a category.
from sklearn.externals import joblib
from sklearn.naive_bayes import GaussianNB
from sklearn.feature_extraction.text import TfidfVectorizer
self.vectorizer = TfidfVectorizer(decode_error='ignore')
self.X_train_tfidf = self.vectorizer.fit_transform(train_data)
# Fit the model to my training data
self.clf = self.gnb.fit(self.X_train_tfidf.toarray(), category)
# Save the classifier to file
joblib.dump(self.clf, 'trained/NB_Model.pkl')
# Save the vocabulary to file
joblib.dump(self.vectorizer.vocabulary_, 'trained/vectorizer_vocab.pkl')
#Next time, I read the saved classifier
self.clf = joblib.load('trained/NB_Model.pkl')
# Read the saved vocabulary
self.vocab =joblib.load('trained/vectorizer_vocab.pkl')
# Initializer the vectorizer
self.vectorizer = TfidfVectorizer(vocabulary=self.vocab, decode_error='ignore')
# Try to predict a category for new data
X_new_tfidf = self.vectorizer.transform(new_data)
print self.clf.predict(X_new_tfidf.toarray())
# After running the predict command above, I get the error
'idf vector is not fitted'
Can anyone tell me what I'm missing?
Note: The saving of the model, the reading of the saved model and trying to predict a new category are all different methods of a class. I have collapsed all of them into a single screen here to make for easier reading.
Thanks
You need to pickle the self.vectorizer and load it again. Currently you are only saving the vocabulary learnt by the vectorizer.
Change the following line in your program:
joblib.dump(self.vectorizer.vocabulary_, 'trained/vectorizer_vocab.pkl')
to:
joblib.dump(self.vectorizer, 'trained/vectorizer.pkl')
And the following line:
self.vocab =joblib.load('trained/vectorizer_vocab.pkl')
to:
self.vectorizer =joblib.load('trained/vectorizer.pkl')
Delete this line:
self.vectorizer = TfidfVectorizer(vocabulary=self.vocab, decode_error='ignore')
Problem explanation:
You are correct in your thinking to just save the vocabulary learnt and reuse it. But the scikit-learn TfidfVectorizer also has the idf_ attribute which contains the IDF of the saved vocabulary. So you need to save that also. But even if you save both and load them both in a new TfidfVectorizer instance, then also you will get the "not_fitted" error. Because thats just the way most of the scikit transformers and estimators are defined. So without doing anything "hacky" saving the whole vectorizer is your best bet. If you still want to go onto the saving the vocabulary path, then please take a look here to how to properly do that:
http://thiagomarzagao.com/2015/12/08/saving-TfidfVectorizer-without-pickles/
The above page saves vocabulary into json and idf_ into a simple array. You can use pickles there, but you will get the idea about the working of TfidfVectorizer.
Hope it helps.

Get the task_name from AsyncResult when submitting chains in celery

How am I supposed to retrieve a task's name when I've got its AsyncResult object and consequently its id?
For example if I launch two of my tasks in a chain:
>>> task_chain = (task_A.s() | task_B.s())
>>> async_result = task_chain.apply_async()
I can retrieve the id of task_B and consequently task_A using the internal _parents() method like this:
>>> async_result.id
>>> 2ed28e84-0673-4491-a56f-c5ab8dfb5725
>>> async_result._parents()[0].id
>>> e793f4dc-5110-4f57-8f98-8caa48c40528
However when I attempt to retrieve the task_name I get nothing back:
>>> async_result.task_name
>>> async_result._parents()[0].task_name
Why is this happening? Could this possibly be a bug?
I have noticed that by submitting a single task, the task_name attribute of AsyncResult works perfectly fine and returns the proper task name.
Is there any other way to retrieve a task's name from the AsyncResult object?
Thank you all in advance.
P.S.
I have already found a similar question here but no one seems to propose a pratical and working solution.
celery-users
UPDATE
Apparently it seems that I've a hit a wall with this one. There is an open ticket on github about the exact same issue with the difference that it concerns groups instead of chains.
https://github.com/celery/celery/issues/2504
One way to do it save it into the cache like that:
cache.set(hash_key, result, 30) # save result to cache for 30 seconds
and you can retrieve it by its key:
result = cache.get(hash_key)
Assuming its not InMemory cache
UPDATE
Sorry I've misinterpreted the question.
What I think happened is that the result of the task_chain AsyncResult gets initialized with only the task id as its the only requirement. Maybe if you try that:
async_result = task_chain.apply_async(name="mytasks")
but I would't count on it

How to insert a row of data to a table using Django's ORM

How do I insert data in Django to a SQL table using the Django ORM?
If you need to insert a row of data, see the "Saving objects" documentation for the save method on the model.
FYI, you can perform a bulk insert. See the bulk_create method's documentation.
In fact it's mentioned in the first part of the "Writing your first Django app" tutorial.
As mention in the "Playing with API" section:
>>> from django.utils import timezone
>>> p = Poll(question="What's new?", pub_date=timezone.now())
# Save the object into the database. You have to call save() explicitly.
>>> p.save()
# Now it has an ID. Note that this might say "1L" instead of "1", depending
# on which database you're using. That's no biggie; it just means your
# database backend prefers to return integers as Python long integer
# objects.
>>> p.id
1
Part-4 of the tutorial explains how to use forms and how to save object using user submitted data.
If you don't want to call the save() method explicitly, you can create a record by using MyModel.objects.create(p1=v1, p2=v1, ...)
fruit = Fruit.objects.create(name='Apple')
# get fruit id
print(fruit.id)
See documentation

Django File-based session doesn't expire

I just realized that my session doesn't expire when I use file-based session engine. Looking at Django code for file-based session, Django doesn't store any expiration information for a session, thus it's never expire unless the session file gets deleted manually.
This looks like a bug to me, as the database-backed session works fine, and I believe regardless of what session back-end developer chooses, they all should behave similarly.
Switching to database-backed session is not an option for me, as I need to store user's session in files.
Can anyone shed some lights?
Is this really a bug?
If yes, how do you suggest me to work around it?
Thanks!
So it looks like you're right. At least in django 1.4, using django.contrib.sessions.backends.file totally ignores SESSION_COOKIE_AGE. I'm not sure whether that's really a bug, or just undocumented.
If you really need this functionality, you can create your own session engine based on the file backend in contrib, but extend it with expiry functionality.
Open django/contrib/sessions/backends/file.py and add the following imports:
import datetime
from django.utils import timezone
Then, add two lines to the load method, so that it appears as below:
def load(self):
session_data = {}
try:
session_file = open(self._key_to_file(), "rb")
if (timezone.now() - datetime.datetime.fromtimestamp(os.path.getmtime(self._key_to_file()))).total_seconds() > settings.SESSION_COOKIE_AGE:
raise IOError
try:
file_data = session_file.read()
# Don't fail if there is no data in the session file.
....
This will actually compare the last modified date on the session file to expire it.
Save this file in your project somewhere and use it as your SESSION_ENGINE instead of 'django.contrib.sessions.backends.file'
You'll also need to enable SESSION_SAVE_EVERY_REQUEST in your settings if you want the session to timeout based on inactivity.
An option would be to use tmpwatch in the directory where you store the sessions
I hit similar issue on Django 3.1. In my case, my program calls the function set_expiry(value) with an integer argument (int data type) before checking session expiry.
Accoring to Django documentation, the data type of argument value to set_expiry() can be int , datetime or timedelta. However for file-based session, expiry check inside load() doesn't work properly only if int argument is passed to set_expiry() beforehand, and such problem doesn't happen to datetime and timedelta argument of set_expiry().
The simple solution (workaround?) is to avoid int argument to set_expiry(value), you can do so by subclassing django.contrib.sessions.backends.file.SessionStore and overriding set_expiry(value) (code sample below), and change parameter SESSION_ENGINE accordingly in settings.py
from datetime import timedelta
from django.contrib.sessions.backends.file import SessionStore as FileSessionStore
class SessionStore(FileSessionStore):
def set_expiry(self, value):
""" force to convert to timedelta format """
if value and isinstance(value, int):
value = timedelta(seconds=value)
super().set_expiry(value=value)
Note:
It's also OK to pass timedelta or datetime to set_expiry(value) , but you will need to handle serialization issue on datetime object.