Django - How to store emojis in postgres DB properly? - django

I'm running the latest version of Django on postgres. I'm trying to store emojis in my postgres DB in a way that a React Native app can properly render it. Below I have the initial emojis variables setup that'll go into the table. I've copy and pasted the emojis from here. How do I store emojis in my postgres DB so that a React Native app can render it properly?
I tried following this blog, which suggests adding ’OPTIONS’: {’charset’: ’utf8mb4’} to DATABASES under settings.py, but I get this error django.db.utils.ProgrammingError: invalid dsn: invalid connection option "charset". Seems like this only works for MySQL DBs. How can I store emojis in a Django postgres DB?

Like in the comments suggested, you need to put quotes around the emojis since they're just chars. Though, something like flags is actually two chars. So that's something to be careful about. All your computer is doing is converting unicode to a rendered emoji that's platform dependent.
The emojis that you're using should be unicode supported. On your computer, they're definitely supported. For the most part, additional unicode support for new emojis is very quickly implemented once published on client machines. There should be no problem with emojis in strings. This is a nice video kinda explaining emojis by Tom Scott who keeps getting interviews about emojis: https://www.youtube.com/watch?v=sTzp76JXsoY
I'm not an expert so please correct me if I'm wrong.

In your models you need to use a CharField or a TextField to store emojis, that need to be passed as characters (for example "😄" and not directly 😄). Your database must use utf8 to support emojis, connect to your database with a SQL shell, to check the current encoding run:
SHOW CLIENT_ENCODING;
If the output is not UTF8 run:
SET CLIENT_ENCODING='UTF8';
Now remove ’OPTIONS’: {’charset’: ’utf8mb4’} from your Django settings.

Related

How can I read a dictionary from database without it being turned into a string by django?

My database contains a dictionary. When I read the dictionary from the database and try to do something with it it fails because the dictionary has been automatically converted into a string. Any way to avoid Django turning the dict into a string?
you can also use simplejson.loads() and simplejson.dumps() to deserialize and serialize the dictionary. It is a bit more work, but it ensures that you are not dependent on database.
There are options for MySQL and Postgres but I don't think there's an equivalent for sqlite.
For MySQL JSONField: https://django-mysql.readthedocs.io/en/latest/model_fields/json_field.html
Similarly for Postgres:
https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/fields/#jsonfield
There's built in support to query the contents of the fields which is pretty neat. The docs show examples.
Solved the issue with help from the responses I got here.
When capturing and saving the JSON from the webhook (as that's where the JSON is coming from in my project), I had to do the strange step of serialising and deserialising the JSON before saving it to my database. This process got rid of all the \r and \t charactors which are passed by request.body but make the JSON invalid:
t = Transaction(data=json.dumps(json.loads(request.body)))
t.save()
To load the JSON from database into a python dictionary that I can then use in my code I used json.loads:
data = json.loads(t.data)

django json field: which one?

I am looking for a JSON field for Django.
I have found mainly 2 jsonfield app and I am not sure which one I should use.
The main difference I see is that the first one does not have the native JSON datatype support for PostgreSQL anymore.
It has been removed recently (https://github.com/bradjasper/django-jsonfield/commit/15957c9dab18c546ae5c119f8a6057e5db6b2135). It was related to this issue https://github.com/bradjasper/django-jsonfield/issues/57
but I am not sure if it's the right approach since JSONB is also coming soon with PostgreSQL 9.4. I think it's better to use the native datatype when using PostgreSQL. What do you think?
1) https://github.com/bradjasper/django-jsonfield
2) https://bitbucket.org/schinckel/django-jsonfield/
Since Django 1.9 JSON support is back again with JSONField:
https://docs.djangoproject.com/en/1.9/ref/contrib/postgres/fields/

Django query returning non-unicode strings?

I'm completely baffled by a problem I found today: I have a PostgreSQL database with tables which are not managed by Django, and completely normal queries via QuerySet on these tables. However, I've started getting Unicode exceptions and when I went digging, I found that my QuerySets are returning non-Unicode strings!
Example code:
d = Document.objects.get(id=45787)
print repr(d.title), type(d.title)
The output of the above statement is a normal string (without the u prefix), followed by a <str> type identifier. What's more, this normal string contains UTF-8 data as expected, in raw byte form! If I call d.title.decode('utf-8'), I get valid Unicode strings!
Even more puzzling, some of the fields work correctly. This same table / model contains another field, html_filename of the same type (TextField) which is returned correctly, as a Unicode string!
I have no special options, the database data is correctly encoded, and I don't even know where to begin searching for a solution. This is Django 1.6.2.
Update:
Database Server encoding is UTF8, as usual, and the data is correctly encoded. This is on PostgreSQL 9.1 on Ubuntu.
Update 2:
I think I may have found the cause, but I don't know why it behaves this way: I thought the database fields were defined with the text type, as usual, but instead they are defined as citext (http://www.postgresql.org/docs/9.1/static/citext.html). Since the Django model is unmanaged, it looks like Django doesn't interpret the field type as being worthy of converting to Unicode. Any ideas how to force Django to do this?
Apparently, Django will not treat fields of type citext as textual and return them as Unicode strings.

Django: How can I determine why Django isn't displaying certain data?

I have a Django app that runs a tool and displays the results from the tool back to the user using a Django template. Sometimes Django does not display the results. It doesn't complain about anything, it just doesn't display the results. I'm guessing this is something to do with one or more of the characters in the results being illegal as far as Django is concerned. How can I get more information about what it is that Django doesn't like? Also, is there some method I can use to filter out "bad" characters? The results are normally just lots of text. They contain company confidential stuff, so I can't give an example unfortunately. I have DEBUG set to True and TEMPLATE_DEBUG set to DEBUG.
UPDATE:
I added some code to filter out all chars with a decimal value greater than 127 and it now works.
If you are using the development server, put in a breakpoint with pdb and see what is going on. Or print out the string that you think has "bad" characters. If you aren't using the development server you could use the Python logging module to log the string you are getting from the tool.
You might be leaping to conclusions about the data containing bad characters. It may be something else, and without debugging further it is hard to speculate.
you could try using the built in django encoding methods to remove illegal characters.
from django.utils.encoding import smart_str
smart_str(your_string)

Django model translation : store translations in database or use gettext?

I'm in a Django website's I18N process.
I've selected two potentially good django-apps :
django-modeltranslation which modifies the db schema to store translations
django-dbgettext which inspect db content to create .po files and uses gettext
From your point of view, what are the pros and cons of those two techniques ?
If you want to let users of your app(or third party translators) easily update the translations without code changes then go for one of the solutions that stores the translations in the database.
If you instead want greater quality control(version control, several set of eyes, etc), then use gettext. By using gettext you may also control which strings you want translate.
Just my 2c.
django-modeltranslation is best for storing translated value. you will go to django-admin and put translated value.
But If you are using django-dbgettext, then you dont need to put any value in django-admin, you can use rosetta for that. If you are not able to look any value for translation and you want it to translate, then you can do entry of model in "*dbgettext_registration.py*" and run command "python manage.py dbgettext_export" then "python manage.py compilemessages".
http://packages.python.org/django-easymode/ combines the two:
http://packages.python.org/django-easymode/i18n/index.html
http://packages.python.org/django-easymode/i18n/translation.html
Gettext is used to translate large ammounts of data, and the admin is used for day to day updates.
I would suggest you always use files for your translations. It's portable and doesn't have unknown impacts on DB performance (especially an issue when using "magic" packages that monkey patch your DB schema)
This package looks simple and extensible: https://github.com/ecometrica/django-vinaigrette