I create a django application with existing MySQL database, the problem is the encoding to database is latin1_general_c and the utf8 characters is save like this ñ => ñ , ó => ó, i need present the information in that page in correctly form but django show the information of database like this
recepción, 4 oficinas, 2 baños
i need show like this
recepcíon, 4 oficinas, 2 baños
For many reasons I can't change the database to utf8
what do I do for show information the correctly way?
Django seems to explicitly require that your database use UTF-8:
Django assumes that all databases use UTF-8 encoding. Using other encodings may result in unexpected behavior.
That said, it should be possible to use the custom OPTIONS setting to pass the desired character set to the database driver. See this answer, in which setting charset solved the problem. But you should check the documentation for the specific MySQL driver and database that you're using, since these options are not used by Django itself.
As suggested by the quote above, even if this works to correctly translate strings between the database and unicode, parts of Django still won't work correctly. For example, to correctly validate the length of a CharField Django has to know the encoding, and it will always assume UTF-8.
Related
I am writing a ColdFusion program that uses cfquery to get data from an AS/400 iSeries table and then output that data to a web page. Some times the Data is in Chinese, but it does not output the Chinese characters correctly.
I built the query below for testing,
<cfprocessingdirective pageEncoding="UTF-8" />
<cfquery name="Test" Datasource = "AS400">
select dsc1 from sales where ref = '123456'
</cfquery>
<cfoutput>#test.dsc1#</cfoutput>
The result should be "M5方头螺栓" but I only get "M5". I did another test running just:
<cfset x = "M5方头螺栓"/>
<cfoutput>#x#</cfoutput>
and it displays the Chinese no problem.
Since ColdFusion can display the characters when they are written out in the code, but not when it goes to get the data through SQL, it seems like the issue is with either my ODBC settings or my ColdFusion Server Data Source Settings but I'm not familiar enough with these settings to know what needs to be changed to get this working.
A workaround was found and discussed within the comments. Adding some details here as an answer for future visitors to this page.
There are a couple of considerations when dealing with Unicode (Chinese) characters:
The data type for the database table must be set to nvarchar
The form processing script (CFML) must be set to utf-8
I believe ColdFusion defaults to this but you can specify the setting to be sure.For example: <cfprocessingDirective pageEncoding=”utf-8″>
Enable "String Format" within the ColdFusion datasource settings
Under the ColdFusion administrator datasource settings select the appropriate datasource you are using. Then click on the "show advanced settings" button. That will show an option for "String Format" Enable High ASCII characters and Unicode for data sources configured for non-Latin characters. Select this option and save the datasource.
The issue for the OP was that they were using an ODBC datasource and the "String Format" option was not available. After some research and the lack of finding any way to configure an ODBC datasource for that setting I recommended trying to use the builtin JDBC driver for "DB2 Universal Database" that comes with ColdFusion. Switching to that driver resolved this issue for the OP.
From the comments
Good info. Though is "Enable String Format..." necessary with the added support for cf_sql_nvarchar in CF10+? – #Leigh
I do believe Leigh is correct that the newer versions of ColdFusion (10 and later) have much better support for nvarchar fields.
Also to note, it looks like some older versions of ColdFusion don't always work with the installed DB2 Universal Driver, and it doesn't look like the older standard versions even have it, I'm not sure if the newer ones have it either, but using the "other" option with jt400.jar, should also work. - #MHall
You've already proven that CF can output UTF-8 characters correctly. Have you tried running that query in the DB console or UI? Do you get the correct charaters?
If the characters were stored as VARCHAR and not NVARCHAR, then there's nothing you can do. The data has to have been properly stored in the first place.
If the characters are stored correctly in the DB, try adding <cfprocessingdirective pageEncoding="utf-8"> at the top of the request. CF should be using UTF-8 by defualt, but this will force the correct character set if, for some reason, it isn't.
I got a request from a customer that he wants to be able to type the query string of my web service with parameters in the IE10 address bar and get the service results. The parameters include string in Hebrew, like:
http://mywebsite.com/service.asmx/foo?param1=123¶m2=מחרוזתבעברית
It seems to me that that IE10 won't encode the query string parameters - every non-ASCII character that goes after the ? mark would be turned to '3f' byte, though it does encode what goes before the ? mark - the url itself.
For example, if i try to reach the url (the parameter is fictional, url is not, and I have no connection with the site)
http://www.shlomo.co.il/pageshe/sales/רכב-למכירה.asp?param=פאראם
and look in wireshark for the bytes I send to the server, it shows me
You can see it does substitute the hebrew part of the URL with urlencoded string, but substitutes the hebrew parameters with ?????, which are '3f's.
The same string in chrome would be encoded in it's entirety:
GET http://www.shlomo.co.il/pageshe/sales/%D7%A8%D7%9B%D7%91-%D7%9C%D7%9E%D7%9B%D7%99%D7%A8%D7%94.asp?param=%D7%A4%D7%90%D7%A8%D7%90%D7%9D HTTP/1.1
I tried it on machines with win7/IE10 and winXPheb/IE8.
My IE settings are (especially checked the "Always show encoded addresses option" to see if it helps and restarted, but made no difference):
I tried to search around for any info about the issue, but didn't find much of it.
My questions are:
Is it indeed like this, or am I missing something?
Is this behavior documented anywhere?
Are there any settings in IE/Win which enable the parameters encoding.
p.s. Sure if I was developing the client/web ui, I would simply urlencode my query, but my request from customer was exactly to paste the query to IE address bar, that's why I'm interested in this specific behavior.
Thanks.
Yes, your observation of the behavior is accurate. Internet Explorer 10 and below follow a complicated algorithm for encoding the URL. This was allegedly updated in Internet Explorer 11, but I've found that the new option doesn't seem to work.
The "Always show encoded addresses option" concerns whether PunyCode is shown for IDN hostnames, and does not impact the query string. Send UTF-8 URLs mostly applies to the encoding of the path, although it can also affect other codepaths
The behavior isn't fully documented anywhere. I'd meant to write a full post on my IEInternals blog about it but ended up moving on from Microsoft before doing so. There's a partial explanation in this blog post.
Yes, there are settings that impact the behavior. The Send UTF-8 URLs checkbox inside Tools > Internet Options > Advanced is one of the variables that determines how URLs are sent, but the option does not blindly do what it implies (it only UTF-8 encodes the path, not the query string). Other variables involved include:
Where the URL was typed (e.g. address bar vs. Start > Run, etc)
What the system's ANSI codepage is (e.g. what locale the OS uses as default)
The charset of the currently loaded page in the browser
As a consequence of these variables, you cannot reliably use URLs which are not properly encoded (e.g. %-escaped UTF8) in Internet Explorer.
Unfortunately this is still true for Internet Explorer 11 (build 11.0.9600.17358, win7-x64)
I saw that you can not unfortunately change the web server. However those who are developing new services may consider changing request parameters into path variables, e.g. from http://myserver.com/page?τεστ into http://myserver.com/τεστ/
If the client is calling the web-service from javascript,
encodeuricomponent can be used. In your case encodeuricomponent("מחרוזתבעברית");
http://www.w3schools.com/jsref/jsref_encodeURIComponent.asp
i get an unicode error when trying url like www.mysite.com/blog/category/πρακτικα/ or www.mysite.com/blog/πρακτικα/
but i dont get the error when trying www.mysite.com/blog/tag/πρακτικα/
UnicodeEncodeError at /blog/category/πρακτικα/ 'latin-1' codec can't encode characters in >position 58-65: ordinal not in range(256)
Exception Location: /home/vagrant/sullogos-venv/local/lib/python2.7/site-packages/django/template/loaders/filesystem.py in load_template_source, line 37
seems it haves different behavior at categories and at tags
The difference is that categories can have a custom template and tags can't. So in the category case, a template name is searched for using the category slug - the error you're getting is due to an incorrectly configured locale which doesn't support utf8.
This is not a problem with Mezzanine or Django, but with the environment used to deploy them. See this issue and this documentation for more details. It's not enough for Python to support a specific locale, but it's also necessary for the webserver to be able to handle Unicode files correctly.
How to fix it will depend on the webserver used. If you're using Apache, for instance, you need to set LANG and LC_ALL to Unicode-compatible values (on *NIX systems at least you should find them at /etc/apache2/envvars). An example would be:
export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'
Feel free to replace the language/country codes with another one more suitable for your needs (I used pt_BR instead of en_US and things worked fine for me). From the error message you're seeing, these settings in your system are probably using ISO-Latin (ISO-8859-1) instead of UTF-8 (which I assume can't handle cyrillic).
If you're using a different webserver, check its documentation on localization/internationalization to see what needs to be changed. The important thing is to offer support to Unicode file names, as I understood.
I have a django server running on tornado server.
When I use special caracters like ó or ñ the page certain part of a certain django template is not rendered (character set has been especified to 'utf-8' in settings.py and tornado_script.py # -- coding: utf-8 --).
Considering that just a certain part of the template is not well rendered (a form) and the server works perfectly using the django built-in runserver, I could supposed the problem is comming from tornado server, but I can not debug that configuration.
If some of you know how to debug this to find the missing configuration, please let me know.
I've been searching a lot last 3 hours with no results.
Best Regards
Probably your browser is guessing the character set wrong. Some browsers allow you to set the encoding, I would suggest trying setting it to UTF-8. If this is indeed the problem, you can set the encoding as a meta tag so all browsers will always pick the right encoding. Add this to head:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
You should also make sure that your special characters are really in UTF-8. Most editors allows you to enforce this. You can also set a special encoding for your Python files, which will choke if anything else appears. Add the following to the beginning of your Pytho source with weird characters:
# encoding: utf-8
I've found that the tornado "render" for templates likes to do its own encoding which may be messing things up for you.
you can look at their source code to see exactly what it does...
Try using "write" instead and see whether the characters appear in the output, then you may have a better idea of what's happening.
J
In the admin, if you enter in a slug two things are applied through JS:
The string is made slug-friendly
The string is transliterated if the language is not English, so for example Cyrillic Russian text gets converted into Transliterated Russian ( typed out in English )
I'm basically inserting a couple thousand rows and I need to access this. Does django provide a non-js server-side version of this transliterator which I can access to somehow do the insertion?
Looks like I have to port over the usr/lib/pymodules/python2.5/django/contrib/admin/media/js/urlify.js code unless I can figure out a way to programmatically load all articles on the client side and slugify themselves automatically.