Using French text in Python script - python-2.7

I am new to Python, and I am trying to change some text from English to French in a series of ArcGIS maps, using a Python script (running version 2.7.12) and editing it in IDLE. Following the suggestions in these posts
Write french characters in python 2.7
How to make the python interpreter correctly handle non-ASCII characters in string operations?
I used
#!/usr/bin/python2.7
# coding: utf-8
as the first lines of my script, and included a 'u' inside the brackets before the text with the French character. However, when I make the substitution, I can no longer save or run the script.
The following code generates the English text correctly:
if name[0] == "Alcids":
elm_spp.text = '\r\n'.join(textwrap.wrap("Alcids: ANMU, CAAU, COMU, MAMU,
PIGU, RHAU, UNAL",30))
The following does not allow me to save or run the script:
if name[0] == "Alcids":
elm_spp.text = '\r\n'.join(textwrap.wrap(u"Alcidés: GUCB, SCAS, GUMA,
GMRB, GUCO, MARH, ALSP",30))
Can anyone tell me what I am missing?
Thanks.

Related

Problems in codification - unicode vs. utf-8 in python 2.7

Well, my python script is supposed to open all utf-8 yaml files in a directory and show the content to the user. But, there are words with graphic accent, words in French, such as présenter, which is shown like this: u"pr\xe9senter. I need it to be shown properly to the user.
Here is my code:
import glob
files = glob.glob("data/*.yaml")
def read_yaml_file(filename):
with open(filename, 'r') as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
for file in files:
read_yaml_file(file)
I already tried to use the import from __future__, but it didn't work. Does anyone know how to solve it?
Unicode in 2.x is painful. If you can, use current python 3, in which text is unicode, printed without a 'u' prefix, instead of bytes, which is now printed with a 'b' prefix.
>>> print(u"pr\xe9senter") # 3.8
'présenter'
You also need a system console/terminal or IDE that displays glyphs for the codepoints in your yaml files.
If you are a masochist or otherwise stuck on 2.7, use sys.stdout.write(). Note that you must explicitly write '\n's.
>>> import sys; sys.stdout.write(u"pr\xe9senter\n") # 2.7
présenter
This question is not really about IDLE. However, the above lines work in both standard interactive Python on Windows 10 and in IDLE. IDLE uses tkinter which uses tcl/tk. Tk itself can handle all Basic Multilingual Plane (BMP) characters (the first 64K), but only those. Which BMP characters it can display depends on your OS and its current fonts.

Why does termcolor not work in python27 windows?

I just installed termcolor for python 2.7 on windows8.1. When I try to print colored text, I get the strange output.
from termcolor import colored
print colored('Hello world','red')
Here is the result:
[31mHello world[0m
Help to get out from this problem.Thanks,In advance
See this stackOverflow post.
It basically says that in order to get the escape sequences working in Windows, you need to run os.system('color') first.
For example:
import termcolor
import os
os.system('color')
print(termcolor.colored("Stack Overflow", "green")
termcolor or colored works perfectly fine under python 2.7 and I can't replicate your error on my Mac/Linux.
If you looks into the source code of colored, it basically print the string in the format as
\033[%dm%s\033[0m' % (COLORS[color], text)
Somehow your terminal environment does not recognise the non-printing escape sequences that is used in the unix/linux system for setting the foreground color of xterm.

Different base64 encoding between python versions

I'm having trouble sending an html code through JSON.
I'm noticing my string values are different between python versions (2.7 and 3.5)
My string being something like: <html><p>PAÇOCA</p></html>
on Python 2.7:
x = '<html><p>PAÇOCA</p></html>'
base64.b64encode(x)
=> PGh0bWw+PHA+UEGAT0NBPC9wPjwvaHRtbD4=
on Python 3.5:
x = '<html><p>PAÇOCA</p></html>'
base64.b64encode(x)
=> b'PGh0bWw+PHA+UEHDh09DQTwvcD48L2h0bWw+'
Why are these values different?
How can I make the 3.5 string equal to the 2.7?
This is causing me troubles with receiving e-mails due to the accents being lost.
Your example x values are not valid Python so it is difficult to tell where the code went wrong, but the answer is to use Unicode strings and explicitly encode them to get consistent answers. The below code gives the same answer in Python 2 and 3, although Python 3 decorates byte strings with b'' when printed. Save the source file in the encoding declared via #coding. The source code encoding can be any encoding that supports the characters used in the source file. Typically UTF-8 is used for non-ASCII source code, but I made it deliberately different to show it doesn't matter.
#coding:cp1252
from __future__ import print_function
import base64
x = u'<html><p>PAÇOCA</p></html>'.encode('utf8')
enc = base64.b64encode(x)
print(enc)
Output using Pylauncher to choose the major Python version:
C:\>py -2 test.py
PGh0bWw+PHA+UEHDh09DQTwvcD48L2h0bWw+
C:\>py -3 test.py
b'PGh0bWw+PHA+UEHDh09DQTwvcD48L2h0bWw+'

Hindi string not supported as an input

I am new to python programming.I want to give Hindi string as an input to a variable in python as:
a='शिक्षा का अधिकार अधिनियम'
On writing this on command prompt I get an error message as
'Unsupported characters in input'
I am using Python 2.7 on windows.Can anyone suggest how to get rid of this problem

How can the python interpreter read the charaset?

I'm learning python. And i learned every comment starts with a hash "#". So how can the python interpreter read this line?
# -*- coding: utf-8 -*-
and set the charset to utf-8 ? (I'm using Python 2.7.3)
Thank you in advance.
Yes, it is a comment. But this does not mean that python doesn't see it. So it can obviously parse it, too.
What python actually does is using the regular expression coding[:=]\s*([-\w.]+) on the first two lines. Most likely this is done even before the actual python parser steps in.
See PEP-0263 for details.