data.encode to CP1252 throws an exception in python2 for outside the range of 128 - python-2.7

Python2 throws an exception while encoding the characters which are outside the range of 128 where as python3 does successfully. Also why the exception throws as UnicodeDecodeError instead of UnicodeEncodeError.Can some please let me know the reason.
python2:
data = 'À'
data.encode('cp1252')
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
Python3:
data = 'À'
data.encode('cp1252')
output: b'\xc0'

Related

UnicodeDecodeError: 'utf8' codec can't decode byte 0xaf in position 3: invalid start byte in python 2.7

using windows10 python 2.7
my code for decryption
def decrypt(self, enc):
enc = b64decode(enc)
iv = enc[:16]
cipher = AES.new(self.key, AES.MODE_CBC, iv)
print cipher,"======"
dec = cipher.decrypt(enc[16:])
#print dec,"========",dec
unp = unpad(dec)
print unp,"=========","=fdkjfsdklfsdjndjdjk"
decode = unp.decode('utf8')
#decode = unp.decode('utf8')
print decode
# unpad(cipher.decrypt(enc[16:])).decode('utf8')
return decode
while decrypting the encrypted response cipher.decrypt(enc[16:]) line gives me below output. But actually It should be the XML format.
)^»3(Fm╠¡Oå┤╖¢iOÑ>s▌B¿▌╥≥┐Éj6╬░¢√(å¥ 2?J≤ôGOL═\¥°t╬╚ΓÜ▐╝Φ÷═AQw≥[&nΣ±ƒ∩(╩ûGN~[3bgrHPÜ4%╖H⌡▄wÅ|■Çq≥½÷σHñxìdºwë±!│▐íWÇÿΘ╦σ╖è#X▓┤2ÿ ┘╟ƒΣ°Y░çNßæÅαb3f«─O(Wo9┐A╕t£╧{K [X┴┬ÜHΘ⌠X4┬Æ≡~╠h3ε┘σmÉfú.Fú╜₧c!_╒▐wα²A/╒|─sY%=⌐▒Yö╕[╞ε░::tA┴₧µ≤²∙C─A█₧╕╧τ╙x≤rƒú░uú█å┬-╤`╡f╕^∞tΦ½q╗&╪─╘¥&┐Σ₧▌(╙┌JüñÇäQ¥/*ó▐H!C┬+δà\Bah╘áÆXu╥C█│¼)ë╩╓*E(÷·├à√¿╨╧1Θ·0≈º²║Ås┬xOò}a╪╔╫HÜq┬gqÅÖ⌐4~v╖·9╥Ü$wçZ▌╗┬? /Zj12^}&t$F=SBKhöåε è╝o╪█º8fìîé╫=«·gO:Z╢≡2╔K«Θ uè/╩ {⌐Åwwε^α┼µk4┘Ñ╧:ƒ16║╞ⁿB°¢üdó?eB┼P┌L_90]\5W╥µA⌐
#Mq╤ìⁿ²ç≥Θ·▓F₧▀) ç#ë╒╖às2╡}πL╕╨60ä┌ù6▒.rn╔jⁿR¢∙µIëÉ╝µè}c≈σß_αäcª/╤"lK*└qX2H öφq#â½æΘjÄ% é6#üY█▓aFßα█÷I║n+⌡▄Ä!jTÄ√∩yr¥d"╛¬z√ⁿµº½êYⁿπ¬2[╕¿≡ÿ │Uv?{τæτ°QÜĵ╨íkUFπ╚BπÆ! Hiåƒ╒£αì≥Æεtr█[╤àÆ█oíΩ("┤╞åMÜò╝D3╬¿VτΩr▓ÜÆÿ$┌)⌠≡\~╩▀Rr≡y£₧≤║L>╙ ╘µv9ÿæ├#B≡µ£╕Ew╗yÿtXeY.αÑsú Y±£∩=yy¥óüΣÆF╧╦á─} Oƒ≥-9[≤¢fúΣe3&Öÿ░ìç·ntÄO
l∙m¥\╞&KêëR»s╔E2╨ª│OV≥░m═╬2┬₧ú(ûöz¢¼╣\≤5nqò+╝±Äm{Gσ╝ROφNµàg╛RV╨;Lδa ,é/ⁿY╜|┤ñ╔÷πvⁿ╞W▓π}Rå#h$*πAò¼2╝CÅk*l"h╕≥aÆhæt)9▐░╝.]B}-╢└∩Iσw┬╚D&5≡▒²`WJ╔╫⌡K1∩ fú~A▌c▄mÑ┴?ôQ╩ƒⁿ|╨{ç▒·ΘB╡Φτ▌⌠─╘q?nⁿC/v>σ°┬#'L┌ 0Kè£
╩[Érekx«wë,\¥─K\a╡·┐PDIF╩l╤YH╞F$c6≈G¡Üc^r=pbiµΦ┘±ÿ▓zΦ¿0░ì┐á7┌o■«-ⁿ#,
While decoding i.e at line unp.decode('utf8') gives me the following error
Traceback (most recent call last):
File "nic_dycrypt_encrypt.py", line 99, in
print('Ciphertext:', AESCipher(key).decrypt(ciphertext))
File "nic_dycrypt_encrypt.py", line 86, in decrypt
decode = unp.decode('utf8').strip()
File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xaf in position 3: invalid start byte
Please any one help me to know what is that format and why the error is coming and how to resolve
Simply put not all bytes and/or byte sequences map to unicode characters. In fact most byte sequences do not have a UTF-8 character mapping.
The common solution is to convert binary to an encoding that can handle all byte values, the most common are Base64 and Hexadecimal.

Convert Python Exception to string fails with encoding error

I'm rather new at Python and am experiencing some issues printing the contents of an Exception:
# -*- coding: ISO-8859-1 -*-
except Exception as err:
print(err)
yields "UnicodeEncodeError: 'ascii' codec can't encode character u'\uf260' in position 52: ordinal not in range(128)"
I've tried reading https://docs.python.org/2.7/howto/unicode.html#the-unicode-type but the issue I'm encountering is that in order to decode and encode or use unicode(..,errors='ignore') I need a string and str(err) fails with the above error message.
This is in a Windows environ. Thankful for any replies, even if it is "learn to search!" because in that case there was indeed a similar question that I missed while searching that's hopefully been answered : )
Edit. I've tried
print("Error {0}".format(str(err.args[0])).encode('utf-8', errors='ignore'))
which yields exactly the same error message.
UnicodeEncodeError: 'ascii' codec can't encode character u'\uf260' in
position 52: ordinal not in range(128)
Look in the error, character \uf260 its not ascii, but something is trying to treat a non-ascii character as ascii. What?
Try this instead : print err.__repr__() .
Explanation:
err is an Exception object which has __str__() function implemented, so on printing an object object.__str__() method is called which essentially brings this error. You can verify by calling print err.__str__() to get similar error.
Also to check that Exception class module has __str__(), you can do dir(Exception).
Update: Check this link to understand how print picks up encoding.
References:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)
Python __str__ versus __unicode__
How to print a class or objects of class using print()?
Python: Converting from ISO-8859-1/latin1 to UTF-8
Why does Python print unicode characters when the default encoding is ASCII?

Python: POSTing binary data gives UnicodeDecodeError or Ascii decode error

When POSTing binary data using urllib2 or urllib3, or httplib2, I receive the error UnicodeDecodeError: 'utf8' codec can't decode or UnicodeDecodeError: 'ascii' codec can't decode... depending on whether the Python script is in UniCode or ASCII mode.
I first thought that the library was the issue, so I tried different libraries but that didn't solve the problem.
End of the stack trace:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 895, in _send_output
msg += message_body
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc4 in position 627: invalid continuation byte
The problem, as noted in comments to Python bug 11898 is that the url string became tagged at some point as either a Unicode or Ascii string.
Then, when the httplib library is creating the byte string for the entire HTTP/S message, and the line
msg += message_body
is executed, Python tries to convert message_body (which contains binary data) to either Ascii or Unicode. In either case, the conversion fails.
Solution
use str() when making any modifications to the url. In my case:
url = baseUrl + "/envelopes" # throws UnicodeDecodeError
url = str(baseUrl + "/envelopes") # works great
If that is not enough, check your other strings to ensure that they haven't been tagged as Unicode or Ascii.

UnicodeEncodeError: 'ascii' codec can't encode character

I'am trying to export as csv a pandas dataframe with the function:
outcome.to_csv("/Users/john/out_1.csv")
I get the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 191: ordinal not in range(128)
how do I go to position 191 to check what's wrong?
Many thanks
outcome.to_csv("/Users/john/out_1.csv",encoding="utf-8")
On referring to the documentation of pandas.to_csv, we have the following details. It seems that for Python 2.7 the default is "ascii" which needs to be overridden to "utf-8"
encoding : string, optional
A string representing the encoding to use in the output file, defaults
to ‘ascii’ on Python 2 and ‘utf-8’ on Python 3.

I have ascii characters errors django-tagging in admin panel

I'm using django-tagging for tags. But I have a problem. Tags have a ascii character. Example: ı
Problem =>
UnicodeDecodeError at /admin/tagging/taggeditem/
'ascii' codec can't decode byte 0xc4 in position 15: ordinal not in range(128)
Request Method: GET
Request URL: http://blog.com/admin/tagging/taggeditem/