Interoperability problems python2 python3 - python-2.7

Two uServices are communicating via a message queue (RabbitMQ). The data is encoded using message pack.
I have the following scenarios:
python3 -> python3: working fine
python2 -> python3: encoding issues
Encoding is done with:
umsgpack.packb(data)
Decoding with:
umsgpack.unpackb(body)
When doing encoding and decoding in python3 I get:
data={'sender': 'producer-big-red-tiger', 'json': '{"msg": "hi"}', 'servicename': 'echo', 'command': 'run'}
When doing encoding in python2 and decoding on python3 I get:
data={b'command': b'run', b'json': b'{"msg": ""}', b'servicename': b'echo', b'sender': b'bla-blah'}
Why is the data non "completely" decoded? What should I do on the sender / receiver to achieve compatibility between python2 and python3?

Look at the "Notes" section of the README from msgpack-python;
msgpack can distinguish string and binary type for now. But it is not like Python 2. Python 2 added unicode string. But msgpack renamed raw to str and added bin type. It is because keep compatibility with data created by old libs. raw was used for text more than binary.
Currently, while msgpack-python supports new bin type, default setting doesn't use it and decodes raw as bytes instead of unicode (str in Python 3).
You can change this by using use_bin_type=True option in Packer and encoding="utf-8" option in Unpacker.
>>> import msgpack
>>> packed = msgpack.packb([b'spam', u'egg'], use_bin_type=True)
>>> msgpack.unpackb(packed, encoding='utf-8')
['spam', u'egg']

Related

Converting python2 byte/string encoding to python3

I'm communicating over a serial port, and currently using python2 code which I want to convert to python3. I want to make sure the bytes I send over the wire are the same, but I'm having trouble verifying that that's the case.
In the original code the commands are sent like this:
serial.Serial().write("\xaa\xb4" + chr(2))
If I print "\xaa\xb4" in python2 I get this: ��.
If I print("\xaa\xb4") in python3 I get this: ª´
Encoding and decoding seem opposite too:
Python2: print "\xaa".decode('latin-1') -> ª
Python3: print("\xaa".encode('latin-1')) -> b'\xaa'
To be crude, what do I need to send in serial.write() in python3 to make sure exactly the same sequence of 1s and 0s are sent down the wire?
Use a bytes sequence.
ser.write(b'\xaa\xb4')

Tensorflow- bidirectional_dynamic_rnn: Attempt to reuse RNNCell

The following code (taken from - https://github.com/dennybritz/tf-rnn/blob/master/bidirectional_rnn.ipynb)
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
# Create input data
X = np.random.randn(2, 10, 8)
# The second example is of length 6
X[1,6:] = 0
X_lengths = [10, 6]
cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)
outputs, states = tf.nn.bidirectional_dynamic_rnn(
cell_fw=cell,
cell_bw=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
output_fw, output_bw = outputs
states_fw, states_bw = states
is giving the following error for
tensorflow - 1.1 for both 2.7 and 3.5
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.LSTMCell object at 0x10ce0c2b0>
with a different variable scope than its first use. First use of cell was with scope
'bidirectional_rnn/fw/lstm_cell', this attempt is with scope 'bidirectional_rnn/bw/lstm_cell'.
Please create a new instance of the cell if you would like it to use a different set of weights.
If before you were using: MultiRNNCell([LSTMCell(...)] * num_layers), change to:
MultiRNNCell([LSTMCell(...) for _ in range(num_layers)]). If before you were using the same cell
instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances
(one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use
existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation,
so this error will remain until then.)
But it is working in
tensorflow - 1.0.1 for python 3.5 (did not test on python - 2.7)
I tried with multiple code examples I found online but
tf.nn.bidirectional_dynamic_rnn
is giving the same error with tensorflow - 1.1
Is there a bug in tensorflow 1.1 or am i just missing something?
Sorry you ran into this. I can confirm that the error appears in 1.1 (docker run -it gcr.io/tensorflow/tensorflow:1.1.0 python) but not in 1.2 RC0 (docker run -it gcr.io/tensorflow/tensorflow:1.2.0-rc0 python).
So it looks like either 1.2-rc0 or 1.0.1 are your options for the moment.

Reading hexascii in Python 2.7 vs Python 3.5x

I have a function built that reads in hex-ascii encoded data, I built that in Python 2.7. I am changing my code over to run on 3.x and hit an unforeseen issue. The function worked flawlessly under 2.7. Here is what I have:
# works with 2.7
data = open('hexascii_file.dat', 'rU').read()
When I run that under 3.x I get a UnicodeError:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x85 in position 500594: invalid start byte
I thought the default codec under Python 2.7 was ascii, so I tried the following under 3.x:
data = open('hexascii_file.dat', 'rU', encoding='ascii')
This did not work (same error as above, but specifying 'ascii' instead of 'utf-8'. However, when I use the latin-1 codec all works well.
data = open('hexascii_file.dat', 'rU', encoding='latin-1')
I guess I am looking for a quick sanity check here to ensure I have made the proper change to the script. Does this change make sense?

Can't open a file with a Japanese filename in Python

Why doesn't this work in the Python interpreter? I am running the Python 2.7 version of python.exe on Windows 7. My locale is en_GB.
open(u'黒色.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] invalid mode ('r') or filename: u'??.txt'
The file does exist, and is readable.
And if I try
name = u'黒色.txt'
name
the interpreter shows
u'??.txt'
Additional:
Okay, I was trying to simplify my problem for the purposes of this forum. Originally the filename was arriving in a cgi script from a web page with a file picker. The idea was to let the web page user upload files to a server:
import cgi
form = cgi.FieldStorage()
fileItems = form['attachment[]']
for fileItem in fileItems:
if fileItem.file:
fileName = os.path.split(fileItem.filename)[1]
f = open(fileName, 'wb')
while True:
chunk = fileItem.file.read(100000)
if not chunk:
break
f.write(chunk)
f.close()
but the files created at the server side had corrupted names. I started investigating this in the Python interpreter, reproduced the problem (so I thought), and that is what I put into my original question. However, I think now that I managed to create a separate problem.
Thanks to the answers below, I fixed the cgi script by making sure the file name is treated as unicode:
fileName = unicode(os.path.split(fileItem.filename)[1])
I never got my example in the interpreter to work. I suspect that is because my PC has the wrong locale for this.
Here's an example script that reads and writes the file. You can use any encoding for the source file that supports the characters you are writing but make sure the #coding line matches. You can use any encoding for the data file as long as the encoding parameter matches.
#coding:utf8
import io
with io.open(u'黒色.txt','w',encoding='utf8') as f:
f.write(u'黒色.txt content')
with io.open(u'黒色.txt',encoding='utf8') as f:
print f.read()
Output:
黒色.txt content
Note the print will only work if the terminal running the script supports Japanese; otherwise, you'll likely get a UnicodeEncodeError. I am on Windows and use an IDE that supports UTF-8 output, since the Windows console uses a legacy US-OEM encoding that doesn't support Japanese.
Run IDLE if you want to work with Unicode strings interactively in Python. Then inputting or printing any characters will just work.

how to pass a value to c++ from python and back?

i would like to pass values from python to a c++ program for an encryption from inside a python program and then return the value from there to the python program . how to do it?
If you want to use some existing Unix-style command line utility that reads from stdin and writes to stdout, you can use subprocess.Popen by using Popen.communicate():
import subprocess
p = subprocess.Popen(["/your/app"], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
output = p.communicate(input)[0]
As said msw in the other post, the proper solution is using PyObject.
If you want to have a two-way communication between C++ & Python, Boost Python would be interesting for you. Take a look at website Boost Python,
This post would also be interesting:
How to expose a C++ class to Python without building a module