Python reading a constant serial byte length from device - python-2.7

I have a device that sends 23 characters (numbers and alpha's) via RS232 serial in the following format:
$02 T AAAAA Q CCC PP ZZZ S RR I I NFF $0D
(the spaces in the above string are for readability only)
In this 23 character string the:
$02 represents the start of text 2 hex ( I am not sure what hex this is?)
$0D represents a Carriage Return 13 decimal.
I am currently reading this information in via Python mostly successfully but I still feel I am not doing it properly. I rarely program in Python but I have to use a raspberry PI so decided to go with python for the coding.
I setup my RPI serial port with the following function:
def setupSerialPort():
ser = serial.Serial(
port='/dev/ttyAMA0',
baudrate = 9600,
parity = serial.PARITY_NONE,
stopbits = serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS,
timeout=1,
xonxoff=0,
rtscts=0
)
return ser
From a while loop I read the port as follow:
# setup serial port
cSerial = setupSerialPort()
while 1:
inbuff = cSerial.inWaiting()
if inbuff > 0:
msgCOM = cSerial.read(inbuff)
#vMsgCOM = re.sub('[^A-Za-z0-9]+', '', msgCOM)
//insert value into database
sleep(1)
at which point I insert the value "vMsgCOM" or "msgCOM" string into a mysql database as I read/receive the data. At first I thought that this works pretty well but after a week of data it became clear that I sometimes only capture partial data which splits over two database rows as mentioned previously. I'll give an example:
A correct 23 char string will look like this: K00000E1120002000063B00.
Now sometimes the string is split into two rows like
(1) K00000E11200020
(2) 00063B00
Another variation of the above is the multiple 23 chunks returned as:
K00000E1120002000063B00K00000E1120002000063B00K00000E1120002000063B00
This happens roughly 15 times out of 400 reads for the above.
Can anyone help me in terms of coding to somehow ensure that I always read the buffer correctly when the 23 string arrives. I know timing can be an issue hence the timeout=1 but somehow I read to quickly (or to long) when the read is not complete.
I had a look at this code (haven't tried it yet): pySerial inWaiting returns incorrect number of bytes (the def read_all(port, chunk_size=200) function part)
but thought it best to rather ask advice from those in the know.
I have in my code a bit of corrective code to concat the two rows and split the multiple chunk event should these instance(s) happen but I still think it is not the best way of doing things.
If anyone can help me with some example code I will really appreciate it.

Related

Python 2.7 pySerial read() returns all characters in Python interpreter but not Python script. Also need robust way to read until buffer is empty

I am communicating with a microcontroller that automatically initializes its flash memory whenever you open its serial port. So on a serial port read, the microcontroller prints up to 10,000 bytes of data showing the addresses and their initial values. You need to leave the port open for the entirety of the print to ensure that initialization completed. I don't ever perform any writes, just reads.
I modified the pySerial buffer from 4k to 32k since I do not want any breaks between reads (subsequent reads will simply restart the init cycle). Below is a snippet of my code where I read from the microcontroller serial port. When I run this snippet from the interpreter, I can tell from print and sizeof that temp contains all 9956 bytes. However when I run the py file, I get only 296 bytes. I inserted the sleep() method after read() but this did not have any effect. I cannot tell from the microcontroller if the initialization completed.
Is there a robust way to read until the serial buffer is empty? The microcontroller image is application-specific, so I cannot always predict the required read() size or timeout.
Any ideas what I could try? I've searched other threads but haven't found anything specific to this problem.
# Create serial port instance
self.ser_port = serial.Serial()
self.ser_port.port = 0
self.ser_port.timeout = 0
.
.
.
self.ser_port.open()
time.sleep(1)
temp = self.ser_port.read(32768)
time.sleep(4)
self.ser_port.close()
A timeout of 0 tells pySerial to return immediately with whatever data is already available. If you want read() to wait to allow more data to come in, use a non-zero timeout.
The first sleep() could cause you to lose some of the data that comes in after open but before read, because the internal buffer might not be as big as the buffer you are passing to read(). But that is highly application-dependent -- if the device you are communicating with takes awhile after the open() to respond, then it might not be a problem.
The second sleep() does nothing for you, because the read() is already finished at that point.
Note that the timeout value is the maximum total amount of time to wait. read() will return when it has filled its buffer, or it has been this long after it was called, whichever comes first.
Here is some example code that will return under the following circumstances:
a) 10-12 seconds with no data
b) 2-4 silent seconds after last data returned, if at least 100 bytes was received
self.ser_port = serial.Serial()
self.ser_port.port = 0
self.ser_port.timeout = 2
.
.
.
self.ser_port.open()
timeout_count = 0
received = []
data_count = 0
while 1:
buffer = self.ser_port.read(32768)
if buffer:
received.append(buffer)
data_count += len(buffer)
timeout_count = 0
continue
if data_count > 100:
# Break if we received at least 100 bytes of data,
# and haven't received any data for at least 2 seconds
break
timeout_count += 1
if timeout_count >= 5:
# Break if we haven't received any data for at
# least 10 seconds, even if we never received
# any data
break
received = ''.join(received) # If you need all the data
Depending on your microcontroller, you may be able to reduce these timeout values considerably, and still have a fairly robust solution for retrieving all the startup data from the microcontroller.

Sending ASCII Control Characters to Serial Device using libserial on Linux

I have a very basic device with which I am trying to interact via a serial connection on Linux. I am on the steep part of the learning curve here, still, so please be gentle!
One of the functions involves sending data to an attached printer. You send a command to the device, which then relays the data you input to the printer attached to the device. The command looks like this:
Send "EXEX*". The device echoes back "EXEX" (the '*' is not echoed yet)
Send a single byte indicating the length of the data you will send, including a LF at the end.
Send the data (the device will now echo back the *).
Send "#". The device will now be ready for another command.
I have a small C++ program to communicate with the device, and I can successfully send single characters and such, but when I try to send this command, I do not get the expected results.
Using Hyperterminal in Windows, it is particularly easy, using alt-key combinations to send ASCII control codes. Just connect and:
Type "EXEX*"
Type Alt+010 to send a LF character, indicating that you are sending 10 bytes to the printer (nine characters and a LF).
Type the data you wish to send: "123456789" (nine bytes in length).
Type Alt+010 again to send a final LF character to the printer.
Type "#" to finish.
Here is what I cobbled together to try in C++:
#include <SerialStream.h>
#include <string>
#include <iostream>
#include <fstream>
using namespace std;
using namespace LibSerial;
int main(){
char buffer [50];
int n;
n=sprintf (buffer, "EXEX*%c123456789%c#",10,10);
printf("Variable buffer was set to a %d character string: %s\n",n,buffer);
SerialStream my_serial_stream;
my_serial_stream.Open("/dev/ttyS0") ;
my_serial_stream.SetBaudRate( SerialStreamBuf::BAUD_19200 ) ;
my_serial_stream.SetCharSize( SerialStreamBuf::CHAR_SIZE_8 ) ;
my_serial_stream.SetFlowControl( SerialStreamBuf::FLOW_CONTROL_NONE ) ;
my_serial_stream.SetParity( SerialStreamBuf::PARITY_NONE ) ;
my_serial_stream.SetNumOfStopBits(1) ;
my_serial_stream.SetVTime(1);
my_serial_stream.SetVMin(100);
cout<<"Sending Command:\n";
my_serial_stream << buffer;
//my_serial_stream << printf("%s",buffer);
//my_serial_stream << "EXEX*\n123456789\n#";
my_serial_stream.read(next_char,100);
cout<<"Result: "<<next_char<<"\n";
my_serial_stream.Close();
return 0;
}
I also tried both of the commented out lines, and neither worked. The device does not receive the proper characters on the other end.'
I'm certain that this is pretty basic, perhaps something is grabbing the control characters in the middle? If anyone has any ideas on a better way to do this, I would really appreciate it. Specifically, I might need to send a byte with a value anywhere between 1 and 40, depending on the length of the data I wish to send to the printer.
My apologies for being unclear, please let me know if I need to break this down farther.
Many thanks,
Tom
The line you send doesn't include the # that you mention in the character sequence.
Have you checked serial comms works on /dev/ttyS0 using gtkterm / cutecom etc?
To test your interface you could read back the serial port. If you have a second port or computer, you could do that by connecting to another port via a null modem. Otherwise you could short pins 2 and 3 of your serial port and check that you are receiving back the characters you send.
You may want to check the return values of the calls to make to the serial library, to see if any errors are returned.
Perhaps there are timing requirements on the printer, and you may need to wait between sending some characters.
I compiled the code and checked the output on another serial port with gtkterm, it does receive the string you would expect:
45 58 45 58 2A 0A 31 32 - 33 34 35 36 37 38 39 0A EXEX*.12 3456789.
It won't affect the sending part of the code, but the receiving looks suspicious. If the read() member function is like the system call and if next_char is a character array, then it won't null terminate the string. Instead you have to look at the return value to get the size, and then null terminate if you are going to use it as a null-terminated C string.

Changing this protocol to work with TCP streaming

I made a simple protocol for my game:
b = bool
i = int
sINT: = string whose length is INT followed by a : then the string
m = int message id.
Example:
m133s11:Hello Worldi-57989b0b1b0
This would be:
Message ID 133
String 'Hello World' length 11
int -57989
bool false
bool true
bool false
I did not know however that TCP could potentially only send PART of a message. I'm not sure exactly how I could modify this such that I can do the following:
on receive data from client:
use client's chunk parser
process data
if has partial message then try to find matching END
if no partial messages then try to read a whole message
for each complete message in queue, dispatch it
I could do this by adding B at the beginning of a message and E at the end, then parsing through for the first char to be B and last to be E.
The only problem is what if
I receive something silly in the middle that does not follow the protocol. Or, what if I was supposed to just receive something that is not a message and is just a string. So if I was somehow intended to receive the string HelloB, then I would parse this as hello and the beginning of a message, but I would never receive that message because it is not a message.
How could I modify my protocol to solve these potential issues? As much as I anticipate only ever receiving correctly formed messages, it would be a nightmare if one was poorly encoded and set everything out of whack.
Thanks
I decided to add the length at the beginning and keep track of if I'm working on a message or not:
so:
p32m133s11:Hello Worldi-57989b0b1b0
I then have 3 states, reading to find 'p', reading to find the length after 'p' or reading bytes until length bytes have been read.
What do you think?
It seems to work great.
What you are doing is pretty old-school, magnetic tape stuff. Nice.
The issue you might have is that if a part of the message is received, you cannot tell if you are partway through a token.
E.g. if you receive:
m12
Is this Message 12, or is it the first part of message 122?
If you receive:
i-12
Is this an integer -12 or is it the first part of an integer -124354?
So I think you need to change it so that the message numbers are fixed width (e.g. four digits), the string length is fixed (e.g. 6 digits) and the integer width is fixed at 10 digits.
So your example would be:
m_133s____11:Hello Worldi____-57989b0b1b0
That way if you get the first part of a message you can store it and wait for the remainder to be received before you process it.
You might also consider using control characters to separate message parts. There are ascii control codes often used for this purpose, RS, FS, GS and US. So a message could be
[RS]FieldName[US]FieldValue[RS]fieldName[US]FieldValue[GS].
You know when you have a complete message because the [GS] marks the end. You can then divide it up into fields using the [RS] as a separator, and split each into name/value using the [US].
See http://en.wikipedia.org/wiki/C0_and_C1_control_codes for a brief bit of information.

Arduino Ethernet Byte size problem

I'm using an Arduino (duemilanove) with the official Ethernet shield to send data to the controller for controlling an LED matrix. I am trying to send some raw 32-bit unsigned int values (unix timestamps) to the controller by taking the 4 bytes in the 32-bit value on the desktop and sending it to the arduino as 4 consecutive bytes. However, whenever a byte value is larger than 127, the returned value by the ethernet client library is 63.
The following is a basic example of what I'm doing on the arduino side of things. Some things have been removed for neatness.
byte buffer[32];
memset(buffer, 0, 32);
int data;
int i=0;
data = client.read();
while(data != -1 && i < 32)
{
buffer[i++] = (byte)data;
data = client.read();
}
So, whenever the input byte is bigger than 127 the variable "data" will end up getting set to 63! At first I thought the problem was further down the line (buffer used to be char instead of byte) but when I print out "data" right after the read, it's still 63.
Any ideas what could be causing this? I know client.read() is supposed to output int and internally reads data from the socket as uint8_t which is a full byte and unsigned, so I should be able to at least go to 255...
EDIT: Right you are, Hans. Didn't realize that Encoding.ASCII.GetBytes only supported the first 7 bits and not all 8.
I'm more inclined to suspect the transmit side. Are you positive the transmit side is working correctly? Have you verified with a wireshark capture or some such?
63 is the ASCII code for ?. There's some relevance to the values, ASCII doesn't have character codes for values over 127. An ASCII encoder commonly replaces invalid codes like this with a question mark. Default behavior for the .NET Encoding.ASCII encoder for example.
It isn't exactly clear where that might happen. Definitely not in your snippet. Probably on the other end of the wire. Write bytes, not characters.
+1 for Hans Passant and Karl Bielefeldt.
Can you just send the data without encoding? How is the data being sent? TCP/UDP/IP/Ethernet definitely support sending binary data without restriction. If this isn't possible, perhaps converting the data to hex will solve the problem. Base64 will also work (better) but is considerably more work. For small amounts of data, hex is probably the easiest and fastest solution.
+1 again to Karl and Ben for mentioning wireshark. Invaluable for debugging network problems like this.

recv windows, one byte per call, what the?

c++
#define BUF_LEN 1024
the below code only receives one byte when its called then immediately moves on.
output = new char[BUF_LEN];
bytes_recv = recv(cli, output, BUF_LEN, 0);
output[bytes_recv] = '\0';
Any idea how to make it receive more bytes?
EDIT: the client connecting is Telnet.
The thing to remember about networking is that you will be able to read as much data as has been received. Since your code is asking for 1024 bytes and you only read 1, then only 1 byte has been received.
Since you are using a telnet client, it sounds like you have it configured in character mode. In this mode, as soon as you type a character, it will be sent.
Try to reconfigure your telnet client in line mode. In line mode, the telnet client will wait until you hit return before it sends the entire line.
On my telnet client. In order to do that, first I type ctrl-] to get to the telnet prompt and then type "mode line" to configure telnet in line mode.
Update
On further thought, this is actually a very good problem to have.
In the real world, your data can get fragmented in unexpected ways. The client may make a single send() call of N bytes but the data may not arrive in a single packet. If your code can handle byte arriving 1 by 1, then you know it will work know matter how the data arrives.
What you need to do is make sure that you accumulate your data across multiple receives. After your recv call returns, you should then append the data a buffer. Something like:
char *accumulate_buffer = new char[BUF_LEN];
size_t accumulate_buffer_len = 0;
...
bytes_recv = recv(fd,
accumulate_buffer + accumulate_buffer_len,
BUF_LEN - accumulate_buffer_len,
0);
if (bytes_recv > 0)
accumulate_buffer_len += bytes_recv;
if (can_handle_data(accumulate_buffer, accumulate_buffer_len))
{
handle_data(accumulate_buffer, accumulate_buffer_len);
accumulate_buffer_len = 0;
}
This code keeps accumulating the recv into a buffer until there is enough data to handle. Once you handle the data, you reset the length to 0 and you start accumulating afresh.
First, in this line:
output[bytes_recv] = '\0';
you need to check if bytes_recv < 0 first before you do that because you might have an error. And the way your code currently works, you'll just randomly stomp on some random piece of memory (likely the byte just before the buffer).
Secondly, the fact you are null terminating your buffer indicates that you're expecting to receive ASCII text with no embedded null characters. Never assume that, you will be wrong at the worst possible time.
Lastly stream sockets have a model that's basically a very long piece of tape with lots of letters stamped on it. There is no promise that the tape is going to be moving at any particular speed. When you do a recv call you're saying "Please give me as many letters from the tape as you have so far, up to this many.". You may get as many as you ask for, you may get only 1. No promises. It doesn't matter how the other side spit bits of the tape out, the tape is going through an extremely complex bunch of gears and you just have no idea how many letters are going to be coming by at any given time.
If you care about certain groupings of characters, you have to put things in the stream (ont the tape) saying where those units start and/or end. There are many ways of doing this. Telnet itself uses several different ones in different circumstances.
And on the receiving side, you have to look for those markers and put the sequences of characters you want to treat as a unit together yourself.
So, if you want to read a line, you have to read until you get a '\n'. If you try to read 1024 bytes at a time, you have to take into account that the '\n' might end up in the middle of your buffer and so your buffer may contain the line you want and part of the next line. It might even contain several lines. The only promise is that you won't get more characters than you asked for.
Force the sending side to send more bytes using Nagle's algorithm, then you will receive them in packages.