pyserial data gaps - python-2.7

I have python code that acquires serial data from 2 devices and writes to a .txt file. Every 4-15 minutes there is approx 30-45 seconds of data missing in the .txt file and this is not acceptable for our use case. I've spent hours googling and searching SO about multiprocessing and serial port data acquisition and haven't come up with a solution.
Here is my code
gpsser = input(("Enter GPS comport as 'COM_': "))
ser = serial.Serial(port=gpsser,
baudrate=38400,
timeout=2,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS)
root = Tk()
root.title("DualEM DAQ")
path = filedialog.asksaveasfilename() + ".txt"
file = glob.glob(path)
filename = path
with open(filename, 'wb') as f:
w = csv.writer(f, dialect='excel')
w.writerow(['header'])
def sensor():
while True:
try:
NMEA1 = dser.readline().decode("ascii")
while dser.inWaiting() == 0:
pass
NMEA1_array = NMEA1.split(',')
NMEA2_array = NMEA2.split(',')
NMEA3_array = NMEA3.split(',')
NMEA4_array = NMEA4.split(',')
if NMEA1_array[0] == '$PDLGH':
value1 = NMEA1_array[2]
value2 = NMEA1_array[4]
if NMEA1_array[0] == '$PDLG1':
value3 = NMEA1_array[2]
value4 = NMEA1_array[4]
if NMEA1_array[0] == '$PDLG2':
value5 = NMEA1_array[2]
value6 = NMEA1_array[4]
return (float(value1), float(value2), float(value3),
float(value4), float(value5), float(value6),
except (IndexError, NameError, ValueError, UnicodeDecodeError):
pass
def gps():
while True:
try:
global Status, Latitude, Longitude, Speed, Truecourse, Date
global GPSQuality, Satellites, HDOP, Elevation, Time
while ser.inWaiting() == 0:
pass
msg = ser.readline()
pNMEA = pynmea2.parse(msg)
if isinstance(pNMEA, pynmea2.types.talker.RMC):
Latitude = pynmea2.dm_to_sd(pNMEA.lat)
Longitude = -(pynmea2.dm_to_sd(pNMEA.lon))
Date = pNMEA.datestamp
Time = datetime.datetime.now().time()
if () is not None:
return (Longitude, Latitude, Date, Time)
except (ValueError, UnboundLocalError, NameError):
pass
while True:
try:
with open(filename, "ab") as f:
data = [(gps() + sensor())]
writer = csv.writer(f, delimiter=",", dialect='excel')
writer.writerows(data)
f.flush()
print(data)
except (AttributeError, TypeError) as e:
pass
The program is writing to the file but I need help understanding why I'm losing 30-45 seconds of data every so often. Where is my bottle neck that is causing this to happen?
Here is an example of where the breaks are, note the breaks are approx 50 seconds in this case.
Breaks in writing data to csv
DB

Back when I used PySerial, I did this:
nbytes = ser.inWaiting()
if nbytes > 0:
indata = ser.read(nbytes)
#now parse bytes in indata to look for delimiter, \n in your case
#and if found process the input line(s) until delimiter not found
else:
#no input yet, do other processing or allow other things to run
#by using time.sleep()
Also note that new versions (3.0+) of PySerial have .in_waiting as a property not a method, so no (), it used to be .inWaiting().

You should not flush the serial port input. Data is arriving on its own timing into a buffer in the driver, not when your read happens, so you are throwing away data with your flush. You may need to add code to synchronize with the input stream.

I used threading with a queue and changed my mainloop to look like this.
while True:
try:
with open(filename, "ab") as f:
writer = csv.writer(f, delimiter=",", dialect='excel')
data = []
data.extend(gpsdata())
data.extend(dualemdata())
writer.writerows([data])
f.flush()
f.close()
dser.flushInput()
ser.flushInput()
print(data)
sleep(0.05)
except (AttributeError, TypeError) as e:
pass
I had to flush the serial port input data before the looping back to the read functions so it was reading new realtime data(this eliminated any lag of the incoming data stream). I've ran a 30 minute test and the time gaps appear to have go away. Thankyou to Cmaster for giving me some diagnostic ideas.

Related

Memory Error when exporting data to csv file

Hello I was hoping someone could help me with my college coursework, I have an issue with my code. I keep running into a memory error with my data export.
Is there any way I can reduce the memory that is being used or is there a different approach I can take?
For the course work I am given a file of 300 records about customer orders from a CSV file and then I have to export the Friday records to a new CSV file. Also I am required to print the most popular method for customer's orders and the total money raised from the orders but I have an easy plan for that.
This is my first time working with CSV so I'm not sure how to do it. When I run the program it tends to crash instantly or stop responding. Once it appeared with 'MEMORY ERROR' however that is all it appeared with. I'm using a college provided computer so I am not sure on the exact specs but I know it runs 4GB of memory.
defining count occurences predefined function
def countOccurences(target,array):
counter = 0
for element in array:
if element == target:
counter= counter + 1
print counter
return counter
creating user defined functions for the program
dataInput function used for collecting data from provided file
def dataInput():
import csv
recordArray = []
customerArray = []
f = open('E:\Portable Python 2.7.6.1\Choral Shield Data File(CSV).csv')
csv_f = csv.reader(f)
for row in csv_f:
customerArray.append(row[0])
ticketID = row[1]
day, area = datasplit(ticketID)
customerArray.append(day)
customerArray.append(area)
customerArray.append(row[2])
customerArray.append(row[3])
recordArray.append(customerArray)
f.close
return recordArray
def datasplit(variable):
day = variable[0]
area = variable[1]
return day,area
def dataProcessing(recordArray):
methodArray = []
wed_thursCost = 5
friCost = 10
record = 0
while record < 300:
method = recordArray[record][4]
methodArray.append(method)
record = record+1
school = countOccurences('S',methodArray)
website = countOccurences('W',methodArray)
if school > website:
school = True
elif school < website:
website = True
dayArray = []
record = 0
while record < 300:
day = recordArray[record][1]
dayArray.append(day)
record = record + 1
fridays = countOccurences('F',dayArray)
wednesdays = countOccurences('W',dayArray)
thursdays = countOccurences('T', dayArray)
totalFriCost = fridays * friCost
totalWedCost = wednesdays * wed_thursCost
totalThurCost = thursdays * wed_thursCost
totalCost = totalFriCost + totalWedCost + totalThurCost
return totalCost,school,website
My first attempt to writing to a csv file
def dataExport(recordArray):
import csv
fridayRecords = []
record = 0
customerIDArray = []
ticketIDArray = []
numberArray = []
methodArray = []
record = 0
while record < 300:
if recordArray[record][1] == 'F':
fridayRecords.append(recordArray[record])
record = record + 1
with open('\Courswork output.csv',"wb") as f:
writer = csv.writer(f)
for record in fridayRecords:
writer.writerows(fridayRecords)
f.close
My second attempt at writing to the CSV file
def write_file(recordArray): # write selected records to a new csv file
CustomerID = []
TicketID = []
Number = []
Method = []
counter = 0
while counter < 300:
if recordArray[counter][2] == 'F':
CustomerID.append(recordArray[counter][0])
TicketID.append(recordArray[counter][1]+recordArray[counter[2]])
Number.append(recordArray[counter][3])
Method.append(recordArray[counter][4])
fridayRecords = [] # a list to contain the lists before writing to file
for x in range(len(CustomerID)):
one_record = CustomerID[x],TicketID[x],Number[x],Method[x]
fridayRecords.append(one_record)
#open file for writing
with open("sample_output.csv", "wb") as f:
#create the csv writer object
writer = csv.writer(f)
#write one row (item) of data at a time
writer.writerows(recordArray)
f.close
counter = counter + 1
#Main Program
recordArray = dataInput()
totalCost,school,website = dataProcessing(recordArray)
write_file(recordArray)
In the function write_file(recordArray) in your second attempt the counter variable counter in the first while loop is never updated so the loop continues for ever until you run out of memory.

PyAudio recorder script IOError: [Errno Input overflowed] -9981

The code below is what I use to record audio until the "Enter" key is pressed it returns an exception,
import pyaudio
import wave
import curses
from time import gmtime, strftime
import sys, select, os
# Name of sub-directory where WAVE files are placed
current_experiment_path = os.path.dirname(os.path.realpath(__file__))
subdir_recording = '/recording/'
# print current_experiment_path + subdir_recording
# Variables for Pyaudio
chunk = 1024
format = pyaudio.paInt16
channels = 2
rate = 48000
# Set variable for the labelling of the recorded WAVE file.
timestamp = strftime("%Y-%m-%d-%H:%M:%S", gmtime())
#wave_output_filename = '%s.wav' % self.get('subject_nr')
wave_output_filename = '%s.wav' % timestamp
print current_experiment_path + subdir_recording + wave_output_filename
# pyaudio recording stuff
p = pyaudio.PyAudio()
stream = p.open(format = format,
channels = channels,
rate = rate,
input = True,
frames_per_buffer = chunk)
print "* recording"
# Create an empty list for audio recording
frames = []
# Record audio until Enter is pressed
i = 0
while True:
os.system('cls' if os.name == 'nt' else 'clear')
print "Recording Audio. Press Enter to stop recording and save file in " + wave_output_filename
print i
if sys.stdin in select.select([sys.stdin], [], [], 0)[0]:
line = raw_input()
# Record data audio data
data = stream.read(chunk)
# Add the data to a buffer (a list of chunks)
frames.append(data)
break
i += 1
print("* done recording")
# Close the audio recording stream
stream.stop_stream()
stream.close()
p.terminate()
# write data to WAVE file
wf = wave.open(current_experiment_path + subdir_recording + wave_output_filename, 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(format))
wf.setframerate(rate)
wf.writeframes(''.join(frames))
wf.close()
The exception produced is this
Recording Audio. Press Enter to stop recording and save file in 2015-11-20-22:15:38.wav
925
Traceback (most recent call last):
File "audio-record-timestamp.py", line 51, in <module>
data = stream.read(chunk)
File "/Library/Python/2.7/site-packages/pyaudio.py", line 605, in read
return pa.read_stream(self._stream, num_frames)
IOError: [Errno Input overflowed] -9981
What is producing the exception? I tried changing the chunk size (512,256,8192) it doesn't work. Changed the while loop condition and it didn't work.
I had a similar problem; there are 3 ways to solve it (that I could find)
set rate=24000
add option "exception_on_overflow=False" to the "read()" call, that is, make it "stream.read(chunk, exception_on_overflow=False)"
use callbacks
Here is, for your convenience, an example o "using callbacks"
#!/usr/bin/python
import sys, os, math, time, pyaudio
try:
import numpy
except:
numpy = None
rate=48000
chan=2
sofar=0
p = pyaudio.PyAudio()
def callback(in_data, frame_count, time_info, status):
global sofar
sofar += len(in_data)
if numpy:
f = numpy.fromstring(in_data, dtype=numpy.int16)
sys.stderr.write('length %6d sofar %6d std %4.1f \r' % \
(len(in_data), sofar, numpy.std(f)))
else:
sys.stderr.write('length %6d sofar %6d \r' % \
(len(in_data), sofar))
data = None
return (data, pyaudio.paContinue)
stream = p.open(format=pyaudio.paInt16, channels=chan, rate=rate,
input=True, stream_callback=callback)
while True:
time.sleep(1)

How to use user-defined functions in python?

I'm just learning python so be gentle. I want to read a file and that to be one function and then have another function work on what the previous function "read". I am having trouble passing the result on one function to another. Here is what I have no far:
I want to call read_file more than once and to be able to pass its result to more than one function, therefore, I do not want frame to be a global variable. How do I get read_file to pass 'frame' to cost_channelID directly? Perhaps, for cost_channelID to call read_file?
def read_file():
user_input = raw_input("please put date needed in x.xx form: ")
path = r'C:\\Users\\CP\\documents\\' + user_input
allFiles = glob.glob(path + '/*.csv')
frame = pd.DataFrame()
list = []
for file in allFiles:
df = pd.read_csv(file,index_col=None,header=0)
list.append(df)
frame =pd.concat(list,ignore_index=True)
def cost_channelID():
numbers =r'[0,1,2,3,4,5,6,7,8,9]'
Ads = frame['Ad']
ID = []
for ad in Ads:
num = ''.join(re.findall(numbers,ad)[1:7])
ID.append(num)
ID = pd.Series(ID)
pieces = [frame,ID]
frame2 = pd.concat(pieces,ignore_index=True,axis=1)
frame2 = frame2.rename(columns={0:'Ad',1:'Ad Impressions',2:'Total Ad Spend',3:'eCPM (Total Ad Spend/Ad)',4:'Ad Attempts',5:'ID'})
Any and all help is greatly appreciated!
Here is your modified code (comments in uppercase for easier finding, not rudeness):
def read_file():
user_input = raw_input("please put date needed in x.xx form: ")
path = r'C:\\Users\\CP\\documents\\' + user_input
allFiles = glob.glob(path + '/*.csv')
frame = pd.DataFrame()
list = []
for file in allFiles:
df = pd.read_csv(file,index_col=None,header=0)
list.append(df)
frame =pd.concat(list,ignore_index=True)
return(frame) #YOUR FUNCTION WILL BE RETURNING THE READ FRAME
def cost_channelID(read_frame): #YOU WILL BE RECEIVING A DIFFERENT FRAME EVERY TIME
numbers =r'[0,1,2,3,4,5,6,7,8,9]'
Ads = read_frame['Ad']
ID = []
for ad in Ads:
num = ''.join(re.findall(numbers,ad)[1:7])
ID.append(num)
ID = pd.Series(ID)
pieces = [frame,ID]
frame2 = pd.concat(pieces,ignore_index=True,axis=1)
frame2 = frame2.rename(columns={0:'Ad',1:'Ad Impressions',2:'Total Ad Spend',3:'eCPM (Total Ad Spend/Ad)',4:'Ad Attempts',5:'ID'})
#HERE YOU MEANT TO USE FRAME2 WITHIN THE FUNCTION? IT IS WHAT HAPPENING BECAUSE OF THE INDENTATION
frame1 = read_file() #YOU CAN READ AS MANY FRAMES AS YOU WANT AND THEY'LL BE KEPT IN SEPARATED FRAMES
frame2 = read_file()
#...framexx = read_file()
#AND YOU CAN JUST CALL cost_channelID in any of them (or any other function)
const_channelID(frame1)
const_channelID(frame2)
#... AND SO ON
If you are trying to pass frame from one function to the other, you need to declare it outside the scope of the function. Otherwise we need more information about what you are trying to accomplish.
frame = None
def read_file():
user_input = raw_input("please put date needed in x.xx form: ")
path = r'C:\\Users\\CP\\documents\\' + user_input
allFiles = glob.glob(path + '/*.csv')
frame = pd.DataFrame()
list = []
for file in allFiles:
df = pd.read_csv(file,index_col=None,header=0)
list.append(df)
frame =pd.concat(list,ignore_index=True)
def cost_channelID():
numbers =r'[0,1,2,3,4,5,6,7,8,9]'
Ads = frame['Ad']
ID = []
for ad in Ads:
num = ''.join(re.findall(numbers,ad)[1:7])
ID.append(num)
ID = pd.Series(ID)
pieces = [frame,ID]
frame2 = pd.concat(pieces,ignore_index=True,axis=1)
frame2 = frame2.rename(columns={0:'Ad',1:'Ad Impressions',2:'Total Ad Spend',3:'eCPM (Total Ad Spend/Ad)',4:'Ad Attempts',5:'ID'})
Here is a good reference for scope rules in python
Short Description of the Scoping Rules?

IndexError, but more likely I/O error

Unsure of why I am getting this error. I'm reading from a file called columns_unsorted.txt, then trying to write to columns_unsorted.txt. There error is on fan_on = string_j[1], saying list index out of range. Here's my code:
#!/usr/bin/python
import fileinput
import collections
# open document to record results into
j = open('./columns_unsorted.txt', 'r')
# note this is a file of rows of space-delimited date in the format <1384055277275353 0 0 0 1 0 0 0 0 22:47:57> on each row, the first term being unix times, the last human time, the middle binary indicating which machine event happened
# open document to read from
l = open('./columns_sorted.txt', 'w')
# CREATE ARRAY CALLED EVENTS
events = collections.deque()
i = 1
# FILL ARRAY WITH "FACTS" ROWS; SPLIT INTO FIELDS, CHANGE TYPES AS APPROPRIATE
for line in j: # columns_unsorted
line = line.rstrip('\n')
string_j = line.split(' ')
time = str(string_j[0])
fan_on = int(string_j[1])
fan_off = int(string_j[2])
heater_on = int(string_j[3])
heater_off = int(string_j[4])
space_on = int(string_j[5])
space_off = int(string_j[6])
pump_on = int(string_j[7])
pump_off = int(string_j[8])
event_time = str(string_j[9])
row = time, fan_on, fan_off, heater_on, heater_off, space_on, space_off, pump_on, pump_off, event_time
events.append(row)
You are missing the readlines function, no?
You have to do:
j = open('./columns_unsorted.txt', 'r')
l = j.readlines()
for line in l:
# what you want to do with each line
In the future, you should print some of your variables, just to be sure the code is working as you want it to, and to help you identifying problems.
(for example, if in your code you would print string_j you would see what kind of problem you have)
Problem was an inconsistent line in the data file. Forgive my haste in posting

NMEA data parsing using python 2.7

i want readings of present GPS location, when i run the below code in raspberry pi, program prints 10-12 outputs and then it displays the error as below:
Traceback (most recent call last):
File "simplegpsparsing.py", line 24, in
get_present_gps()
File "simplegpsparsing.py", line 16, in get_present_gps
lat, _, lon= line.split(',')[2:5]
ValueError: need more than 0 values to unpack
i want present value of GPS( buffers should be updated with immediate GPS) so that present GPS values can be known.
my code goes as below :
import os
import serial
def get_present_gps:
ser= serial.Serial('/dev/ttyAMA0',9600)
ser.open()
while True :
f=open('/home/pi/Desktop/gps1','w')
data=ser.read(ser.inWaiting()) # read no.of bytes in waiting
f.write(data) #write data into file
f.flush() # flush(method) from buffer into os buffer
os.fsync(f.fileno()) #ensure to write from os buffer(internal buffers)into disk
f = open('/home/pi/Desktop/gps1','r') # fetch the required file
for line in f.read().split('\n') :
if line.startswith( '$GPGGA' ) :
lat, _, lon = line.strip().split(',')[2:5]
try :
lat = float( lat )
lon = float( lon )
print lat
print lon
except :
pass
# something wrong happens with your data, print some error messages
get_present_gps()
if the serial port is left open without closing, will it create any problem? will the above code meet my requirement i.e getting the instantaneous value?
Here is a thought: parse the data string into chunks that are newline terminated; leave "unfinished" strings in the buffer until the next time around the loop. Untested, this will look something like this:
keepThis = ''
while True :
data=keepThis + ser.read(ser.inWaiting()) # read no.of bytes in waiting
m = data.split("\n") # get the individual lines from input
if len(m[-1]==0): # true if only complete lines present (nothing after last newline)
processThis = m
keepThis = ''
else:
processThis = m[0:-1] # skip incomplete line
keepThis = m[-1] # fragment
# process the complete lines:
for line in processThis:
if line.startswith( '$GPGGA' ) :
try :
lat, _, lon = line.strip().split(',')[2:5] # move inside `try` just in case...
lat = float( lat )
lon = float( lon )
print lat
print lon
except :
pass