Reading date from file - python-2.7

I've got the following code which errors and I'm not sure why.
from datetime import datetime
import os
Right_now = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print('Right now is ' + Right_now)
filename = 'sysdate.txt'
with open(filename,"r") as fin:
last_date = fin.read()
my_date = datetime.strptime(str(last_date), '%Y-%m-%d %H:%M:%S')
fin.close()
The file contains the following date format
2018-01-18 11:01:54
However I'm getting the following error message..
Right now is 2018-01-18 11:16:13
Traceback (most recent call last):
File "test1.py", line 11, in <module>
my_date = datetime.strptime(str(last_date), '%Y-%m-%d %H:%M:%S')
File "/usr/lib64/python2.7/_strptime.py", line 328, in _strptime
data_string[found.end():])
ValueError: unconverted data remains:
Python version is 2.7.5

Assuming you are interested only in the last date, if there are more than one, you need to modify your program to account for the empty string returned at the very end:
with open(filename,"r") as fin:
last_date = fin.read().split('\n')[-2]
my_date = datetime.strptime(last_date, '%Y-%m-%d %H:%M:%S')
Using .read() will read the whole file at once. The last element will be the empty string returned when the end of file is reached. You can read more about it in this post and in the documentation.
Your corrected program reads the file and splits it into lines based on the newline character after which it selects the pre-last element which is your target date. The remaining command runs successfully.
You don't need the .close() at the end - with open closes the file automatically,
As of Python 2.5, you can avoid having to call this method explicitly
if you use the with statement. For example, the following code will
automatically close f when the with block is exited:
as written in the documentation.

Your problem is the new line character.
I had written a very similar program, reading one date from each row in a text file, and got the same error message.
I solved it by using rstrip() to remove the newline in a similar fashion:
with open(filename,"r") as fin:
last_date = fin.read()
my_date = datetime.strptime(str.rstrip(last_date), '%Y-%m-%d %H:%M:%S')
Even though the manual does not state it explicitly rstrip() also removes newline characters.

Related

Python import/insert a CSV (without headers) in a Oracle BD using cx_Oracle

Can anyone suggest a way to import a CSV file into a Oracle BD using cx_Oracle. The below code works but I have to manually delete the CSV headers column on row 1 before I run the below Python Script. Is there a way to change the code to ignore line 1 of the CSV file?
import cx_Oracle
import csv
connection = cx_Oracle.connect(USER,PASSWORD,'adhoc_serv')#DADs
cursor = connection.cursor()
insert = """
INSERT INTO MUK (CODE, UNIT_NAME, GROUP_CODE, GROUP_NAME,)
VALUES(:1, :2, :3, :4)"""
# Initialize list that will serve as a container for bind values
L = []
reader = csv.reader(open(r'C:\Projects\MUK\MUK_Latest_PY.csv'),delimiter=',')
for row in reader:
L.append(tuple(row))
# prepare insert statement
cursor.prepare(insert)
print insert
# execute insert with executemany
cursor.executemany(None, L)
# report number of inserted rows
print 'Inserted: ' + str(cursor.rowcount) + ' rows.'
# commit
connection.commit()
# close cursor and connection
cursor.close()
connection.close()
If you want to simply ignore line 1 of the CSV file, that is easily accomplished by performing this immediately after the reader has been created:
next(reader)
This will simply get the first row from the CSV file and discard it.

ValueError Converting UTC time to a desired format and time zone python

I am trying to convert UTC time to a normal format and timezone. The docs are making me throw toys!! Can someone please write me a quick simple example. My code in python;
m.startAt = datetime.strptime(r['StartAt'], '%d/%m/%Y %H:%M')
Error
ValueError: time data '2016-10-28T12:42:59.389Z' does not match format '%d/%m/%Y %H:%M:'
For datetime.strptime to work you need to specify a formatting string appropriately matching the string you're parsing from. The error indicates you don't - so parsing fails. See strftime() and strptime() Behavior for the formatting arguments.
The string you get is indicated in the error message: '2016-10-28T12:42:59.389Z' (a Z/Zulu/ISO 8601 datetime string).
The matching string for that would be '%Y-%m-%dT%H:%M:%S.%f%z' or, after dropping the final Z from the string, '%Y-%m-%dT%H:%M:%S.%f'.
A bit tricky is the final Z in the string, which can be parsed by a %z, but which may not be supported in the GAE-supported python version (in my 2.7.12 it's not supported):
>>> datetime.strptime('2016-10-28T12:42:59.389', '%Y-%m-%dT%H:%M:%S.%f%z')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/_strptime.py", line 324, in _strptime
(bad_directive, format))
ValueError: 'z' is a bad directive in format '%Y-%m-%dT%H:%M:%S.%f%z'
So I stripped Z and used the other format:
>>> stripped_z = '2016-10-28T12:42:59.389Z'[:-1]
>>> stripped_z
'2016-10-28T12:42:59.389'
>>> that_datetime = datetime.strptime(stripped_z, '%Y-%m-%dT%H:%M:%S.%f')
>>> that_datetime
datetime.datetime(2016, 10, 28, 12, 42, 59, 389000)
To obtain a string use strftime:
>>> that_datetime.strftime('%d/%m/%Y %H:%M')
'28/10/2016 12:42'
It'll be more complicated if you want to use a timezone, but my recommendation is to stick with UTC on the backend storage and leave timezone conversion for the frontend/client side.
You might want to use a DateTimeProperty to store the value, in which case into you can write it directly:
entity.datetime_property = that_datetime
That error is telling you the problem, the format string has to match the datetime string you gave it.
For example:
x = datetime.strptime("2016-6-9 08:57", "%Y-%m-%d %H:%M")
Notice the second string matches the format of the first one.
Your time string looks like this:
2016-10-28T12:42:59.389Z
Which does not match your format string.

os.walk set start and end point - python

I'm trying to find how to stop a os.walk after it has walked through a particular file.
I have a directory of log files organized by date. I'm trying to replace grep searches allowing a user to find ip addresses stored in a date range they specify.
The program will take the following arguments:
-i ipv4 or ipv6 address with subnet
-s start date ie 2013/12/20 matches file structure
-e end date
I'm assuming because the topdown option their is a logic that should allow me to declare a endpoint, what is the best way to do this? I'm thinking while loop.
I apologize in advance if something is off with my question. Just checked blood sugar, it's low 56, gd type one.
Additional information
The file structure will be situated in flows/index_border as such
2013
--01
--02
----01
----...
----29
2014
___________Hope this is clear, year folder contains month folders, containing day folders, containing hourly files. Dates increase downwards.___________________
The end date will need to be inclusive, ( I didn't focus too much on it because I can just add code to move one day up)
I have been trying to make a date range function, I was surprised I didn't see this in any datetime docs, seems like it would be useful.
import os, gzip, netaddr, datetime, argparse
startDir = '.'
def sdate_format(s):
try:
return (datetime.datetime.strptime(s, '%Y/%m/%d').date())
except ValueError:
msg = "Bad start date. Please use yyyy/mm/dd format."
raise argparse.ArgumentTypeError(msg)
def edate_format(e):
try:
return (datetime.datetime.strptime(e, '%Y/%m/%d').date())
except ValueError:
msg = "Bad end date. Please use yyyy/mm/dd format."
raise argparse.ArgumentTypeError(msg)
parser = argparse.ArgumentParser(description='Locate IP address in log files for a particular date or date range')
parser.add_argument('-s', '--start_date', action='store', type=sdate_format, dest='start_date', help='The first date in range of interest.')
parser.add_argument('-e', '--end_date', action='store', type=edate_format, dest='end_date', help='The last date in range of interest.')
parser.add_argument('-i', action='store', dest='net', help='IP address or address range, IPv4 or IPv6 with optional subnet accepted.', required=True)
results = parser.parse_args()
start = results.start_date
end = results.end_date
target_ip = results.net
startDir = '/flows/index_border/{0}/{1:02d}/{2:02d}'.format(start.year, start.month, start.day)
print('searching...')
for root, dirs, files in os.walk(startDir):
for contents in files:
if contents.endswith('.gz'):
f = gzip.open(os.path.join(root, contents), 'r')
else:
f = open(os.path.join(root, contents), 'r')
text = f.readlines()
f.close()
for line in text:
for address_item in netaddr.IPNetwork(target_IP):
if str(address_item) in line:
print line,
You need to describe what works or does not work. The argparse of your code looks fine, though I haven't done any testing. The use of type is refreshingly correct. :) (posters often misuse that parameter.)
But as for the stopping, I'm guessing you could do:
endDir = '/flows/index_border/{0}/{1:02d}/{2:02d}'.format(end.year, end.month, end.day)
for root, dirs, files in os.walk(startDir):
for contents in files:
....
if endDir in <something based on dirs and files>:
break
I don't know enough your file structure to be more specific. It's also been sometime since I worked with os.walk. In any case, I think a conditional break is the way to stop the walk early.
#!/usr/bin/env python
import os, gzip, netaddr, datetime, argparse, sys
searchDir = '.'
searchItems = []
def sdate_format(s):
try:
return (datetime.datetime.strptime(s, '%Y/%m/%d').date())
except ValueError:
msg = "Bad start date. Please use yyyy/mm/dd format."
raise argparse.ArgumentTypeError(msg)
def edate_format(e):
try:
return (datetime.datetime.strptime(e, '%Y/%m/%d').date())
except ValueError:
msg = "Bad end date. Please use yyyy/mm/dd format."
raise argparse.ArgumentTypeError(msg)
parser = argparse.ArgumentParser(description='Locate IP address in log files for a particular date or date range')
parser.add_argument('-s', '--start_date', action='store', type=sdate_format, dest='start_date',
help='The first date in range of interest.', required=True)
parser.add_argument('-e', '--end_date', action='store', type=edate_format, dest='end_date',
help='The last date in range of interest.', required=True)
parser.add_argument('-i', action='store', dest='net',
help='IP address or address range, IPv4 or IPv6 with optional subnet accepted.', required=True)
results = parser.parse_args()
start = results.start_date
end = results.end_date + datetime.timedelta(days=1)
target_IP = results.net
dateRange = end - start
for addressOfInterest in(netaddr.IPNetwork(target_IP)):
searchItems.append(str(addressOfInterest))
print('searching...')
for eachDay in range(dateRange.days):
period = start+datetime.timedelta(days=eachDay)
searchDir = '/flows/index_border/{0}/{1:02d}/{2:02d}'.format(period.year, period.month, period.day)
for contents in os.listdir(searchDir):
if contents.endswith('.gz'):
f = gzip.open(os.path.join(searchDir, contents), 'rb')
text = f.readlines()
f.close()
else:
f = open(os.path.join(searchDir, contents), 'r')
text = f.readlines()
f.close()
#for line in text:
# break
for addressOfInterest in searchItems:
for line in text:
if addressOfInterest in line:
# if str(address_item) in line:
print contents
print line,
I was banging my head, because I thought I was printing a duplicate. Turns out the file I was given to test has duplication. I ended up removing os.walk due to the predictable nature of the file system, but #hpaulj did provide a correct solution. Much appreciated!

how to read and overwrite part of batch file in Python

I have a batch file look like something below:
.....
set ARGS=%ARGS% /startDate:2015-07-15T15:20:00.000
set ARGS=%ARGS% /endDate:2015-07-15T17:30:00.000
set ARGS=%ARGS% /IDs:250
set ARGS=%ARGS% /values:10000,20000
.....
now I want to read it and overwrite it with new dates (1 day after current start and enddate). My code below works fine if I write it to a new file but doesn't work if I tried to overwrite it. Any idea about how to fix it?
WANTED = 19 #or however many characters you want after dates
with open('myfile.bat') as searchfile, open('mynewfile.bat', 'w') as outfile:
for line in searchfile:
left,sep,right = line.partition('startDate:')
if sep: # True iff 'Figure' in line
startdatestr = (right[:WANTED])
startdate = datetime.strptime(startdatestr, "%Y-%m-%dT%H:%M:%S")
newstartdate = startdate + timedelta(days=1)
newstartdatestr = newstartdate.strftime("%Y-%m-%dT%H:%M:%S")
line = line.replace(startdatestr, newstartdatestr)
left,sep,right = line.partition('endDate:')
if sep: # True iff 'Figure' in line
enddatestr = (right[:WANTED])
enddate = datetime.strptime(enddatestr, "%Y-%m-%dT%H:%M:%S")
newenddate = enddate + timedelta(days=1)
newenddatestr = newenddate.strftime("%Y-%m-%dT%H:%M:%S")
line = line.replace(enddatestr, newenddatestr)
outfile.write(line)
While you are in the 'with open()' part, your file is still open, so you cannot overwrite it. Store the content and your modifications in variables, leave the 'with open()' part so that the file handle closes.
Then open the file for writing and output your data to it.

Is it possible to do a times and date plot by reading the data from temporary file?

I have twitter data in a textfile in a following format "RT Find out how AZ is targeting escape pathways to further personalise breastcancer treatment SABCS14 Thu Dec 11 03:09:12 +0000 2014". So from this i need to find same tweets with same text and get their time and date. After that i have to do a time and date plot. Following is my code what i have tried.
import matplotlib
import os
import tempfile
temp = tempfile.TemporaryFile(mode = 'w+t')
count = 0
f = open('bcd.txt','r')
lines = f.read().splitlines()
for i in range(len(lines)):
line = lines[i]
try:
next_line = lines[i+1]
except IndexError:
continue
if line == next_line:
count +=1
temp.writelines([line+'\n'])
temp.seek(0)
for line in temp:
line.split('\t')
First i have compared the line with next line, then i wrote that into a temporary file then tried to go through that temporary file and extract the time and date of particular tweets but i was unsuccessful. Anykind of help would be appreciated. Thanks in advance.
link to data file
https://drive.google.com/file/d/0BxE3rA3-6F8eVGVwOFlVRjNFTUE/view?usp=sharing