RegEx for matching the month, day and year - regex

I'm trying to find a regular expression to extract the month, day and year from a datetime stamp in this format:
01/20/2019 12:34:54
It should return a list:
['01', '20', '2019']
I know this can be solved using:
dt.split(' ')[0].split('/')
But, I'm trying to find a regex to do it:
[^\/\s]+
But, I need it to exclude everything after the space.

As you are expecting the date month and year to be returned as a list, you can use this Python code,
import re
s = '01/20/2019 12:34:54'
print(re.findall(r'\d+(?=[ /])', s))
Prints,
['01', '20', '2019']
Otherwise, you can better write your regex as,
(\d{2})/(\d{2})/(\d{4})
And get date, month and year from group1, group2 and group3
Regex Demo
Python code in this way should be,
import re
s = '01/20/2019 12:34:54'
m = re.search(r'(\d{2})/(\d{2})/(\d{4})', s)
if m:
print([m.group(1), m.group(2), m.group(3)])
Prints,
['01', '20', '2019']

You should absolutely be taking advantage of Python's date/time API here. Use strptime to parse your input datetime string to a bona fide Python datetime. Then, just build a list, accessing the various components you need.
dt = "01/20/2019 12:34:54"
dto = datetime.strptime(dt, '%m/%d/%Y %H:%M:%S')
list = [dto.month, dto.day, dto.year]
print(list)
[1, 20, 2019]
If you really want/need to work with the original datetime string, then split provides an option, without even formally using a regex:
dt = "01/20/2019 12:34:54"
dt = dt.split()[0].split('/')
print(dt)
['01', '20', '2019']

This RegEx might help you to do so.
([0-9]+)\/([0-9]+)\/([0-9]+)\s[0-9]+:[0-9]+:[0-9]+
Code:
import re
string = '01/20/2019 12:34:54'
matches = re.search(r'([0-9]+)/([0-9]+)/([0-9]+)', string)
if matches:
print([matches.group(1), matches.group(2), matches.group(3)])
else:
print('Sorry! No matches! Something is not right! Call 911')
Output
['01', '20', '2019']

Related

Age checker Python3 18+

I need a programm on Pyhton3 to check if the user is 18+ or not.
Input: date of the birth. in 4 types of format (25/12/2000,25-12-2000,25.12.2000,25_12_2000)
if wrong print(wrong format)
Output: "welcome to system" or "sorry comeback when you will be 18+"
In case you need my stupid tries:
from datetime import datetime, date
def try_parsing_date(text):
for fmt in ('%d/%m/%Y', '%d.%m.%Y', '%d-%m-%Y', '%d_%m_%Y'):
try:
return datetime.strftime(text,fmt)
except ValueError:
pass
raise ValueError('no valid date format')
dob = input('Введите свой день рождения (дд/мм/гггг): ')
try_parsing_date(dob)
Maybe to deal with it wit regular expressions?
```re_age_checker= "^(0[1-9]|[12][0-9]|3[01])[- \/.,_](0[1-9]|1[012])[- \/.,_](19|20)\d\d"```
Your try was a good start, but you mixed up strftime with strptime; change that.
Maybe to deal with it wit regular expressions?
This is not advisable. After the above change, we can use the result of your function try_parsing_date to compute the 18th birthday and simply compare that to today's date:
dt = try_parsing_date(dob)
import time
# compute the 18th birthday:
d = date.fromtimestamp(time.mktime((dt.year+18, dt.month, dt.day, *(0,)*6)))
if d <= date.today():
print("welcome to system")
else:
print("sorry comeback when you will be 18+")

python Find the most reported month

I am trying to find out October(mentioned 2 times), I had the idea to use dictionary to solve this problem. However I struggled a lot to figure out how to find/separate the months, I was not able to use my solution for the 1st str values where there are some spaces. Can someone please suggest how can I modify that split section to cover - , and white space?
import re
#str="May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990"
str="May-29-1990,Oct-18-1980,Sept-1-1980,Oct-2-1990"
val=re.split(',',str)
monthList=[]
myDictionary={}
#put the months in a list
def sep_month():
for item in val:
if not item.isdigit():
month,day,year=item.split("-")
monthList.append(month)
#process the month list from above
def count_month():
for item in monthList:
if item not in myDictionary.keys():
myDictionary[item]=1
else:
myDictionary[item]=myDictionary.get(item)+1
for k,v in myDictionary.items():
if v==2:
print(k)
sep_month()
count_month()
from datetime import datetime
import calendar
from collections import Counter
datesString = "May-29-1990,Oct-18-1980,Sep-1-1980,Oct-2-1990"
datesListString = datesString.split(",")
datesList = []
for dateStr in datesListString:
datesList.append(datetime.strptime(dateStr, '%b-%d-%Y'))
monthsOccurrencies = Counter((calendar.month_name[date.month] for date in datesList))
print(monthsOccurrencies)
# Counter({'October': 2, 'May': 1, 'September': 1})
Something to be aware in my solution with %b for the month is that Sept has changed to Sep to work (Month as locale’s abbreviated name). In this case you can either use fullname months (%B) or abbreviated name (%b). If you can not have the big string as with correct month name formatting, just replace the wrong ones ("Sept" for example with "Sep" and always work with date obj).
Not sure that regex is the best tool for this job, I would just use strip() along with split() to handle your whitespace issues and get a list of just the month abbreviations. Then you could create a dict with counts by month using the list method count(). For example:
dates = 'May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990'
months = [d.split('-')[0].strip() for d in dates.split(',')]
month_counts = {m: months.count(m) for m in set(months)}
print(month_counts)
# {'May': 1, 'Oct': 2, 'Sept': 1}
Or even better with collections.Counter:
from collections import Counter
dates = 'May-29-1990, Oct-18-1980 ,Sept-1-1980, Oct-2-1990'
months = [d.split('-')[0].strip() for d in dates.split(',')]
month_counts = Counter(months)
print(month_counts)
# Counter({'Oct': 2, 'May': 1, 'Sept': 1})

How to extract files with date pattern using python

I have n-files in a folder like
source_dir
abc_2017-07-01.tar
abc_2017-07-02.tar
abc_2017-07-03.tar
pqr_2017-07-02.tar
Lets consider for a single pattern now 'abc'
(but I get this pattern randomly from Database, so need double filtering,one for pattern and one for last day)
And I want to extract file of last day ie '2017-07-02'
Here I can get common files but not exact last_day files
Code
pattern = 'abc'
allfiles=os.listdir(source_dir)
m_files=[f for f in allfiles if str(f).startswith(pattern)]
print m_files
output:
[ 'abc_2017-07-01.tar' , 'abc_2017-07-02.tar' , 'abc_2017-07-03.tar' ]
This gives me all files related to abc pattern, but how can filter out only last day file of that pattern
Expected :
[ 'abc_2017-07-02.tar' ]
Thanks
just a minor tweak in your code can get you the desired result.
import os
from datetime import datetime, timedelta
allfiles=os.listdir(source_dir)
file_date = datetime.now() + timedelta(days=-1)
pattern = 'abc_' +str(file_date.date())
m_files=[f for f in allfiles if str(f).startswith(pattern)]
Hope this helps!
latest = max(m_files, key=lambda x: x[-14:-4])
will find the filename with latest date among filenames in m_files.
use python regex package like :
import re
import os
files = os.listdir(source_dir)
for file in files:
match = re.search('abc_2017-07-(\d{2})\.tar', file)
day = match.group(1)
and then you can work with day in the loop to do what ever you want. Like create that list:
import re
import os
def extract_day(name):
match = re.search('abc_2017-07-(\d{2})\.tar', file)
day = match.group(1)
return day
files = os.listdir(source_dir)
days = [extract_day(file) for file in files]
if the month is also variable you can substitute '07' with '\d\d' or also '\d{2}'. Be carefull if you have files that dont match with the pattern at all, then match.group() will cause an error since match is of type none. Then use :
def extract_day(name):
match = re.search('abc_2017-07-(\d{2})\.tar', file)
try:
day = match.group(1)
except :
day = None
return day

Would DateTimeField() work if I have time in this format 1/7/11 9:15 ? If not what would?

I am importing data from a JSON file and it has the date in the following format 1/7/11 9:15
What would be the best variable type/format to define in order to accept this date as it is? If not what would be the most efficient way to accomplish this task?
Thanks.
"What would be the best variable type/format to define in order to accept this date as it is?"
The DateTimeField.
"If not what would be the most efficient way to accomplish this task?"
You should use the datetime.strptime method from Python's builtin datetime library:
>>> from datetime import datetime
>>> import json
>>> json_datetime = "1/7/11 9:15" # still encoded as JSON
>>> py_datetime = json.loads(json_datetime) # now decoded to a Python string
>>> datetime.strptime(py_datetime, "%m/%d/%y %I:%M") # coerced into a datetime object
datetime.datetime(2011, 1, 7, 9, 15)
# Now you can save this object to a DateTimeField in a Django model.
If you take a look at https://docs.djangoproject.com/en/dev/ref/models/fields/#datetimefield, it says that django uses the python datetime library which is docomented at http://docs.python.org/2/library/datetime.html.
Here is a working example (with many debug prints and step-by-step instructions:
from datetime import datetime
json_datetime = "1/7/11 9:15"
json_date, json_time = json_datetime.split(" ")
print json_date
print json_time
day, month, year = map(int, json_date.split("/")) #maps each string in stringlist resulting from split to an int
year = 2000 + year #be ceareful here! 2 digits for a year may cause trouble!!! (could be 1911 as well)
hours, minutes = map(int, json_time.split(":"))
print day
print month
print year
my_datetime = datetime(year, month, day, hours, minutes)
print my_datetime
#Generate a json date:
new_json_style = "{0}/{1}/{2} {3}:{4}".format(my_datetime.day, my_datetime.month, my_datetime.year, my_datetime.hour, my_datetime.minute)
print new_json_style

How do you remove seconds and milliseconds from a date time string in python

How I can convert a date in format "2013-03-15 05:14:51.327" to "2013-03-15 05:14", i.e. removing the seconds and milliseconds. I don't think there is way in Robot frame work. Please let me know if any one have a solution for this in python.
Try this (Thanks Blender!)
>>> date = "2013-03-15 05:14:51.327"
>>> newdate = date.rpartition(':')[0]
>>> print newdate
2013-03-15 05:14
In Robotframework the most straightforward way would be to user Split String From Right from the String library library:
${datestring}= Set Variable 2019-03-15 05:14:51.327
${parts}= Split String From Right ${datestring} : max_split=1
# parts is a list of two elements - everything before the last ":", and everything after it
# take the 1st element, it is what we're after
${no seconds}= Get From List ${parts} 0
Log ${no senods} # 2019-03-15 05:14