get date from arabic string - regex

I have an Arabic string that shows date in Arabian with detail how can I get dd mm yyyy from that with regex
Ex:
الأحد 21 مايو 2017 01:20 م
i use this regex but doesn't work
^\d{2} [\u0600-\u06FF] \d{4}
what can i do ?

This regex
(\d{4}\ \d{2}:\d{2})
Will match 2017 01:20
I guess you could play around with capture groups to get it in the right order if you want to use the result afterwards.

Regex
The code is here. Arabic right to left should reverse the code.
\d\d\d\d\s\d\d:\d\d

Related

How to extract time out of text with regex in Google Sheets

I need to get the time values out of this string:
SomeText 02/02/2020 9:00 AM-02/02/2020 9:15 AM;"Text" 02/02/2020 10:45 AM-02/02/2020 11:15 AM;"Text" 02/02/2020 12:45 PM-02/02/2020 1:00 PM;
The pattern and length are not consistent. But time always comes after the date.
Any suggestions?
See if this works
=regexextract(J7, rept("\s(\d+:\d+\s[AP]M).+", len(J7)-len(substitute(J7, ":",))))
Note: to convert the returned values to number, try
=ArrayFormula(regexextract(J7, rept("\s(\d+:\d+\s[AP]M).+", len(J7)-len(substitute(J7, ":",))))+0)
and format the output as desired.
You could use
\d+:\d+ [AP]M
See a demo on regex101.com.

Regex for date format dd Mmm yyyy from email header

I have the following regex that I have been working on:
^(\d\d)\s(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s(\d{4})?$
I am trying to grab the date from an email header that is formatted like so:
"Mon, 18 Nov 2019 09:19:17 -0700 (MST)"
and I want the result to be:
18 Nov 2019
It seems that the \s for whitespace could be the culprit, but I have yet to find another forum result that grabs dates with whitespace instead of "-" or "/".
Does anyone have any suggestions for getting this working to extract as described above? Thanks in advance.
The problem is that you have added the "^" and "$" symbol on the start and end of the regex.
"^n": The ^n quantifier matches any string with n at the beginning of it.
"n$": The n$ quantifier matches any string with n at the end of it.
Since the text is not start with 2 digit (\d\d) and end with 2 digit (\d{4}). You will not get any result from this regex.
You can simply remove those two symbol or use the following code to achieve that.
/(\d{2}\s(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{4})/.exec("Mon, 18 Nov 2019 09:19:17 -0700 (MST)")[1]

Remove everything after a certain word in OpenRefine

I'd like to remove everything after a certain word ("am") in a cell with OpenRefine.
My data:
Workshop im Rahmen des Weiterbildungsprogramms am 02. November 2015
Brainstorming am 09. November 2015 in Bremen
Workshop "Auswählen und bewerten" am 17. November 2015 in Hamburg
Example for Regex: [\n\r].*am\s*([^\n\r]*)
See it in action here: http://rubular.com/r/bBlXOMoos1
That works. I'd like to have the following result.
Workshop im Rahmen des Weiterbildungsprogramms
Brainstorming
Workshop "Auswählen und bewerten"
I tried: value.replace(/[\n\r].*am\s*([^\n\r]*)/, '')
The problem is not so much the regex, I could remove the "am" in a 2nd step, if necessary. But I can't get the regex to work in combination with value.replace.
Could you try this with Python/Jython ?
import re
return re.sub(r"am.+","", value)
I think the Python's regular expressions are often more consistent than those of GREL. But if you want to use GREL, does this not work?
value.replace(/\s+am.+/, '')
I feel you are mixing the syntax of value.match() (which requires you to match the whole string in a cell, then select the substring you want) and value.replace() (where you can only match the substring you need).
The issue is pretty simple actually, your missing the . before your * to remove all the trailing stuff, right now your regex is saying 0 or more spaces are following the am, but you want it to clean off everything else after it...This works:
value.replace(/\sam.*/,'')

Validate Month Year Format

I need to validate Text box in this format (ex:FEB 2014 MMM YYYY).
I am using the following regular expression string
^(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\-\d{4}$
Only issue is that my input is with a 'space' and not '-' i.e. JUN 2012 not JUN-2012
Can someone please amend the above regex to cater for space
Thanks
Try the below regex to match month and year in this MMM YYYY format ,
^(?:JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC) \d{4}$
DEMO
use \s instead of \- in your regex
like this :
^(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s\d{4}$
#Avinash is right \s matches [\r\n\t\f] better use " " instead.

Parsing dates from string - regex

I'm terible with regex and I can't seem to wrap my head around this simple task.
I need to parse out the two dates in a string which always has one of two formats:
"Inquiry at your property for December 29, 2013 - January 03, 2014"
OR
"Inquiry at your property for 29 December , 2013 - 03 January, 2014"
the 2 different date formats are throwing me off. Any insights would be appreciated!
/(\d+ \w+, \d+|\w+ \d+, \d+)/ for example. Try it out on Rubular.
For sure, it would pickup more stuff, like 2013 NotReallyAMonth, 12345. But if you don't have things in the input that look like a date, but not actually a date this might work.
You could make the regexp stronger, but applying more restrictions on what is matched:
/(\d{2} (?:January|December), \d{4}|(?:January|December) \d{2}, \d{4})/
In this case the day is always two digits, the year is 4. Months are listed explicitly (you would have to list all of them).
Update: For ranges it would be a different regexp:
/((?:Jan|Dec) \d+ - \d+, \d{4})/
Obviously they can all be combined together:
/(\d{2} (?:January|December), \d{4}|(?:January|December) \d{2}, \d{4}|(?:Jan|Dec) \d+ - \d+, \d{4})/