Regular Expression for a string of latitude longitude pairs - regex

I am receiving an input like this in my program
lat1,long1;lat2,long2;lat3,long3
latitude longitude pairs separated by columns.
No I want to validate this input so that i may not receive wrong data.
I have created a regular expression:
((^(\-?\d+(\.\d+)?),(\-?\d+(\.\d+)?));?)
problem is it only validates a single pair not a string of pairs separated by ; as i wish.
If you could help me come up with an expression to validate my data I would be grateful.

Your expression check only one pair at the start of the string folowed by one column.
You can try this :
^\-?\d+(\.\d+)?,\-?\d+(\.\d+)?(;\-?\d+(\.\d+)?,\-?\d+(\.\d+)?)*$
Thats your regex (without some brackets inutiles) folowed by 0 or more times a column and a pair.
A test with your 4 examples.

How about something like this?
(-?[\d\.]+,-?[\d\.]+)+;?+
I tried it on Rubular.com (permalink), with the test string as following:
-10,10;10.5,-10.337
Result?
Match 1
1. -10,10
Match 2
1. 10.5,-10.337

I would use the following pattern, which can also accept values like .2 along with 0.2.
^-?(\d+(\.\d+)?|\.\d+),-?(\d+(\.\d+)?|\.\d+)(;-?(\d+(\.\d+)?|\.\d+),-?(\d+(\.\d+)?|\.\d+)){2}$
Here's the breakdown.
^ # beginning of string
-? # optional minus sign
( # either
\d+(\.\d+)? # an integer with an optional decimal part
| # or
\.\d+ # a decimal number
)
, # a comma
-? # optional minus sign
( # either
\d+(\.\d+)? # an integer and an optional decimal part
| # or
\.\d+ # a decimal number
)
(
; # a semicolon
-? # optional minus sign
( # either
\d+(\.\d+)? # an integer and an optional decimal part
| # or
\.\d+ # a decimal number
)
, # a comma
-? # optional minus sign
( # either
\d+(\.\d+)? # an integer and an optional decimal part
| # or
\.\d+ # a decimal number
)
){2} # with 2 repetitions (of the pattern)
$ # end of string

Related

Parsing digits and decimals out of string with re

I have a string that looks like this:
'Home Cookie viewed item "yada_yada.mov" (22.4338.241384081)'
I need to parse the last set of numbers, the ones between the last period and the closing paren (in this case, 241384081) out of the string, keeping in mind that there may be one or more sets of parenthesis in the filename "yada_yada.mov."
So far I have this:
mo = re.match('.*([0-9])\)$', data1)
...where data1 is the string. But that is only returning the very last digit.
Any help, please?
Thanks!
You may use
(\d[\d.]*)\)$
See the regex demo.
Details
(\d[\d.]*) - Capturing group 1: a digit and then any amount of . and digits, 0 or more times
\) - a )
$ - end of string.
See the Python demo:
import re
s='Home Cookie viewed item "yada_yada.mov" (22.4338.241384081)'
m = re.search(r'(\d[\d.]*)\)$', s)
if m:
print(m.group(1)) # => 22.4338.241384081
# print(m.group(1).replace(".", "")) # => 224338241384081
Alternative patterns:
(\d+(?:\.\d+)*)\)$ # To match digits and then 0 or more repetitions of . + digits
(\d+(?:\.\d+)*)\)\s*$ # To allow any 0+ trailing whitespaces

Regex to Extract Last Part of URL that Contains User ID Strings

I'm having a hard time figuring this one out and could use some help.
I'm using Google Analytics filters to reduce the number of unique pages being reported in our app by stripping out ID strings from the URLs that are coming in.
What I need is a regex that will look for URLs that have these IDs in the URL. Here's what sets them apart from the rest of the URL:
ID strings are always the last part of the URL
ID strings always contain both letters and numbers
ID strings are always either 16- or 32-characters in length
ID strings can show up twice in a URL
ID strings can end with either a "/" or without
Here are some example URLs that show how they appear in our reporting:
/app/6be031b9672be9b5/
/app/admin/client/settings/6be031b9672be9b5
/app/subscribers/ea33fb38c9efc4dc0367819f23434f99/
/app/subscribers/customfieldsettings/0359c487066727ae/
/app/reports/6fa92d36be0e6c16/dc5aa096fba9cbb97eea1dae616d4b3c/
The second part of my question is that this regex should also group everything before these ID strings into a capturing group so that I can call that group later on in the filter, effectively stripping out these ID strings to look like the following:
/app/6be031b9672be9b5/ --> /app/
/app/subscribers/ea33fb38c9efc4dc0367819f23434f99/ --> /app/subscribers/
etc.
I've tried a couple different approaches but none seem to work perfectly, so I could really use the help, thank you!
Here's a solution:
^(.*?)(?:\/[a-zA-Z0-9]{16}|\/[a-zA-Z0-9]{32}){0,2}\/?$
Demo
This will remove the last part or 2 parts of URLs which are 16 or 32 characters long and contain only letters and digits.
You can make sure these parts contain both letters and numbers like this, if the tool supports lookaheads:
^(.*?)(?:\/(?=.{0,15}?\d)(?=.{0,15}?[a-zA-Z])[a-zA-Z0-9]{16}|\/(?=.{0,31}?\d)(?=.{0,31}?[a-zA-Z])[a-zA-Z0-9]{32}){0,2}\/?$
Demo
This adds assertions to the pattern.
Breakdown:
^(.*?) # Start of URL
(?:
\/ # a slash
(?=.{0,15}?\d) # check there's a digit at most 16 chars ahead
(?=.{0,15}?[a-zA-Z]) # check there's a letter at most 16 chars ahead
[a-zA-Z0-9]{16} # check the next 16 chars are digits or letters
| # .. or:
\/ # a slash
(?=.{0,31}?\d) # check there's a digit at most 32 chars ahead
(?=.{0,31}?[a-zA-Z]) # check there's a letter at most 32 chars ahead
[a-zA-Z0-9]{32} # check the next 32 chars are digits or letters
){0,2} # .. at most 2 times
\/?$ # optional slash at end
This will do it:
([a-z0-9]+)(?:\/?$)
Demo
Explanation:
([a-z0-9]+) matches and captures the alphanumeric part
(?:\/?$) looks for (but doesn't match or capture) the optional final / and then the end of the string ($)
modified - totally missed that can be 1 or 2 id's at the end thing.
Oh well, revised fwiw.
# (?i)^(.*?)/((?:(?=[^/]{0,31}[a-f])(?=[^/]{0,31}[0-9])(?:[a-f0-9]{16}|[a-f0-9]{32})(?:(?:/[a-z])?/?$|/)){1,2})$
(?i) # Case insensitive modifier
^ # BOS, begin the ride ..
( .*? ) # (1), Kreep up on the first ID
/ # Trim this / junk
( # (2 start), 1-2 ID's separated by a /
(?:
(?= [^/]{0,31} [a-f] ) # Use largest range (32), Must be a lettr AND number
(?= [^/]{0,31} [0-9] )
(?: # One of 16 or 32 length
[a-f0-9]{16}
| [a-f0-9]{32}
)
(?:
(?: / [a-z] )? # optional / letter
/? $ # /? EOS for end of 1 or 2
| # or,
/ # / between 2 only
)
){1,2}
) # (2 end)
$ # EOS, rides over !!
Sample output:
** Grp 0 - ( pos 195 , len 63 )
/app/reports/6fa92d36be0e6c16/dc5aa096fba9cbb97eea1dae616d4b3c/
** Grp 1 - ( pos 195 , len 12 )
/app/reports
** Grp 2 - ( pos 208 , len 50 )
6fa92d36be0e6c16/dc5aa096fba9cbb97eea1dae616d4b3c/

Regular expression allow up to 2 decimal places optional leading 0

I have the following regular expression
/^\d*[0-9](?:\.[0-9]{1,2})?$/
How can I modify it so that it will allow numbers like .12 and .0? I want to keep it so they can only enter numeric values, but I need to allow values as seen above with no leading digits.
At the moment it works well but only if you provide a leading zero.
Thank you!
You can use alternation:
/^(?:\d*[0-9](?:\.[0-9]{1,2})?|\.[0-9]{1,2})$/
Here this regexp:
/
^ # The string start with...
\s* # Any leading spaces
0* # Any leading zeros
# Now the number we want to match
(
[0-9]? # Maybe a positive number
# An optional decimal part
(?:
[.,] # A decimal point or a comma
[0-9]{1,2} # One or two values after the comma
)?
)
0* # Perhaps some trailing zeros
/gmx
Demo : http://regex101.com/r/cI8xP4/1
^(?!^$)\d*[0-9]?(?:\.[0-9]{1,2})?$
Try this.See demo.
http://regex101.com/r/lE9oV4/1

Match any except list of values - oracle regex

I need an Oracle regex that will match a file-name in the format ABCD_EFG_YYYYMMDD_HH(24)MISS.csv, except if the time-part is one of three specific values: 110000, 140000, or 180000.
So, for example, it will match the file-name ABC_DEF_20120925_110001.csv, but not the file-name ABCD_EFG_20120925_110000.csv is not.
The following non-Oracle regex works:
^ABCD_EFG_[0-9]*_(?!110000|140000|180000)[0-9]*\.csv$
but I don't know how to write it as an Oracle regex.
Oracle doesn't support lookahead assertions, so you'll have to spell out all the valid matches:
^ABCD_EFG_[0-9]*_([02-9]|1[0235679]|1[148]0{0,3}[1-9])[0-9]*\.csv$
should work (assuming that the time part is always 6 digits long).
Explanation:
ABCD_EFG_ # Match ABCD_EFG_
[0-9]*_ # Match first number (date part) and _
( # Match a number that starts with
[02-9] # 0 or 2-9
| # or
1[0235679] # 1, followed by 2,3,5,6,7, or 9
| # or
1[148] # 11, 14, or 18
0{0,3} # followed by up to three zeroes
[1-9] # but then one digit 1-9
) # End of alternation
[0-9]* # Fill the rest with any digits
\.csv # Match .csv (mind the backslash!)

Regular expression for phone numbers

I'm trying:
\d{3}|\d{11}|\d{11}-\d{1}
to match three-digit numbers, eleven-digit numbers, eleven-digit followed by a hyphen, followed by one digit.
But, it only matches three digit numbers!
I also tried \d{3}|\d{11}|\d{11}-\d{1} but doesn't work.
Any ideas?
There are many ways of punctuating phone numbers. Why don't you remove everything but the digits and check the length?
Note that there are several ways of indicating "extension":
+1 212 555 1212 ext.35
If the first part of an alternation matches, then the regex engine doesn't even try the second part.
Presuming you want to match only three-digit, 11 digit, or 11 digit hyphen 1 digit numbers, then you can use lookarounds to ensure that the preceding and following characters aren't digits.
(?<!\d)(\d{3}|\d{11}|\d{11}-\d{1})(?!\d)
\d{7}+\d{4} will select an eleven digit number. I could not get \d{11} to actually work.
This should work: /(?:^|(?<=\D))(\d{3}|\d{11}|\d{11}-\d{1})(?:$|(?=\D))/
or combined /(?:^|(?<!\d))(\d{3}|\d{11}(?:-\d{1})?)(?:$|(?![\d-]))/
expanded:
/ (?:^ | (?<!\d)) # either start of string or not a digit before us
( # capture grp 1
\d{3} # a 3 digit number
| # or
\d{11} # a 11 digit number
(?:-\d{1})? # optional '-' pluss 1 digit number
) # end capture grp 1
(?:$ | (?![\d-])) # either end of string or not a digit nor '-' after us
/