Regular Expressions for finding files - regex

Ok I'm giving up and ask the question after I read through the help article of regex and still don't have a clue what I'm looking for:
I Have a list of files:
files <- c("files_combined.csv","file_1-10.csv","file_11-20.csv",
"file_21-30.csv","file_2731-2740.csv","file_2731-2740.txt")
I want only the csv files that start with "file_" and end with ".csv". I know the it looks something like this:
grep(pattern = "^file_???.csv$" ,files)
But I need to find the correct regular expression that ignores the number of characters between the first and the second pattern ("file_" + ".csv"). I'd really appreciate if somebody knows a complete list with the regular expressions in R since it is tedious to read through the help every time and, as in my case not successful, sometimes...

R offers a function for doing wildcard expansion using glob patterns for those who don't like regex:
files <- Sys.glob("file_*.csv")
This should match your pattern.

Thanks a lot! Seems David Arenburg and Heroka, you came up with the solution at the same time. Also thanks to MichaelChirico for providing the cheatsheet.
This is the answer to my specific problem:
grep("^file_.+\\.csv$",files,ignore.case = T)
As for problems with regex, this is helpful as well txt2re

Related

Regex nothing or some options

I am trying to develop a regular expression to extract this: PT~MM:SS~EQP>G-G<EQP from a file.
PT is optional but if it is present it's only valid if it is 1P, 2P, 1EP or 2EP.
So if the example is: 3EP~101:37~POR>4-2<ISL it shouldn't be matching nothing but I am getting 2EP~101:37~POR>4-2<ISL as a match.
So far I've tried this:
(((1|2)P|(1|2)EP)~)?(0{0,1}([0-9]|[1-8][0-9]|9[0-9]|1[01][0-9]|120)):(0*([0-9]|[1-4][0-9]|5[0-9]))~[A-Z]{3}>[0-9]-[0-9]<[A-Z]{3}
Can someone help me?
This might what you are looking for ^(?:[12]E?P)?~?\w+?:\w+?~\w+?>\w-\w<\w{3} (https://regex101.com/r/T8Cy4C/6). Although you did not specified fully what are the requirements for each parts.

Regex to match everything."LettersNumbers"."extension" and forum searching tip

I would need a regex to match my files named "something".Title"numberFrom1to99".mp4 on Windows' File Explorer, my first approach as a regex newbie was something like
"..mp4"
, but it didn't work, so i tried
"*.Title[1-9][0-9].mp4"
, that also did not work.
I would also like a tip on how to search regex related advices on Stackoverflow archive but also on the web, so that i can be specific, but without having the regex in the searching bar interact.
Thank you!
EDIT
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)
Try something like this:
(.*)\.[A-za-z]+\d+\.mp4
See this Regex Demo to get an explanation on the regex.
Use regex101.com to test your regexs
Here it is:
^[\s\S]*\.Title[1-9][0-9]?\.mp4$
I suggest regexr.com to find many interesting regexes(Favourites tab) and simple tutorial.
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)

Rewrite regex without negation

I have wrote this regex to help me extract some links from some text files:
https?:\/\/(?:.(?!https?:\/\/))+$
Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..
What I would like to do with it, is to select all the text from the last occurance of http/https till the end.
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> Output: http://websites.com/path/subpath/#query2
Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.
Try this regex:
https?:[^:]*$
Regex live here.
The lookaheads exist for a reason.
However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:
(?!xyz)
is somewhat equivalent to:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
With that said, hopefully I didn't make any mistakes:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$

Regular Expression for recognizing files with following patterns az.4.0.0.119.tgz

I am trying to find a regular expression that will recognize files with the pattern as az.4.0.0.119.tgz. I have tried the regular expression below:
([a-z]+)[.][0-9]{0,3}[.][0-9]{0,3}[.][0-9]{0,3}[.]tgz
But, no luck. Can anyone please point me in the right direction.
Regards,
Shreyas
Better to use a simple regex like this:
^([a-z]+)\.(?:[0-9]+\.)+tgz$
You just forgot one number part:
([a-z]+)[.][0-9]{0,3}[.][0-9]{0,3}[.][0-9]{0,3}[.][0-9]{0,3}[.]tgz
or
([a-z]+)[.]([0-9]{0,3}[.]){4}tgz
Depending on where and how you use the regex, you might want to surround it in ^...$.
Your pattern has 4 chiffers group, your regexp only 3.

Searching my code with regex

It happens all the time, I would need to scan my code for places where I have two or more of the same keywords.
For example $json["VALID"]
So, I would need to find json, and VALID.
Some places in the code may contain:
// a = $json['VALID']; // (note the apostrophes)
(I am using EditPlus which is a great text editor, letting me use regex in my searches)
What would be the string in the regex to find json and VALID (in this example) ?
Thanks in advance!
Use this regex:
\$json\[["']VALID['"]\]
wound find $json<2 character>VALID
\$json.{2}VALID