%:s/\([0-9]*\)_\(*\)/\2 will not rename files - regex

can someone please edit %:s/\([0-9]*\)_\(*\)/\2 so that i can rename files. for example, if file name is 5555_word_word.jpg, then I want the file name to be word_word.jpg. i feel like I am so close!

You may want to simplify and have it just delete leading numbers and the underscore:
s/^[0-9]+_//

Try this:
:%s/\([0-9]*\)_\(.*\)/\2
The . will match any character (part of the second grouping) and the * will greedily match any amount of them. Your original regex was missing that directive. This will also rename files of the form _word_word.txt to word_word.txt. If you want to require digits to match (probably a good idea), use:
:%s/\([0-9]\+\)_\(.*\)/\2
The \+ directive means to match 1 or more instances.

Your version is fine but you forgot a period and you should probably anchor it to the beginning of a line or to a word boundary using either ^ or \<.
:%s/^\([0-9]*\)_\(.*\)/\2/
You can use \v to clean up some of those slashes.
:%s/\v^([0-9]*)_(.*)/\2/
You can use \ze to avoid capture groups.
:%s/^[0-9]*_\ze.*//
But the trailing .* is superfluous, because it matches anything. So use Seth's version, it's the simplest.

Related

VIM - Replace based on a search regex

I've got a file with several (1000+) records like :
lbc3.*'
ssa2.*'
lie1.*'
sld0.*'
ssdasd.*'
I can find them all by :
/s[w|l].*[0-9].*$
What i want to do is to replace the final part of each pattern found with \.*'
I can't do :%s//s[w|l].*[0-9].*$/\\\\\.\*' because it'll replace all the string, and what i need is only replace the end of it from
.'
to
\.'
So the file output is llike :
lbc3\\.*'
ssa2\\.*'
lie1\\.*'
sld0\\.*'
ssdasd\\.*'
Thanks.
In general, the solution is to use a capture. Put \(...\) around the part of the regex that matches what you want to keep, and use \1 to include whatever matched that part of the regex in the replacement string:
s/\(s[w|l].*[0-9].*\)\.\*'$/\1\\.*'/
Since you're really just inserting a backslash between two strings that you aren't changing, you could use a second set of parens and \2 for the second one:
s/\(s[w|l].*[0-9].*\)\(\.\*'\)$/\1\\\2/
Alternatively, you could use \zs and \ze to delimit just the part of the string you want to replace:
s/s[w|l].*p0-9].*\zs\ze\*\'$/\\/

Regex - Filename may contains parantheses group

I want to match the main name and the file count without parantheses.
For example:
8680733046449.png
8680733046449 (3).png
these files has the same name. I want to seperate second file's name (8680733046449) and the file count (3) (without parantheses).
If file name is not containing any parantheses just match the name.
My regex is:
/^(.*)\s?\((\d+)\)\.png$/
This regex matching files that has parantheses but the without.
Test here : http://www.regexr.com/38pup
You need to use a non-greedy quantifier for the name part. Otherwise, it will match the space and parentheses. You also need to make the part in parentheses optional.
/^(.*?)\s?(\((\d+)\))?\.png$/
^
If I understood to you well, I think this would work for you:
^(.+)(\s?\((\d+)\))?\.png$
Notice that I changed the * after the first dot to avoid empty filenames.
Kind regards.

regex limiting wildcards for url folders

I'd like to set up a regular expression that matches certain patterns for a URL:
http://www.domain.com/folder1/folder2/anything/anything/index.html
This matches, and gets the job done:
/^http:\/\/www\.domain\.com\/folder1\/folder2\/.*\/.*\/index\.html([\?#].*)?$/.test(location.href)
I'm unsure how to limit the wildcards to one folder each. So how can I prevent the following from matching:
http://www.domain.com/folder1/folder2/folder3/folder4/folder5/index.html
(note: folder 5+ is what I want to prevent)
Thanks!
Try this regular expression:
/^http:\/\/www\.domain\.com\/(?:\w+\/){1,3}index\.html([\?#].*)?$/
Change the number 3 to the maximum depth of folders possible.
. matches any character.
[^/] matches any characters except /.
Since the / character marks the begining and end of regex literals, you may have to escape them like this: [^\/].
So, replacing .* by [^\/]* will do what you want:
/^http:\/\/www\.domain\.com\/folder1\/folder2\/[^\/]*\/[^\/]*\/index\.html([\?#].*)?$/.test(location.href)
/^http:\/\/www\.domain\.com\/folder1\/folder2\/[^/]*\/[^/]*\/index\.html([\?#].*)?$/
I don't remember whether we should escape the slashes within the []. I don't think so.
EDIT: Aknoledging tom's comment using + instead of *:
/^http:\/\/www\.domain\.com\/folder1\/folder2\/[^/]+\/[^/]+\/index\.html([\?#].*)?$/
/^http:\/\/www\.domain\.com\/\([^/]*\/\)\{2\}/
And you can change 2 to whatever number of directories you want to match.
You may use:
^http:\/\/www\.domain\.com\/folder1\/folder2\/(\w*\/){2}index\.html([\?#].*)?$/.test(location.href)

How to capture text between two markers?

For clarity, I have created this:
http://rubular.com/r/ejYgKSufD4
My strings:
http://blablalba.com/foo/bar_soap/foo/dir2
http://blablalba.com/foo/bar_soap/dir
http://blablalba.com/foo/bar_soap
My Regular expression:
\/foo\/(.*)
This returns:
/foo/bar_soap/dir/dir2
/foo/bar_soap/dir
/foo/bar_soap
But I only want
/foo/bar_soap
Any ideas how I can achieve this? As illustrated above, I want everything after foo up until the first forward slash.
Thanks in advance.
Edit. I only want the text after foo until until the next forward slash after. Some directories may also be named as foo and this would render incorrect results. Thanks
. will match anything, so you should change it to [^/] (not slash) instead:
\/foo\/([^\/]*)
Some of the other answers use + instead of *. That might be correct depending on what you want to do. Using + forces the regex to match at least one non-slash character, so this URL would not match since there isn't a trailing character after the slash:
http://blablalba.com/foo/
Using * instead would allow that to match since it matches "zero or more" non-slash characters. So, whether you should use + or * depends on what matches you want to allow.
Update
If you want to filter out query strings too, you could also filter against ?, which must come at the front of all query strings. (I think the examples you posted below are actually missing the leading ?):
\/foo\/([^?\/]*)
However, rather than rolling out your own solution, it might be better to just use split from the URI module. You could use URI::split to get the path part of the URL, and then use String#split split it up by /, and grab the first one. This would handle all the weird cases for URLs. One that you probably haven't though of yet is a URL with a specified fragment, e.g.:
http://blablalba.com/foo#bar
You would need to add # to your filtered-character class to handle those as well.
You can try this regular expression
/\/foo\/([^\/]+)/
\/foo\/([^\/]+)
[^\/]+ gives you a series of characters that are not a forward slash.
the parentheses cause the regex engine to store the matched contents in a group ([^\/]+), so you can get bar_soap out of the entire match of /foo/bar_soap
For example, in javascript you would get the matched group as follows:
regexp = /\/foo\/([^\/]+)/ ;
match = regexp.exec("/foo/bar_soap/dir");
console.log(match[1]); // prints bar_soap

How to distinguish between saved segment and alternative?

From the following text...
Acme Inc.<SPACE>12345<SPACE or TAB>bla bla<CRLF>
... I need to extract company name + zip code + rest of the line.
Since either a TAB or a SPACE character can separate the second from the third tokens, I tried using the following regex:
FIND:^(.+) (\d{5})(\t| )(.+)$
REPLACE:\1\t\2\t\3
However, the contents of the alternative part is put in the \3 part, so the result is this:
Acme Inc.<TAB>12345<TAB><TAB or SPACE here>$
How can I tell the (Perl) regex engine that (\t| ) is an alternative instead of a token to be saved in RAM?
Thank you.
You want:
^(.+?) (\d{5})[\t ](.+)$
Since you are matching one character or the other, you can use a character class instead. Also, I made your first quantifier non-greedy (+? instead of +) to reduce the amount of backtracking the engine has to do to find the match.
In general, if you want to make capture groups not capture anything, you can add ?: to it, like:
^(.+?) (\d{5})(?:\t| )(.+)$
Use non-capturing parentheses:
^(.+) (\d{5})(?:\t| )(.+)$
One way is to use \s instead of ( |\t) which will match any whitespace char.
See Backslash-sequences for how Perl defines "whitespace".