clarification on vim pattern matching - regex

I want to convert the 5 space indentation in a python file to 4 space indentation. I want the command to do the following
remove a single space in all the lines which starts with a space followed by characters.
I issued the command %s/^\ [a-zA-Z]*// which seems to work. Later i figured out that the command should actually be
remove a single space in all the lines which starts with a space followed by any number of spaces followed by characters.
However still i am not able to figure out how the command(above) is working. It should basically report error for the following stating pattern not found but still it works.
class H:
def __init__():
hell()

It's working because * means "match zero or more of the previous atom". In your case, it's matching zero. You probably wanted to use \+ instead which means "match one or more of the previous atom".
In actuality, you could have just dropped the * entirely because just a space followed by a single character would have matched what you were originally searching for. There are better regular expressions for what you're trying to accomplish, but that's not what you're asking here.
Edit (clarification):
Your regex as it stands (^\ [a-zA-Z]*) translates to:
^: From the start of the line
\: Match a space
[a-zA-Z]: Followed by a letter
*: Zero or more times (of the previous atom - a letter)

Related

Find and Partially Replace Notepad++ Regex

I have a file with a file with lines containing a space, 9 digits, 6 spaces and 5C18. Finding it is easy I'm using
\s\d{9}\s{6}\5C18
The problem is that I need to replace the space at the beginning of the line with a letter, say F. So that everything else remains in tact. Every time I try to do it the entire line is replaced with the expression. I know this is probably something stupidly basic but any help would be appreciated.
Move the part that you do not wish to replace into a lookahead expression:
^\s(?=\d{9}\s{6}5C18)
Now the portion in (?= ... ) is not considered part of the match; only the initial space is. Hence, running a replace with this regex would let you replace the initial space with whatever characters that you want.
It's text on a single line. The F needs to go where that first space is at the beginning of the line.
Note the use of ^ anchor to ensure that the match of the initial space is tied to the beginning of the line.

Why doesn't my non-greedy match work in vim?

This is test
There are two tabs (\t) in this line. I want to get rid of the part from the beginning to the first tab key, which is "This ", and I used the following pattern:
:s/.\{-}\t//g
It says it can't find the pattern. If I use the following, both tabs are replaced, which isn't what I want. Why doesn't the first pattern work?
:s/.*\t//g
Your first attempt does not work because you are matching the fewest number of any character followed by a tab. The fewest number of any character is zero (0). So both of your tabs match without any other characters.
Based on the comments, the above explanation was incorrect.
Here is one possible solution.
:s/^[^\t]*\t//
This goes from the beginning ^, capturing any number of non-tab characters [^\t]* until it reaches a tab \t.
Your pattern /.\{-}\t didn't work because of the g flag in the :s command. This flag enables global matching so it matches twice. Just remove the flag and it will work. In addition, when deleting something you can omit the replacement part in :s:
:s/.\{-}\t
The full :s/.\{-}\t// is fine as well. Note that in either case it should not say "pattern not found" as you described. If you see that message, there is something else different between your example and your actual text.

PowerShell RegEx with multiple options

Given a file name of 22-PLUMB-CLR-RECTANGULAR.0001.rfa I need a RegEx to match it. Basically it's any possible characters, then . and 4 digits and one of four possible file extensions.
I tried ^.?\.\d{4}\.(rvt|rfa|rte|rft)$ , which I would have thought would be correct, but I guess my RegEx understanding has not progressed as much as I thought/hoped. Now, .?\.\d{4}\.(rvt|rfa|rte|rft)$ does work and the ONLY difference is that I am not specifying the start of the string with ^. In a different situation where the file name is always in the form journal.####.txt I used ^journal\.\d{4}\.txt$ and it matched journal.0001.txt like a champ. Obviously when I am specifying a specific string, not any characters with .? is the difference, but I don't understand the WHY.
That never matches the mentioned string since ^.? means match beginning of input string then one optional single character. Then it looks for a sequence of dots and digits and nothing's there. Because we didn't yet pass the first character.
Why does it work without ^? Because without ^ it is allowed to go through all characters to find a match and it stops right before R and continues matching up to the end.
That's fine but with first approach it should be ^.*. Kleene star matches every thing greedily then backtracks but ? is the operator here which makes preceding pattern optional. That means one character, not many characters.

How do I regex search in x and y for a, and only include the replacement of y if a was found in x?

I need to search through a larger text file.
This is an example of what I'm searching through.
https://pastebin.com/JFVy2TEt
recipes.addShaped("basemetals:adamantine_arrow", <basemetals:adamantine_arrow> * 4, [[<ore:nuggetAdamantine>], [<basemetals:adamantine_rod>], [<minecraft:feather>]]);
I need to look for lines that match a specific part in the first argument.
For example the "_arrow" part in the above line.
And erase everything that doesn't match on the "_arrow" in the first argument.
And the arguments differ across all of them.
And also with different names in the place where "basemetals:adamantine" is in the above line.
And since the further arguments are all different I can't wrap my head around on how to include the end only when the first thing matches.
Edit: The end goal being to ease sort my 3k+ line text file.
basic, blacksmith, carpenter, chef, chemist, engineer, farmer, jeweler, mage, mason, scribe, tailor
I think what you're trying to do is filter your text file by removing lines that don't fit a set criteria. I've chosen the Atom text editor for this solution (because I'm running Windows OS and can't install gedit, and I want to ensure you have a working example).
To remove only lines that don't have a first argument ending in _arrow, one could do (?!recipes\.addShaped\("[^"]+_arrow")recipes.+\r?\n? and replace with nothing.
As a note: this task is made more difficult by Atom's low regex support. In a more well-supported environment, my answer would probably be ^recipes\.addShaped("[^"]+(?<!_arrow)").+\r?\n? (with multiline mode).
Also, please read "What should I do when someone answers my question?".
Regex explained:
(?! ) is a negative lookahead, which peeks at the succeeding text to ensure it doesn't contain "_arrow" at end of the first argument.
\. is an escaped literal period
[^"] is a character class that signifies a character that is not a ".
+ is a quantifier which tells the regex to match the preceding character or subexpression as many times as possible, with a minimum of one time.
. is a wildcard, representing any character
\r?\n? is used to match any kind of newline, with the ? quantifier making each character optional.
Everything else it literal characters; it represents exactly what it matches.

RegExp adaption with new line

I've the following RegExp to find the URIs listed above:
"^w{3}\.[\S\-\n|\S]+[^\s.!?,():]+$"
URLs to find:
www.example.org
www.example-example.org
www.example-example.org/product
You'll find it at www.example-
example.org/product.
www.example.org
You'll find it there.
Number 1, 2 and 3 will be found, but 4. delivers "www.example-" as URI.
When there is no point at the end of 4. it would deliver it correct.
EDIT: With deleting ^ and $ only number 5 is not working.
Does anyone can help here?
Your pattern
^w{3}\.[\S\-\n|\S]+[^\s.!?,():]+$
can be simplified to
^w{3}\.[\S\n]+[^\s.!?,():]$
[\S\-\n|\S] this is a character class, no OR possible, no repetition needed, - is included in \S. So [\S\n] is doing the same.
[^\s.!?,():]+ because you match every non whitespace with the expression before this one, here the + is not needed. I assume you just want your pattern not to end with one of the characters from the class.
See your pattern on Regexr (I added \r to your first class, because the line breaks there needs it)
This is a very useful tool to test regexes
I think your problem is that you want to allow line breaks in the link. How do you want to handle this? How do you want to distinguish when the line ends with a link if the word in the next line is just a word or part of the link. I think this is not possible!
The problem is the '^\s' in the second squared bracketed part. Depending on your programming language, '\s' might match the new line. So, you are telling it to match anything that is not a whitespace and it finds a whitespace (new line).
However, this should only be one of your issues. Your regex uses the '^' and '$' characters which mean start and end of line respectively. Try this URL example:
hello from www.example.org
Did it match? I think it will not.