Regex -- replace all spaces before a particular character - regex

The goal is to make something like
This is some text=This is some text
become:
This\ is\ some\ text=This is some text
I've been playing with variations of things I know will grab spaces/whitespaces (like "\ " or \s) in front of (?==) which seems to select until the = character, but nothing seems to be working in Intellij IDEA's search and replace.
Any suggestions?

Copying the answer from the comments in order to remove this question from the "Unanswered" filter:
This - (\s)(?=.*=) should work. Replace it with \$1
~ answer per Rohit Jain
This was additionally confirmed by the OP:
That worked, though I used a literal space instead of the \s because it was picking up some additional white space I didn't want replaced. Also had to do some silly escaping for the replace (\\$1 )

Related

Notepad++ RegEX how do I append a character based on start of the character and before a character?

I would like to append _OLD to the end of each strings that starts with SR_ but before the symbol ' or without it
For example my text is the following:
SR_Apple
When the 'SR_APPLE' rotten, we must discard it.
I would like the find and replace to do:
SR_Apple_OLD
When the 'SR_APPLE_OLD' rotten, we must discard it.
I have tried (SR_*)+$.*(?='\s) based on what i Learned but no luck so far. Please help. Thx in Adv
For simple cases you should be able to use
Find: (\bSR_[\w]+)
Replace: $1_OLD
(\bSR_.+?)('|$) and $1_OLD$2 could also work if the text after SR_ is more complex
The lookbehind you're using is only matching the string if it ends with a ' so it won't find the text not in quotes.
regex101 is a useful tool for debugging expressions

Notepad ++ regex. Finding and replacing with wildcard, but without allowing any spaces?

I have something like this in txt
[[asdfg]] [[abcd|qwerty]]
in a row, but I want it to look like that
[[asdfg]] [[qwerty]]
using wildcards ( [[.*\| ) when trying to search, results in it finding the whole line up to the "|" Not allowing it to have a space in between should work, but I don't know how to do that.
Edit 1
It's from a wikipedia dump, so the first part is the word in it's basic form and the second is how it fits into the sentence. Something like [[I]] [[be|was]] [[at]] [[the]] [[doctor]] And I want to change it into normal sentences
[[I]] [[was]] [[at]] [[the]] [[doctor]]
Edit 2
I found somewhat of a solution. I just put every word in a new line, did the first regex and then deleted newlines. That did kinda mess up my spacing though...
Try this regex:
\[\[\w+\|(\w+)\]\]
Replace with:
[[$1]]
Make sure you choose Regular expression at the bottom before you click Replace All in Notepad++.
You can do it all in one run like so
\[{2}(?:(?!\]{2}).)+?\|([^\]]+)
This needs to be replaced by
[[$1
See a demo on regex101.com.
Broken down this says:
\[{2} # match [[
(?:(?!\]{2}).)+? # do not overrun ]]
\| # |
([^\]]+) # capture anything not ] into group 1
Afterwards, you'll only need to replace the open brackets and the content of group $1

regex to replace everything after period with nothing

I want to remove everything comes after " . " with nothing using simple regex on notepad++ It seems to be really simple one. I tried with regex "..*$" but no luck.
Eg: 129.435456
I would like to replace everything after . and get just 129
Since dot is a regex character (match one character) you need to escape the first dot
:
"\..*$"
Try without quotes in Notepad++ making sure you select "Regular expression" radio button in the bottom. Works great!

What is the regexpression for fixed words and a variable

I am using sublime to search and replace text in a code.
This is what I want to find
pre + <anyvariable> + post
and then replace it like this
sanitize(<anyvariable>)
I don't know how to come up with regexp after finding "pre + "
Came up with something like this
/(\pre \+)(?=.*)/g
Try this:
/^(.*\+)\s*(\S+)\s*(\+.*)$/
This will give you everything in three capture groups, starting from the beginning of the line between the first and second +, and then everything following to the end of the line. If there's other content such as more +'s, then this probably won't work blanketly.
Without a computer, and not sure a out what you want, but if you search for:
pre \+ (.*?) \+ post
And replace with:
sanityze($1) (or \1 I never remember)
It should do what you want :-)
If you want to be able to have linebreaks, replace .* by (?:.|\n)

Change `"` quotation marks to latex style

I'm editing a book in LaTeX and its quotation marks syntax is different from the simple " characters. So I want to convert "quoted text here" to ``quoted text here''.
I have 50 text files with lots of quotations inside. I tried to write a regular expression to substitute the first " with `` and the second " with '', but I failed. I searched on internet and asked some friends, but I had no success at all. The closest thing I got to replace the first quotation mark is
s/"[a-z]/``/g
but this is clearly wrong, since
"quoted text here"
will become
``uoted text here"
How can I solve my problem?
I'm a little confused by your approach. Shouldn't it be the other way round with s/``/"[a-z]/g? But then, I think it'll be better with:
s/``(.*?)''/"\1"/g
(.*?) captures what's between `` and ''.
\1 contains this capture.
If it's the opposite that you're looking for (i.e. I wrongly interpreted your question), then I would suggest this:
s/"(.*?)"/``\1''/g
Which works on the same principles as the previous regex.
Use the following to tackle multiple quotations, replacing all " in one step.
echo '"Quote" she said, "again."' | sed "s/\"\([^\"]*\)\"/\`\`\1''/g"
The [^\"]* avoids the need for ungreedy matching, which does not seem possible in sed.
If you are using the TeXmaker software, you could use a regular expression with the Replace command (CTRL+R), and put the following into the Find field:
"([^}]*)"
and into the Replace field:
``$1''
And then just press the Replace All button. But after that, you still have to check that everything is fine, and maybe you need to do some corrections. This has worked pretty well for me.
Try grouping the word:
sed 's/"\([a-z]\)/``\1/'
On my PC:
abhishekm71#PC:~$ echo \"hello\" | sed 's/"\([a-z]\)/``\1/'
``hello"
It depends a little on your input file (are quotes always paired, or can there be ommissions?). I suggest the following robust approach:
sed 's/"\([0-9a-zA-Z]\)/``\1/g'
sed "s/\([0-9a-zA-Z]\)\"/\1\'\'/g"
Assumption: An opening quotation mark is always immediately followed by a letter or digit, a closing quotation mark is preceeded by one. Quotations can span over several words an even several input lines (some of the other solutions don't work when this happens).
Note that I also replace the closing quotation mark: Depending on the fonts you use the double quotation mark can be typeset as neutral straight quotation mark.
You are looking for something contained in straight quotation marks not containing a quotation mark, so the best regex is "([^"]*?)". Replace it with ``\1''. In Perl this can be simplified to s/"([^"]*?)"/``\1''/g. I would be very careful with this approach, it only works if all opening quotation marks have matching closing ones, for example in "one" two "three" four. But it will fail in "one" t"wo "three" four producing ``one'' t``wo ''three".