I am fairly new to regular expression therefore this may be the simplest question you've seen on StackOverflow :-)
I have a large JSON file with text like this:
{..., "text": "BLAH BLAH", ...}
The text may contain any of the special characters and also characters like \", which I understand can be seen as escape character in regular expression. I am trying to find and replace a single character colon : with tilde ~ within the portion followed by "text" preferably in Notepad++. Any help will be greatly appreciated.
This regex will find all the : in the value for fields named text and replace the character with a ~. Note there were issues using regular expressions with Notepad++ v5. My demo here was tested in Notepad++ v6.3.3
Regex: ("text":\s"[^"]*?):
Replace with: $1~
Input string: {"not text": "12:34", "text": "BLAH:BLAH", "Never get a": ":oskupee"}
Here is what I did (thanks for all the help #Mike, but I had to make many edits. That's why I am answering my own question so other users can get the complete answer)
Search for \"text\": \".*? : .*?\", in Notepad++
Find and replace with \1~\2 to replace all : with ~
Manually correct the mistakes
You can do this:
find: ("(?:[^"]+|(?<=\\)")*")\s*:
replace: $1~
The idea is to capture the content inside double quotes, to put it in the replacement.
I use a lookbehind to allow escaped double quotes inside double quotes.
Related
I would like to append _OLD to the end of each strings that starts with SR_ but before the symbol ' or without it
For example my text is the following:
SR_Apple
When the 'SR_APPLE' rotten, we must discard it.
I would like the find and replace to do:
SR_Apple_OLD
When the 'SR_APPLE_OLD' rotten, we must discard it.
I have tried (SR_*)+$.*(?='\s) based on what i Learned but no luck so far. Please help. Thx in Adv
For simple cases you should be able to use
Find: (\bSR_[\w]+)
Replace: $1_OLD
(\bSR_.+?)('|$) and $1_OLD$2 could also work if the text after SR_ is more complex
The lookbehind you're using is only matching the string if it ends with a ' so it won't find the text not in quotes.
regex101 is a useful tool for debugging expressions
I have file with rows like this:
"B4P(6-3,5)-VH(LF)(SN)",JST,2018+,34000,SMD
893D226X0016C8W,VISHAY,2018+,"30,000",SMD
BL-BUF1V4V-AT-L,FOXLINK,2018+,1890,CONN
"TLP721F(D4-GR,M,F)",NSC,2001+,114,AUCDIP-16
How can i find all commas inside quotes? For example, i need to find this:
"B4P(6-3 >>,<< 5)-VH(LF)(SN)",JST,2018+,34000,SMD
893D226X0016C8W,VISHAY,2018+,"30 >>,<< 000",SMD
BL-BUF1V4V-AT-L,FOXLINK,2018+,1890,CONN
"TLP721F(D4-GR >>,<< M >>,<< F)",NSC,2001+,114,AUCDIP-16
Now I can only find text in quotes, tell me how to select only commas from it, using one regular expression?
("(?:\[??[^\[]*?"))
Regex101 - online regex editor and debugger
Here is a simplistic solution that works with your example:
It match only quoted strings having one or more , inside.
grep '"[^,]*,[^"]*"'
Hope it works for you.
Explanation
"[^,]* match " and following non , chars
, match the first , char
[^"]*" match following non " till find the next"
I'm trying to clean a huge geoJson datafile. I need to change the format of "text" field from
"text": "(2:Placename,Placename)"
to
"text": "Placename".
In Sublime text I managed to write a regular expression which enabled me to select and remove the first part leaving something like this:
"text": "Placename)"
With following regexp I can select the text above, but I need to narrow it down to the last character:
text\": \".*?\)
No matter what I can't figure out how to select the ")" character in the end of Placename string in the whole file and remove it. Note that the "Placename" here can be any place name, like New York, London etc.
I tried to build an expression where first part finds the text field, then ignores n-amount of characters until it finds the ")" character.
After experimenting and Googling I couldn't find a solution here.
You can capture the value of the second placemark field with the following regexp:
/"text": "+\(\d+:[^,]+,(.*?)\)/
Which will capture "Placename" in $1
More info on capturing parenthesis: http://www.regular-expressions.info/brackets.html
The trick is to use the inverted character classes and to escape any parentheses you want to match.
HTH
I do not know if you are using a Unix system, but probably sed can do much of the work for you. It can interpret regular expressions, capture groups, and substitute by other groups of characters. I have tried an example with sed and the following sed command worked for me:
echo "\"text\": \"(2:Placename,Placename)\"" | sed -r 's/(\"text\": )\"\([[:digit:]]:[^0-9]+,([^0-9]+)\)\"/\1\"\2\"/g'
-r allows sed to interpret regular expressions. I am using parentheses to capture groups that I will use later in the substitution (e.g., a group for "text", and a group for the second placename). In the substitution part of sed, you can use groups by using \n where n is the group number that you want to used. This expression should help you to achieve your desired result.
In Notepad++, how do you Find and Insert (instead of Find and Replace) while using a regular expression as the search criteria?
For non regular expression, you can simply include what you are finding in the replace value, but for regular expression, that won't work. Ideas?
very simple, if you need to add some text to every match of your search you can use backreferences in regular expressions, so for example, you have:
this is a table.
and you want to get "this is a red table",
so you do search for:
(this is a)
and replace with (in regular expression mode):
\1 red
also note, that we've used parenthesis in our search. Each set of parens can be accessed in replace with the corresponding \N tag. So you can, for example search for
(this is).*(table)
and replace it with
\1 not a \2
to get "this is not a table"
Dmitry Avtonomov answered it right but I just wanted to add in case you have something dynamic in between two strings.
Example:
Line 1: Question 1
Line 2: Question 2
And you want to just add a dot after the end of each question number, you can add at this way.
In Notepad++
Replace : (QUESTION)(.*)(\r\n)
With : \1 \2. \3
Result:
Line 1: Question 1.
Line 2: Question 2.
Have you checked other posts?
Maybe this will help you get your answers:
Using regular expressions to do mass replace in Notepad++ and Vim
http://markantoniou.blogspot.com/2008/06/notepad-how-to-use-regular-expressions.html
Need some help in Notepad++
Example how it looks at the moment
http://www.test.com/doc/rat.rar">rat.rar
http://www.test.com/down/ung.rar">ung.rar
http://www.test.com/read/add.rar">add.rar
......
How I want it (just remove after ">....rar)
http://www.test.com/doc/rat.rar
http://www.test.com/down/ung.rar
http://www.test.com/read/add.rar
Its a list about 1000 lines. So help would be nice
Use the following expression:
">[^.]+\.rar
Explanation:
"> # literal `"` followed by literal `>`
[^.]+ # any character that is not a `.`, repeated at least once
\. # literal `.` character
rar # literal string `rar`
Note: a couple of other answers pointed out that just ">.* will work. This is true, because Notepad++ doesn't appear to support multi-line regular expressions, even with [\s\S]+. Either way will work so it's personal preference. The regex I gave in this answer is very verbose and would reduce the likelihood of false positives. ">.*, on the other hand, is shorter.
In regexp mode , replace pattern ">.* with empty string.
">.*
Search for this and replace with nothing.
Your search string should be ">.+\.rar, and you can just blank out the replace box. This should do the job.
Also, check that you've got regex selected at the bottom of the replace box ;)
If you put this in find ".* and nothing in replace, that should do what you're looking for.
Remember to check that you've got regex selected at the bottom of the replace box.
Flick the "regular expression" radio button and then use this for your FIND:
">[a-z]+\.[a-z]+
Then just put empty space for your REPLACE and replace all.
Use -
Find What : (.*)">(.*)
Replace With : \1
And check Regular expression option at the bottom of the dialog.