Remove everything without Digits With Notepad ++ - regex

I wish to remove everything except Digits in my notepad ++ with regular expression.
can anyone help me with the String to use. that would help me get results like
from
416385-creativelive-photo-week-2014-hd-full-day-5.html
416668-creativelive-photo-week-2014-hd-full-day-4.html
421733-creativelive-photo-week-2014-day-2.html
to
416385
416668
421733

According to this sentence:
I wish to remove everything except Digits in my notepad ++ with regular expression.
do:
Find what: \D+
Replace with: :Nothing
It'll give : 416385201454166682014442173320142, but I'm pretty that's not what you want.
Another proposal is to keep also line break:
Find what: [^\d\r\n]+
Replace with: :Nothing
It'll give:
41638520145
41666820144
42173320142
Finally, according to your example, I guess you want:
Find what: ^(\d+).*$
Replace with: $1
NB.: Don't check dot matches new line.
It'll give:
416385
416668
421733

Related

Parse the string using RegEx in notepad++

I am trying to parse out some data using notepad++ macro. Here is the example of the data I have
<abcdefghkdadajsdkdjg><hhDate>2019-12-31 <dklajdlajdkjasd>
I want hhDate 2019-12-31 from the above data. I am very new to RegEx so I didn't try anything but I used notepad++ techniques to select and delete the unnecessary text but didn't work out.
Any help is appreciated.
Thanks
Assuming each of the strings are on a new line because you have to capture the whole line to remove the 'junk' and leave the good stuff, find the start of the line (^), then find first bit you want to capture and wrap it in () then find the second bit and wrap it in (), then proceed on to the end of the line ($).
So in Notepad++ work to get all the strings on separate lines first if they are not already. Then find/replace with 'regex' mode selected:
Find:
^.*?<.*?<(hhDate)>(\d+-\d+-\d+).*$
Replace:
$1 $2
https://regex101.com/r/BKha4m/1
If you don't want < to be removed before hh ? Then try this short code.
Find what: \s<.*?>
Replace with: nothing
Otherwise use this \s<.*?><|<.*>
Uncheck match-case

What is the regexpression for fixed words and a variable

I am using sublime to search and replace text in a code.
This is what I want to find
pre + <anyvariable> + post
and then replace it like this
sanitize(<anyvariable>)
I don't know how to come up with regexp after finding "pre + "
Came up with something like this
/(\pre \+)(?=.*)/g
Try this:
/^(.*\+)\s*(\S+)\s*(\+.*)$/
This will give you everything in three capture groups, starting from the beginning of the line between the first and second +, and then everything following to the end of the line. If there's other content such as more +'s, then this probably won't work blanketly.
Without a computer, and not sure a out what you want, but if you search for:
pre \+ (.*?) \+ post
And replace with:
sanityze($1) (or \1 I never remember)
It should do what you want :-)
If you want to be able to have linebreaks, replace .* by (?:.|\n)

How to use RegEx to add "_" between two words with notepad++

I want to use Notepadd++ replace option with regular expression to accomplish this:
From: IntegrationName
To: Integration_Name
How can do this ?
My RegEx to search is: .[A-Z]
this finds: "oN"
But I don't know what to put in the replace box so it will only add "_" between the "o" and the "N"...
Another solution using lookaround assertions would be:
(?<=[a-z])(?=[A-Z])
and replace with
_
Note: The "Match case" option needs to be active, otherwise Notepad++ will find a match between every two letters.
This regex will find every position where a lowercase is on the left and an uppercase is on the right.
You can make use of capture groups. If I have to take your current attempt and edit it as little as possible, you would get:
(.)([A-Z])
This will store the match of . into $1 and the uppercase letter in $2, thus, you can use the following in the replace entry box:
$1_$2
I know you've accepted an answer, but when I ran it, I got From: _Integration_Name
Here's my idea;
(:\s)(.{1})([a-z]*)([A-Z]{1})
And use the following replace
$1$2$3_$4
I finaly did it like this:
Find: ([a-z])([A-Z])
Replace with: $1_$2

Replace multiple sentences between 2 expressions in multiple files Notepad ++

I have 58K files where I need to find this expression
()">A Random sentence.</A></P>
and i need to replace A Random Sentence by nothing.
I was trying on Notepad++ something like
Find What: ()">[[:alnum:][:punct:][:space:]]</A></P>
Replace: <empty>
Not even gettng results from the search...
Waiting for some feedback.
Try to find
(\(\)">).*(<\/A><\/P>)
and replace it with
$1\<empty\>$2
The idea is to save left part and right part, placing essential parts in brackets ().
The ".*" means every character in between.
In replace statement we call $1 and $2 to access saved parts.
You also can try :
(?<=\(\)">)[a-z \.-]+(?=</A></P>)
here [a-z \.-] you put everything what you want to search
Also parenthesis in Notepad++ should be mark with \
This should work for you:
Find: (?<=\(\)">)A Random sentence.(?=<\/A><\/P>)
Replace: <empty>
If A Random sentence. is not the actual sentence you can replace the find with:
(?<=\(\)">).*?(?=<\/A><\/P>)

Matching all occurrences of a html element attribute in notepad++ regex

I have a file which has hundreds of links like this:
<h3>aspnet</h3>
Ex 1
Ex 2
Ex 3
So I want to remove all the elements
icon="..."
from all the lines. I went through the official Notepad++ regex wiki and have come up with this after several trials:
icon=\"[^\.]+\"
The problem with this is, it is selecting past the second double quote and stopping at the next occurring double quote. To illustrate, this will select the following content:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt...">EX 1</a> <a href="
If I modify the above regex to,
icon=\"[^\.]+\">
Then it is almost perfect, but it is also selecting the >:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt...">
The regex I am looking for would select like this:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt..."
I also tried the following, but it doesn't match anything at all
icon=\"[^\.]+\"$
Just match anything but a quote, followed by a quote:
icon="[^"]+"
Just tested with notepad++ 6.2.2 and confirmed that this matches correctly as written.
Broken down:
icon="
This is fairly obvious, match the literal text icon=".
[^"]+
This means to match any character that is not a ". Adding the + after it means "one or more times."
Finally we match another literal ".
I am not a notepad++ user. so don't know how notepad++ plays with regex, but can you try to replace
icon=\"[^>]* to (empty string) ?
Try this solution:
This is I just check was working as you wanted it.
The way achieving your goal:
Find what: (icon.*")|.*?
Replace with: $1