notepad++ regex to replace define sort of data - regex

I want to replace texts in huge text file using notepad++. I don't know how can I replace text only if it's length is between for example 50-100. As far as I know in regex it should look like this [a-zA-Z0-9 -+]{50,100} but it doesn't work in n++. I'm not a regex specialist.
Example input:
<a>short text</a>
<a>veeeeryyyyy lloooooonnngggg teeeexxxtttt</a>
Expected output:
<a>short text</a>
<a>shrt txt</a>

Or more better is [^<>]{15,100} it replace everything between tags

Related

replace text with regular expression keeping structure match on sublime text

i been trying a few options, but i can't figure out.
this is the text i'm looking for:
php[whatever_is_in_between]myfunction
and i want to change it to this
=[whatever_is_in_between]myfunction
where [whatever_is_in_between] = \n or \n\t or \n\t\t or \nbarspace or \nbarespace\t and so on
so i found this regexp match the search text:
php[\n]*[\t]*[ ]*myfunction\(
this is the "replace with" text:
=[\n]*[\t]*[ ]*myfunction\(
but the regexp does not work on the replacement, it replace it as text.
can anybody help me with this?
thanks
I think the problem is that you are not using a capturing group ( ). Among other things, a capturing group allows you to take input from the the read text and then inject it into your replacement text.
I'd use a search pattern like this:
php\[([^\]]*)\]([\w\W]*)
It looks complicated, but I've set up a sample on Regex 101 that you can check out. The replacement text should look something like this:
[\1]\2
Please note that how you insert a capturing group will depend on what programming language you're using. The above should work for php.
I hope that helps,
--Jonathan

multi-line xml regex in sublime

I have a large logfile (+100 000 lines) in XML like so:
<container>
<request:getApples xml="...">
...
</request:getApples>
<request:getOranges xml="...">
...
</request:getOranges>
</container>
...
I want to extract the :getXXXX part to
getApples
getOranges
by doing a regex find & replace in Sublime Text 2.
Something like
Find: [^(request:)]*(.*) xml
Replace: $1\n
Any regex masters that can assist?
Correcting mart1n's answer and actually using ST2 and your sample input, I came up with the following:
First, CtrlA to select all. Then, CtrlH,
Search: .*?(get\w+) .*
Replace: $1
Replace All
Then,
Search: ^[^get].*$
Replace: nothing
Replace All
Finally,
Search: ^\n
Replace: nothing
Replace All
And you're left with:
getApples
getOranges
Not familiar with Sublime Text but you can do in two parts:
Find .*?\(get\w+\) .* and replace with \1. Now those get* strings are on separate lines with nothing else. All that remains is to remove the cruft.
So, many ways to do this. Easy one: find ^[^g][^e][^t].*$ and replace with nothing (an empty string).
Now you have your file that contains just the string you want and some empty lines, which (I hope) Sublime can get rid of with some delete-empty-lines function.
You can quickly throw all of the above in a macro and execute at will for any input following the same format ;-)
If you're willing to take the problem out of sublime text, you can use the dotall flag along with lazy matching to extract only the getXXX parts.
Replacing
.*?(get\w*) .*?
with
$1\n
should get you most of the way, only leaving a bit of easily removeable closing tags at the end of the file that I can't figure out at present.
You can check this solution here.
Maybe someone could take this and figure out a way to remove the extra closing tags.
Try this
Find what: :(\w+)>|.\s?
Replace with: $1
And if didn't work as intended, then let me know?

How can find the regex for separating strings

I have this file
xorg-fonts-misc-1.0b-1
Xorg-font-bitstream-75dpi-1.0.0-2.i386
Xorg-font-bitstream-100dpi-1.2a-2.arm
Other-Third-Party-1.2.2-1-any
i want to separate and want output like this
xorg-fonts-misc- 1.0b-1
Xorg-font-bitstream-75dpi- 1.0.0-2.i386
Xorg-font-bitstream-100dpi- 1.2a-2.arm
Other-Third-Party- 1.2.2-1-any
I tried this
-[^a-zA-Z][0-9\.\w-]+[^a-zA-Z][\w-]*?[\d\w]*\n
this will put your text into two matching groups. I put spaces between the two groups, but you can put tabs or whatever else in there if you want
/^(.*?)((\d[a-z]?\.)+.*)$/\1 \2/gmi
regex101 is a great place to test out regexes. That link has the regex and your test input, and gives a full explaination of how the regex works

Replace text with regular expressions in a text editor

I need to edit lines in a text file.
The text files contains more than 100 lines of data in the below format.
Cosmos Rh Us (Paperback) $10.99 Shipped:
The Good Earth (Paperback) $6.66 Shipped:
BEST OF D.H. LAWRENCE (Paperback) $7.89 Shipped:
...
These are excerpts from the online book shop I use to buy books
I have this data in a test editor. How do I edit it [Fine/Replace] such that the data becomes like this
$10.99
$6.66
$7.89
or better, without the dollar sign, since it'll be easy total it.
I use notepad++ as text editor.
Search for (don't forget to enable regular expressions in the replace box!)
^.*\$(\d+\.\d+).*$
and replace all with
\1
You could simply match full lines and capture all numbers after the $ sign:
Find what: ^[^$]*\$(\d+\.\d+).*$
Replace with: $1
Make sure that you don't check the ". matches newline" option. And note that this will behave unexpectedly if you have multiple $ signs in a line.
You might need to update to Notepad++ 6. Before that some regex features were not working properly.
Find:
((?<=\$)[\d\.]+)
Replace With:
\1 or $1 (whichever Notepad++ uses)
first regex will be replaced with nothing
[a-zA-Z0-9].*\)
second regex will be replaced with nothing
[a-zA-Z]+\:

How to write a regex for [mm:ss.SS] text <mm:ss.SSS> text

How can I write a regex to extract the time from something like the below:
Example:
[00:16.150]Why <00:16.600>do you <00:16.800>look <00:17.150>so <00:17.600>glum?
\[\d{2}:\d{2}\.\d{3}\](?:.*?<\d{2}:\d{2}\.\d{3}>)*.*
should match (if I haven't made a mistake), but I don't know what you want to do with the matched text.