How to remove a specific text with multiple slashes by Regex - regex

I am quite new at Regex and would like to remove the following text:
1/10 2/10 3/10 4/10 5/10 6/10 7/10 8/10 9/10 10/10
I was thinking something like:
/1(.*)10(.*)2(.*)10(.*)3(.*)10(.*)10/s
but this doesnt seem to do the trick, it does remove the text, but it removes some other things too. Some images also contain numbers, so it starts to remove from the number in the image on.
So what i am looking for is to remove the exact text as above only

You have a couple of problems here.
1) You are matching multiple characters with .* when there's only one character there (either a slash or a space). You could simply use a . to match a single character.
2) You don't even need to do that. Why not use a literal, escaped slash \/ and space respectively?

If you want to remove that exact text, I suggest using string.Replace instead of using regular expressions... that is if you're using a language with a string replace function.

Thanks for the help! As i mentioned i am new at Regex, so please forgive my teminology.
Anyway i have matched the text with /1.10.2.10.3.10.4.10.5.10.6.10.7.10.8.10.9.10.10.10/ and replaced it with a blank field and that has done the trick!
Thanks for the hints and the support, it is really appreciated!

Related

RegEx for Google Analytics that picks text within urls

I am trying to build a RegEx that picks urls that end with "/topic". These urls have a different number of folders so whereas one might be www.example.com/pijamas/topic another could be www.example.com/pijamas/strippedpijamas/topic
What regular expression can I use to do that? My attempt is ^www.example.com/[a-zA-Z][1,]/topic$ but this hasn't worked. Even if it worked I'd like to have a shorter RegEx to do this really.
Any help on this would be much appreciated.
Thank you, A.
Try this:
^www\.example\.com\/[\w\/]*topic$
You need to make a few changes to your regex. Firstly, the dot (.) is a special character and needs to be escaped by prefacing it with a backslash.
Secondly, you probably meant {1,} instead of [1,] – the latter defines a character class. You can substitute {1,} with +.
Then there's the fact that your second URL has one more subdirectory, so you need to somehow incorporate a / into your regex.
Putting all this together:
^www\.example\.com/[a-zA-Z]+(/[a-zA-Z]+)*/topic$
To shorten it, you can use the i option to match regardless of case, cutting down the two [a-zA-Z] to [a-z]. Try this online here.

Remove digits after decimal notepad++

Basically all over the document I have values like
2014-01-23 15:09:31.879958
I want to remove the last 6 digits and the . using find and replace. I've gotten
(\d{6})
To find the 6 digits but I also need it to find the . so I can replace it with nothing
Try: \.\d{6} - the \. escapes the dot.
In this case, you should be able to simply add a period to your find/replace.
As a test, I copied your example multiple times in a document. I then attempted to Find the following: \.(\d{6}) and replace with a blank
Give that a try and see if that works for you.
Cheers,
Edited to add the slash that I apparently didn't type. Silly.

Regex Match between brackets (...)

I'm trying to grab 2 items from a simple line.
[Title](Description)
EDIT: actually a url looking to display called it description because i want it displayed not actually parsed.
[Trivium](https://www.youtube.com/user/trivium)
Grabbing between the brackets (...) doesn't seem to work at all for me. I've googled and found several variations with no luck, Thanks in advance :)
EDIT:
Tried the following:
[(.+?)]\((.*)\)
[(.+?)]\([^\(\r\n]*\)
[(.+?)]((.+?))
and a cpl more I cant find again
The first regex you listed almost has it right. Try using this regex instead:
\[.+?\]\((.*)\)
As #PM 77-1 pointed out, you need to escape the brackets by placing a backslash in front of them. The reason for this is that brackets are special regex metacharacters, or characters which have a special meaning. Brackets tell the regex engine to look for classes of characters contained inside of it.
Your original regex [(.+?)]\((.*)\) is actually doing this:
[(.+?)] match a period '.' 1 or more times
\((.*)\) match (anything), i.e. anything contained in parentheses
So this regex would match .....(stuff) but would not match [Title](Description), the latter which is what you really want.
Here is a link where you can test out the working regex:
Regex 101

Notepad++ masschange using regular expressions

I have issues to perform a mass change in a huge logfile.
Except the filesize which is causing issues to Notepad++ I have a problem to use more than 10 parameters for replacement, up to 9 its working fine.
I need to change numerical values in a file where these values are located within quotation marks and with leading and ending comma: ."123,456,789,012.999",
I used this exp to find and replace the format to:
,123456789012.999, (so that there are no quotation marks and no comma within the num.value)
The exp used to find is:
([,])(["])([0-9]+)([,])([0-9]+)([,])([0-9]+)([,])([0-9]+)([\.])([0-9]+)(["])([,])
and the exp to replace is:
\1\3\5\7\9\10\11\13
The problem is parameters \11 \13 are not working (the chars eg .999 as in the example will not appear in the changed values).
So now the question is - is there any limit for parameters?
It seems for me as its not working above 10. For shorter num.values where I need to use only up to 9 parameters the string for serach and replacement works fine, for the example above the search works but not the replacement, the end of the changed value gets corrupted.
Also, it came to my mind that instead of using Notepad++ I could maybe change the logfile on the unix server directly, howerver I had issues to build the correct perl syntax. Anyone who could help with that maybe?
After having a little play myself, it looks like back-references \11-\99 are invalid in notepad++ (which is not that surprising, since this is commonly omitted from regex languages.) However, there are several things you can do to improve that regular expression, in order to make this work.
Firstly, you should consider using less groups, or alternatively non-capture groups. Did you really need to store 13 variables in that regex, in order to do the replacement? Clearly not, since you're not even using half of them!
To put it simply, you could just remove some brackets from the regex:
[,]["]([0-9]+)[,]([0-9]+)[,]([0-9]+)[,]([0-9]+)[.]([0-9]+)["][,]
And replace with:
,\1\2\3\4.\5,
...But that's not all! Why are you using square brackets to say "match anything inside", if there's only one thing inside?? We can get rid of these, too:
,"([0-9]+),([0-9]+),([0-9]+),([0-9]+)\.([0-9]+)",
(Note I added a "\" before the ".", so that it matches a literal "." rather than "anything".)
Also, although this isn't a big deal, you can use "\d" instead of "[0-9]".
This makes your final, optimised regex:
,"(\d+),(\d+),(\d+),(\d+)\.(\d+)",
And replace with:
,\1\2\3\4.\5,
Not sure if the regex groups has limitations, but you could use lookarounds to save 2 groups, you could also merge some groups in your example. But first, let's get ride of some useless character classes
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
We could merge those groups:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
^^^^^^^^^^^^^^^^^^^^
We get:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(,)
Let's add lookarounds:
(?<=\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(?=,)
The replacement would be \2\4\6\8.
If you have a fixed length of digits at all times, its fairly simple to do what you have done. Even though your expression is poorly written, it does the job. If this is the case, look at Tom Lords answer.
I played around with it a little bit myself, and I would probably use two expressions - makes it much easier. If you have to do it in one, this would work, but be pretty unsafe:
(?:"|(\d+),)|(\.\d+)"(?=,) replace by \1\2
Live demo: http://regex101.com/r/zL3fY5

how to eliminate dots from filenames, except for the file extension

I have a bunch of files that look like this:
A.File.With.Dots.Instead.Of.Spaces.Extension
Which I want to transform via a regex into:
A File With Dots Instead Of Spaces.Extension
It has to be in one regex (because I want to use it with Total Commander's batch rename tool).
Help me, regex gurus, you're my only hope.
Edit
Several people suggested two-step solutions. Two steps really make this problem trivial, and I was really hoping to find a one-step solution that would work in TC. I did, BTW, manage to find a one-step solution that works as long as there's an even number of dots in the file name. So I'm still hoping for a silver bullet expression (or a proof/explanation of why one is strictly impossible).
It appears Total Commander's regex library does not support lookaround expressions, so you're probably going to have to replace a number of dots at a time, until there are no dots left. Replace:
([^.]*)\.([^.]*)\.([^.]*)\.([^.]*)$
with
$1 $2 $3.$4
(Repeat the sequence and the number of backreferences for more efficiency. You can go up to $9, which may or may not be enough.)
It doesn't appear there is any way to do it with a single, definitive expression in Total Commander, sorry.
Basically:
/\.(?=.*?\.)//
will do it in pure regex terms. This means, replace any period that is followed by a string of characters (non-greedy) and then a period with nothing. This is a positive lookahead.
In PHP this is done as:
$output = preg_replace('/\.(?=.*?\.)/', '', $input);
Other languages vary but the principle is the same.
Here's one based on your almost-solution:
/\.([^.]*(\.[^.]+$)?)/\1/
This is, roughly, "any dot stuff, minus the dot, and maybe plus another dot stuff at the end of the line." I couldn't quite tell if you wanted the dots removed or turned to spaces - if the latter, change the substitution to " \1" (minus the quotes, of course).
[Edited to change the + to a *, as Helen's below.]
Or substitute all dots with space, then substitute [space][Extension] with .[Extension]
A.File.With.Dots.Instead.Of.Spaces.Extension
to
A File With Dots Instead Of Spaces Extension
to
A File With Dots Instead Of Spaces.Extension
Another pattern to find all dots but the last in a (windows) filename that I've found works for me in Mass File Renamer is:
(?!\.\w*$)\.
I don't know how useful that is to other users, but this page was an early search result and if that had been on here it would have saved me some time.
It excludes the result if it's followed by an uninterrupted sequence of alphanumeric characters leading to the end of the input (filename) but otherwise finds all instances of the dot character.
You can do that with Lookahead. However I don't know which kind of regex support you have.
/\.(?=.*\.)//
Which roughly translates to Any dot /\./ that has something and a dot afterwards. Obviously the last dot is the only one not complying. I leave out the "optionality" of something between dots, because the data looks like something will always be in between and the "optionality" has a performance cost.
Check:
http://www.regular-expressions.info/lookaround.html