LookBehind - Find string occuring after a pattern - regex

I need help regarding a regex Query :-
C: - total 79.45 Gb - used: 33.82 Gb (43%) - free 45.63 Gb (57%)
This is my sample text . I want to find the %usage of used disk . i.e 43% in my case .
I am using lookbehind to find the occurrences after "used" keyword .
This is the pattern i am using (?<=(\bused))(.*?\(\d*%\)) . But this is giving me : 33.82 Gb (43%) as the output . I only need 43 as my output.
Can anyone please help

Try capturing only the \d* part:
(\bused)(.*?\((\d*)%\))
* *
The asterisks are where the group 3, the group you want, starts and ends.
Or you can make every other group non capturing, and get group 1:
(?:\bused)(?:.*?\((\d*)%\))
Demo

Using python3 you can write this :
import re
re_findall = re.findall("used:.*\(([0-9]*)%\) -",line)
where line is :
C: - total 79.45 Gb - used: 33.82 Gb (43%) - free 45.63 Gb (57%)
This is similar to what Sweeper say, you just only "capture" the number here using a python module.

Related

Regex to remove number and sign '/' before or after '/' if less than 20 in productfeed attribute

We have created a product feed from a Magento webshop using the Koongo Module. We want to submit this feed to a marketplace. Our customer does not consistently fill in the size field when adding products. As a result, there is also an incorrect format for the marketplace. Size examples are: 39/6 or 40 / 6.5 or 8/42 or 8.5 / 42.5. In short, either first the European size and then the UK size or first the UK size and then the European size. We want to correctly display the size in the feed using a regular expression, namely only the European size. In short, we no longer want to include either '/ UK m' or 'UK size /'. It is important that the sign / must also be removed from the outcome in our product feed. Can you help?
We have the option to fill in two fields, namely 'Rewrite from' and 'Rewrite to'. We can use a regular expression for each of these fields.
Thanks in advance for the help.
It would be simple to only take the numbers >= 20. Does that help you?
[2-5][0-9](\.5)?
This will match all sizes between 20 and 59 with the option of half sizes
I am not familiar with Magento. If it supports regular expression replace with capture groups try this:
Rewrite From: ^.*?([234][0-9](\.5)?).*?$
Rewrite To: $1
Explanation:
^ - anchor at start of string
.*? - non-greedy scan
( - capture group start
[234][0-9] - number between 20 and 49
(\.5)? - optional .5 fraction
) - capture group end
.*? - non-greedy scan
$ - anchor at end of string
use capture group in replace: $1

Regular expression for matching a specifc substring of a string

I have a log file that logs connection drops of computers in a LAN. I want to extract name of each computer from every line of the log file and for that I am doing this: (?<=Name:)\w+|(-PC)
The target text:
`[C417] ComputerName:KCUTSHALL-PC UserID:GO kcutshall Station 9900 (locked) LanId: | (11/23 10:54:09 - 11/23 10:54:44) | Average limit (300) exceeded while pinging www.google.com [74.125.224.147] 8x
[C445] ComputerName:FRONTOFFICE UserID:YB Yenae Ball Station 7C LanId: | (11/23 17:02:00) | Client is connected to agent.`
The problem is that some computer names have -PC in them and in some isn't. The expression I have created matches computer without -PC in their names but it if a computer has -PC in the name, it treats that as a separate match and I don't want that. In short, it gives me 3 matches, but I want only 2. That's why I need help here, I am beginner in regex.
You may use
(?<=Name:)\w+(?:-PC)?
Details
(?<=Name:) - a place immediately preceded with Name:
\w+ - 1+ word chars
(?:-PC)? - an optional non-capturing group that matches 1 or 0 occurrences of -PC substring.
Consider using word boundaries if you need to match PC as a whole word,
(?<=Name:)\w+(?:-PC\b)?
See the regex demo.

Regex - Multiple matches, colon and empty spaces

I'm trying to get some RegEx working but fail slightly at my specific use case.
Given the following string for example
Device-1: P0_Node0_Channel0_Dimm0 size: 32 GB speed: 2133 MHz type: DDR4
I want to extract informations from this string preferably like that:
Device-1: P0_Node0_Channel0_Dimm0
size: 32 GB
speed: 2133 MHz
type: DDR4
So I tried a bit around and tested some expressions
(.*?):\s
Does work to some regard. Catches the first parameter name properly but after that messes up with the spaces.
:\s(.*?)\s\w*?:!?
Although this catches the empty space in the third parameter value, it only gives me the first and the third value. Also no parameter names.
Someone has an idea how I could achieve the expected behaviour?
Note: I'm doing this in Excel VBA, not sure if all functions are supported there.
Thanks
You may use
([^\s:]+):\s*(.*?)\s*(?=[^\s:]+:|$)
See the regex demo
Details
([^\s:]+) - Group 1: one or more chars other than whitespace and :
: - a colon
\s* - zero or more whitespaces
(.*?) - Group 2: any 0+ chars other than line break chars up to the first occurrence of...
\s* - zero or more whitespaces that are followed with...
(?=[^\s:]+:|$) - one or more chars other than whitespace and : or end of string
This is an Autoit example:
#include <Array.au3>
Local $str = 'Device-1: P0_Node0_Channel0_Dimm0 size: 32 GB speed: 2133 MHz type: DDR4'
$r_A = StringRegExp($str, '([\w-]*\:)\s*(\w*\s|.*)', 3)
ConsoleWrite(_ArrayToString($r_A, #CR) & #crlf)

REGEX : Extract group of number where digits are more than 3

HI I have a question regarding REGEX.
This sounds very simple and I remember doing it but somehow it got deleted and I am finding it hard to get it back.
I want to extract group of numbers from one line.
If the count of digits > 3 - select that.
EG:
ga3rdparty/phpMyAdmin/i0ndex.php?&t0oken=abf540063shakk
This line can be different everytime but there will be only 1 group of digits with more than 2 digits.
OUTPUT: 540063
Thank you in advance
You can use \d{3,} where 3 is the minimum number of digits. You an take a look at the following python code
import re
var= "ga3rdparty/phpMyAdmin/i0ndex.php?&t0oken=abf540063shakk"
pattern = re.compile(r'\d{3,}')
for match in pattern.findall(ver):
print(match)

Regex ordinal digit and file size

I have been thinking about a regular expression that can transform a list like this:
1. 10.Things.I.Hate.About.You[1999]DvDrip[Eng]-Ray 699.68 MB
2. 100.Feet.2008.DvDRip-FxM 701.14 MB
3. 11 - 14 1 286.22 MB
4. 13_going_on_30(2004)[Brizzly] 700.23 MB
...
1 523. Waz 699.93 MB
1 524. We.Own.the.Night[2007]DvDrip[Eng]-Ray 700.87 MB
1 525. Webs [2003]DVDRip[Xvid AC3[5.1]-RoCK&BlueLadyRG 1 347.70 MB
into:
10.Things.I.Hate.About.You[1999]DvDrip[Eng]-Ray,699.68 MB
100.Feet.2008.DvDRip-FxM,701.14
11 - 14,1286.22
13_going_on_30(2004)[Brizzly],700.23
...
Waz,699.93
We.Own.the.Night[2007]DvDrip[Eng]-Ray,700.87
Webs [2003]DVDRip[Xvid AC3[5.1]-RoCK&BlueLadyRG,1347.70
Assumption : The filesize is never > 9999.99MB
So far I have a partially working regex:
^[^\.]+\. (.+?) (?:([0-9])(?: ))?([0-9]+\.[0-9]{2}) MB.*$
that maps to
$1:$2$3
to complete the transformation.
I used the colon because no desktop OS would allow that in a filename, so I am safe.
I built the regex without any formal method (i.e, via using intution) and that very same intution tells me this regex is horrifically complicated and slow!
I wish RegExBuddy had a online version or something similar.
How do I build a better RegEx for the same? Hints, tips...
I use The Regex Coach.
In Perl:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
while ( <DATA> ) {
no warnings 'uninitialized';
next unless /^[^.]+\. (.+?) (?:(\d+) )?(\d+(?:.\d+)?) MB$/ ;
print "$1,$2$3\n";
}
__DATA__
1. 10.Things.I.Hate.About.You[1999]DvDrip[Eng]-Ray 699.68 MB
2. 100.Feet.2008.DvDRip-FxM 701.14 MB
3. 11 - 14 1 286.22 MB
4. 13_going_on_30(2004)[Brizzly] 700.23 MB
...
1 523. Waz 699.93 MB
1 524. We.Own.the.Night[2007]DvDrip[Eng]-Ray 700.87 MB
1 525. Webs [2003]DVDRip[Xvid AC3[5.1]-RoCK&BlueLadyRG 1 347.70 MB
Output:
C:\Temp> zcx
10.Things.I.Hate.About.You[1999]DvDrip[Eng]-Ray,699.68
100.Feet.2008.DvDRip-FxM,701.14
11 - 14,1286.22
13_going_on_30(2004)[Brizzly],700.23
Waz,699.93
We.Own.the.Night[2007]DvDrip[Eng]-Ray,700.87
Webs [2003]DVDRip[Xvid AC3[5.1]-RoCK&BlueLadyRG,1347.70
"I used the colon because no desktop OS would allow that in a filename, so I am safe."
Nice try. It is allowed under GNU/Linux.
More importantly, you have only given examples. You have not described what the regex is intended to do. You also have obviously pointless constructs, like (?: ), which could just be a single space.
Finally, it's unclear what role the colon actually plays, as it's not in your replacement text. Perhaps it would help if you told us what language you're using.