Conditional Regex for Percentage based values [closed] - regex

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
I've never been very good at regex, but I really need to grab the percentage information from these log entries; however, the warn/critical message moves around depending on where the warning was located in either the In or the Out utilization. I just can't figure out the regex. Here are two example entries that show both in and out issues:
["XXXXXXX"], (up), MAC: XX:XX:XX:XX:XX:XX, Speed: 2 GBit/s, In: 0 Bit/s (0%), Out: 6.53 GBit/s (warn/crit at 1.6 GBit/s/1.8 GBit/s) (326.45%)(!!)
["XXXXXXX"], (up), MAC: XX:XX:XX:XX:XX:XX, Speed: 2 GBit/s, In: 0 Bit/s (warn/crit at 1.6 GBit/s/1.8 GBit/s) (95.45%), Out: 6.53 GBit/s (32.00%)(!!)
Ultimately I need to use capture groups to capture both the in and out utilization percentage. But every regex I try only finds a single percentage. Help on this would be greatly appreciated. Thanks in advance.
EDIT SHOWING EXPECTED RESULT:
for each line the regex capture groups would identify in and out so the program can see both the in and out utilization. The program is expecting a key value pair from every log entry like the following:
IN:0% OUT:326.45%
IN:95.45% OUT:32.00%

Do you need something like this?
In:.+\(([0-9\.]+%)\).+Out:.+\(([0-9\.]+%)
If you just need to pull out values with percentage information, then it can help
https://regex101.com/r/B9pZeO/1

Related

How to match regex pattern multiple times in Pyspark? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Below consists of email data present in the single column:
Requirement is to print from Call Example to additional details alone.
Input:
Summary:
Below are the details:
Call Example:
dialFromNumber:***** dialToNumber:***** date:*** time:*** additional details:xxxx
Please check out the call details.
Second Call Example:
dialFromNumber:*****
dialToNumber:*****
date:***
time:***
additional details:xxxx
Some random text.
Output:
Both of the call examples needs to be populated in the new column 'Calldetails1' in two different rows using Pyspark.
Call Example:
dialFromNumber:***** dialToNumber:***** date:*** time:*** additional details:xxxx
Call Example:
dialFromNumber:*****
dialToNumber:*****
date:***
time:***
additional details:xxxx
Regex_extract which i used to print from call example to additional details:
result = df.withColumn('result',regex_extract('comments','(?s)(?=Call Example)(.?additional details:\s[\w+])',1))
It's working for one group. Please suggest options to work globally in python
As mentioned in the chat:
(?=Call Example)([\w\s:\*]+?[\S])$
(?=Call Example) will assert whether there is a string that starts with Call Example
[\w\s:*]+? - Will do a lazy check of atleast 1 or more characters until the last occurence of a character till end of line.
Extracting multiple captured groups using pySpark
https://stackoverflow.com/questions/58930893/extracting-several-regex-matches-in-pyspark
https://stackoverflow.com/questions/54597183/i-have-an-issue-with-regex-extract-with-multiple-matches

Why lone line in fortran code in one part is highlighted and become like comment? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to write a fortran code for 1 phase flow in porous media.
In discretized equations and other long line I have a below problem as you see in the picture.
After that highlighted phrase my code become like comment.
Can anyone help me?
According to this:
In its current state, the syntax highlighting for fixed-form Fortran in the extension only supports a line length of 72 characters. Anything after column 72 appears as comments (green) in the source code, which also affects the appearance of the following lines (when a closing parenthesis is in this green region for example)
You can change it on fortran_fixed-form.tmLanguage.json file or on the VS code settings.

Regex (Bigquery) get specific values from STRING [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have the STRING - TX1234XT batch 44, 1111ABCDEF
TX1234XT (Can be different length)
batch 44 (number can be different length)
ABCDEF (can be a different length, but always have 1111 at the start)
What I need is to generate two columns:
BatchNumber Name
44 1111ABCDEF
1 1111SAMPLE
999 1111Example
Starting point:
First is done:
REGEXP_EXTRACT(reference, r'1111[a-zA-Z0-9_.+-]+') AS Name
Second
- REGEXP_REPLACE(REGEXP_EXTRACT(reference, r'batch [0-9_.+-]+'),r'batch ','') AS BatchNumber
SORTED ^_^
I don't really know Google Big Query, but if you want to extract the batch number and the value at the end, you could go with this regular expression:
/^.*?batch\s*(\d+),\s*(1111.+)$/
(\d+) will capture your batch id.
(1111.+) will capture the value starting with 1111.
Example here: https://regex101.com/r/SJXmIV/2

How to find dates in any text with regex? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a text extracted from an OCR program. I manage so far to get every element I wanted except the date. My date would be like this in some cases ASDICA>31.04.2019END($> and in others will be with spaces (which are easy to extract). My question:
Is there any quick function without nested for loops to parse through the text and extract dates?
My first amateur thought was to build a list with the common date separators, parse the text, save the position of the elements found in the text and then search their relatives to build a date.
This took a lot of time and proved troublesome because I'm hitting many escape chars due to OCR's behavior.
My ideal output would be 31/04/2019 but I can handle the symbol replacement as long as I got a list with the dates from the text.
To begin with SDICA>31.04.2019END($> is not a valid date :) April just has 30 days in a month.
But to answer your question, you can use dateutil module, especially the parser.parse function for the problem at hand
from dateutil import parser
#Parse date from the string, fuzzy parameter can find hidden datetime string around a wall of text
print(parser.parse('ASDICA>31.01.2019END($>', fuzzy=True))
The output will be 2019-01-31 00:00:00

Regex: finding a number between a range with decimals [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
I can not for the life of me get my head around this regex stuff after a few days of fiddling around I find myself seeking help from those wiser than I. Could any of you kind souls write me a line(s) that will find and match a number between 0.00 and x.xx? I do need the decimals however so hopefully this can be done.
I actually tried using
(\b|^)(0.00|0.01|0.02)(\b|$)
until x.xx and so forth but I couldn't fit the rest of it in because I need it to go into the 100.00+. Would anyone mind whipping something up real quick for me? : ) I would appreciate it more than you can imagine! Thanks very much for your time.
Ray.
Edit:
So i forgot to explain what I'm trying to achieve here, I'm using it in conjunction with a Chrome addon called Page Monitor (life saver folks try it out when you have time to kill!) which pings every time an a website updates, this also works for shares but I'm trying to make it only alert me when the price drops below a certain point eg $4.99 per share, will (\b|^)([0-9]+\.[0-9]{2})(\b|$) and ([0-9]+.[0-9]+) suffice?
Why isn't this good enough: ([0-9]+\.[0-9]+) ?
If you can give an example of input and what is the output you expect, it would be easier to write a regex.
Updated: $ sign is a reserved character in RegEx, it means end-of-line, so you need to use \$, if you plan on using it.
So your regex would be \$([0-9]+\.[0-9]+), this would capture your $4.99 and $5.10, etc, not just $4.99
Regexs in general are good at capturing data, less at analyzing it, but if you must, you can do this to determine when the price goes below $4.99 =>
\$(([0-3]\.[0-9]+)|(4\.[0-8][0-9])|(4\.9[0-8]))
It should be obvious that its a waste of resource :)
Didn't provide enough info but this will match if the number is the entire value or if it is within a larger string and the number is not withing something else like "foo8.9bar". This will match any 1 or more digit number on the left side of the decimal and exactly 2 numbers on the right side
(\b|^)([0-9]+\.[0-9]{2})(\b|$)
(\b|^) and (\b|$) are redundant because \b implies ^ and $.
this regex: (\d+\.\d{2}) should do it.