.net/powershell regex single match - regex

I have spent days trying to work this out. Managed to get text to be captured but I need only one of the lines.
I have tried various ways but always get all matches returned to me.
This line of text appears 3 times
<![LOG[Property SerialNumber is now = serial]LOG]!>
using the Regex
(?<=Property\sSerialNumber\sis\snow\s\=\s)[^<]+(?=]LOG]!>)
I get three matches of the word serial. I only need 1.
Where am i going wrong?

If that exact line appears three times, then [Regex]::Matches will return all three of them, of course.
You can use [Regex]::Match if you're only interested in the first.

Related

Regex Extract with two different delimiters

Working in Google Data Studio and having trouble extracting a string between two different delimiters
For example if I have the following line item:
Company_Clothes_Shirt:Red_Online_US
I would like to extract just Red
I’ve tried
REGEXP_EXTRACT(Dimension,'^(?:[^\\_]*\\_){2}([^\\:]*\\:){1}') but it just gives me Shirt:
Tried several other iterations but have only been able to extract the first part (Shirt), rather than the second (Red).
Would appreciate any help on this!
You don't need to extract based on the whole string, you can just extract the value between the two delimiters:
SELECT REGEXP_EXTRACT(Dimension,':([^_]+)_')
For an input value of Company_Clothes_Shirt:Red_Online_US, this will give Red.
Note that neither _ or : are special characters for regex, so they don't need to be escaped.

KNIME regex expression to return 6th line

I have a column with string values present in several lines. I would like to only have the values in the 6th line, all the lines have varying lengths, but all the cells in the column have the information I need in the 6th line.
I am honestly absolutely new and have no background in Java nor KNIME - I have scoured this forum and other internet sources, and none seem to tackle what I need in KNIME specifically - I found something similar but it doesn't work in KNIME:
Regex for nth line in a text file
Your answer will probably need to be broken into two parts
How to do a regex search in KNIME
How to do a regex search for the 6th line
I can help with the regex search, but I don't know KNIME
To start with, you want to know how to search for a single line which is
([^\n]*\n)
This looks for
*: 0 or more of
[^\n]: anything that isn't a new line
followed by \n: a new line
and (): groups them together into a single match
We can then expand this into: ([^\n]*\n){5}([^\n]*\n){1} Which creates 2 capture groups, one with the first 5 lines, the second with the 6th line.
If KNIME supports Non-Capturing groups you can then expand that into the following so that you only have one matching capture group. You can decide for yourself which you like best.
(?:[^\n]*\n){5}([^\n]*\n){1}
I've created an example you can test on RegExr
Regardless of which way you go, make sure to document the regex with comments or stick it into a variable with a very clear name since they aren't particularly human readable

REGEX to find first instance after set length

I'm probably going to get pilloried for asking this question, but after searching and trying to figure out this regex on my own, I'm just tired of wasting time trying to figure out. Here's the problem I'm trying to solve. I frequently use editpad pro to to convert character strings so they will fit into a mainframe.
For instance, I want to convert a column of words from excel into an IN clause for sql. The column is 5000 words or so.
I can easily copy and paste that into the text editor and then using find and replace convert that from a column of words to a single row with ',' separating each word.
Once that's done, though I want to use a regex to split this row before or after a comma after 70 characters have gone by.
(?P<start>^.{0,70})
This will give me the first 70 characters, but then I get stuck as I can't figure out how to create the next group to find all the characters up to the next comma so I can refer to it like this
(?P<start>^.{0,70})(?P<next>????,)
If I could get that, then I could create do a find and replace that would break it after the first comma that appears after the 70th character.
I know given the rest of the day I could figure it out, but I need to move on. I've tried this before. I would even be willing to only find the first 7o characters and then next few characters until the comma and then have to repeat the replace and find multiple times, if necessary, but I can not get the regex to work.
Any assistance with this would be greatly appreciated.
Here is some sample data that I have added line breaks into as an example of what I want it to look like after the regex runs.
'Ability','Absence','Absolute','Absorb','Accident','Acclaim','Accompany',
'Accomplish','Achievement','Acquaintance','Acquire','Across','Acting','Address',
'Admire','Adorable','Advance','Advertisement','Afraid','Agriculture','Align',
'All','Allow','Allowance','Allowed','Alone','Aluminium','Always','America',
'Analyze','Android','Angle','Announce','Annual','Ant','Antarctica','Antler',
I think you should consider restricting your initial concatenation, but here's a solution to your specific implementation :
^.{0,70}[^,]*
This will select the first 70 characters (if available), then every character up to the one before the next comma.
I don't think you need groups here, but you can obviously add them to the regex :
(?P<start>^.{0,70})(?P<next>[^,]*)

Find in Files (Regex) - Returning Consecutive Lines

I need to return all occurrences of when three lines are consecutively written in a file. I'm looking for the following:
FieldName=<some name>
Operator=<some operator>
Value=<some value>
Example File Content
MatchAny=FALSE
FieldValue=TRUE
Operator=Is less than
TotalFields=1
[OutputTarget0SelField0]
FieldName=ORIG-DATE
Operator=Is greater than
Value=20000101
[OutputTarget1]
To do this, I have been trying to use Notepad++ Find in Files functionality but I cannot seem to get the correct regular expression.
Here is what I've tried (in this case I'm assuming the two lines after FieldName= will always be Operator= and Value=)
Find what: (FieldName=|Operator=|Value) is also close, but obviously doesn't account for the fact that these lines need to be consecutive ("FieldName=" followed by "Operator=" followed by "Value=") and returns all single occurrences as well.
You can use ^FieldName=[^\n\r]*[\n\r]+Operator=[^\n\r]*[\n\r]+Value=[^\n\r]* to match your 3 consectuve lines:
^FieldName=[^\n\r]*[\n\r]+ matches a start of a line, follwed by FieldName=, any amount of non-linebreaks and then one or more linbreaks. As you tagged your question with Windows, you might be able to replace [\n\r]+ with \r\n, this also prevents empty lines from jumping into the match (which at the moment would be possible)
Operator=[^\n\r]*[\n\r]+ is basically the same for the Operator-Line
Value=[^\n\r]* is again the same for the Value-Line, this time without the finishing linebreaks
As stated in the comments, this will only show you the first matched line in the find in files overview, but you can double click it, so it shows the whole match.

REGEX: How to match several lines?

I have a huge CSV list that needs to be broken up into smaller pieces (say, groups of 100 values each). How do I match 100 lines? The following does not work:
(^.*$){100}
If you must, you can use (flags: multi-line, not global):
(^.*[\r\n]+){100}
But, realistically, using regex to find lines probably is the worst-performing method you could come up with. Avoid.
You don't need regex for this, there should be other tools in your language, even if there is none, you can do simple string processing to get those lines.
However this is the regular expression that should match 100 lines:
/([^\n]+\n){100}/
But you really shouldn't use that, it's just to show how to do such task if ever needed (it searches for non newlines [^\n]+ followed by a newline \n repeated for {100} times).