Search for text with Regex - regex

I need a regular expression to get the text out from between [ and ] within a sentence.
Example Text:
Hello World - Test[**This is my string**]. Good bye World.
Desired Result:
**This is my String**
The regex that I have come up with is Test\\[[a-zA-Z].+\\], but this returns the entire **Test[This is my string]**

You could use a capturing group to access to the text of interest:
\[([^]]+)\]
A quick proof of concept using JavaScript:
var text = 'Hello World - Test[This is my string]. Good bye World.'
var match = /\[([^\]]+)\]/.exec(text)
if (match) {
console.log(match[1]) // "This is my string"
}
If the regular expression engine you are using supports both lookahead and lookbehind, Tim's solution is more appropriate.

Match m = Regex.Match(#"Hello World - Test[This is my string]. Good bye World.",
#"Test\[([a-zA-Z].+)\]");
Console.WriteLine(m.Groups[1].Value);

(?<=Test\[)[^\[\]]*(?=\])
should do what you want.
(?<=Test\[) # Assert that "Test[" can be matched before the current position
[^\[\]]* # Match any number of characters except brackets
(?=\]) # Assert that "]" can be matched after the current position
Read up on lookaround assertions.

Related

How to replace all from string except of part which follow regex pattern?

I have pattern \+\d.AT and some string like "test 123 +1 AT test end". And i need to remove all except part which follow regex pattern. How i can do that? For now my code remove part which follow pattern if in string have some part which follow pattern.
val comment = "test 123 +1 AT test end"
if("\\+\\d.AT".toRegex().containsMatchIn(comment)) {
val regexpString = comment.replace("\\+\\d.AT".toRegex(), "")
print(regexpString)
}
Match the entire string by putting .* before and after the pattern, and put a capture group around the part you want to keep. Then use a back-reference in the replacement to copy that to the result.
val regexpString = comment.replace(".*(\\+\\d.AT).*".toRegex(), "$1")

regex split string by specific values

Trying to split a string by specific characters and values with a regex expression.
I have the following string for example:
abc.def.ghi:wxyz_1234
I would like to get both 'wxyz' and '1234'.
i.e. the string between ':' and '_' and the string after '_'
Cheers!
Method 1
Maybe,
([^\s:_]+)_(\S+)
might work OK.
RegEx Demo 1
Method 2
With lookbehind, to create a left boundary for pre-underscore string:
(?<=:)([^_]+)_(.+)
RegEx Demo 2
Test
import re
string = '''
abc.def.ghi:wxyz_1234
abc.def.ghi:abcd_78910
abc.def.ghi: foo_baz123
'''
expression = r'([^\s:_]+)_(\S+)'
for i in re.findall(expression, string):
print(i[0])
print(i[1])
Output
wxyz
1234
abcd
78910
foo
baz123
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
RegEx Circuit
jex.im visualizes regular expressions:
string str = "abc.def.ghi:wxyz_1234";
Regex rx = new Regex(":(.*)_(.*)");
Match match = rx.Match(str);
string first =match.Groups[1].Value;
string second= match.Groups[2].Value;
I managed to create the following
Case A - (?<=:)(.+)(?=_)
Case B - (?<=_).*
Guess the options are endless...
Thanks for your assistance!

Building a Regex String - Any assistance provided

Im very new to REGEX, I understand its purpose, but Im struggling to yet fully comprehend how to use it. Im trying to build a REGEX string to pull the A8OP2B out from the following (or whatever gets dumped in that 5th group).
{"RfReceived":{"Sync":9480,"Low":310,"High":950,"Data":"A8OP2B","RfKey":"None"}}
The other items in above line, will change in character length, so I cannot say the 51st to the 56th character. It will always be the 5th group in quotation marks though that I want to pull out.
Ive tried building various regex strings up, but its still mostly a foreign language to me and I still have much reading to do on it.
Could anyone provide me a working example with the above, so I can reverse engineer and understand better?
Thanks
Demo 1: Reference the JSON to a var, then use either dot or bracket notation.
Demo 2: Using RegEx is not recommended, but here's one in JavaScript:
/\b(\w{6})(?=","RfKey":)/g
First Match
non-consuming match: :"A
meta border: \b: A non-word=:, any char=", and a word=A
consuming match: A8OP2B
begin capture: (, Any word =\w, 6 times={6}
end capture: )
non-consuming match: ","RfKey":
Look ahead: (?= for: ","RfKey": )
Demo 1
var obj = {"RfReceived":{"Sync":9480,"Low":310,"High":950,"Data":"A8OP2B","RfKey":"None"}};
var dataDot = obj.RfReceived.Data;
var dataBracket = obj['RfReceived']['Data'];
console.log(dataDot);
console.log(dataBracket)
Demo 2
Note: This is consuming a string of 3 consecutive patterns. 3 matches are expected.
var rgx = /\b(\w{6})(?=","RfKey":)/g;
var str = `{"RfReceived":{"Sync":9480,"Low":310,"High":950,"Data":"A8OP2B","RfKey":"None"}},{"RfReceived":{"Sync":8080,"Low":102,"High":1200,"Data":"PFN07U","RfKey":"None"}},{"RfReceived":{"Sync":7580,"Low":471,"High":360,"Data":"XU89OM","RfKey":"None"}}`;
var res = str.match(rgx);
console.log(res);

Replace pattern with pattern in vb.net string

I want to replace "0A ","0B ",...,"1A ","1B ",... patterns with "0A|","0B|",...,"1A|","1B|",... from string vb.net
I can write individual replace lines like
string = string.Replace("0A ", "0A|")
string = string.Replace("0B ", "0B|")
.
.
.
string = string.Replace("0Z ", "0Z|")
But, I would have to write too many lines(26*10*2- Two because such scenario occurs twice) and it just doesn't seem to be a good solution. Can someone give me a good regex solution for this?
Use Regex.Replace:
result = Regex.Replace(string, "(\d+[A-Z]+) ", "$1|")
I used the pattern \d+[A-Z]+ to represent the text under the assumption that your series of data might see more than one digit/letter. This seems to be working in the demo below.
Demo
Regex: \s Substitution: |
Details:
\s Matches any whitespace character
Regex demo
VB.NET code:
Regex.Replace("0A ", "\s", "|") Output: 0A|

How to capture a group and exclude a word in the capture?

Example:
Book = string containing whole text
startChar = where it should begin capturing = |
endChar = where it should end capturing = §
word to ignore in capture = gray
So if it wasn't the word "gray", my capture would be a simple: |(.+)§
Here is an example of what i mean:
Book = "The gray fox is |so gray that its pretty gray§."
Captured = "so that its pretty "
Using C#, and PHP, but i do not want to use any replace function, i just want a pure regex expression.
You can use this pattern in a global search:
(?:\G(?!\A)|\|)(?:\bgray\b)?\K((?:(?!\bgray\b)[^§])+)(?=(?:gray)?(§)?)
demo
details
(?: # the two entry points
\G(?!\A) # position at the end of the pevious match
|
\| # the start
)
(?:\bgray\b)? # optional "gray"
\K
((?:(?!\bgray\b)[^§])+) # all that is not the word "gray" (see the note)
(?=(?:gray)?(§)?) # trick to capture the last §
note: this subpattern is a well know trick to match text avoiding a word. However, this subpattern is slow in particular with long text with few words to avoid.
It can be replaced with: ((?>[^g§]+|\Bg|g(?!ray\b))+) that may be faster (but less easy to build programmatically).
Example of use with PHP:
$book = "The gray fox is |so gray that its pretty gray§.";
$reg = '~(?:\G(?!\A)|\|)(?:\bgray\b)?\K((?:(?!\bgray\b)[^§])+)(?=(?:gray)?(§)?)~';
if ( preg_match_all($reg, $book, $matches) && !empty(end($matches[2])) )
echo implode('', $matches[1]);
Note: the last capture group is only here to ensure that the end has been reached. The "if" condition checks it with !empty(end($matches[2]))