Function to check a value in a String - regex

I'm working with an application that tries to check the location of an user in a User Directory.
I have strings similar to:
CN=John Mayor,OU=Users,OU=NA,OU=Local,DC=domain,DC=application,DC=com
or
CN=Annette Luis Morgant,OU=Users,OU=CH,OU=Local,DC=domain,DC=application,DC=com
I'm trying to filter in javascript the string in order to print out ONLY the value of the second "OU".
So for the first case it will be "NA", for the second case it will be "CH".
Trying to use substring and trim or something similar, but I'm confusing myself!
Can you help me?
Thanks!!!!
edit-----
This is what I was trying to do:
public class SplitUser {
public static void main(String[] args) {
String MyStringContent = "CN=John Mayor,OU=Users,OU=NA,OU=Local,DC=domain,DC=application,DC=com";
String[] arrSplit = MyStringContent.split(",");
for (int i=0; i < arrSplit.length; i++)
{
System.out.println(arrSplit[i]);
}
//System.out.println(arrSplit[2]);
String p = arrSplit[2].substring(3, arrSplit[2].length());
System.out.println(p);
}}

You could try with this:
(?:,|^)OU=[^,]+(?=.*?,OU=([^,]+)).*
It will work even if there are some other FOO=BAR values inserted among the "OU"'s
Explained:
(?:,|^) # start by begin of line or ','
OU=
[^,]+ #Anything but a ',' 1 or more times
#We have found 1 OU, let's find the next
(?= # Lookahead expression
.*? # Anything (ungreedy)
,OU=
([^,]+)
)
.* # Anything, just to match the whole line
# and avoid multiple matches for the same line (g flag)
Demo here

Related

Regex - get list comma separated allow spaces before / after the comma

I try to extract an images array/list from a commit message:
String commitMsg = "#build #images = image-a, image-b,image_c, imaged , image-e #setup=my-setup fixing issue with px"
I want to get a list that contains:
["image-a", "image-b", "image_c", "imaged", "image-e"]
NOTES:
A) should allow a single space before/after the comma (,)
B) ensure that #images = exists but exclude it from the group
C) I also searching for other parameters like #build and #setup so I need to ignore them when looking for #images
What I have until now is:
/(?i)#images\s?=\s?<HERE IS THE MISSING LOGIC>/
I use find() method:
def matcher = commitMsg =~ /(?i)#images\s?=\s?([^,]+)/
if(matcher.find()){
println(matcher[0][1])
}
You can use
(?i)(?:\G(?!^)\s?,\s?|#images\s?=\s?)(\w+(?:-\w+)*)
See the regex demo. Details:
(?i) - case insensitive mode on
(?:\G(?!^)\s?,\s?|#images\s?=\s?) - either the end of the previous regex match and a comma enclosed with single optional whitespaces on both ends, or #images string and a = char enclosed with single optional whitespaces on both ends
(\w+(?:-\w+)*) - Group 1: one or more word chars followed with zero or more repetitions of - and one or more word chars.
See a Groovy demo:
String commitMsg = "#build #images = image-a, image-b,image_c, imaged , image-e #setup=my-setup fixing issue with px"
def re = /(?i)(?:\G(?!^)\s?,\s?|#images\s?=\s?)(\w+(?:-\w+)*)/
def res = (commitMsg =~ re).collect { it[1] }
print(res)
Output:
[image-a, image-b, image_c, imaged, image-e]
An alternative Groovy code:
String commitMsg = "#build #images = image-a, image-b,image_c, imaged , image-e #setup=my-setup fixing issue with px"
def re = /(?i)(?:\G(?!^)\s?,\s?|#images\s?=\s?)(\w+(?:-\w+)*)/
def matcher = (commitMsg =~ re).collect()
for(m in matcher) {
println(m[1])
}

regex to extract substring for special cases

I have a scenario where i want to extract some substring based on following condition.
search for any pattern myvalue=123& , extract myvalue=123
If the "myvalue" present at end of the line without "&", extract myvalue=123
for ex:
The string is abcdmyvalue=123&xyz => the it should return myvalue=123
The string is abcdmyvalue=123 => the it should return myvalue=123
for first scenario it is working for me with following regex - myvalue=(.?(?=[&,""]))
I am looking for how to modify this regex to include my second scenario as well. I am using https://regex101.com/ to test this.
Thanks in Advace!
Some notes about the pattern that you tried
if you want to only match, you can omit the capture group
e* matches 0+ times an e char
the part .*?(?=[&,""]) matches as least chars until it can assert eiter & , or " to the right, so the positive lookahead expects a single char to the right to be present
You could shorten the pattern to a match only, using a negated character class that matches 0+ times any character except a whitespace char or &
myvalue=[^&\s]*
Regex demo
function regex(data) {
var test = data.match(/=(.*)&/);
if (test === null) {
return data.split('=')[1]
} else {
return test[1]
}
}
console.log(regex('abcdmyvalue=123&3e')); //123
console.log(regex('abcdmyvalue=123')); //123
here is your working code if there is no & at end of string it will have null and will go else block there we can simply split the string and get the value, If & is present at the end of string then regex will simply extract the value between = and &
if you want to use existing regex then you can do it like that
var test = data1.match(/=(.*)&|=(.*)/)
const result = test[1] ? test[1] : test[2];
console.log(result);

DART Conditional find and replace using Regex

I have a string that sometimes contains a certain substring at the end and sometimes does not. When the string is present I want to update its value. When it is absent I want to add it at the end of the existing string.
For example:
int _newCount = 7;
_myString = 'The count is: COUNT=1;'
_myString2 = 'The count is: '
_rRuleString.replaceAllMapped(RegExp('COUNT=(.*?)\;'), (match) {
//if there is a match (like in _myString) update the count to value of _newCount
//if there is no match (like in _myString2) add COUNT=1; to the string
}
I have tried using a return of:
return "${match.group(1).isEmpty ? _myString + ;COUNT=1;' : 'COUNT=$_newCount;'}";
But it is not working.
Note that replaceAllMatched will only perform a replacement if there is a match, else, there will be no replacement (insertion is still a replacement of an empty string with some string).
Your expected matches are always at the end of the string, and you may leverage this in your current code. You need a regex that optionally matches COUNT= and then some text up to the first ; including the char and then checks if the current position is the end of string.
Then, just follow the logic: if Group 1 is matched, set the new count value, else, add the COUNT=1; string:
The regex is
(COUNT=[^;]*;)?$
See the regex demo.
Details
(COUNT=[^;]*;)? - an optional group 1: COUNT=, any 0 or more chars other than ; and then a ;
$ - end of string.
Dart code:
_myString.replaceFirstMapped(RegExp(r'(COUNT=[^;]*;)?$'), (match) {
return match.group(0).isEmpty ? "COUNT=1;" : "COUNT=$_newCount;" ; }
)
Note the use of replaceFirstMatched, you need to replace only the first match.

find a string pattern in a string array

I need to count the occurrence of specified patterns in the input strands and produces a report for each pattern.
The input string would contain 1 AA AATTCGAA end
the 1 signifies one pattern to search for and AA is the pattern and the next is the part you would search AA in.
My idea is to :
public static void main(String[] args){
Scanner s = new Scanner(System.in);
System.out.println("How many patterns do you want and enter patterns and DNA Sequence(type 'end' to signify end):");
String DNA = s.nextLine();
process(DNA);
}
public static void process(String DNA){
String number = DNA.replaceFirst(".*?(\\d+).*", "$1");
int N = Integer.parseInt(number);
DNA.toUpperCase();
String[] DNAarray;
DNAarray = DNA.split(" ");
for(int i=1; i<N; i++){
int count=0;
for(int j =0; j < DNAarray.length; j++) {
if(DNAarray[i+N].contains(DNAarray[i])){
count= count++;
}
}
System.out.println("Pattern:"+DNAarray[i]+ "Count:"+count);
}
This should do it:
using System;
using System.Text.RegularExpressions;
public class Program
{
public void Main()
{
Console.WriteLine(PatternCount("1 AA AADDRRSSAA"));
}
public int PatternCount(string sDNA) {
Regex reParts = new Regex("(\\d+)\\s(\\w\\w)\\s(\\w+)");
Match m = reParts.Match(sDNA);
if (m.Success)
{
return Regex.Matches(m.Groups[3].Value, m.Groups[2].Value).Count;
}
else
return 0;
}
}
First RE splits the input into count, pattern and data. (Not sure why you want to limit the number of patterns to search for. This code ignores that. Modify after your needs...)
Second RE equals the pattern wanted and "Matches" counts the number of occurrences. Work from here.
Regards
(I feel good today, doing people's work ;))
Really no need to put the number of searches. And, actually this could be done
with a single regex. I can't remember if Dot-Net supports the \G anchor,
but this is really not necessary anyway. I left it in.
Each Match:
Finds a new key.
Captures the keys sub-string matches at the end.
Advances the search position by just the key.
So, sit in a Find loop.
On each match print the 'Key' capture buffer,
then print the capture collection 'Values' count.
Thats all there is to it.
The regex will search for overlapping keys. To change it to exclusive keys,
change the = to : as shown in the comments.
You can also make it a little more specific. For example, change all the \w's to [A-Z], etc...
The regex:
(?:
^ [ \d]*
| \G
)
(?<Key> \w+ ) #_(1)
[ ]+
(?=
(?: \w+ [ ]+ )*
(?= \w )
(?:
(?= # <- Change the = to : to get non-overlapped matches
(?<Values> \1 ) #_(2)
)
| .
)*
$
)
This is a perl test case
# $str = '2 6 AA TT PP AAATTCGAA';
# $count = 0;
#
# while ( $str =~ /(?:^[ \d]*|\G)(\w+)[ ]+(?=(?:\w+[ ]+)*(?=\w)(?:(?=(\1)(?{ $count++ }))|.)*$)/g )
# {
# print "search = '$1'\n";
# print "found = '$count'\n";
# $count = 0;
#
# }
#
# Output >>
#
# search = 'AA'
# found = '3'
# search = 'TT'
# found = '1'
# search = 'PP'
# found = '0'
#
#

Solution required to create Regex pattern

I am developing a windows application in C#. I have been searching for the solution to my problem in creating a Regex pattern. I want to create a Regex pattern matching the either of the following strings:
XD=(111111) XT=( 588.466)m3 YT=( .246)m3 G=( 3.6)V N=(X0000000000) M=(Y0000000000) O=(Z0000000000) Date=(06.01.01)Time=(00:54:55) Q=( .00)m3/hr
XD=(111 ) XT=( 588.466)m3 YT=( .009)m3 G=( 3.6)V N=(X0000000000) M=(Y0000000000) O=(Z0000000000) Date=(06.01.01)Time=(00:54:55) Q=( .00)m3/hr
The specific requirement is that I need all the values from the above given string which is a collection of key/value pairs. Also, would like to know the right approach (in terms of efficiency and performance) out of the two...Regex pattern matching or substring, for the above problem.
Thank you all in advance and let me know, if more details are required.
I don't know C#, so there probably is a better way to build a key/value array. I constructed a regex and handed it to RegexBuddy which generated the following code snippet:
StringCollection keyList = new StringCollection();
StringCollection valueList = new StringCollection();
StringCollection unitList = new StringCollection();
try {
Regex regexObj = new Regex(
#"(?<key>\b\w+) # Match an alphanumeric identifier
\s*=\s* # Match a = (optionally surrounded by whitespace)
\( # Match a (
\s* # Match optional whitespace
(?<value>[^()]+) # Match the value string (anything except parens)
\) # Match a )
(?<unit>[^\s=]+ # Match an optional unit (anything except = or space)
\b # which must end at a word boundary
(?!\s*=) # and not be an identifier (i. e. followed by =)
)? # and is optional, as mentioned.",
RegexOptions.IgnorePatternWhitespace);
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
keyList.Add(matchResult.Groups["key"].Value);
valueList.Add(matchResult.Groups["value"].Value);
unitList.Add(matchResult.Groups["unit"].Value);
matchResult = matchResult.NextMatch();
}
Regex re=new Regex(#"(\w+)\=\(([\d\s\.]+)\)");
MatchCollection m=re.Matches(s);
m[0].Groups will have { XD=(111111), XD, 111111 }
m[1].Groups will have { XT=( 588.466), XT, 588.466 }
String[] rows = { "XD=(111111) XT=( 588.466)m3 YT=( .246)m3 G=( 3.6)V N=(X0000000000) M=(Y0000000000) O=(Z0000000000) Date=(06.01.01)Time=(00:54:55) Q=( .00)m3/hr",
"XD=(111 ) XT=( 588.466)m3 YT=( .009)m3 G=( 3.6)V N=(X0000000000) M=(Y0000000000) O=(Z0000000000) Date=(06.01.01)Time=(00:54:55) Q=( .00)m3/hr" };
foreach (String s in rows) {
MatchCollection Pair = Regex.Matches(s, #"
(\S+) # Match all non-whitespace before the = and store it in group 1
= # Match the =
(\([^)]+\S+) # Match the part in brackets and following non-whitespace after the = and store it in group 2
", RegexOptions.IgnorePatternWhitespace);
foreach (Match item in Pair) {
Console.WriteLine(item.Groups[1] + " => " + item.Groups[2]);
}
Console.WriteLine();
}
Console.ReadLine();
If you want to extract the units also then use this regex
#"(\S+)=(\([^)]+(\S+))
I added a set of brackets around it, then you will find the unit in item.Groups[3]