Count how many times new line is present? - regex

For example,
string="help/nsomething/ncrayons"
Output:
String word count is: 3
This is what I have but the program is looping though the method several times and it looks like I am only getting the last string created. Here's the code block:
Regex regx = new Regex(#"\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
MatchCollection matches = regx.Matches(output);
//int counte = 0;
foreach (Match match in matches)
{
//counte = counte + 1;
links = links + match.Value + '\n';
if (links != null)
{
string myString = links;
string[] words = Regex.Split(myString, #"\n");
word_count.Text = words.Length.ToString();
}
}
It is \n for newline.

Not sure if regex is a must for your case but you could use split:
string myString = "help/nsomething/ncrayons";
string[] separator = new string[] { "/n" };
string[] result = myString.Split(separator, StringSplitOptions.None);
MessageBox.Show(result.Count().ToString());
Another way using regex:
string myString = "help/nsomething/ncrayons";
string[] words = Regex.Split(myString, #"/n");
word_count.Text = words.Length;

Related

Regex Matching using Matcher and Pattern

I am trying to do regex on a number based on the below conditions, however its returning an empty string
import java.util.regex.Matcher
import java.util.regex.Pattern
object clean extends App {
val ALPHANUMERIC: Pattern = Pattern.compile("^[a-zA-Z0-9]*$")
val SPECIALCHAR: Pattern = Pattern.compile("[a-zA-Z0-9\\-#\\.\\(\\)\\/%&\\s]")
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
val TRAILINGZEROES: Pattern = Pattern.compile("\\.0*$|(\\.\\d*?)0+$")
def evaluate(codes: String): String = {
var str2: String = codes.toString
var text:Matcher = LEADINGZEROES.matcher(str2)
str2 = text.replaceAll("")
text = ALPHANUMERIC.matcher(str2)
str2 = text.replaceAll("")
text = SPECIALCHAR.matcher(str2)
str2 = text.replaceAll("")
text = TRAILINGZEROES.matcher(str2)
str2 = text.replaceAll("")
}
}
the code is returning empty string for LEADINGZEROES match.
scala> println("cleaned value :" + evaluate("0001234"))
cleaned value :
What change should I do to make the code work as I expect. Basically i am trying to remove leading/trailing zeroes and if the numbers has special characters/alphanumeric values than entire value should be returned null
Your LEADINGZEROES pattern is working correct as
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
println(LEADINGZEROES.matcher("0001234").replaceAll(""))
gives
//1234
But then there is a pattern matching
text = ALPHANUMERIC.matcher(str2)
which replaces all alphanumeric to "" and this made str as empty ("")
As when you do
val ALPHANUMERIC: Pattern = Pattern.compile("^[a-zA-Z0-9]*$")
val LEADINGZEROES: Pattern = Pattern.compile("^[0]+(?!$)")
println(ALPHANUMERIC.matcher(LEADINGZEROES.matcher("0001234").replaceAll("")).replaceAll(""))
it will print empty
Updated
As you have commented
if there is a code that is alphanumeric i want to make that value NULL
but in case of leading or trailing zeroes its pure number, which should return me the value after removing zeroes
but its also returning null for trailing and leading zeroes matches
and also how can I skip a match , suppose i want the regex to not match the number 0999 for trimming leading zeroes
You can write your evaluate function and regexes as below
val LEADINGTRAILINGZEROES = """(0*)(\d{4})(0*)""".r
val ALPHANUMERIC = """[a-zA-Z]""".r
def evaluate(codes: String): String = {
val LEADINGTRAILINGZEROES(first, second, third) = if(ALPHANUMERIC.findAllIn(codes).length != 0) "0010" else codes
if(second.equalsIgnoreCase("0010")) "NULL" else second
}
which should give you
println("cleaned value : " + evaluate("000123400"))
// cleaned value : 1234
println("alphanumeric : " + evaluate("0001A234"))
// alphanumeric : NULL
println("skipping : " + evaluate("0999"))
// skipping : 0999
I hope the answer is helpful

RegularExpression get strings between new lines

I want to taking every string who is located on a new line with Regular Expression
string someStr = "first
second
third
"
example:
string str1 = "first";
string str2 = "second";
string str3 = "third";
Or if you just want the first word of each line;
^(\w+).*$ with multi-line flag.
Regex101 has a nice regex testing tool: https://regex101.com/r/JF3cKR/1
Just split it with "\n";
someStr.split("\n")
And you can filter the empty strings if you'd like
Or if you really want regex, do /^.*$/ with multiline flag
List<String> listOfLines = new ArrayList<String>();
Pattern pattern = Pattern.compile("^.*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("first\nsecond\nthird\n");
while (matcher.find()) {
listOfLines.add(matcher.group());
}
Then you have;
listOfLines.get(0) = first
listOfLines.get(1) = second
listOfLines.get(2) = third
You can use the following regex :
(\w+)(?=\n|"|$)
see demo

Regex pattern misses match on a 2 char word

Using regex101 I have developed this regex:
^(\S+)\s_(\S)(\S[^;\s]+)?.*
This works great for 99.999% of the time but occasionally it is run against a string containing a 2 char word that should have matched.
For example it would normally capture...
string _asdf = string.empty;
bool _ttfnow;
//$1 = string
//$2 = a
//$3 = sdf
and
//$1 = bool
//$2 = t
//$3 = tfnow
But for some reason this fails to match the third group?
string _qw = string.empty;
//$1 = string
//$2 = q
//$3 =
Again using regex101 if add add a char it suddenly matches so:
string _qwx = string.empty;
//$1 = string
//$2 = q
//$3 = wx
Any ideas? Thank You
^(\S+)\s_(\S)(\S[^;\s]*)?.*
^^
Just change the quantifier.See demo.
https://regex101.com/r/pG1kU1/33
[^;\s]+ change it to [^;\s]*
/^(\S+)\s_(\S)(\S[^;\s]*)?.*/

How to highlight a string within a string ignoring whitespace and non alphanumeric chars?

What is the best way to produce a highlighted string found within another string?
I want to ignore all character that are not alphanumeric but retain them in the final output.
So for example a search for 'PC3000' in the following 3 strings would give the following results:
ZxPc 3000L = Zx<font color='red'>Pc 3000</font>L
ZXP-C300-0Y = ZX<font color='red'>P-C300-0</font>Y
Pc3 000 = <font color='red'>Pc3 000</font>
I have the following code but the only way i can highlight the search within the result is to remove all the whitespace and non alphanumeric characters and then set both strings to lowercase. I'm stuck!
public string Highlight(string Search_Str, string InputTxt)
{
// Setup the regular expression and add the Or operator.
Regex RegExp = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
// Highlight keywords by calling the delegate each time a keyword is found.
string Lightup = RegExp.Replace(InputTxt, new MatchEvaluator(ReplaceKeyWords));
if (Lightup == InputTxt)
{
Regex RegExp2 = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
RegExp2.Replace(" ", "");
Lightup = RegExp2.Replace(InputTxt.Replace(" ", ""), new MatchEvaluator(ReplaceKeyWords));
int Found = Lightup.IndexOf("<font color='red'>");
if (Found == -1)
{
Lightup = InputTxt;
}
}
RegExp = null;
return Lightup;
}
public string ReplaceKeyWords(Match m)
{
return "<font color='red'>" + m.Value + "</font>";
}
Thanks guys!
Alter your search string by inserting an optional non-alphanumeric character class ([^a-z0-9]?) between each character. Instead of PC3000 use
P[^a-z0-9]?C[^a-z0-9]?3[^a-z0-9]?0[^a-z0-9]?0[^a-z0-9]?0
This matches Pc 3000, P-C300-0 and Pc3 000.
One way to do this would be to create a version of the input string that only contains alphanumerics and a lookup array that maps character positions from the new string to the original input. Then search the alphanumeric-only version for the keyword(s) and use the lookup to map the match positions back to the original input string.
Pseudo-code for building the lookup array:
cleanInput = "";
lookup = [];
lookupIndex = 0;
for ( index = 0; index < input.length; index++ ) {
if ( isAlphaNumeric(input[index]) {
cleanInput += input[index];
lookup[lookupIndex] = index;
lookupIndex++;
}
}

In a RegEx with multiple subexpressions (i.e. using parenthesis), how do I know which one it matched?

So, for example:
//The string to search through
var str = "This is a string /* with some //stuff in here";
//I'm matching three possible things: "here" or "//" or "/*"
var regEx = new RegExp( "(here)|(\\/\\/)|(\\/\\*)", "g" );
//Loop and find them all
while ( match = regEx.exec( str ) )
{
//Which one is matched? The first parenthesis subexpression? The second?
alert( match[ 0 ] );
}
How do i know I matched the "(//)" instead of the "(here)" without running another regex against the returned match?
You can check which group is defined:
var str = "This is a string /* with some //stuff in here";
var regEx = /(here)|(\/\/)|(\/\*)/g;
while(match = regEx.exec(str)){
var i;
for(i = 1; i < 3; i++){
if(match[i] !== undefined)
break;
}
alert("matched group " + i + ": " + match[i]);
}
Running at http://jsfiddle.net/zLD5V/