Regex Returning extra empty Value - regex

Set Regex = New RegExp
Regex.Pattern = """[^""]*""|[^,]*"
Regex.Global = True
//I have a for loop here to loop through records
text = Cells.Item(r, 7).Value
For Each Match In Regex.Execute(text)
count = count + 1
Next Match
This is my Regex Code, and here is the table where I am pulling the data from,
When I run the code in debug mode the PCBaa count comes up as two, c3 and c4 come up as 14 and C6-c36 come up as 36, Is my regex code wrong for extracting the codes between the commas ??

Ok, I have tried that myself and it seems that first off, it seems you don't reset the count value to 0 after each line. That could be intentional, but just so you know.
The second thing is that the regular expression seems to work nearly fine but always gives you the double amount because it matches a zero length string at the end of each match.
So for the last line (C6-C26) it machtes:
1) "C6" 2) "" 3) "C7" 4) "" ... and so on.
To be hounest, I'm a little bit surprised myself and don't exactly know why that's the case for now.
But the solution is pretty easy: Since you want there to be no zero length strings in the result (so they don't get counted) you simply have to exchange the * for a + and that will tell the regular expression to match only if there's at least one character.
So your regular expression string should look like:
Regex.Pattern = """[^""]+""|[^,]+"
Why you've got a count of 14 on the c3, c4 surprises me... I got a 4 which makes sence because of the double counting due to the zero length matches.

Related

Optional parts of regex pattern in vba

I am trying to build regex pattern for the text like that
numb contra: 1.29151306 number mafo: 66662308
numb contra 1.30789668number mafo 60.046483
numb contra/ 1.29154056 number mafo: 666692638
numb contra 137459625
mafo: 666692638
mafo: 666692638 numb contra/ 1.29154056
Here's the pattern I could build
contra?.\s+?(\d+\.?\d+)(.+mafo.?\s+(\d+\.?\d+))?
It works fine for all the lines except the last one. How can I implement all the possibilities to include the last line too?
Please have a look at this link
https://regex101.com/r/pSThAU/1
All is OK as for contra but not as for mafo
I think the key here is to make your regexp do less and your vba do more. What I think I see here is either the word 'mafo' or 'contra' and a number following. Don't know what order or whether each is present or how many times. So you can scan each of your strings for ALL occurrences with a regexp like this:
(?:^|[^A-Z])(?:(mafo)|(contra))[^A-Z]\s*(\d*\.?\d+)
Then process it with some VBA code like this that I created in Excel:
Sub BreakItUp()
Dim rg As RegExp, scanned As MatchCollection, eachMatch As Match, i As Long, col As Long
Set rg = New RegExp
rg.Pattern = "(?:^|[^A-Z])(?:(mafo)|(contra))[^A-Z]\s*(\d*\.?\d+)"
rg.IgnoreCase = True
rg.Global = True
i = 1
Do While (Not IsEmpty(ActiveSheet.Cells(i, 1).Value))
Set scanned = rg.Execute(ActiveSheet.Cells(i, 1).Value)
col = 2
For Each eachMatch In scanned
ActiveSheet.Cells(i, col).Value = eachMatch.SubMatches(0) & eachMatch.SubMatches(1)
ActiveSheet.Cells(i, col + 1).Value2 = "'" & eachMatch.SubMatches(2)
col = col + 2
Next eachMatch
i = i + 1
Loop
End Sub
That MatchCollection object will get one item for each Match that occurs and the subMatches array contains each capturing group. You should be able write your own logic within this processing loop to interpret what was extracted. When I ran it on your data it created all the fields in blue:
Notice I added a line to your data that had two contra entries and one mafo and it found all the occurrences. You should be able to modify this to interpret the meanings.

Removing zeros in betwenn a string in java

I want to remove zeros in a String.
For example,
String A = AY000120
then the output should be
AY120
so basically any thing between AY and next number which is greater than 0 should be removed. Also, if any zero occurs after a number which is greater than 1 then that zero will not be deleted.
A reg ex will be very useful.
replace ^(AY)0*([1-9].*)
by \1\2
Or, if you knew your input is in fixed format AY+(zero or more 0)+(other Numbers), you can just:
replace ^AY0* by AY
Looks like you are using Java but here is a solution that I wrote in JavaScript. I think regex part should work for you
let str = "AY000120"
let result = str.replace(/(0*)(?=[1-9])/g, ''); //AY120
post any questions if you still have.
(0*) - looks for any number of 0s (greedy)
(?=[1-9]) - positive look ahead to make sure any number other than 0 exists

regex interval with possible characters before and after number VBA

I'm trying to produce a regular expression that can identify a number within an interval in a string in VBA. Sometimes this number has characters around it, other times not (non-consistent notation from a supplier). The expression should identify that 1413 in the three examples below are within the number range 500-2000 (or alternatively that it's not in the number range 0-50 or 51-499).
Example:
Test 12/2014. Tot.flow:1413 m3 or
Test 12/2014. Tot.flow:1413m3 or
Test 12/2014. Tot.flow: 1413
These strings have some identifiers:
there will always be a colon before the number
there may be a white space between the colon and the number
there may be a white space between the number and the m3
m3 is not necessarily always present, and if not, the number is at the end of the string
So far what I have in my attempt to make an regex that find the number range is ([5-9][0-9][0-9]|[1]\d{3}|2000), but this matches all three digit numbers as well (2001 gives a match on 200). However, I understand that I'm missing out on a couple of concepts to achieve the ultimate goal here. I guess my problems are as following:
How to start the interval at something not being zero (found lots of questions on intervals starting on zero)
How to take into account the variations in notation both for flow: and m3?
I'm only interested in checking that the number lies within the number range. This is driving me bonkers, all help is highly appreciated!
You can just extract the number with regExp.Replace() using the following regex:
^.*:\s*(\d+).*$
The replacement part is $1.
Then, use usual number comparison to check whether the value is in the expected range (e.g. If CLng(result) > 499 And If CLng(result) < 2001 Then ...).
Test macro:
Dim re As RegExp, tgt As String, src As String
Set re = New RegExp
With re
.pattern = "^.*:\s*(\d+).*$"
.Global = False
End With
src = "Test 12/2014. Tot.flow: 1413"
tgt = re.Replace(src, "$1")
MsgBox (CLng(tgt) > 499 And CLng(tgt) < 2001)
You can try with:
:\s?([5-9]\d\d|1\d{3}|2000)\s?(m3|\n)
also, your regex ([5-9][0-9][0-9]|[1]\d{3}|2000) in my opinion is fine, it should not match numbers >500 and 2000<.

can any one tell me regular expression for this? UPDATED

I tried many syntax in vistal studio, and in this site, but nothing helped.
The expression would be _ct_(anyDigitHere)_
like
adkasjflasdjfsdf asdfkl sjfdsf _ct150_ asdfasd // so it would match this _ct150
any thing here doens't matter Random stuff..afd a&r9qwr89 ((
_ct415487_ anything here doesn't matter // this will match _ct415487_
basically any _ctAndAnyNumberHere_ (underscore at start and end)
A couple I tried ^*(_ct)(:z)(+_)+*$, ^*(_ct[0-9]+_)+*$. But none helps!
EDIT
Thanks for the reply(ies). That did work, but what I now the problem is replace those matched elements with a hidden field value.. say..
if the value in hidden field is of 1 digit, (any value from 0-9), I have to take last digit from that matched expression and replace it with the value in hidden field.
if the value in hidden field is of 2 digit, (any value from 0-99), I have to take last two digits from that matched expression and replace it with the value in hidden field.
so, basically..
if the value in hidden field is of n digit, I have to take last n digits from that matched expression and replace it with the value in hidden field.
How do I do that?
I don't know what language of visual studio you're talking about, but this should work:
_ct\d+_
or this:
_ct([0-9]+)_
EDIT:
Regex rg = new Regex("_ct([0-9]+)_");
string text = "any thing here doens't matter Random stuff..afd a&r9qwr89 ((_ct415487_ anything here doesn't matter";
var match = rg.Match(text).Groups[1].Value;
int sizeHiddenField = HiddenField1.Value.Length;
var newString = text.Replace(match, match.Substring(0, match.Length - sizeHiddenField) + HiddenField1.Value);
/_ct\d*_/
This is the regular expression syntax for your given problem. Try this

Keypress ISSUE VB.NET

I took many hours trying to solve this problem I have attempted, without success.
All I need is to validate a textbox:
Valid Chains:
10%
0%
1111111.12%
15.2%
10
2.3
Invalid Chains:
.%
12.%
.02%
%
123456789123.123
I need to validate the textbox with these valid chains, supporting the keypress event.
I tryed:
Private Sub prices_KeyPress(ByVal sender As Object, ByVal e As System.Windows.Forms.KeyPressEventArgs) Handles wholeprice_input_new_item.KeyPress, dozenprice_input_new_item.KeyPress, _
detailprice_input_new_item.KeyPress, costprice_input_new_item.KeyPress
Dim TxtB As TextBox = CType(sender, TextBox)
Dim fullText As String = TxtB.Text & e.KeyChar
Dim rex As Regex = New Regex("^[0-9]{1,9}([\.][0-9]{1,2})?[\%]?$ ")
If (Char.IsDigit(e.KeyChar) Or e.KeyChar.ToString() = "." Or e.KeyChar = CChar(ChrW(Keys.Back))) Then
If (fullText.Trim() <> "") Then
If (rex.IsMatch(fullText) = False And e.KeyChar <> CChar(ChrW(Keys.Back))) Then
e.Handled = True
MessageBox.Show("You are Not Allowed To Enter More then 2 Decimal!!")
End If
End If
Else
e.Handled = True
End If
End Sub
NOTE: The regex has to validate (Maximum 2 decimal places, and 9 integers) with an optional percent symbol.
Please help, I feel so frustrated trying to solve the problem without success
I think that you almost had the right answer. When I run your regex against the samples you supplied, they all fail. But if I remove the extra space at the end of the regex I get the expected successes and failures.
So currently your regex looks like this:
Dim rex As Regex = New Regex("^[0-9]{1,9}([\.][0-9]{1,2})?[\%]?$ ")
and it should look like
Dim rex As Regex = New Regex("^[0-9]{1,9}([\.][0-9]{1,2})?[\%]?$")
EDIT:
Ok I understand the issue more. The problem with the regex is that it will only allow a period if it is followed by one or two numbers. That works fine if you are evaluating the textbox value after someone has finished typing. But in your code, you are evaluating for each keypress, so you don't have a chance to type a number after the "."
I can see two possible solutions
Change the regex to allow 1. as a valid entry
Change when you evaluate the regex, perhaps trying to figure out a way to only evaluate the regex when the person has paused typing.
If you went with option 1, then we need to tweak the regex to something like this
"^[0-9]{1,9}((\.)|(\.[0-9]{1,2}(%)?)|(%))?$"
I changed the regex so that it will accept three optional endings to the text string (\.) will allow the string to end in a period , (\.[0-9]{1,2}(%)?) will allow the string to end period followed by one or two numbers and an optional percent sign, and (%) will allow the string to end in a percent sign. I broke the ending into the three options because I didn't want to allow something like 12.% to be valid. Also for this to work you will also need to add the percent sign to your first If statement
If (Char.IsDigit(e.KeyChar) Or e.KeyChar.ToString() = "." Or e.KeyChar.ToString() = "%" Or e.KeyChar = CChar(ChrW(Keys.Back))) Then
so that the regex runs when someone types the percent sign.