Removing zeros in betwenn a string in java - regex

I want to remove zeros in a String.
For example,
String A = AY000120
then the output should be
AY120
so basically any thing between AY and next number which is greater than 0 should be removed. Also, if any zero occurs after a number which is greater than 1 then that zero will not be deleted.
A reg ex will be very useful.

replace ^(AY)0*([1-9].*)
by \1\2
Or, if you knew your input is in fixed format AY+(zero or more 0)+(other Numbers), you can just:
replace ^AY0* by AY

Looks like you are using Java but here is a solution that I wrote in JavaScript. I think regex part should work for you
let str = "AY000120"
let result = str.replace(/(0*)(?=[1-9])/g, ''); //AY120
post any questions if you still have.
(0*) - looks for any number of 0s (greedy)
(?=[1-9]) - positive look ahead to make sure any number other than 0 exists

Related

Regex to get values of format Number-Decimal-Number (eg 1.2)

I need to ensure that a textbox is having a specific format entered against it... Number from a variable then a Decimal Point then any other number (1.10, 2.6 etc...) The important bit is that the first number should come from a variable then it must be a decimal followed by another number.
I have not been able to find anything too specific and the REGEX functionality looks to require a bit of investigation of how it all works... If I can get a quick result here would be great though!
I instinctively (although didnt expect it to work) tried:
If System.Text.RegularExpressions.Regex.IsMatch(txbCriterionNo.Text, OutcomeNo.ToString() + "." + "^[0-9]+$") Then
...
where OutcomeNo is an integer variable - so I hope you can see what I am aiming to get. So, the format MUST be integer variable - decimal point then another integer value.
What should work:
1.5 or 5.42 or 10.5
What shouldn't work:
.14 or a.1 or 1.c
etc...
Thanks!
Chris85 pointed me in the right direction, but I also needed to ensure that the first number matched a variable value so I have arrived at the following which works a treat...
If System.Text.RegularExpressions.Regex.IsMatch(txbCriterionNo.Text, "^\d+\.\d+$") And txbCriterionNo.Text.Substring(0, Convert.ToInt32(InStr(txbCriterionNo.Text, "."))) = OutcomeNo Then
Here we are fistly using the regex "^\d+.\d+$" to make sure the format is correct [number][decimal][number] and then a second check get the position of the decimal and using that to get the substring we want to compare against my variable OutcomeNo.
Thanks all!!
TextBox This will allow only digits and dot to be enetered. And it will have to start with a digit.
Private Sub txtValue_KeyPress(ByVal sender As Object, ByVal e As System.Windows.Forms.KeyPressEventArgs) Handles txtValue.KeyPress
Dim txtValue As txtValue = DirectCast(sender, txtValue)
If Not (Char.IsDigit(e.KeyChar) Or Char.IsControl(e.KeyChar) Or (e.KeyChar = "." And txtValue.Text.IndexOf(".") < 0) ) Then
e.Handled = True
If txtValue.Text.StartsWith(".") Then
txtValue.Text = ""
End If
End If
End Sub

regex interval with possible characters before and after number VBA

I'm trying to produce a regular expression that can identify a number within an interval in a string in VBA. Sometimes this number has characters around it, other times not (non-consistent notation from a supplier). The expression should identify that 1413 in the three examples below are within the number range 500-2000 (or alternatively that it's not in the number range 0-50 or 51-499).
Example:
Test 12/2014. Tot.flow:1413 m3 or
Test 12/2014. Tot.flow:1413m3 or
Test 12/2014. Tot.flow: 1413
These strings have some identifiers:
there will always be a colon before the number
there may be a white space between the colon and the number
there may be a white space between the number and the m3
m3 is not necessarily always present, and if not, the number is at the end of the string
So far what I have in my attempt to make an regex that find the number range is ([5-9][0-9][0-9]|[1]\d{3}|2000), but this matches all three digit numbers as well (2001 gives a match on 200). However, I understand that I'm missing out on a couple of concepts to achieve the ultimate goal here. I guess my problems are as following:
How to start the interval at something not being zero (found lots of questions on intervals starting on zero)
How to take into account the variations in notation both for flow: and m3?
I'm only interested in checking that the number lies within the number range. This is driving me bonkers, all help is highly appreciated!
You can just extract the number with regExp.Replace() using the following regex:
^.*:\s*(\d+).*$
The replacement part is $1.
Then, use usual number comparison to check whether the value is in the expected range (e.g. If CLng(result) > 499 And If CLng(result) < 2001 Then ...).
Test macro:
Dim re As RegExp, tgt As String, src As String
Set re = New RegExp
With re
.pattern = "^.*:\s*(\d+).*$"
.Global = False
End With
src = "Test 12/2014. Tot.flow: 1413"
tgt = re.Replace(src, "$1")
MsgBox (CLng(tgt) > 499 And CLng(tgt) < 2001)
You can try with:
:\s?([5-9]\d\d|1\d{3}|2000)\s?(m3|\n)
also, your regex ([5-9][0-9][0-9]|[1]\d{3}|2000) in my opinion is fine, it should not match numbers >500 and 2000<.

VB.Net Beginner: Replace with Wildcards, Possibly RegEx?

I'm converting a text file to a Tab-Delimited text file, and ran into a bit of a snag. I can get everything I need to work the way I want except for one small part.
One field I'm working with has the home addresses of the subjects as a single entry ("1234 Happy Lane Somewhere, St 12345") and I need each broken down by Street(Tab)City(Tab)State(Tab)Zip. The one part I'm hung up on is the Tab between the State and the Zip.
I've been using input=input.Replace throughout, and it's worked well so far, but I can't think of how to untangle this one. The wildcards I'm used to don't seem to be working, I can't replace ("?? #####") with ("??" + ControlChars.Tab + "#####")...which I honestly didn't expect to work, but it's the only idea on the matter I had.
I've read a bit about using Regex, but have no experience with it, and it seems a bit...overwhelming.
Is Regex my best option for this? If not, are there any other suggestions on solutions I may have missed?
Thanks for your time. :)
EDIT: Here's what I'm using so far. It makes some edits to the line in question, taking care of spaces, commas, and other text I don't need, but I've got nothing for the State/Zip situation; I've a bad habit of wiping something if it doesn't work, but I'll append the last thing I used to the very end, if that'll help.
If input Like "Guar*###/###-####" Then
input = input.Replace("Guar:", "")
input = input.Replace(" ", ControlChars.Tab)
input = input.Replace(",", ControlChars.Tab)
input = "C" + ControlChars.Tab + strAccount + ControlChars.Tab + input
End If
input = System.Text.RegularExpressions.Regex.Replace(" #####", ControlChars.Tab + "#####") <-- Just one example of something that doesn't work.
This is what's written to input in this example
" Guar: LASTNAME,FIRSTNAME 999 E 99TH ST CITY,ST 99999 Tel: 999/999-9999"
And this is what I can get as a result so far
C 99999/9 LASTNAME FIRSTNAME 999 E 99TH ST CITY ST 99999 999/999-9999
With everything being exactly what I need besides the "ST 99999" bit (with actual data obviously omitted for privacy and professional whatnots).
UPDATE: Just when I thought it was all squared away, I've got another snag. The raw data gives me this.
# TERMINOLOGY ######### ##/##/#### # ###.##
And the end result is giving me this, because this is a chunk of data that was just fine as-is...before I removed the Tabs. Now I need a way to replace them after they've been removed, or to omit this small group of code from a document-wide Tab genocide I initiate the code with.
#TERMINOLOGY###########/##/########.##
Would a variant on rgx.Replace work best here? Or can I copy the code to a variable, remove Tabs from the document, then insert the variable without losing the tabs?
I think what you're looking for is
Dim r As New System.Text.RegularExpressions.Regex(" (\d{5})(?!\d)")
Dim input As String = rgx.Replace(input, ControlChars.Tab + "$1")
The first line compiles the regular expression. The \d matches a digit, and the {5}, as you can guess, matches 5 repetitions of the previous atom. The parentheses surrounding the \d{5} is known as a capture group, and is responsible for putting what's captured in a pseudovariable named $1. The (?!\d) is a more advanced concept known as a negative lookahead assertion, and it basically peeks at the next character to check that it's not a digit (because then it could be a 6-or-more digit number, where the first 5 happened to get matched). Another version is
" (\d{5})\b"
where the \b is a word boundary, disallowing alphanumeric characters following the digits.

Regex Returning extra empty Value

Set Regex = New RegExp
Regex.Pattern = """[^""]*""|[^,]*"
Regex.Global = True
//I have a for loop here to loop through records
text = Cells.Item(r, 7).Value
For Each Match In Regex.Execute(text)
count = count + 1
Next Match
This is my Regex Code, and here is the table where I am pulling the data from,
When I run the code in debug mode the PCBaa count comes up as two, c3 and c4 come up as 14 and C6-c36 come up as 36, Is my regex code wrong for extracting the codes between the commas ??
Ok, I have tried that myself and it seems that first off, it seems you don't reset the count value to 0 after each line. That could be intentional, but just so you know.
The second thing is that the regular expression seems to work nearly fine but always gives you the double amount because it matches a zero length string at the end of each match.
So for the last line (C6-C26) it machtes:
1) "C6" 2) "" 3) "C7" 4) "" ... and so on.
To be hounest, I'm a little bit surprised myself and don't exactly know why that's the case for now.
But the solution is pretty easy: Since you want there to be no zero length strings in the result (so they don't get counted) you simply have to exchange the * for a + and that will tell the regular expression to match only if there's at least one character.
So your regular expression string should look like:
Regex.Pattern = """[^""]+""|[^,]+"
Why you've got a count of 14 on the c3, c4 surprises me... I got a 4 which makes sence because of the double counting due to the zero length matches.

Regex: How to match a string that is not only numbers

Is it possible to write a regular expression that matches all strings that does not only contain numbers? If we have these strings:
abc
a4c
4bc
ab4
123
It should match the four first, but not the last one. I have tried fiddling around in RegexBuddy with lookaheads and stuff, but I can't seem to figure it out.
(?!^\d+$)^.+$
This says lookahead for lines that do not contain all digits and match the entire line.
Unless I am missing something, I think the most concise regex is...
/\D/
...or in other words, is there a not-digit in the string?
jjnguy had it correct (if slightly redundant) in an earlier revision.
.*?[^0-9].*
#Chad, your regex,
\b.*[a-zA-Z]+.*\b
should probably allow for non letters (eg, punctuation) even though Svish's examples didn't include one. Svish's primary requirement was: not all be digits.
\b.*[^0-9]+.*\b
Then, you don't need the + in there since all you need is to guarantee 1 non-digit is in there (more might be in there as covered by the .* on the ends).
\b.*[^0-9].*\b
Next, you can do away with the \b on either end since these are unnecessary constraints (invoking reference to alphanum and _).
.*[^0-9].*
Finally, note that this last regex shows that the problem can be solved with just the basics, those basics which have existed for decades (eg, no need for the look-ahead feature). In English, the question was logically equivalent to simply asking that 1 counter-example character be found within a string.
We can test this regex in a browser by copying the following into the location bar, replacing the string "6576576i7567" with whatever you want to test.
javascript:alert(new String("6576576i7567").match(".*[^0-9].*"));
/^\d*[a-z][a-z\d]*$/
Or, case insensitive version:
/^\d*[a-z][a-z\d]*$/i
May be a digit at the beginning, then at least one letter, then letters or digits
Try this:
/^.*\D+.*$/
It returns true if there is any simbol, that is not a number. Works fine with all languages.
Since you said "match", not just validate, the following regex will match correctly
\b.*[a-zA-Z]+.*\b
Passing Tests:
abc
a4c
4bc
ab4
1b1
11b
b11
Failing Tests:
123
if you are trying to match worlds that have at least one letter but they are formed by numbers and letters (or just letters), this is what I have used:
(\d*[a-zA-Z]+\d*)+
If we want to restrict valid characters so that string can be made from a limited set of characters, try this:
(?!^\d+$)^[a-zA-Z0-9_-]{3,}$
or
(?!^\d+$)^[\w-]{3,}$
/\w+/:
Matches any letter, number or underscore. any word character
.*[^0-9]{1,}.*
Works fine for us.
We want to use the used answer, but it's not working within YANG model.
And the one I provided here is easy to understand and it's clear:
start and end could be any chars, but, but there must be at least one NON NUMERICAL characters, which is greatest.
I am using /^[0-9]*$/gm in my JavaScript code to see if string is only numbers. If yes then it should fail otherwise it will return the string.
Below is working code snippet with test cases:
function isValidURL(string) {
var res = string.match(/^[0-9]*$/gm);
if (res == null)
return string;
else
return "fail";
};
var testCase1 = "abc";
console.log(isValidURL(testCase1)); // abc
var testCase2 = "a4c";
console.log(isValidURL(testCase2)); // a4c
var testCase3 = "4bc";
console.log(isValidURL(testCase3)); // 4bc
var testCase4 = "ab4";
console.log(isValidURL(testCase4)); // ab4
var testCase5 = "123"; // fail here
console.log(isValidURL(testCase5));
I had to do something similar in MySQL and the following whilst over simplified seems to have worked for me:
where fieldname regexp ^[a-zA-Z0-9]+$
and fieldname NOT REGEXP ^[0-9]+$
This shows all fields that are alphabetical and alphanumeric but any fields that are just numeric are hidden. This seems to work.
example:
name1 - Displayed
name - Displayed
name2 - Displayed
name3 - Displayed
name4 - Displayed
n4ame - Displayed
324234234 - Not Displayed