How can I get the last 4 digits in $MyBuildNumber, which could also be an asterisk? I want $NewVersion to return "2.8.1.*"; however, this code does not return anything for me:
$MyBuildNumber = "MyBuildNumberIs_2.8.1.*"
$VersionRegex = "\d+[*]?\.\d+[*]?\.\d+[*]?\.\d+[*]?"
$VersionData = [regex]::matches($MyBuildNumber,$VersionRegex)
switch($VersionData.Count)
{
0
{
Write-Error "Could not find version number data in MyBuildNumber."
exit 1
}
1 {}
}
$NewVersion = $VersionData[0]
Write-Host "Version: $NewVersion"
You can use character class with digit and *:
[\d*]+\.[\d*]+\.[\d*]+\.[\d*]+
anubhava beat me to it, but here's a more compact version of the same thing because why not:
([\d*]+\.){3}[\d*]+
Related
Is it possible to use regex to round decimal places?
I have lines that look like this but without any spaces (space added for readability).
0, 162.3707542, -162.3707542
128.2, 151.8299471, -23.62994709 // this 151.829 should lead to 151.83
I want to remove all numbers after the second decimal position and if possible round the second decimal position based on the third position.
0, 162.37, -162.37
128.2, 151.82, -23.62 // already working .82
..., 151.83, ... // intended .83 <- this is my question
What is working
The following regex (see this sample on regex101.com) almost does what i want
([0-9]+\.)([0-9]{2})(\d{0,}) // search
$1$2 // replace
My understanding
The search works like this
group: ([0-9]+\.) find 1 up to n numbers and a point
group: ([0-9]{2}) followd by 2 numbers
group: (\d{0,}) followed by 0 or more numbers / digits
In visual-studio-code in the replacement field only group 1 and 2 are referenced $1$2.
This results in this substitution (regex101.com)
Question
Is it possible to change the last digit of $2 (group two) based on the first digit in $3 (group three) ?
My intention is to round correctly. In the sample above this would mean
151.8299471 // source
151.82 // current result
151.83 // desired result 2 was changed to 3 because of third digit 9
It is not only that you need to update the digit of $2. if the number is 199.995 you have to modify all digits of your result.
You can use the extension Regex Text Generator.
You can use a predefined set of regex's.
"regexTextGen.predefined": {
"round numbers": {
"originalTextRegex": "(-?\\d+\\.\\d+)",
"generatorRegex": "{{=N[1]:fixed(2):simplify}}"
}
}
With the same regex (-?\\d+\\.\\d+) in the VSC Find dialog select all number you want, you can use Find in Selection and Alt+Enter.
Then execute the command: Generate text based on Regular Expression.
Select the predefined option and press Enter a few times. You get a preview of the result, you can escape the UI and get back the original text.
In the process you can edit generatorRegex to change the number of decimals or to remove the simplify.
It was easier than I thought, once I found the Number.toFixed(2) method.
Using this extension I wrote, Find and Transform, make this keybinding in your keybindings.json:
{
"key": "alt+r", // whatever keybinding you want
"command": "findInCurrentFile",
"args": {
"find": "(-?[0-9]+\\.\\d{3,})", // only need the whole number as one capture group
"replace": [
"$${", // starting wrapper to indicate a js operation begins
"return $1.toFixed(2);", // $1 from the find regex
"}$$" // ending wrapper to indicate a js operation ends
],
// or simply in one line
// "replace": "$${ return $1.toFixed(2); }$$",
"isRegex": true
},
}
[The empty lines above are there just for readability.]
This could also be put into a setting, see the README, so that a command appears in the Command Palette with the title of your choice.
Also note that javascript rounds -23.62994709 to -23.63. You had -23.62 in your question, I assume -23.63 is correct.
If you do want to truncate things like 4.00 to 4 or 4.20 to 4.2 use this replace instead.
"replace": [
"$${",
"let result = $1.toFixed(2);",
"result = String(result).replace(/0+$/m, '').replace(/\\.$/m, '');",
"return result;",
"}$$"
],
We are able to round-off decimal numbers correctly using regular expressions.
We need basically this regex:
secondDD_regx = /(?<=[\d]*\.[\d]{1})[\d]/g; // roun-off digit
thirdDD_regx = /(?<=[\d]*\.[\d]{2})[\d]/g; // first discard digit
isNonZeroAfterThirdDD_regx = /(?<=[\d]*\.[\d]{3,})[1-9]/g;
isOddSecondDD_regx = /[13579]/g;
Full code (round-off digit up to two decimal places):
const uptoOneDecimalPlaces_regx = /[\+\-\d]*\.[\d]{1}/g;
const secondDD_regx = /(?<=[\d]*\.[\d]{1})[\d]/g;
const thirdDD_regx = /(?<=[\d]*\.[\d]{2})[\d]/g;
const isNonZeroAfterThirdDD_regx = /(?<=[\d]*\.[\d]{3,})[1-9]/g;
const num = '5.285';
const uptoOneDecimalPlaces = num.match(uptoOneDecimalPlaces_regx)?.[0];
const secondDD = num.match(secondDD_regx)?.[0];
const thirdDD = num.match(thirdDD_regx)?.[0];
const isNonZeroAfterThirdDD = num.match(isNonZeroAfterThirdDD_regx)?.[0];
const isOddSecondDD = /[13579]/g.test(secondDD);
// check carry
const carry = !thirdDD ? 0 : thirdDD > 5 ? 1 : thirdDD < 5 ? 0 : isNonZeroAfterThirdDD ? 1 : isOddSecondDD ? 1 : 0;
let roundOffValue;
if(/9/g.test(secondDD) && carry) {
roundOffValue = (Number(`${uptoOneDecimalPlaces}` + `${secondDD ? Number(secondDD) : 0}`) + Number(`0.0${carry}`)).toString();
} else {
roundOffValue = (uptoOneDecimalPlaces + ((secondDD ? Number(secondDD) : 0) + carry)).toString();
}
// Beaufity output : show exactly 2 decimal places if output is x.y or x
const dd = roundOffValue.match(/(?<=[\d]*[\.])[\d]*/g)?.toString().length;
roundOffValue = roundOffValue + (dd=== undefined ? '.00' : dd === 1 ? '0' : '');
console.log(roundOffValue);
For more details check: Round-Off Decimal Number properly using Regular Expression🤔
When trying to validate that a string is made up of alphabetic characters only, two possible regex solutions come to my mind.
The first one checks that every character in the string is alphanumeric:
/^[a-z]+$/
The second one tries to find a character somewhere in the string that is not alphanumeric:
/[^a-z]/
(Yes, I could use character classes here.)
Is there any significant performance difference for long strings?
(If anything, I'd guess the second variant is faster.)
Just by looking at it, I'd say the second method is faster.
However, I made a quick non-scientific test, and the results seem to be inconclusive:
Regex Match vs. Negation.
P.S. I removed the group capture from the first method. It's superfluous, and would only slow it down.
Wrote this quick Perl code:
#testStrings = qw(asdfasdf asdf as aa asdf as8up98;n;kjh8y puh89uasdf ;lkjoij44lj 'aks;nasf na ;aoij08u4 43[40tj340ij3 ;salkjaf; a;lkjaf0d8fua ;alsf;alkj
a a;lkf;alkfa as;ldnfa;ofn08h[ijo ok;ln n ;lasdfa9j34otj3;oijt 04j3ojr3;o4j ;oijr;o3n4f;o23n a;jfo;ie;o ;oaijfoia ;aosijf;oaij ;oijf;oiwj;
qoeij;qwj;ofqjf08jf0 ;jfqo;j;3oj4;oijt3ojtq;o4ijq;onnq;ou4f ;ojfoqn;aonfaoneo ;oef;oiaj;j a;oefij iiiii iiiiiiiii iiiiiiiiiii);
print "test 1: \n";
foreach my $i (1..1000000) {
foreach (#testStrings) {
if ($_ =~ /^([a-z])+$/) {
#print "match"
} else {
#print "not"
}
}
}
print `date` . "\n";
print "test 2: \n";
foreach my $j (1..1000000) {
foreach (#testStrings) {
if ($_ =~ /[^a-z]/) {
#print "match"
} else {
#print "not"
}
}
}
then ran it with:
date; <perl_file>; date
it isn't 100% scientific, but it gives us a good idea. The first Regex took 10 or 11 seconds to execute, the second Regex took 8 seconds.
I'm hoping someone might know of a script that can take an arbitrary word list and generated the shortest regex that could match that list exactly (and nothing else).
For example, suppose my list is
1231
1233
1234
1236
1238
1247
1256
1258
1259
Then the output should be:
12(3[13468]|47|5[589])
This is an old post, but for the benefit of those finding it through web searches as I did, there is a Perl module that does this, called Regexp::Optimizer, here: http://search.cpan.org/~dankogai/Regexp-Optimizer-0.23/lib/Regexp/Optimizer.pm
It takes a regular expression as input, which can consist just of the list of input strings separated with |, and outputs an optimal regular expression.
For example, this Perl command-line:
perl -mRegexp::Optimizer -e "print Regexp::Optimizer->new->optimize(qr/1231|1233|1234|1236|1238|1247|1256|1258|1259/)"
generates this output:
(?^:(?^:12(?:3[13468]|5[689]|47)))
(assuming you have installed Regex::Optimizer), which matches the OP's expectation quite well.
Here's another example:
perl -mRegexp::Optimizer -e "print Regexp::Optimizer->new->optimize(qr/314|324|334|3574|384/)"
And the output:
(?^:(?^:3(?:[1238]|57)4))
For comparison, an optimal trie-based version would output 3(14|24|34|574|84). In the above output, you can also search and replace (?: and (?^: with just ( and eliminate redundant parentheses, to obtain this:
3([1238]|57)4
You are probably better off saving the entire list, or if you want to get fancy, create a Trie:
1231
1234
1247
1
|
2
/ \
3 4
/ \ \
1 4 7
Now when you take a string check if it reaches a leaf node. It does, it's valid.
If you have variable length overlapping strings (eg: 123 and 1234) you'll need to mark some nodes as possibly terminal.
You can also use the trie to generate the regex if you really like the regex idea:
Nodes from the root to the first branching are fixed (eg: 12)
Branches create |: (eg: 12(3|4)
Leaf nodes generate a character class (or single character) that follows the parent node: (eg 12(3[14]|47))
This might not generate the shortest regex, to do that you'll might some extra work:
"Compact" ranges if you find them (eg [12345] becomes [1-4])
Add quantifiers for repeated elements (eg: [1234][1234] becomes [1234]{2}
???
I really don't think it's worth it to generate the regex.
This project generates a regexp from a given list of words: https://github.com/bwagner/wordhierarchy
It almost does the same as the above JavaScript solution, but avoids certain superfluous parentheses.
It only uses "|", non-capturing group "(?:)" and option "?".
There's room for improvement when there's a row of single characters:
Instead of e.g. (?:3|8|1|6|4) it could generate [38164].
The generated regexp could easily be adapted to other regexp dialects.
Sample usage:
java -jar dist/wordhierarchy.jar 1231 1233 1234 1236 1238 1247 1256 1258 1259
-> 12(?:5(?:6|9|8)|47|3(?:3|8|1|6|4))
Here's what I came up with (JavaScript). It turned a list of 20,000 6-digit numbers into a 60,000-character regular expression. Compared to a naive (word1|word2|...) construction, that's almost 60% "compression" by character count.
I'm leaving the question open, as there's still a lot of room for improvement and I'm holding out hope that there might be a better tool out there.
var list = new listChar("");
function listChar(s, p) {
this.char = s;
this.depth = 0;
this.parent = p;
this.add = function(n) {
if (!this.subList) {
this.subList = {};
this.increaseDepth();
}
if (!this.subList[n]) {
this.subList[n] = new listChar(n, this);
}
return this.subList[n];
}
this.toString = function() {
var ret = "";
var subVals = [];
if (this.depth >=1) {
for (var i in this.subList) {
subVals[subVals.length] = this.subList[i].toString();
}
}
if (this.depth === 1 && subVals.length > 1) {
ret = "[" + subVals.join("") + "]";
} else if (this.depth === 1 && subVals.length === 1) {
ret = subVals[0];
} else if (this.depth > 1) {
ret = "(" + subVals.join("|") + ")";
}
return this.char + ret;
}
this.increaseDepth = function() {
this.depth++;
if (this.parent) {
this.parent.increaseDepth();
}
}
}
function wordList(input) {
var listStep = list;
while (input.length > 0) {
var c = input.charAt(0);
listStep = listStep.add(c);
input = input.substring(1);
}
}
words = [/* WORDS GO HERE*/];
for (var i = 0; i < words.length; i++) {
wordList(words[i]);
}
document.write(list.toString());
Using
words = ["1231","1233","1234","1236","1238","1247","1256","1258","1259"];
Here's the output:
(1(2(3[13468]|47|5[689])))
I am trying to pass value as 95%
numexu = 95%
"^((>|GT|>=|GE|<|LT|<=|LE|==|EQ|!=|NE)?\\s*\\d?[%]?)$
if (!regex.IsMatch(numexu))
throw new ArgumentException("Percent expression is in an invalid format.");
it is throwing exception in code.
Regards,
Regex
You are checking only for 1 number \\d?, try instead this: \\d{0,2}, this accepts 0, 1 or 2 numbers. The ? makes it 0 or 1 times matching.
I am not sure if you need to escape the %, if so then \\%. Additionally if you have only one character you can skip the brackets [%], so % (or \\%, if needed to escape)
This Function will work for your requirement
function check() {
var txtfield; txtfield =document.getElementById('txtbox').value;
var reg=/^(\d{0,2}%?$)/;
if(reg.test(txtfield)){
alert("match");
}
else { alert("Try again"); }
}
I have been looking for a regular expression with Google for an hour or so now and can't seem to work this one out :(
If I have a number, say:
2345
and I want to find any other number with the same digits but in a different order, like this:
2345
For example, I match
3245 or 5432 (same digits but different order)
How would I write a regular expression for this?
There is an "elegant" way to do it with a single regex:
^(?:2()|3()|4()|5()){4}\1\2\3\4$
will match the digits 2, 3, 4 and 5 in any order. All four are required.
Explanation:
(?:2()|3()|4()|5()) matches one of the numbers 2, 3, 4, or 5. The trick is now that the capturing parentheses match an empty string after matching a number (which always succeeds).
{4} requires that this happens four times.
\1\2\3\4 then requires that all four backreferences have participated in the match - which they do if and only if each number has occurred once. Since \1\2\3\4 matches an empty string, it will always match as long as the previous condition is true.
For five digits, you'd need
^(?:2()|3()|4()|5()|6()){5}\1\2\3\4\5$
etc...
This will work in nearly any regex flavor except JavaScript.
I don't think a regex is appropriate. So here is an idea that is faster than a regex for this situation:
check string lengths, if they are different, return false
make a hash from the character (digits in your case) to integers for counting
loop through the characters of your first string:
increment the counter for that character: hash[character]++
loop through the characters of the second string:
decrement the counter for that character: hash[character]--
break if any count is negative (or nonexistent)
loop through the entries, making sure each is 0:
if all are 0, return true
else return false
EDIT: Java Code (I'm using Character for this example, not exactly Unicode friendly, but it's the idea that matters now):
import java.util.*;
public class Test
{
public boolean isSimilar(String first, String second)
{
if(first.length() != second.length())
return false;
HashMap<Character, Integer> hash = new HashMap<Character, Integer>();
for(char c : first.toCharArray())
{
if(hash.get(c) != null)
{
int count = hash.get(c);
count++;
hash.put(c, count);
}
else
{
hash.put(c, 1);
}
}
for(char c : second.toCharArray())
{
if(hash.get(c) != null)
{
int count = hash.get(c);
count--;
if(count < 0)
return false;
hash.put(c, count);
}
else
{
return false;
}
}
for(Integer i : hash.values())
{
if(i.intValue()!=0)
return false;
}
return true;
}
public static void main(String ... args)
{
//tested to print false
System.out.println(new Test().isSimilar("23445", "5432"));
//tested to print true
System.out.println(new Test().isSimilar("2345", "5432"));
}
}
This will also work for comparing letters or other character sequences, like "god" and "dog".
Put the digits of each number in two arrays, sort the arrays, find out if they hold the same digits at the same indices.
RegExes are not the right tool for this task.
You could do something like this to ensure the right characters and length
[2345]{4}
Ensuring they only exist once is trickier and why this is not suited to regexes
(?=.*2.*)(?=.*3.*)(?=.*4.*)(?=.*5.*)[2345]{4}
The simplest regular expression is just all 24 permutations added up via the or operator:
/2345|3245|5432|.../;
That said, you don't want to solve this with a regex if you can get away with it. A single pass through the two numbers as strings is probably better:
1. Check the string length of both strings - if they're different you're done.
2. Build a hash of all the digits from the number you're matching against.
3. Run through the digits in the number you're checking. If you hit a match in the hash, mark it as used. Keep going until you don't get an unused match in the hash or run out of items.
I think it's very simple to achieve if you're OK with matching a number that doesn't use all of the digits. E.g. if you have a number 1234 and you accept a match with the number of 1111 to return TRUE;
Let me use PHP for an example as you haven't specified what language you use.
$my_num = 1245;
$my_pattern = '/[' . $my_num . ']{4}/'; // this resolves to pattern: /[1245]{4}/
$my_pattern2 = '/[' . $my_num . ']+/'; // as above but numbers can by of any length
$number1 = 4521;
$match = preg_match($my_pattern, $number1); // will return TRUE
$number2 = 2222444111;
$match2 = preg_match($my_pattern2, $number2); // will return TRUE
$number3 = 888;
$match3 = preg_match($my_pattern, $number3); // will return FALSE
$match4 = preg_match($my_pattern2, $number3); // will return FALSE
Something similar will work in Perl as well.
Regular expressions are not appropriate for this purpose. Here is a Perl script:
#/usr/bin/perl
use strict;
use warnings;
my $src = '2345';
my #test = qw( 3245 5432 5542 1234 12345 );
my $canonical = canonicalize( $src );
for my $candidate ( #test ) {
next unless $canonical eq canonicalize( $candidate );
print "$src and $candidate consist of the same digits\n";
}
sub canonicalize { join '', sort split //, $_[0] }
Output:
C:\Temp> ks
2345 and 3245 consist of the same digits
2345 and 5432 consist of the same digits