Powershell: How to Validate Range between 1 to 15 and it should accepts pattern as 1,2,3,4,5 to 15 - regex

I am a newbie in Power Shell Scripting.
I am trying to Achieve a functionality, that should accepts inputs from user in below criteria
Only Digits
Range Between 1 to 15
Should accept String of Array with comma Separated values ex: 1,2,3,4,14,15
Can Contain space in between commas
Values should not be duplicated
The Returned values must be Array
Till now, I have tried
Function Validate-Choice{
[cmdletbinding()]
Param(
[Parameter(Position=0,Mandatory=$True)]
[ValidateRange(1,15)]
[string[]]$Item
)
Process {$Item}
}
Validate-Choice 1,2,3,4,5,6,7,8,9,10,11,13 # Similar Way i want O/p
Out Put:
1
2
3
4
5
6
7
8
9
10
11
13
$ReadInput = Read-Host -prompt "Please Choose from the list [1/2/3/4/5/6/7/8/9/10/11/12/13/14] You can select multiple Values EX: 1, 2, 3 -- "
$userchoices = Validate-Choice -item $ReadInput
$userchoices
If read the same input from Host Getting Below Error
Validate-Choice : Cannot validate argument on parameter 'Item'. The argument cannot be validated because
its type "String" is not the same type (Int32) as the maximum and minimum limits of the parameter. Make sure the argument is of type Int32 and then try the command again. At line:10 char:21
+ Validate-Choice '1,2,3,4,5,6,7,8,9,10,11,13'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [Validate-Choice], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Validate-Choice
And also i am trying with different Regex patterns. But failing
Function Test-Something {
[cmdletbinding()]
Param(
[Parameter(Position=0,Mandatory=$True)]
[ValidatePattern('(?:\s*\d{1,15}[1-15]\s*(?:,|$))+$')]
[string[]]$Item
)
Process { $Item }
}
The above functions are partially resulting.
Can any one please help me here..!?

This would probably be easiest if you just changed the parameter type to [int[]] then your ValidateRange attribute does most of the work. It doesn't handle duplicates though. Turns out you can't use [ValidateScript()] as #PetSerAl points out. So that leaves checking the parameter the old fashioned way, in a begin block:
Function Test-Something {
[cmdletbinding()]
Param(
[Parameter(Position=0, Mandatory)]
[ValidateRange(1, 15)]
[int[]]$Item
)
Begin {
$ht = #{}
foreach ($i in $Item) {
if ($ht.ContainsKey("$i")) {
throw "Parameter Item must contain unique values - duplicate '$i'"
}
else {
$ht["$i"] = $i
}
}
}
Process {
[string]$Item
}
}
Test-Something (1,2,3,4,3)
Note that this won't work if you make the Item parameter accept pipeline input. In the pipeline input case, $Item will not be set in the begin block.

Naren_Ch
Your 1st usage of the advanced function (AF)
Validate-Choice 1,2,3,4,5,6,7,8,9,10,11,13 # Similar Way i want O/p
is correct - you are inputting data as expected by the AF.
Now, look at the example reading the input from the host:
$ReadInput = Read-Host -prompt "Please Choose from the list [1/2/3/4/5/6/7/8/9/10/11/12/13/14] You can select multiple Values EX: 1, 2, 3 -- "
When you do this, $ReadInput is a string, and in this case, it's a string full of commas!
Consequently, your data inputted to the AF will result in error caused by validation code, written by yourself.
To correct the situation, just do this:
$ReadInput = (Read-Host -prompt "Please Choose etc...") -split ','
$userchoices = Validate-Choice -item $ReadInput
You must remember that data read by Read-Host is a string (just 1 string).

Related

RegEx vscode - replace decimal places and round correctly

Is it possible to use regex to round decimal places?
I have lines that look like this but without any spaces (space added for readability).
0, 162.3707542, -162.3707542
128.2, 151.8299471, -23.62994709 // this 151.829 should lead to 151.83
I want to remove all numbers after the second decimal position and if possible round the second decimal position based on the third position.
0, 162.37, -162.37
128.2, 151.82, -23.62 // already working .82
..., 151.83, ... // intended .83 <- this is my question
What is working
The following regex (see this sample on regex101.com) almost does what i want
([0-9]+\.)([0-9]{2})(\d{0,}) // search
$1$2 // replace
My understanding
The search works like this
group: ([0-9]+\.) find 1 up to n numbers and a point
group: ([0-9]{2}) followd by 2 numbers
group: (\d{0,}) followed by 0 or more numbers / digits
In visual-studio-code in the replacement field only group 1 and 2 are referenced $1$2.
This results in this substitution (regex101.com)
Question
Is it possible to change the last digit of $2 (group two) based on the first digit in $3 (group three) ?
My intention is to round correctly. In the sample above this would mean
151.8299471 // source
151.82 // current result
151.83 // desired result 2 was changed to 3 because of third digit 9
It is not only that you need to update the digit of $2. if the number is 199.995 you have to modify all digits of your result.
You can use the extension Regex Text Generator.
You can use a predefined set of regex's.
"regexTextGen.predefined": {
"round numbers": {
"originalTextRegex": "(-?\\d+\\.\\d+)",
"generatorRegex": "{{=N[1]:fixed(2):simplify}}"
}
}
With the same regex (-?\\d+\\.\\d+) in the VSC Find dialog select all number you want, you can use Find in Selection and Alt+Enter.
Then execute the command: Generate text based on Regular Expression.
Select the predefined option and press Enter a few times. You get a preview of the result, you can escape the UI and get back the original text.
In the process you can edit generatorRegex to change the number of decimals or to remove the simplify.
It was easier than I thought, once I found the Number.toFixed(2) method.
Using this extension I wrote, Find and Transform, make this keybinding in your keybindings.json:
{
"key": "alt+r", // whatever keybinding you want
"command": "findInCurrentFile",
"args": {
"find": "(-?[0-9]+\\.\\d{3,})", // only need the whole number as one capture group
"replace": [
"$${", // starting wrapper to indicate a js operation begins
"return $1.toFixed(2);", // $1 from the find regex
"}$$" // ending wrapper to indicate a js operation ends
],
// or simply in one line
// "replace": "$${ return $1.toFixed(2); }$$",
"isRegex": true
},
}
[The empty lines above are there just for readability.]
This could also be put into a setting, see the README, so that a command appears in the Command Palette with the title of your choice.
Also note that javascript rounds -23.62994709 to -23.63. You had -23.62 in your question, I assume -23.63 is correct.
If you do want to truncate things like 4.00 to 4 or 4.20 to 4.2 use this replace instead.
"replace": [
"$${",
"let result = $1.toFixed(2);",
"result = String(result).replace(/0+$/m, '').replace(/\\.$/m, '');",
"return result;",
"}$$"
],
We are able to round-off decimal numbers correctly using regular expressions.
We need basically this regex:
secondDD_regx = /(?<=[\d]*\.[\d]{1})[\d]/g; // roun-off digit
thirdDD_regx = /(?<=[\d]*\.[\d]{2})[\d]/g; // first discard digit
isNonZeroAfterThirdDD_regx = /(?<=[\d]*\.[\d]{3,})[1-9]/g;
isOddSecondDD_regx = /[13579]/g;
Full code (round-off digit up to two decimal places):
const uptoOneDecimalPlaces_regx = /[\+\-\d]*\.[\d]{1}/g;
const secondDD_regx = /(?<=[\d]*\.[\d]{1})[\d]/g;
const thirdDD_regx = /(?<=[\d]*\.[\d]{2})[\d]/g;
const isNonZeroAfterThirdDD_regx = /(?<=[\d]*\.[\d]{3,})[1-9]/g;
const num = '5.285';
const uptoOneDecimalPlaces = num.match(uptoOneDecimalPlaces_regx)?.[0];
const secondDD = num.match(secondDD_regx)?.[0];
const thirdDD = num.match(thirdDD_regx)?.[0];
const isNonZeroAfterThirdDD = num.match(isNonZeroAfterThirdDD_regx)?.[0];
const isOddSecondDD = /[13579]/g.test(secondDD);
// check carry
const carry = !thirdDD ? 0 : thirdDD > 5 ? 1 : thirdDD < 5 ? 0 : isNonZeroAfterThirdDD ? 1 : isOddSecondDD ? 1 : 0;
let roundOffValue;
if(/9/g.test(secondDD) && carry) {
roundOffValue = (Number(`${uptoOneDecimalPlaces}` + `${secondDD ? Number(secondDD) : 0}`) + Number(`0.0${carry}`)).toString();
} else {
roundOffValue = (uptoOneDecimalPlaces + ((secondDD ? Number(secondDD) : 0) + carry)).toString();
}
// Beaufity output : show exactly 2 decimal places if output is x.y or x
const dd = roundOffValue.match(/(?<=[\d]*[\.])[\d]*/g)?.toString().length;
roundOffValue = roundOffValue + (dd=== undefined ? '.00' : dd === 1 ? '0' : '');
console.log(roundOffValue);
For more details check: Round-Off Decimal Number properly using Regular Expression🤔

Comparing filenames and determine their incremental digits

Imagine i have a sequence of files, e.g.:
...
segment8_400_av.ts
segment9_400_av.ts
segment10_400_av.ts
segment11_400_av.ts
segment12_400_av.ts
...
When the filenames are known, i can match against the filenames with a regular expression like:
/segment(\d+)_400_av\.ts/
Because i know the incremental pattern.
But what would be a generic approach to this? I mean how can i take two file names out of the list, compare them and find out where in the file name the counting part is, taking into account any other digits that can occur in the filename (the 400 in this case)?
Goal: What i want to do is to run the script against various file sequences to check for example for missing files, so this should be the first step to find out the numbering scheme. File sequences can occur in many different fashions, e.g.:
test_1.jpg (simple counting suffix)
test_2.jpg
...
or
segment9_400_av.ts (counting part inbetween, with other static digits)
segment10_400_av.ts
...
or
01_trees_00008.dpx (padded with zeros)
01_trees_00009.dpx
01_trees_00010.dpx
Edit 2: Probably my problem can be described more simple: With a given set of files, i want to:
Find out, if they are a numbered sequence of files, with the rules below
Get the first file number, get the last file number and file count
Detect missing files (gaps in the sequence)
Rules:
As melpomene summarized in his answer, the file names only differ in one substring, which consists only of digits
The counting digits can occur anywhere in the filename
The digits can be padded with 0's (see example above)
I can do #2 and #3, what i am struggling with is #1 as a starting point.
You tagged this question regex, so here's a regex-based solution:
use strict;
use warnings;
my $name1 = 'segment12_400_av.ts';
my $name2 = 'segment10_400_av.ts';
if (
"$name1\0$name2" =~ m{
\A
( \D*+ (?: \d++ \D++ )* ) # prefix
( \d++ ) # numeric segment 1
( [^\0]* ) # suffix
\0 # separator
\1 # prefix
( \d++ ) # numeric segment 2
\3 # suffix
\z
}xa
) {
print <<_EOT_;
Result of comparing "$name1" and "$name2"
Common prefix: $1
Common suffix: $3
Varying numeric parts: $2 / $4
Position of varying numeric part: $-[2]
_EOT_
}
Output:
Result of comparing "segment12_400_av.ts" and "segment10_400_av.ts"
Common prefix: segment
Common suffix: _400_av.ts
Varying numeric parts: 12 / 10
Position of varying numeric part: 7
It assumes that
the strings are different (guard the condition with $name1 ne $name2 && ... if that's not guaranteed)
there's only one substring that's different between the input strings (otherwise it won't find any match)
the differing substring consists of digits only
all digits surrounding the first point of difference are part of the varying increment (e.g. the example above recognizes segment as the common prefix, not segment1)
The idea is to combine the two names into a single string (separated by NUL, which is unambiguous because filenames can't contain \0), then let the regex engine do the hard work of finding the longest common prefix (using greediness and backtracking).
Because we're in a regex, we can get a bit more fancy than just finding the longest common prefix: We can make sure that the prefix doesn't end with a digit (see the segment1 vs. segment case above) and we can verify that the suffix is also the same.
See if this works for you:
use strict;
use warnings;
sub compare {
my ( $f1, $f2 ) = #_;
my #f1 = split /(\d+)/sxm, $f1;
my #f2 = split /(\d+)/sxm, $f2;
my $i = 0;
my $out1 = q{};
my $out2 = q{};
foreach my $p (#f1) {
if ( $p eq $f2[$i] ) {
$out1 .= $p;
$out2 .= $p;
}
else {
$out1 .= sprintf ' ((%s)) ', $p;
$out2 .= sprintf ' ((%s)) ', $f2[$i];
}
$i++;
}
print $out1 . "\n";
print $out2 . "\n";
return;
}
print "Test1:\n";
compare( 'segment8_400_av.ts', 'segment9_400_av.ts' );
print "\n\nTest2:\n";
compare( 'segment999_8_400_av.ts', 'segment999_9_400_av.ts' );
You basically split strings by starting/ending digits, the loop through the items and compare each of the 'pieces'. If they are equal, you accumulate. If not, then you highlight the differences and accumulate.
Output (I'm using ((number)) for the highlight)
Test1:
segment ((8)) _400_av.ts
segment ((9)) _400_av.ts
Test2:
segment999_ ((8)) _400_av.ts
segment999_ ((9)) _400_av.ts
I assume that only the counter differs across the strings
use warnings;
use strict;
use feature 'say';
my ($fn1, $fn2) = ('segment8_400_av.ts', 'segment12_400_av.ts');
# Collect all numbers from all strings
my #nums = map { [ /([0-9]+)/g ] } ($fn1, $fn2);
my ($n, $pos); # which number in the string, at what position
# Find which differ
NUMS:
for my $j (1..$#nums) { # strings
for my $i (0..$#{$nums[0]}) { # numbers in a string
if ($nums[$j]->[$i] != $nums[0]->[$i]) { # it is i-th number
$n = $i;
$fn1 =~ /($nums[0]->[$i])/g; # to find position
$pos = $-[$i];
say "It is $i-th number in a string. Position: $pos";
last NUMS;
}
}
}
We loop over the array with arrayrefs of numbers found in each string, and over elements of each arrayref (eg [8, 400]). Each number in a string (0th or 1st or ...) is compared to its counterpart in the 0-th string (array element); all other numbers are the same.
The number of interest is the one that differs and we record which number in a string it is ($n-th).
Then its position in the string is found by matching it again and using #- regex variable with (the just established) index $n, so the offset of the start of the n-th match. This part may be unneeded; while question edits helped I am still unsure whether the position may or not be useful.
Prints, with position counting from 0
It is 0-th number in a string. Position: 7
Note that, once it is found that it is the $i-th number, we can't use index to find its position; an number earlier in strings may happen to be the same as the $i-th one, in this string.
To test, modify input strings by adding the same number to each, before the one of interest.
Per question update, to examine the sequence (for missing files for instance), with the above findings you can collect counters for all strings in an array with hashrefs (num => filename)
use Data::Dump qw(dd);
my #seq = map { { $num[$_]->[$n] => $fnames[$_] } } 0..$#fnames;
dd \#seq;
where #fnames contains filenames (like two picked for the example above, $fn1 and $fn2). This assumes that the file list was sorted to begin with, or add the sort if it wasn't
my #seq =
sort { (keys %$a)[0] <=> (keys %$b)[0] }
map { { $num[$_]->[$n] => $fnames[$_] } }
0..$#fnames;
The order is maintained by array.
Adding this to the above example (with two strings) adds to the print
[
{ 8 => "segment8_400_av.ts" },
{ 12 => "segment12_400_av.ts" },
]
With this all goals in "Edit 2" should be straighforward.
I suggest that you build a regex pattern by changing all digit sequences to (\d+) and then see which captured values have changed
For instance, with segment8_400_av.ts and
segment9_400_av.ts you would generate a pattern /segment(\d+)_(\d+)_av\.ts/. Note that s/\d+/(\d+)/g will return the number of numeric fields, which you will need for the subsequent check
The first would capture 8 and 400 which the second would capture 9 and 400. 8 is different from 9, so it is in that region of the string where the number varies
I can't really write much code as you don't say what sort of result you want from this process

Why is max number ignoring two-digit numbers?

At the moment I am saving a set of variables to a text file. I am doing following to check if my code works, but whenever I use a two-digit numbers such as 10 it would not print this number as the max number.
If my text file looked like this.
tom:5
tom:10
tom:1
It would output 5 as the max number.
name = input('name')
score = 4
if name == 'tom':
fo= open('tom.txt','a')
fo.write('Tom: ')
fo.write(str(score ))
fo.write("\n")
fo.close()
if name == 'wood':
fo= open('wood.txt','a')
fo.write('Wood: ')
fo.write(str(score ))
fo.write("\n")
fo.close()
tomL2 = []
woodL2 = []
fo = open('tom.txt','r')
tomL = fo.readlines()
tomLi = tomL2 + tomL
fo.close
tomLL=max(tomLi)
print(tomLL)
fo = open('wood.txt','r')
woodL = fo.readlines()
woodLi = woodL2 + woodL
fo.close
woodLL=max(woodLi)
print(woodLL)
You are comparing strings, not numbers. You need to convert them into numbers before using max. For example, you have:
tomL = fo.readlines()
This contains a list of strings:
['tom:5\n', 'tom:10\n', 'tom:1\n']
Strings are ordered lexicographically (much like how words would be ordered in an English dictionary). If you want to compare numbers, you need to turn them into numbers first:
tomL_scores = [int(s.split(':')[1]) for s in tomL]
The parsing is done in the following way:
….split(':') separates the string into parts using a colon as the delimiter:
'tom:5\n' becomes ['tom', '5\n']
…[1] chooses the second element from the list:
['tom', '5\n'] becomes '5\n'
int(…) converts a string into an integer:
'5\n' becomes 5
The list comprehension [… for s in tomL] applies this sequence of operations to every element of the list.
Note that int (or similarly float) are rather picky about what it accepts: it must be in the form of a valid numeric literal or it will be rejected with an error (although preceding and trailing whitespace is allowed). This is why you need ….split(':')[1] to massage the string into a form that it's willing to accept.
This will yield:
[5, 10, 1]
Now, you can apply max to obtain the largest score.
As a side-note, the statement
fo.close
will not close a file, since it doesn't actually call the function. To call the function you must enclose the arguments in parentheses, even if there are none:
fo.close()

Error in writing output file through AWK scripting

I have a AWK script to write specific values matching with specific pattern to a .csv file.
The code is as follows:
BEGIN{print "Query Start,Query End, Target Start, Target End,Score, E,P,GC"}
/^\>g/ { Query=$0 }
/Query =/{
split($0,a," ")
query_start=a[3]
query_end=a[5]
query_end=gsub(/,/,"",query_end)
target_start=a[8]
target_end=a[10]
}
/Score =/{
split($0,a," ")
score=a[3]
score=gsub(/,/,"",score)
e=a[6]
e=gsub(/,/,"",e)
p=a[9]
p=gsub(/,/,"",p)
gc=a[12]
printf("%s,%s,%s,%s,%s,%s,%s,%s\n",query_start, query_end,target_start,target_end,score,e,p,gc)
}
The input file is as follows:
>gi|ABCDEF|
Plus strand results:
Query = 100 - 231, Target = 100 - 172
Score = 20.92, E = 0.01984, P = 4.309e-08, GC = 51
But I received the output in a .csv file as provided below:
100 0 100 172 0 0 0 51
The program failed to copy the values of:
Query end
Score
E
P
(Note: all the failed values are present before comma (,))
Any help to obtain the right output will be great.
Best regards,
Amit
As #Jidder mentioned, you don't need to call split() and as #jaypal mentioned you're using gsub() incorrectly, but also you don't need to call gsub() at all if you just include , in your FS.
Try this:
BEGIN {
FS = "[[:space:],]+"
OFS = ","
print "Query Start","Query End","Target Start","Target End","Score","E","P","GC"
}
/^\>g/ { Query=$0 }
/Query =/ {
query_start=$4
query_end=$6
target_start=$9
target_end=$11
}
/Score =/ {
score=$4
e=$7
p=$10
gc=$13
print query_start,query_end,target_start,target_end,score,e,p,gc
}
That work? Note the field numbers are bumped out by 1 because when you don't use the default FS awk no longer skips leading white space so there's an empty field before the white space in your input.
Obviously, you are not using your Query variable so the line that populates it is redundant.

decision on regular expression length

I want to accomplish the following requirements using Regex only (no C# code can be used )
• BTN length is 12 and BTN starts with 0[123456789] then it should remove one digit from left and one digit from right.
WORKING CORRECTLY
• BTN length is 12 and it’s not the case stated above then it should always return 10 right digits by removing 2 from the start. (e.g. 491234567891 should be changed to 1234567891)
NOT WORKING CORRECTLY
• BTN length is 11 and it should remove one digit from left. WORKING CORRECTLY
for length <=10 BTNs , nothing is required to be done , they would remain as it is or Regex may get failed too on them , thats acceptable .
USING SQL this can be achieved like this
case when len(BTN) = 12 and BTN like '0[123456789]%' then SUBSTRING(BTN,2,10) else RIGHT(BTN,10) end
but how to do this using Regex .
So far I have used and able to get some result correct using this regex
[0*|\d\d]*(.{10}) but by this regex I am not able to correctly remove 1st and last character of a BTN like this 015732888810 to 1573288881 as this regex returns me this 5732888810 which is wrong
code is
string s = "111112573288881,0573288881000,057328888105,005732888810,15732888815,344956345335,004171511326,01777203102,1772576210,015732888810,494956345335";
string[] arr = s.Split(',');
foreach (string ss in arr)
{
// Match mm = Regex.Match(ss, #"\b(?:00(\d{10})|0(\d{10})\d?|(\d{10}))\b");
// Match mm = Regex.Match(ss, "0*(.{10})");
// ([0*|\\d\\d]*(.{10}))|
Match mm = Regex.Match(ss, "[0*|\\d\\d]*(.{10})");
// Match mm = Regex.Match(ss, "(?(^\\d{12}$)(.^{12}$)|(.^{10}$))");
// Match mm = Regex.Match(ss, "(info)[0*|\\d\\d]*(.{10}) (?(1)[0*|\\d\\d]*(.{10})|[0*|\\d\\d]*(.{10}))");
string m = mm.Groups[1].Value;
Console.WriteLine("Original BTN :"+ ss + "\t\tModified::" + m);
}
This should work:
(0(\d{10})0|\d\d(\d{10}))
UPDATE:
(0(\d{10})0|\d{1,2}(\d{10}))
1st alternate will match 12-digits with 0 on left and 0 on right and give you only 10 in between.
2nd alternate will match 11 or 12 digits and give you the right 10.
EDIT:
The regex matches the spec, but your code doesn't read the results correctly. Try this:
Match mm = Regex.Match(ss, "(0(\\d{10})0|\\d{1,2}(\\d{10}))");
string m = mm.Groups[2].Value;
if (string.IsNullOrEmpty(m))
m = mm.Groups[3].Value;
Groups are as follows:
index 0: returns full string
index 1: returns everything inside the outer closure
index 2: returns only what matches in the closure inside the first alternate
index 3: returns only what matches in the closure inside the second alternate
NOTE: This does not deal with anything greater than 12 digits or less than 11. Those entries will either fail or return 10 digits from somewhere. If you want results for those use this:
"(0(\\d{10})0|\\d*(\\d{10}))"
You'll get rightmost 10 digits for more than 12 digits, 10 digits for 10 digits, nothing for less than 10 digits.
EDIT:
This one should cover your additional requirements from the comments:
"^(?:0|\\d*)(\\d{10})0?$"
The (?:) makes a grouping excluded from the Groups returned.
EDIT:
This one might work:
"^(?:0?|\\d*)(\\d{10})\\d?$"
(?(^\d{12}$)(?(^0[1-9])0?(?<digit>.{10})|\d*(?<digit>.{10}))|\d*(?<digit>.{10}))
which does the exact same thing as sql query + giving result in Group[1] all the time so i didn't had to change the code a bit :)