get first char of each string in a list

get first char of each string in a list - list

I am trying to get the first char of each string in a List. The list contains:
hehee_one
hrhr_one
test_two
I am using a foreach-object to loop the list
ForEach-Object {($jsonRules.name[0])}
But what this does is getting only the first element, which makes sense. and if i do this:
ForEach-Object {($jsonRules.name[0][0])}
I only get the first char of the first element but not the rest..
so please help me..
thank you

Santiago Squarzon provided the crucial pointer in a comment:
Provide the strings you want the ForEach-Object cmdlet to operate on via the pipeline, which allows you to refer to each via the automatic $_ variable (the following uses an array literal as input for brevity):
PS> 'tom', 'dick', 'harry' | ForEach-Object { $_[0] }
t
d
h
Alternatively, for values already in memory, use the .ForEach array method for better performance:
('tom', 'dick', 'harry').ForEach({ $_[0] })
The foreach statement provides the best performance:
foreach ($str in ('tom', 'dick', 'harry')) { $str[0] }
As for what you tried:
ForEach-Object { ... } - without pipeline input - is essentially the same as executing ... directly.
Thus, expressed in terms of the sample input above:
You executed:
ForEach-Object { ('tom', 'dick', 'harry')[0][0] }
which is the same as:
('tom', 'dick', 'harry')[0][0]
which therefore extracts the first element from the input array (the first [0]) and then applies the second [0] to that string only, and therefore only yields 't'.
In other words: use of ForEach-Object only makes sense with input from the pipeline.

Related

PowerShell regex to get just hex part in strings

I'm working on a function that gets the map of string key and it's hex value. I got the string key part working, but I'm having trouble getting the hex part to work. This is my function so far:
function Get-Contents4_h{
[cmdletbinding()]
Param ([string]$fileContent)
#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A
# create an ordered hashtable to store the results
$errorMap = [ordered]#{}
# process the lines one-by-one
switch -Regex ($fileContent -split '\r?\n') {
'define ([\w]*)' { # Error_Failed_To_Do_ #this works fine
$key = ($matches[1]).Trim()
}
'([0x\w]*)' { # 0x04A etc #this does not work
$errorMap[$key] = ($matches[1]).Trim()
}
}
# output the completed data as object
#[PsCustomObject]$errorMap
return $errorMap
}
I'm going to be looping through the returned map and matching the hex value with the key in another object.
This is what the string parameter to the function looks like:
#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A
For some reason my
0x\w
regex is not returning anything in regex101.com. I've had luck with that with other hex numbers but not this time.
I've tried this and other variations as well: ^[\s\S]*?#[\w]*[\s\S]+([0x\w]*)
This is with powershell 5.1 and VS Code.

You need to remove the [...] range construct around 0x\w - the 0x occurs exactly once in the input string, and the following characters appears at least once - but the expression [0x\w]* could be satisfied by an empty string (thanks to the *, 0-or-more quantifier).
I'd suggest matching the whole line at once with a single pattern instead:
switch -Regex ($fileContent -split '\r?\n') {
'^\s*#define\s+(\w+)\s+(0x\w+)' {
$key,$value = $Matches[1,2] |ForEach-Object Trim
$errorMap[$key] = $value
}
}

This works for me. The square brackets match any one character inside them at a time. The pattern with the square brackets has 18 matches in this line, the first match being empty string ''. Regex101.com says the same thing (null). https://regex101.com/r/PZ8Y8C/1 This would work 0x[\w]*, but then you might as well drop the brackets. I made an example data file and then a script on how I would do it.
'#define Error_Failed_To_Do_A 0x81A0 /* random comments */' |
select-string [0x\w]* -AllMatches | % matches | measure | % count
18
'#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A' |
set-content file.txt
# Get-Contents4_h.ps1
Param ($file)
switch -Regex -File $file {
'define (\w+).*(0x\w+)' {
[pscustomobject]#{
Error = $matches[1]
Hex = $matches[2]
}
}
}
.\Get-Contents4_h file.txt
Error Hex
----- ---
Error_Failed_To_Do_A 0x81A0
Error_Failed_To_Do_B 0x810A

using a partial variable as part of regular expression string

I'm trying to loop through an array of alarm codes and use a regular expression to look it up in cpp code. I know my regex works when I hard code a value in and use double quotes for my regex, but I need to pass in a variable because it's a list of about 100 to look up with separate definitions. Below is what I want to use in general. How do I fix it so it works with $lookupItem instead of hard-coding "OTHER-ERROR" for example in the Get-EpxAlarm function? I tried single quotes and double quotes around $lookupItem in the $fullregex definition, and it returns nothing.
Function Get-EpxAlarm{
[cmdletbinding()]
Param ( [string]$fileContentsToParse, [string]$lookupItem)
Process
{
$lookupItem = "OTHER_ERROR"
Write-Host "In epx Alarm" -ForegroundColor Cyan
# construct regex
$fullregex = [regex]'$lookupItem', # Start of error message########variable needed
":[\s\Sa-zA-Z]*?=", # match anything, non-greedy
"(?<epxAlarm>[\sa-zA-Z_0-9]*)", # Capture epxAlarm Num
'' -join ''
# run the regex
$Values = $fileContentsToParse | Select-String -Pattern $fullregex -AllMatches
# Convert Name-Value pairs to object properties
$result = $Values.Matches
Write-Host $result
#Write-Host "result:" $result -ForegroundColor Green
return $result
}#process
}#function
#main code
...
Get-EpxAlarm -fileContentsToParse $epxContents -lookupItem $item
...
where $fileContentsToParse is
case OTHER_ERROR:
bstrEpxErrorNum = FATAL_ERROR;
break;
case RI_FAILED:
case FILE_FAILED:
case COMMUNICATION_FAILURE:
bstrEpxErrorNum = RENDERING_ERROR;
break;
So if I look for OTHER_ERROR, it should return FATAL_ERROR.
I tested my regular expression in regex editor and it works with the hard-coded value. How can I define my regex so that I use the parameter and it returns the same thing as hard-coding the parameter value?

I wouldn't recommend trying to construct a single regular expression to do complex source code parsing - it gets quite unreadable really quickly.
Instead, write a small error mapping parser that just reads the source code line by line and constructs the error mapping table as it goes along:
function Get-EpxErrorMapping {
param([string]$EPXFileContents)
# create hashtable to hold the final mappings
$errorMap = #{}
# create array to collect keys that are grouped together
$keys = #()
switch -Regex ($EPXFileContents -split '\r?\n') {
'case (\w+):' {
# add relevant key to key collection
$keys += $Matches[1] }
'bstrEpxErrorNum = (\w+);' {
# we've reached the relevant error, set it for all relevant keys
foreach($key in $keys){
$errorMap[$key] = $Matches[1]
}
}
'break' {
# reset/clear key collection
$keys = #()
}
}
return $errorMap
}
Now all you need to do is call this function and use the resulting table to resolve the $lookupItem value:
Function Get-EpxAlarm{
[CmdletBinding()]
param(
[string]$fileContentsToParse,
[string]$lookupItem
)
$errorMap = Get-EpxErrorMapping $fileContentsToParse
return $errorMap[$lookupItem]
}
Now we can get the corresponding error code:
$epxContents = #'
case OTHER_ERROR:
bstrEpxErrorNum = FATAL_ERROR;
break;
case RI_FAILED:
case FILE_FAILED:
case COMMUNICATION_FAILURE:
bstrEpxErrorNum = RENDERING_ERROR;
break;
'#
# this will now return the string "FATAL_ERROR"
Get-EpxAlarm -fileContentsToParse $epxContents -lookupItem OTHER_ERROR

Regex performance

I am benchmarking different approaches to RegEx and seeing something I really don't understand. I am specifically comparing using the -match operator vs using the [regex]::Matches() accelerator.
I started with
(Measure-Command {
foreach ($i in 1..10000) {
$path -match $testPattern
}
}).TotalSeconds
(Measure-Command {
foreach ($i in 1..10000) {
[regex]::Matches($path, $testPattern)
}
}).TotalSeconds
and -match is always very slightly faster. But it's also not apples to apples because I need to assign the [Regex] results to a variable to use it. So I added that
(Measure-Command {
foreach ($i in 1..10000) {
$path -match $testPattern
}
}).TotalSeconds
(Measure-Command {
foreach ($i in 1..10000) {
$test = [regex]::Matches($path, $testPattern)
}
}).TotalSeconds
And now [Regex] is consistently slightly faster, which makes no sense because I added to the workload with the variable assignment. The performance difference is ignorable, 1/100th of a second when doing 10,000 matches, but I wonder what is going on under the hood to make [Regex] faster when there is a variable assignment involved?
For what it's worth, without the variable assignment -match is faster, .05 seconds vs .03 seconds. With variable assignment [Regex] is faster by .03 seconds vs .02 seconds. So while it IS all negligible, adding the variable cuts [Regex] processing time more than in half, which is a (relatively) huge delta.

The outputs of both tests are different.
The accelerator output a lot more text.
Even though they are not displayed when wrapped in the Measure-Command cmdlet, they are part of the calculation.
Output of $path -match $testPattern
$true
Output of [regex]::Matches($path,$testPattern
Groups : {0}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 0
Value :
Writing stuff is slow.
In your second example, you take care of the accelerator output by assigning it to a variable. That's why it is significantly faster.
You can see the difference without assignment by voiding the outputs
If you do that, you'll see the accelerator is consistently slightly faster.
(Measure-Command {
foreach ($i in 1..10000) {
[void]($path -match $testPattern)
}
}).TotalSeconds
(Measure-Command {
foreach ($i in 1..10000) {
[void]([regex]::Matches($path, $testPattern))
}
}).TotalSeconds
Additional note
void is always more efficient than Command | Out-null.
Pipeline is slower but memory efficient.

This isn't an answer to the direct question asked, but it's an expansion on the performance of pre-compiled regexes that I mentioned in comments...
First, here's my local performance benchmark for the original code in the question for comparison (with some borrowed text and patterns):
$text = "foo" * 1e6;
$pattern = "f?(o)";
$count = 1000000;
# example 1
(Measure-Command {
foreach ($i in 1..$count) {
$text -match $pattern
}
}).TotalSeconds
# 8.010825
# example 2
(Measure-Command {
foreach ($i in 1..$count) {
$result = [regex]::Matches($text, $pattern)
}
}).TotalSeconds
# 6.8186813
And then using a pre-compiled regex, which according to Compilation and Reuse in Regular Expressions emits a native assembly to process the regex rather than the default "sequence of internal instructions" - whatever that actually means :-).
$text = "foo" * 1e6;
$pattern = "f?(o)";
$count = 1000000;
# example 3
$regex = [regex]::new($pattern, "Compiled");
(Measure-Command {
foreach ($i in 1..$count) {
$result = $regex.Matches($text)
}
}).TotalSeconds
# 5.8794981
# example 4
(Measure-Command {
$regex = [regex]::new($pattern, "Compiled");
foreach ($i in 1..$count) {
$result = $regex.Matches($text)
}
}).TotalSeconds
# 3.6616832
# example 5
# see https://github.com/PowerShell/PowerShell/issues/8976
(Measure-Command {
& {
$regex = [regex]::new($pattern, "Compiled");
foreach ($i in 1..$count) {
$result = $regex.Matches($text);
}
}
}).TotalSeconds
# 1.5474028
Note that Example 3 has a performance overhead of finding / resolving the $regex variable from inside each iteration because it's defined outside the Measure-Command's -Expresssion scriptblock - see https://github.com/PowerShell/PowerShell/issues/8976 for details.
Example 5 defines the variable inside a nested scriptblock and so is a lot faster. I'm not sure why Example 4 sits in between the two in performance, but it's useful to note there's a definite difference :-)
Also, as an aside, in my comments above, my original version of Example 5 didn't have the &, which meant I was timing the effort required to define the scriptblock, not execute it, so my numbers were way off. In practice, the performance increase is a lot less than my comment suggested, but it's still a decent improvement if you're executing millions of matches in a tight loop...

Reading list style text file into powershell array

I am provided a list of string blocks in a text file, and i need this to be in an array in powershell.
The list looks like this
a:1
b:2
c:3
d:
e:5
[blank line]
a:10
b:20
c:30
d:
e:50
[blank line]
...
and i want this in a powershell array to further work with it.
Im using
$output = #()
Get-Content ".\Input.txt" | ForEach-Object {
$splitline = ($_).Split(":")
if($splitline.Count -eq 2) {
if($splitline[0] -eq "a") {
#Write-Output "New Block starting"
$output += ($string)
$string = "$($splitline[1])"
} else {
$string += ",$($splitline[1])"
}
}
}
Write-Host $output -ForegroundColor Green
$output | Export-Csv ".\Output.csv" -NoTypeInformation
$output | Out-File ".\Output.txt"
But this whole thing feels quite cumbersome and the output is not a csv file, which at this point is i think because of the way i use the array. Out-File does produce a file that contains rows that are separated by commas.
Maybe someone can give me a push in the right direction.
Thx
x

One solution is to convert your data to an array of hash tables that can be read into a custom object. Then the output array object can be exported, formatted, or read as required.
$hashtables = (Get-Content Input.txt) -replace '(.*?):','$1=' | ConvertFrom-StringData
$ObjectShell = "" | Select-Object ($hashtable.keys | Select-Object -Unique)
$output = foreach ($hashtable in $hashtable) {
$obj = $ObjectShell.psobject.Copy()
foreach ($n in $hashtable.GetEnumerator()) {
$obj.($n.key) = $n.value
}
$obj
}
$output
$output | Export-Csv Output.csv -NoTypeInformation
Explanation:
The first colons (:) on each line are replaced with =. That enables ConvertFrom-StringData to create an array of hash tables with values on the LHS of the = being the keys and values on the RHS of the = being the values. If you know there is only one : on each line, you can make the -replace operation simpler.
$ObjectShell is just an object with all of the properties your data presents. You need all of your properties present for each line of data whether or not you assign values to them. Otherwise, your CSV output or table view within the console will have issues.
The first foreach iterates through the $hashtables array. Then we need to enumerate through each hash table to find the keys and values, which is performed by the second foreach loop. Each key/value pair is stored as a copy of $ObjectShell. The .psobject.Copy() method is used to prevent references to the original object. Updating data that is a reference will update the data of the original object.
$output contains the array of objects of all processed data.
Usability of output:
# Console Output
$output | format-table
a b c d e
- - - - -
1
2
3
5
10
20
30
50
# Convert to CSV
$output | ConvertTo-Csv -NoTypeInformation
"a","b","c","d","e"
"1",,,,
,"2",,,
,,"3",,
,,,"",
,,,,"5"
,,,,
"10",,,,
,"20",,,
,,"30",,
,,,"",
,,,,"50"
# Accessing Properties
$output.b
2
20
$output[0],$output[1]
a : 1
b :
c :
d :
e :
a :
b : 2
c :
d :
e :
Alternative Conversion:
$output = ((Get-Content Input.txt -raw) -split "(?m)^\r?\n") | Foreach-Object {
$data = $_ -replace "(.*?):(.*?)(\r?\n)",'"$1":"$2",$3'
$data = $data.Remove($data.LastIndexOf(','),1)
("{1}`r`n{0}`r`n{2}" -f $data,'{','}') | ConvertFrom-Json
}
$output | ConvertTo-Csv -NoType
Alternative Explanation:
Since ConvertFrom-StringData does not guarantee hash table key order, this alternative readies the file for a JSON conversion. This will maintain the property order listed in the file provided each group's order is the same. Otherwise, the property order of the first group will be respected.
All properties and their respective values are divided by the first : character on each line. The property and value are each surrounded by double quotes. Each property line is separated by a ,. Then finally the opening { and closing } are added. The resulting JSON-formatted string is converted to a custom object.

You can split by \n newline, see example:
$text = #"
a:1
b:2
c:3
d:
e:5
a:10
b:20
c:30
d:
e:50
e:50
e:50
e:50
"#
$Array = $text -split '\n' | ? {$_}
$Array.Count
15
if you want to exclude the empty lines, add ? {$_}
With your example:
$Array = (Get-Content ".\Input.txt") -split '\n' | ? {$_}

Perl regex replacement string special variable

I'm aware of the match, prematch, and postmatch predefined variables. I'm wondering if there is something similar for the evaluated replacement part of the s/// operator.
This would be particularly useful in dynamic expressions so they don't have to be evaluated a 2nd time.
For example, I currently have %regexs which is a hash of various search and replace strings.
Here's a snippet:
while (<>) {
foreach my $key (keys %regexs) {
while (s/$regexs{$key}{'search'}/$regexs{$key}{'replace'}/ee) {
# Here I want to do something with just the replaced part
# without reevaluating.
}
}
print;
}
Is there a convenient way to do it? Perl seems to have so many convenient shortcuts, and it seems like a waste to have to evaluate twice (which appears to be the alternative).
EDIT: I just wanted to give an example: $regexs{$key}{'replace'} might be the string '"$2$1"' thus swapping the positions of some text in the string $regexs{$key}{'search'} which might be '(foo)(bar)' - thus resulting in "barfoo". The second evaluation that I'm trying to avoid is the output of $regexs{$key}{'replace'}.

Instead of using string eval (which I assume is what's going on with s///ee), you could define code references to do the work. Those code references can then return the value of the replacement text. For example:
use strict;
use warnings;
my %regex = (
digits => sub {
my $r;
return unless $_[0] =~ s/(\d)(\d)_/$r = $2.$1/e;
return $r;
},
);
while (<DATA>){
for my $k (keys %regex){
while ( my $replacement_text = $regex{$k}->($_) ){
print $replacement_text, "\n";
}
}
print;
}
__END__
12_ab_78_gh_
34_cd_78_yz_

I'm pretty sure there isn't any direct way to do what you're asking, but that doesn't mean it's impossible. How about this?
{
my $capture;
sub capture {
$capture = $_[0] if #_;
$capture;
}
}
while (s<$regexes{$key}{search}>
<"capture('" . $regexes{$key}{replace}) . "')">eeg) {
my $replacement = capture();
#...
}
Well, except to do it really properly you'd have to shoehorn a little more code in there to make the value in the hash safe inside a singlequotish string (backslash singlequotes and backslashes).

If you do the second eval manually you can store the result yourself.
my $store;
s{$search}{ $store = eval $replace }e;

why not assign to local vars before:
my $replace = $regexs{$key}{'replace'};
now your evaluating once.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

get first char of each string in a list - list

Related

PowerShell regex to get just hex part in strings

using a partial variable as part of regular expression string

Regex performance

Reading list style text file into powershell array

Perl regex replacement string special variable

Categories

Resources