I'm trying to improve on the original solution (INI file parsing in PowerShell) so I can parse an INI file with entries like the example below.
[proxy]
; IP address and port number
server = 192.168.0.253
port = 8080
logfile=session.log ; log session
[user]
; default username and settings
name=J. Doe ;name
address="377 Sunrise Way;Santa Monica;CA" ; address
[program files]
root="C:\Program Files\Windows " ; path name
path="C:\Program Files\Windows;%windir" ; path name
;
[program]
root=C:\Program Files\Windows ; path name
path=C:\Program Files\Windows;%windir ; path name
I'm using the following powershell code to populate a nested hash table (if that is the right description) containing the name/value pairs for each section.
I have no problem dealing with the first section where I have lines ending in a comment, or with the value in the second section which contains spaces, but things go wrong when I try to mix quoted strings and comments.
Given that a string begins and ends with a double quote I think it should be possible to get the results I want but I am obviously missing something somewhere (I am a little new to this).
function Parse-INI-File() {
Param ([parameter()][string]$_file = '')
# Don't prompt to continue if '-Debug' is specified.
If ($DebugPreference -eq "Inquire") {$DebugPreference = "Continue"}
$_settings=#{}
switch -Regex -file $_file {
'(?:^ ?\[\s*(?<section>[^\s]+[^#;\r\n\[\]]+)\s*\])' {
$_section = $Matches.section.trim()
$_settings[$_section] = #{}
}
'(?:^\s*?(?<name>[^\[\]\r\n=#;]+))(?: ?=\s*"?(?<value>[^;#\\\r\n]*(?:\\.[^"#\\\r\n]*)*))' {
$_name, $_value = $Matches.name.trim(), $matches.value.trim()
$_settings[$_section][$_name] = $_value
Write-Debug "/$_section/ /$_name//$_value/" # Debug
}
}
$_settings
}
$_file='./ini-example.ini'
$_output=Parse-INI-File -Debug ($_file)
What I'd like is for the parsing of the sample ini file to result in the following name/value pairs:
DEBUG: /proxy/ /server//192.168.0.253/
DEBUG: /proxy/ /port//8080/
DEBUG: /proxy/ /logfile//session.log/
DEBUG: /user/ /name//J. Doe/
DEBUG: /user/ /address//377 Sunrise Way;Santa Monica;CA/
DEBUG: /program files/ /root//C:\Program Files\Windows/
DEBUG: /program files/ /path//C:\Program Files\Windows;%windir/
DEBUG: /program/ /root//C:\Program Files\Windows/
DEBUG: /program/ /path//C:\Program Files\Windows/
I don't mind if quoted strings include the original quotes or not.
Thank you.
Updated 10 Sep 19 - I have tried the Get-IniContent function in the psini module, but it doesn't ignore comments at the end of a line.
PS C:\> $_output = Get-IniContent (".\ini-example.ini")
PS C:\> $_output["program files"]
Name Value
---- -----
root "C:\Program Files\Windows "' ; path name
path "C:\Program Files\Windows;;%windir" ; path name
Comment1 ;
PS C:\>
Think I've solved it, there is probably a better solution but I solved the problem by using a separate regex for quoted strings - this complicates the logic a bit but seems to solve the problem reliably.
function Parse-INI-File() {
Param ([parameter()][string]$_file = '')
# Don't prompt to continue if '-Debug' is specified.
If ($DebugPreference -eq "Inquire") {$DebugPreference = "Continue"}
$_settings=#{}
switch -Regex -file $_file {
'(?:^ ?\[\s*(?<section>[^\s]+[^\r\n\[\]]+)\s*\])' {
$_section = $Matches.section.trim()
$_settings[$_section] = #{}
#Write-Debug "1/$_section/" # Debug
}
'(?:^\s*?(?<name>[^\[\]\r\n]+))(?: ?=\s*(?<value>[^";#\\\r\n]*(?:\\.[^";#\\\r\n]*)*))' {
If ($matches.value -ne '' ) {
$_name, $_value = $Matches.name.trim(), $matches.value.trim()
$_settings[$_section][$_name] = $_value
Write-Debug "2/$_section//$_name//$_value/" # Debug
}
}
'(?:^\s*?(?<name>[^\[\]\r\n]+))(?: ?=\s*(?<value>\"+[^\"\r\n]*\")*)' {
#If ($matches.value -ne $null ) {
If (-not [string]::IsNullOrEmpty($matches.value)) {
$_name, $_value = $Matches.name.trim(), $matches.value.trim()
$_settings[$_section][$_name] = $_value
Write-Debug "3/$_section//$_name//$_value/" # Debug
}
}
}
$_settings
}
This seems to produces the results I'd expect
PS C:\> $_output = Parse-INI-File -Debug (".\ini-example.ini")
DEBUG: 2/proxy//server//192.168.0.253/
DEBUG: 2/proxy//port//8080/
DEBUG: 2/proxy//logfile//session.log/
DEBUG: 2/user//name//J. Doe/
DEBUG: 3/user//address//"377 Sunrise Way;Santa Monica;CA"/
DEBUG: 3/program files//root//"C:\Program Files\Windows "/
DEBUG: 3/program files//path//"C:\Program Files\Windows;%windir"/
DEBUG: 2/program//root//C:\Program Files\Windows/
DEBUG: 2/program//path//C:\Program Files\Windows/
PS C:\> $_output["user"]
Name Value
---- -----
name J. Doe
address "377 Sunrise Way;Santa Monica;CA"
PS C:\>
Note that if there are multiple values with the same name in a section then only the last value is returned (try parsing system.ini to see what I mean)
Related
I have installed on my PC draw.io app. I want to export all tabs with drawings to seperate files. The only options I have found is:
"c:\Program Files\draw.io\draw.io.exe" --crop -x -f jpg c:\Users\user-name\Documents\_xxx_\my-file.drawio
Help for draw.io
Usage: draw.io [options] [input file/folder]
Options:
(...)
-x, --export export the input file/folder based on the
given options
-r, --recursive for a folder input, recursively convert
all files in sub-folders also
-o, --output <output file/folder> specify the output file/folder. If
omitted, the input file name is used for
output with the specified format as
extension
-f, --format <format> if output file name extension is
specified, this option is ignored (file
type is determined from output extension,
possible export formats are pdf, png, jpg,
svg, vsdx, and xml) (default: "pdf")
(default: 0)
-a, --all-pages export all pages (for PDF format only)
-p, --page-index <pageIndex> selects a specific page, if not specified
and the format is an image, the first page
is selected
-g, --page-range <from>..<to> selects a page range (for PDF format only)
(...)
is not supporting. I can use one of this:
-p, --page-index <pageIndex> selects a specific page, if not specified
and the format is an image, the first page
is selected
-g, --page-range <from>..<to> selects a page range (for PDF format only)
but how to get page-range or number of pages to select index?
There is no easy way to find the number of pages out of the box with Draw.io's CLI options.
One solution would be export the diagram as XML.
draw.io --export --format xml --uncompressed test-me.drawio
And then count how many diagram elements there are. It should equal the number of pages (I briefly tested this but I'm not 100% sure if diagram element only appears once per page).
grep -o "<diagram" "test-me.xml" | wc -l
Here is an example of putting it all together in a bash script (I tried this on MacOS 10.15)
#!/bin/bash
file=test-me # File name excluding extension
# Export diagram to plain XML
draw.io --export --format xml --uncompressed "$file.drawio"
# Count how many pages based on <diagram element
count=$(grep -o "<diagram" "$file.xml" | wc -l)
# Export each page as an PNG
# Page index is zero based
for ((i = 0 ; i <= $count-1; i++)); do
draw.io --export --page-index $i --output "$file-$i.png" "$file.drawio"
done
OP did ask the question with reference to the Windows version, so here's a PowerShell solution inspired by eddiegroves
$DIR_DRAWIO = "."
$DrawIoFiles = Get-ChildItem $DIR_DRAWIO *.drawio -File
foreach ($file in $DrawIoFiles) {
"File: '$($file.FullName)'"
$xml_file = "$($file.DirectoryName)/$($file.BaseName).xml"
if ((Test-Path $xml_file)) {
Remove-Item -Path $xml_file -Force
}
# export to XML
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'xml' $file.FullName
# wait for XML file creation
while ($true) {
if (-not (Test-Path $xml_file)) {
Start-Sleep -Milliseconds 200
}
else {
break
}
}
# load to XML Document (cast text array to object)
$drawio_xml = [xml](Get-Content $xml_file)
# for each page export png
for ($i = 0; $i -lt $drawio_xml.mxfile.pages; $i++) {
$file_out = "$($file.DirectoryName)/$($file.BaseName)$($i + 1).png"
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--border' '10' '--page-index' $i '--output' $file_out $file.FullName
}
# wait for last file PNG image file
while ($true) {
if (-not (Test-Path "$($file.DirectoryName)/$($file.BaseName)$($drawio_xml.mxfile.pages).png")) {
Start-Sleep -Milliseconds 200
}
else {
break
}
}
# remove/delete XML file
if ((Test-Path $xml_file)) {
Remove-Item -Path $xml_file -Force
}
# export 'vsdx' & 'pdf'
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'vsdx' $file.FullName
Start-Sleep -Milliseconds 1000
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'pdf' $file.FullName
}
I am trying to edit and run a snakemake pipeline. In a nutshell, the snakemake pipeline calls a default genome aligner (minimap) and produces output files with this name. I am trying to add a variable aligner to config.yaml to specify the aligner I want to call. Also (where I am actually stuck), the output files should have the name of the aligner specified in config.yaml.
My config.yaml looks like this:
# this config.yaml is passed to Snakefile in pipeline-structural-variation subfolder.
# Snakemake is run from this pipeline-structural-variation folder; it is necessary to
# pass an appropriate path to the input-files (the ../ prefix is sufficient for this demo)
aligner: "ngmlr" # THIS IS THE VARIABLE I AM ADDING TO THIS FILE. VALUES COULD BE minimap or ngmlr
# FASTQ file or folder containing FASTQ files
# check if this has to be gzipped
input_fastq: "/nexusb/Gridion/20190917PGD2staal2/PD170815/PD170815_cat_all.fastq.gz" # original is ../RawData/GM24385_nf7_chr20_af.fastq.gz
# FASTA file containing the reference genome
# note that the original reference sequence contains only the sequence of chr20
reference_fasta: "/nexus/bhinckel/19/ONT_projects/PGD_breakpoint/ref_hg19_local/hg19_chr1-y.fasta" # original is ../ReferenceData/human_g1k_v37_chr20_50M.fasta
# Minimum SV length
min_sv_length: 300000 # original value was 40
# Maximum SV length
max_sv_length: 1000000 # original value was 1000000. Note that the value I used to run the pipeline for the sample PD170677 was 100000000000, which will be coerced to NA in the R script (/home/bhinckel/ont_tutorial_sv/ont_tutorial_sv.R)
# Min read length. Shorter reads will be discarded
min_read_length: 1000
# Min mapping quality. Reads will lower mapping quality will be discarded
min_read_mapping_quality: 20
# Minimum read support required to call a SV (auto for auto-detect)
min_read_support: 'auto'
# Sample name
sample_name: "PD170815" # original value was GM24385.nf7.chr20_af. Note that this can be a list
I am posting below the sections of my snakefile which generate output files with the extension _minimap2.bam, which I would like to replace by either _minimap2.bam or _ngmlr.bam, depending on aligner on config.yaml
# INPUT BAM folder
bam = None
if "bam" in config:
bam = os.path.join(CONFDIR, config["bam"])
# INPUT FASTQ folder
FQ_INPUT_DIRECTORY = []
if not bam:
if not "input_fastq" in config:
print("\"input_fastq\" not specified in config file. Exiting...")
FQ_INPUT_DIRECTORY = os.path.join(CONFDIR, config["input_fastq"])
if not os.path.exists(FQ_INPUT_DIRECTORY):
print("Could not find {}".format(FQ_INPUT_DIRECTORY))
MAPPED_BAM = "{sample}/alignment/{sample}_minimap2.bam" # Original
#MAPPED_BAM = "{sample}/alignment/{sample}_{alignerName}.bam" # this did not work
#MAPPED_BAM = f"{sample}/alignment/{sample}_{config['aligner']}.bam" # this did nor work either
else:
MAPPED_BAM = find_file_in_folder(bam, "*.bam", single=True)
...
if config['aligner'] == 'minimap':
rule index_minimap2:
input:
REF = FA_REF
output:
"{sample}/index/minimap2.idx"
threads: config['threads']
conda: "env.yml"
shell:
"minimap2 -t {threads} -ax map-ont --MD -Y {input.REF} -d {output}"
rule map_minimap2:
input:
FQ = FQ_INPUT_DIRECTORY,
IDX = rules.index_minimap2.output,
SETUP = "init"
output:
BAM = "{sample}/alignment/{sample}_minimap2.bam",
BAI = "{sample}/alignment/{sample}_minimap2.bam.bai"
conda: "env.yml"
threads: config["threads"]
shell:
"cat_fastq {input.FQ} | minimap2 -t {threads} -K 500M -ax map-ont --MD -Y {input.IDX} - | samtools sort -# {threads} -O BAM -o {output.BAM} - && samtools index -# {threads} {output.BAM}"
else:
print(f"Aligner is {config['aligner']} - skipping indexing step for minimap2")
rule map_ngmlr:
input:
REF = FA_REF,
FQ = FQ_INPUT_DIRECTORY,
SETUP = "init"
output:
BAM = "{sample}/alignment/{sample}_minimap2.bam",
BAI = "{sample}/alignment/{sample}_minimap2.bam.bai"
conda: "env.yml"
threads: config["threads"]
shell:
"cat_fastq {input.FQ} | ngmlr -r {input.REF} -t {threads} -x ont - | samtools sort -# {threads} -O BAM -o {output.BAM} - && samtools index -# {threads} {output.BAM}"
I initially tried to create a alignerName parameter, similar to the sample parameter, as shown below:
# Parameter: sample_name
sample = "sv_sample01"
if "sample_name" in config:
sample = config['sample_name']
###############
#
# code below created by me
#
###############
# Parameter: aligner_name
alignerName = "defaultAligner"
if "aligner" in config:
alignerName = config['aligner']
Then I tried to input {alignerName} wherever I have minimap2 on my input/ output files (see commented MAPPED_BAM variable definition above), though this is throwing an error. I guess snakemake will interpret {alignerName} as a wildcard, though what I want is simply to pass the variable name defined in config['aligner'] to input/ output files. I also tried with f-string (MAPPED_BAM = f"{sample}/alignment/{sample}_{config['aligner']}.bam"), though I guess this it did not work either.
You are close!
The way wildcards work in snakemake is they get interpreted 'last', while f-strings get interpreted first. To not interpret a curly brace in an f-string you can escape it with another curly brace, like so:
print(f"{{keep curly}}")
>>> {keep curly}
So all we need to do is
MAPPED_BAM = f"{{sample}}/alignment/{{sample}}_{config['aligner']}.bam"
I am trying to compress a folder containing files and subfolders (with files) into a single zip. I'm limited to the core perl modules so I'm trying to work with IO::Compress::Zip. I want to remove the working directory file path but seem to end up with a blank first folder before my zipped folder, like there is a trailing "/" I haven't been able to get rid of.
use Cwd;
use warnings;
use strict;
use File::Find;
use IO::Compress::Zip qw(:all);
my $cwd = getcwd();
$cwd =~ s/[\\]/\//g;
print $cwd, "\n";
my $zipdir = $cwd . "\\source_folder";
my $zip = "source_folder.zip";
my #files = ();
sub process_file {
next if (($_ eq '.') || ($_ eq '..'));
if (-d && $_ eq 'fp'){
$File::Find::prune = 1;
return;
}
push #files, $File::Find::name if -f;
}
find(\&process_file, $cwd . "\\source_folder");
zip \#files => "$zip", FilterName => sub{ s|\Q$cwd|| } or die "zip failed: $ZipError\n";
I have also attempted using the option "CanonicalName => 1, " which appears to leave the filepath except the drive letter (C:).
Substitution with
s[^$dir/][]
did nothing and
s<.*[/\\]><>
left me with no folder structure at all.
What am I missing?
UPDATE
The Red level is unexpected and is what is not required, win explorer is not able to see beyond this level.
There are two issues with your script.
First, you are mixing Windows and Linux/Unix paths in the script. Let me illustrate
I've created a subdirectory called source_folder to match your script
$ dir source_folder
Volume in drive C has no label.
Volume Serial Number is 7CF0-B66E
Directory of C:\Scratch\source_folder
26/11/2018 19:48 <DIR> .
26/11/2018 19:48 <DIR> ..
26/11/2018 17:27 840 try.pl
01/06/2018 13:02 6,653 url
2 File(s) 7,493 bytes
When I run your script unmodified I get an apparently empty zip file when I view it in Windows explorer. But, if I use a command-line unzip, I see that source_folder.zip isn't empty, but it has non-standard filenames that are part Windows and part Linux/Unix.
$ unzip -l source_folder.zip
Archive: source_folder.zip
Length Date Time Name
--------- ---------- ----- ----
840 2018-11-26 17:27 \source_folder/try.pl
6651 2018-06-01 13:02 \source_folder/url
--------- -------
7491 2 files
The mix-and-match of windows & Unix paths is created in this line of your script
find(\&process_file, $cwd . "\\source_folder");
You are concatenating a Unix-style path in $cwd with a windows part "\source_folder".
Change the line to use a forward slash, rather than a backslash to get a consistent Unix-style path.
find(\&process_file, $cwd . "/source_folder");
The second problem is this line
zip \#files => "$zip",
FilterName => sub{ s|\Q$cwd|| },
BinmodeIn =>1
or die "zip failed: $ZipError\n";
The substitute, s|\Q$cwd||, needs an extra "/", like this s|\Q$cwd/|| to make sure that the path added to the zip archive is a relative path. So the line becomes
zip \#files => "$zip", FilterName => sub{ s|\Q$cwd/|| } or die "zip failed: $ZipError\n";
Once those two changes are made I can view the zip file in Explorer and get unix-style relative paths in when I use the command-line unzip
$ unzip -l source_folder.zip
Archive: source_folder.zip
Length Date Time Name
--------- ---------- ----- ----
840 2018-11-26 17:27 source_folder/try.pl
6651 2018-06-01 13:02 source_folder/url
--------- -------
7491 2 files
This works for me:
use Cwd;
use warnings;
use strict;
use File::Find;
use IO::Compress::Zip qw(:all);
use Data::Dumper;
my $cwd = getcwd();
$cwd =~ s/[\\]/\//g;
print $cwd, "\n";
my $zipdir = $cwd . "/source_folder";
my $zip = "source_folder.zip";
my #files = ();
sub process_file {
next if (($_ eq '.') || ($_ eq '..'));
if (-d && $_ eq 'fp') {
$File::Find::prune = 1;
return;
}
push #files, $File::Find::name if -f;
}
find(\&process_file, $cwd . "/source_folder");
print Dumper \#files;
zip \#files => "$zip", FilterName => sub{ s|\Q$cwd/|| } or die "zip failed: $ZipError\n";
I changed the path seperator to '/' in your call to find() and also stripped it in the FilterName sub.
console:
C:\Users\chris\Desktop\devel\experimente>mkdir source_folder
C:\Users\chris\Desktop\devel\experimente>echo 1 > source_folder/test1.txt
C:\Users\chris\Desktop\devel\experimente>echo 1 > source_folder/test2.txt
C:\Users\chris\Desktop\devel\experimente>perl perlzip.pl
C:/Users/chris/Desktop/devel/experimente
Exiting subroutine via next at perlzip.pl line 19.
$VAR1 = [
'C:/Users/chris/Desktop/devel/experimente/source_folder/test1.txt',
'C:/Users/chris/Desktop/devel/experimente/source_folder/test2.txt'
];
C:\Users\chris\Desktop\devel\experimente>tar -tf source_folder.zip
source_folder/test1.txt
source_folder/test2.txt
I currently have the following PS script to extract the SW version from a Git tag and integrate it into the built assembly.
This works for tags like v1.2.3 and creates file-versions and product-versions such as e.g. 1.2.3.16 and 1.2.3.16-13b05b79
# Get version info from Git. For example: v1.2.3-45-g6789abc
$gitVersion = git describe --match "v[0-9]*" --long --always --dirty;
# Get name of current branch
$gitBranch = git rev-parse --abbrev-ref HEAD;
# Write Git information to version.txt
$versionFile = $args[1] + "\version.txt";
"version: " + $gitVersion > $versionFile;
"branch: " + $gitBranch >> $versionFile;
# Parse Git version info into semantic pieces
$gitVersion -match '[v](.*)-(\d+)-[g](.+)$';
$gitTag = $Matches[1];
$gitCount = $Matches[2];
$gitSHA1 = $Matches[3];
# Define file variables
$assemblyFile = $args[0] + "\Properties\AssemblyInfo.cs";
# Read template file, overwrite place holders with git version info
$newAssemblyContent = Get-Content $assemblyFile |
%{$_ -replace '\$FILEVERSION\$', ($gitTag + "." + $gitCount) } |
%{$_ -replace '\$INFOVERSION\$', ($gitTag + "." + $gitCount + "-" + $gitSHA1) };
echo "Injecting Git Version Info to AssemblyInfo.cs"
$newAssemblyContent > $assemblyFile;
I would now like to extend the regex in this script, so that I can use tags with a brief description such as v1.2.3-description, where description can be of variable length.
Ideally, the regex should allow for dashes in the description so that v1.2.3-description-with-dashes would also be valid and any other characters that are allowed in Git tags.
What makes this difficult for me (I have tried) is that git describe command will output this as v1.2.3-description-with-dashes-16, how I can distinguish between the dashes that belong to the Git-output and those that belong to the description.
Using RegEx (and using the new examples) this is what you can do:
$gitVersion -match '(?<tag>v\d+\.\d+\.\d+)(?:-?(?<description>\D+)?)(?:-?(?<count>\d+)?)(?:-?(?<sha1>gd[0-9a-f]+))(?:-?(?<dirty>.+)?)'
$gitTag = $Matches['tag']
$gitDescription = ($Matches['description']).Trim("-")
$gitCount = if($Matches['count']) { $Matches['count'] } else { 1 } # if no count is found, we assume 1 ??
$gitSHA1 = $Matches['sha1']
$gitDirty = $Matches['dirty']
Testresults:
teststring tag description count sha1 dirty
--------------------------------------------------- ------- ------------------------- ----- --------- -----
v1.2.3-123-gd9b5a775-dirty v1.2.3 123 gd9b5a775 dirty
v1.2.3-description-123-gd9b5a775-dirty v1.2.3 description- 123 gd9b5a775 dirty
v1.2.3-description-with-dashes-123-gd9b5a775-dirty v1.2.3 description-with-dashes- 123 gd9b5a775 dirty
v1.2.3-description-with-dashes-123-gd9b5a775 v1.2.3 description-with-dashes- 123 gd9b5a775
v1.2.3-description-with-dashes-gd9b5a775 v1.2.3 description-with-dashes- gd9b5a775
v1.2.3-45-gd9b5a775 v1.2.3 45 gd9b5a775
v1.2.3-gd9b5a775 v1.2.3 gd9b5a775
Using ActiveState perl 5.8 on windows. I am placing the results of sc qc MyServiceName into a variable.
$MSSQLResults=`sc qc $MSSQLServiceName`;
print "MSSQLResults $MSSQLResults";
If I print the variable to STDOUT I get something like:
[SC] QueryServiceConfig SUCCESS
SERVICE_NAME: MSSQL$INSTANCE1
TYPE : 10 WIN32_OWN_PROCESS
START_TYPE : 3 DEMAND_START
ERROR_CONTROL : 1 NORMAL
BINARY_PATH_NAME : "C:\Program Files\Microsoft SQL Server\MSSQL10_50.INSTANCE1\MSSQL\Binn\sqlservr.exe" -sINSTANCE1
LOAD_ORDER_GROUP :
TAG : 0
DISPLAY_NAME : SQL Server (INSTANCE1)
DEPENDENCIES :
SERVICE_START_NAME : TESTLAB\svc_SQLServer
The string I want returned from a grep or regex match is TESTLAB\svc_SQLServer
. Should I use grep, or regex, or something else? What line of perl would accomplish what I want? The text TESTLAB\svc_SQLServer will vary depending on which machine I run it on.
If I understand you correctly, you have a scalar variable (e.g $MSSQLResults) which you want to search. In that case:
if (my ($service_start_name) = $MSSQLResults=~ m/SERVICE_START_NAME\s+:\s+(.*)/m) {
# do something
}