Parsing files to populate placeholders with values

Parsing files to populate placeholders with values - build

Background Info
I have a JS library which consists of many constructor functions. I am using grunt-concat & -uglify to compile these into a single file.
Each constructor has a readme.md file.
The library is used to create advertising banners. It is used by around 10 developers who create their own advertising templates within a folder Templates which use the library. The files are .xml files which also provide a CDATA tag where they can insert their JavaScript code.
Question
I would like to populate the readme files with a counter, so that the developers can see how popular a particular constructor is directly in its documentation.
Number of occurrences (<% occurrences %>)
What I've already done
I can get the number of occurrences by executing
find . -name "*.xml" -exec grep -e "new\Foo\.Bar" {} \; | wc -l
It would be great if I could grab this value and insert it to the readme file.

grunt.registerTask('count_occurrences', '', function () {
var exec = require('child_process').execSync;
var result = exec("find . -name "*.xml" -exec grep -e "new\Foo\.Bar" {} \; | wc -l", { encoding: 'utf8' });
grunt.log.writeln(result);
// Now write result to your README file
grunt.file.write("README.md", result);
});
or
You can use the grunt plugin called exec (https://github.com/jharding/grunt-exec) to execute cmd line functions, e.g. find.
You probably would want something like this in your GruntFile.js:
exec: {
count_occurrences: {
cmd: function() {
return 'find . -name "*.xml" -exec grep -e "new\Foo\.Bar" {} \; | wc -l';
}
}
}
Then call grunt exec:echo_name

Related

Draw.io - how to export all tabs to images using command line

I have installed on my PC draw.io app. I want to export all tabs with drawings to seperate files. The only options I have found is:
"c:\Program Files\draw.io\draw.io.exe" --crop -x -f jpg c:\Users\user-name\Documents\_xxx_\my-file.drawio
Help for draw.io
Usage: draw.io [options] [input file/folder]
Options:
(...)
-x, --export export the input file/folder based on the
given options
-r, --recursive for a folder input, recursively convert
all files in sub-folders also
-o, --output <output file/folder> specify the output file/folder. If
omitted, the input file name is used for
output with the specified format as
extension
-f, --format <format> if output file name extension is
specified, this option is ignored (file
type is determined from output extension,
possible export formats are pdf, png, jpg,
svg, vsdx, and xml) (default: "pdf")
(default: 0)
-a, --all-pages export all pages (for PDF format only)
-p, --page-index <pageIndex> selects a specific page, if not specified
and the format is an image, the first page
is selected
-g, --page-range <from>..<to> selects a page range (for PDF format only)
(...)
is not supporting. I can use one of this:
-p, --page-index <pageIndex> selects a specific page, if not specified
and the format is an image, the first page
is selected
-g, --page-range <from>..<to> selects a page range (for PDF format only)
but how to get page-range or number of pages to select index?

There is no easy way to find the number of pages out of the box with Draw.io's CLI options.
One solution would be export the diagram as XML.
draw.io --export --format xml --uncompressed test-me.drawio
And then count how many diagram elements there are. It should equal the number of pages (I briefly tested this but I'm not 100% sure if diagram element only appears once per page).
grep -o "<diagram" "test-me.xml" | wc -l
Here is an example of putting it all together in a bash script (I tried this on MacOS 10.15)
#!/bin/bash
file=test-me # File name excluding extension
# Export diagram to plain XML
draw.io --export --format xml --uncompressed "$file.drawio"
# Count how many pages based on <diagram element
count=$(grep -o "<diagram" "$file.xml" | wc -l)
# Export each page as an PNG
# Page index is zero based
for ((i = 0 ; i <= $count-1; i++)); do
draw.io --export --page-index $i --output "$file-$i.png" "$file.drawio"
done

OP did ask the question with reference to the Windows version, so here's a PowerShell solution inspired by eddiegroves
$DIR_DRAWIO = "."
$DrawIoFiles = Get-ChildItem $DIR_DRAWIO *.drawio -File
foreach ($file in $DrawIoFiles) {
"File: '$($file.FullName)'"
$xml_file = "$($file.DirectoryName)/$($file.BaseName).xml"
if ((Test-Path $xml_file)) {
Remove-Item -Path $xml_file -Force
}
# export to XML
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'xml' $file.FullName
# wait for XML file creation
while ($true) {
if (-not (Test-Path $xml_file)) {
Start-Sleep -Milliseconds 200
}
else {
break
}
}
# load to XML Document (cast text array to object)
$drawio_xml = [xml](Get-Content $xml_file)
# for each page export png
for ($i = 0; $i -lt $drawio_xml.mxfile.pages; $i++) {
$file_out = "$($file.DirectoryName)/$($file.BaseName)$($i + 1).png"
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--border' '10' '--page-index' $i '--output' $file_out $file.FullName
}
# wait for last file PNG image file
while ($true) {
if (-not (Test-Path "$($file.DirectoryName)/$($file.BaseName)$($drawio_xml.mxfile.pages).png")) {
Start-Sleep -Milliseconds 200
}
else {
break
}
}
# remove/delete XML file
if ((Test-Path $xml_file)) {
Remove-Item -Path $xml_file -Force
}
# export 'vsdx' & 'pdf'
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'vsdx' $file.FullName
Start-Sleep -Milliseconds 1000
& "C:/Program Files/draw.io/draw.io.exe" '--export' '--format' 'pdf' $file.FullName
}

sed command issues

Background
Original json (test.json): {"rpc-password": "password"}
Expected changed json: {"rpc-password": "somepassword"}
replace_json_str is a function used to replace password with somepassword using sed.
replace_json_str() {
x=$1
sed -i -e 's/\({ "'"$2"'":\)"[^"]*" }/\1"'"$3"'" }/g' $x
}
Unit test: replace_json_str test.json rpc-password somepassword
Issue
After running the above test, I get a file named test.json-e and the contents of the file is the same as before the test was ran, why?

there is a handy command line json tool called jq
cat input.json
{"rpc-password": "password"}
cat update_json.sh
givenkey=$1
givenvalue=$2
inputfile=input.json
outfile=output.json
cat $inputfile | jq . # show input json
jq --arg key1 "$givenkey" --arg val1 "$givenvalue" '.[$key1] = $val1' "$inputfile" > "$outfile"
cat "$outfile" | jq . # render output json
keep in mind jq can handle multiple such key value updates ... execute it
update_json.sh rpc-password somepassword
{
"rpc-password": "password"
}
{
"rpc-password": "somepassword"
}

Depends on which sed you're using.
The command you ran will work as expected with GNU sed.
But BSD sed does not allow for an empty argument to -i, and if you ran the same command, it will use the next argument, -e, as the backup file.
Also, the positioning of the spaces in your pattern don't match your example JSON.

Piping output of shell command to grep causes "grep: write error"

I wrote a tool in Crystal that takes some command line parameters, and turns those into, basically, "find stuff | xargs grep" where xargs is instructed to use multiple processes. This is run via Process.run, and output and error are redirected into a custom IO object which filters what comes out of grep a bit, writing everything that isn't filtered into STDOUT.
When I run this normally, it mostly seems to run fine. There do seem to be some instances of output getting cut off before the search completes though, so I'm not sure I can fully trust the results. When I pipe the output from this command into grep, however, it always cuts off the search early and says "grep: write error". I have no idea why this is happening, and would love some help. Eventually I'll likely rewrite this to do everything in pure Crystal, but for now this is a quick solution to search the codebase I'm working on.
Here is the code that is getting run:
class FindFilterIO
include IO
##generic_filter = [".diff", ".iml", "/target/"]
##web_filter = [".css", ".js", ".jsp", ".ftl"]
def initialize(#web_search : Bool = false)
end
def read(slice : Bytes)
raise "FindFilterIO does not support reading!"
end
def write(slice : Bytes)
str = String.new slice
if ##generic_filter.any? { |e| str.includes? e }
return
end
if #web_search
if !##web_filter.any? { |e| str.includes? e }
return
end
end
STDOUT.write(slice)
end
end
cmd = "find . -not \\( -path ./.svn -prune \\) " \
"-not \\( -path ./.idea -prune \\) " \
"-type f -print0 " \
"| xargs -0 -P 1 -n 100 grep -E -n --color=always "
cmd += if #html_id
"'id=['\"'\"'\"]#{#search_text}['\"'\"'\"]|\##{#search_text}'"
elsif #html_class
"'class=['\"'\"'\"]#{#search_text}['\"'\"'\"]|\\.#{#search_text}'"
else
"'#{#search_text}'"
end
io = FindFilterIO.new web_search: (#html_id || #html_class)
Process.run(cmd, output: io, error: io, shell: true, chdir: File.join(#env.home_dir, #env.branch, "repodir"))

This seems to have been fixed now that the issue at https://github.com/crystal-lang/crystal/issues/2065 has been closed. Will need to do some more testing to make sure it's totally fixed, but using my older code seems to be working fine now.

How to download latest version of software from same url using wget

I would like to download a latest source code of software (WRF) from some url and automate the installation process thereafter. A sample url like is given below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
In the above url, the version number may change time to time after the developer release the new version. Now I would like to download the latest available version from the main script. I tried the following:-
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV'
With above code, I could pull all available version of the software. The output of the above code is below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV2.0.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Var-do-not-use.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3_OVERLAY_3.0.1.1.TAR.gz
However, I am unable to go further to filter out only later version from the link.

Usually, for processing the html-pages i recommendig some perl tools, but because this is an Directory Index output, (probably) can be done by bash tools like grep sed and such...
The following code is divided to several smaller bash functions, for easy changes
#!/bin/bash
#getdata - should output html source of the page
getdata() {
#use wget with output to stdout or curl or fetch
curl -s "http://www2.mmm.ucar.edu/wrf/src/"
#cat index.html
}
#filer_rows - get the filename and the date columns
filter_rows() {
sed -n 's:<tr><td.*href="\([^"]*\)">.*>\([0-9].*\)</td>.*</td>.*</td></tr>:\2#\1:p' | grep "${1:-.}"
}
#sort_by_date - probably don't need comment... sorts the lines by date... ;)
sort_by_date() {
while IFS=# read -r date file
do
echo "$(date --date="$date" +%s)#$file"
done | sort -gr
}
#MAIN
file=$(getdata | filter_rows WRFV | sort_by_date | head -1 | cut -d# -f2)
echo "You want download: $file"
prints
You want download: WRFV3-Chem-3.6.1.TAR.gz

What about adding a numeric sort and taking the top line:
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV[0-9]*[0-9]\.[0-9]' | sort -r -n | head -1

Unix CLI script to rename folders using their pre-existing names

I have a directory with folder structure as follows:
-- DATA -- ABD 1231345 -- 01-08-12 // date in mm-dd-yy format
-- 03-09-12
-- 06-11-12
-- DEF 4859480 -- 02-10-12
-- 05-10-12
-- 07-10-12
I would like to batch rename this DATA folder as follows
-- DATA -- ABD 1231345 -- 2012_01_08 // date in yyyy_mm_dd format with underscore
-- 2012_03_09
-- 2012_06_11
-- DEF 4859480 -- 2012_02_10
-- 2012_05_10
-- 2012_07_10
Do you have a suggestion on how to accomplish using command line on Mac OSX / unix?

You could use a for loop and awk, parsing each file-name into your specified format and then mv to rename the original to the new name:
for dir in DATA/*; do \
pushd "$dir"; # change directory \
for file in *; do \
path=`echo $file | awk -F- '{print "20"$3"_"$1"_"$2}'`; \
mv $file $path; # "rename" the file \
done; \
popd; # restore original directory \
done;
This can be executed in folder above DATA. If you want to execute it directly in DATA, update the first loop to read for dir in *; do instead of DATA/*. It tells awk to use the - as the delimiter (instead of whitespace), and then reconstructs a string from "mm-dd-yy" to "20yy_mm_dd".
Using pushd and popd will enable the script to change the current directory to each subdirectory inside DATA (pushd) and then, after moving all necessary files will change back to the original (popd). Doing this will save you a lot of parsing-effort trying to save directory paths / etc.

You could use string manipulations and arrays to do that with bash only.
Something like:
for f in * ; do
parts=(${f//-/ })
mv "$f" "20${parts[2]}_${parts[1]}_${parts[0]}"
done
Search this site for various options to recurse into directories e.g.:
Shell script to traverse directories

Use the date command to convert the file name:
$ date -j -f %m-%d-%y 01-08-12 +%Y_%m_%d
2012_01_08
Getting to the files is a little tricker. We'll just switch directories to avoid dealing with long file paths.
for d in DATA; do
pushd "$d"
for f in *; do
new_f=$(date -j -f %m-%d-%y $f +%Y_%m_%d)
mv "$f" "$new_f"
done
popd
done

This site gives a good snippet
for i in *.avi; do j=`echo $i | sed 's/(\d{2})-(\d{2})-(\d{2})/20\3_\1_\2/g'`; mv "$i" "$j"; done

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Parsing files to populate placeholders with values - build

Related

Draw.io - how to export all tabs to images using command line

sed command issues

Piping output of shell command to grep causes "grep: write error"

How to download latest version of software from same url using wget

Unix CLI script to rename folders using their pre-existing names

Categories

Resources