sed command issues - regex

Background
Original json (test.json): {"rpc-password": "password"}
Expected changed json: {"rpc-password": "somepassword"}
replace_json_str is a function used to replace password with somepassword using sed.
replace_json_str() {
x=$1
sed -i -e 's/\({ "'"$2"'":\)"[^"]*" }/\1"'"$3"'" }/g' $x
}
Unit test: replace_json_str test.json rpc-password somepassword
Issue
After running the above test, I get a file named test.json-e and the contents of the file is the same as before the test was ran, why?

there is a handy command line json tool called jq
cat input.json
{"rpc-password": "password"}
cat update_json.sh
givenkey=$1
givenvalue=$2
inputfile=input.json
outfile=output.json
cat $inputfile | jq . # show input json
jq --arg key1 "$givenkey" --arg val1 "$givenvalue" '.[$key1] = $val1' "$inputfile" > "$outfile"
cat "$outfile" | jq . # render output json
keep in mind jq can handle multiple such key value updates ... execute it
update_json.sh rpc-password somepassword
{
"rpc-password": "password"
}
{
"rpc-password": "somepassword"
}

Depends on which sed you're using.
The command you ran will work as expected with GNU sed.
But BSD sed does not allow for an empty argument to -i, and if you ran the same command, it will use the next argument, -e, as the backup file.
Also, the positioning of the spaces in your pattern don't match your example JSON.

Related

how to have 2 output of newman with single execution

Given the following scenario:
execute collections using newman, and save it in a reportfile but meanwhile also put the standard output of the the newman so that we can see in details. (like executing normally, without saving to any report file).
So my problem is when executing the newman with option save it to a reportfile, it seems redirecting the standard output and convert it to a report file. During this execution I see nothing in the standard output at all.
As of now I can do this in two step which seems a bit unprofessional.
inside of:
ExecutePostmanCollection.ps1
...
newman run $collection -e $env --insecure -r junitfull --reporter-junitfull-export $result
...
newman run $collection -e $env --insecure --disable-unicode | Out-File -FilePath "./output.txt"
Get-Content "./output.txt"
Thank you and
Regards
CP
use cli reporter.
newman run $collection -e $env --insecure -r cli,junitfull --reporter-junitfull-export $result ...
newman run $collection -e $env --insecure -r cli --disable-unicode | Out-File -FilePath "./output.txt" Get-Content "./output.txt"

PCF: How to find the list of all the spaces where you have Developer privileges

If you are not an admin in Pivotal Cloud Foundry, how will you find or list all the orgs/spaces where you have developer privileges? Is there a command or menu to get that, instead of going into each space and verifying it?
Here's a script that will dump the org & space names of which the currently logged in user is a part.
A quick explanation. It will call the /v2/spaces api, which already filters to only show spaces of which the currently logged in user can see (if you run with a user that has admin access, it will list all orgs and spaces). We then iterate over the results & take the space's organization_url field and cf curl that to get the organization name (there's a hashmap to cache results).
This script requires Bash 4+ for the hashmap support. If you don't have that, you can remove that part and it will just be a little slower. It also requires jq, and of course the cf cli.
#!/usr/bin/env bash
#
# List all spaces available to the current user
#
set -e
function load_all_pages {
URL="$1"
DATA=""
until [ "$URL" == "null" ]; do
RESP=$(cf curl "$URL")
DATA+=$(echo "$RESP" | jq .resources)
URL=$(echo "$RESP" | jq -r .next_url)
done
# dump the data
echo "$DATA" | jq .[] | jq -s
}
function load_all_spaces {
load_all_pages "/v2/spaces"
}
function main {
declare -A ORGS # cache org name lookups
# load all the spaces & properly paginate
SPACES=$(load_all_spaces)
# filter out the name & org_url
SPACES_DATA=$(echo "$SPACES" | jq -rc '.[].entity | {"name": .name, "org_url": .organization_url}')
printf "Org\tSpace\n"
for SPACE_JSON in $SPACES_DATA; do
SPACE_NAME=$(echo "$SPACE_JSON" | jq -r '.name')
# take the org_url and look up the org name, cache responses for speed
ORG_URL=$(echo "$SPACE_JSON" | jq -r '.org_url')
ORG_NAME="${ORGS[$ORG_URL]}"
if [ "$ORG_NAME" == "" ]; then
ORG_NAME=$(cf curl "$ORG_URL" | jq -r '.entity.name')
ORGS[$ORG_URL]="$ORG_NAME"
fi
printf "$ORG_NAME\t$SPACE_NAME\n"
done
}
main "$#"

Regex each line of stdout and push to array in shell/bash

I am using AWS CLI to ls an S3 bucket. The output is:
Austins-MacBook-Pro:~ austin$ aws s3 ls s3://obscured-bucket-name
PRE 2016-02-24-03-42/
PRE 2016-02-25-22-25/
PRE 2016-02-26-00-34/
PRE 2016-02-26-00-42/
PRE 2016-02-26-03-43/
Using either Bash or Shell script I need to take each line and remove the spaces or tabs and the PRE before the prefix names and put each prefix in an array so I can use it to subsequently rm the oldest folder.
TLDR;
I need to turn the output of aws s3 ls s3://obscured-bucket-name to an array of values like this: 2016-02-26-03-43/
Thanks for reading!
Under bash, you could:
mapfile myarray < <(aws s3 ls s3://obscured-bucket-name)
echo ${myarray[#]#*PRE }
2016-02-24-03-42/ 2016-02-25-22-25/ 2016-02-26-00-34/ 2016-02-26-00-42/ 2016-02-26-03-43/
or
mapfile -t myarray < <(aws s3 ls s3://obscured-bucket-name)
myarray=( "${myarray[#]#*PRE }" )
printf '<%s>\n' "${myarray[#]%/}"
<2016-02-24-03-42>
<2016-02-25-22-25>
<2016-02-26-00-34>
<2016-02-26-00-42>
<2016-02-26-03-43>
Nota: -t switch remove a trailing newline from each line read.
See help mapfile and/or man -Pless\ +/readarray bash
mapfile was introduced in 2009 with version 4 of bash.
try this:
aws s3 ls s3://obscured-bucket-name | sed -e "s/[^0-9]*//"
so if you want to get the oldest folder:
aws s3 ls s3://obscured-bucket-name | sed -e "s/[^0-9]*//" | sort | head -n 1
You could also use awk to the rescue
aws s3 ls <s3://obscured-bucket-name>/ | awk '/PRE/ { print $2 }' | tail -n+2
This will remove the last bucket and provide store the folders in the array variable.

How to download latest version of software from same url using wget

I would like to download a latest source code of software (WRF) from some url and automate the installation process thereafter. A sample url like is given below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
In the above url, the version number may change time to time after the developer release the new version. Now I would like to download the latest available version from the main script. I tried the following:-
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV'
With above code, I could pull all available version of the software. The output of the above code is below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV2.0.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Var-do-not-use.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3_OVERLAY_3.0.1.1.TAR.gz
However, I am unable to go further to filter out only later version from the link.
Usually, for processing the html-pages i recommendig some perl tools, but because this is an Directory Index output, (probably) can be done by bash tools like grep sed and such...
The following code is divided to several smaller bash functions, for easy changes
#!/bin/bash
#getdata - should output html source of the page
getdata() {
#use wget with output to stdout or curl or fetch
curl -s "http://www2.mmm.ucar.edu/wrf/src/"
#cat index.html
}
#filer_rows - get the filename and the date columns
filter_rows() {
sed -n 's:<tr><td.*href="\([^"]*\)">.*>\([0-9].*\)</td>.*</td>.*</td></tr>:\2#\1:p' | grep "${1:-.}"
}
#sort_by_date - probably don't need comment... sorts the lines by date... ;)
sort_by_date() {
while IFS=# read -r date file
do
echo "$(date --date="$date" +%s)#$file"
done | sort -gr
}
#MAIN
file=$(getdata | filter_rows WRFV | sort_by_date | head -1 | cut -d# -f2)
echo "You want download: $file"
prints
You want download: WRFV3-Chem-3.6.1.TAR.gz
What about adding a numeric sort and taking the top line:
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV[0-9]*[0-9]\.[0-9]' | sort -r -n | head -1

regular expression to extract data from html page

I want to extract all anchor tags from html pages. I am using this in Linux.
lynx --source http://www.imdb.com | egrep "<a[^>]*>"
but that is not working as expected, since result contains unwanted results
<a class="amazon-affiliate-site-name" href="http://www.fabric.com">Fabric</a><br>
I want just
<a href >...</a>
any good way ?
If you have a -P option in your grep so that it accepts PCRE patterns, you should be able to use better regexes. Sometimes a minimal quantifier like *? helps. Also, you’re getting the whole input line, not just the match itself; if you have a -o option to grep, it will list only the part that matches.
egrep -Po '<a[^<>]*>'
If your grep doesn’t have those options, try
perl -00 -nle 'print $1 while /(<a[^<>]*>)/gi'
Which now crosses line boundaries.
To do a real parse of HTML requires regexes subtantially more more complex than you are apt to wish to enter on the command line. Here’s one example, and here’s another. Those may not convince you to try a non-regex approach, but they should at least show you how much harder it is in the general case than in specific ones.
This answer shows why all things are possible, but not all are expedient.
why can't you use options like --dump ?
lynx --dump --listonly http://www.imdb.com
Try grep -Eo:
$ echo '<a class="amazon-affiliate-site-name" href="http://www.fabric.com">Fabric</a><br>' | grep -Eo '<a[^>]*>'
<a class="amazon-affiliate-site-name" href="http://www.fabric.com">
But please read the answer that MAK linked to.
Here's some examples of why you should not use regex to parse html.
To extract values of 'href' attribute of anchor tags, run:
$ python -c'import sys, lxml.html as h
> root = h.parse(sys.argv[1]).getroot()
> root.make_links_absolute(base_url=sys.argv[1])
> print "\n".join(root.xpath("//a/#href"))' http://imdb.com | sort -u
Install lxml module if needed: $ sudo apt-get install python-lxml.
Output
http://askville.amazon.com
http://idfilm.blogspot.com/2011/02/another-class.html
http://imdb.com
http://imdb.com/
http://imdb.com/a2z
http://imdb.com/a2z/
http://imdb.com/advertising/
http://imdb.com/boards/
http://imdb.com/chart/
http://imdb.com/chart/top
http://imdb.com/czone/
http://imdb.com/features/hdgallery
http://imdb.com/features/oscars/2011/
http://imdb.com/features/sundance/2011/
http://imdb.com/features/video/
http://imdb.com/features/video/browse/
http://imdb.com/features/video/trailers/
http://imdb.com/features/video/tv/
http://imdb.com/features/yearinreview/2010/
http://imdb.com/genre
http://imdb.com/help/
http://imdb.com/helpdesk/contact
http://imdb.com/help/show_article?conditions
http://imdb.com/help/show_article?rssavailable
http://imdb.com/jobs
http://imdb.com/lists
http://imdb.com/media/index/rg2392693248
http://imdb.com/media/rm3467688448/rg2392693248
http://imdb.com/media/rm3484465664/rg2392693248
http://imdb.com/media/rm3719346688/rg2392693248
http://imdb.com/mymovies/list
http://imdb.com/name/nm0000207/
http://imdb.com/name/nm0000234/
http://imdb.com/name/nm0000631/
http://imdb.com/name/nm0000982/
http://imdb.com/name/nm0001392/
http://imdb.com/name/nm0004716/
http://imdb.com/name/nm0531546/
http://imdb.com/name/nm0626362/
http://imdb.com/name/nm0742146/
http://imdb.com/name/nm0817980/
http://imdb.com/name/nm2059117/
http://imdb.com/news/
http://imdb.com/news/celebrity
http://imdb.com/news/movie
http://imdb.com/news/ni7650335/
http://imdb.com/news/ni7653135/
http://imdb.com/news/ni7654375/
http://imdb.com/news/ni7654598/
http://imdb.com/news/ni7654810/
http://imdb.com/news/ni7655320/
http://imdb.com/news/ni7656816/
http://imdb.com/news/ni7660987/
http://imdb.com/news/ni7662397/
http://imdb.com/news/ni7665028/
http://imdb.com/news/ni7668639/
http://imdb.com/news/ni7669396/
http://imdb.com/news/ni7676733/
http://imdb.com/news/ni7677253/
http://imdb.com/news/ni7677366/
http://imdb.com/news/ni7677639/
http://imdb.com/news/ni7677944/
http://imdb.com/news/ni7678014/
http://imdb.com/news/ni7678103/
http://imdb.com/news/ni7678225/
http://imdb.com/news/ns0000003/
http://imdb.com/news/ns0000018/
http://imdb.com/news/ns0000023/
http://imdb.com/news/ns0000031/
http://imdb.com/news/ns0000128/
http://imdb.com/news/ns0000136/
http://imdb.com/news/ns0000141/
http://imdb.com/news/ns0000195/
http://imdb.com/news/ns0000236/
http://imdb.com/news/ns0000344/
http://imdb.com/news/ns0000345/
http://imdb.com/news/ns0004913/
http://imdb.com/news/top
http://imdb.com/news/tv
http://imdb.com/nowplaying/
http://imdb.com/photo_galleries/new_photos/2010/
http://imdb.com/poll
http://imdb.com/privacy
http://imdb.com/register/login
http://imdb.com/register/?why=footer
http://imdb.com/register/?why=mymovies_footer
http://imdb.com/register/?why=personalize
http://imdb.com/rg/NAV_TWITTER/NAV_EXTRA/http://www.twitter.com/imdb
http://imdb.com/ri/TRAILERS_HPPIRATESVID/TOP_BUCKET/102785/video/imdb/vi161323033/
http://imdb.com/search
http://imdb.com/search/
http://imdb.com/search/name?birth_monthday=02-12
http://imdb.com/search/title?sort=num_votes,desc&title_type=feature&my_ratings=exclude
http://imdb.com/sections/dvd/
http://imdb.com/sections/horror/
http://imdb.com/sections/indie/
http://imdb.com/sections/tv/
http://imdb.com/showtimes/
http://imdb.com/tiger_redirect?FT_LIC&licensing/
http://imdb.com/title/tt0078748/
http://imdb.com/title/tt0279600/
http://imdb.com/title/tt0377981/
http://imdb.com/title/tt0881320/
http://imdb.com/title/tt0990407/
http://imdb.com/title/tt1034389/
http://imdb.com/title/tt1265990/
http://imdb.com/title/tt1401152/
http://imdb.com/title/tt1411238/
http://imdb.com/title/tt1411238/trivia
http://imdb.com/title/tt1446714/
http://imdb.com/title/tt1452628/
http://imdb.com/title/tt1464174/
http://imdb.com/title/tt1464540/
http://imdb.com/title/tt1477837/
http://imdb.com/title/tt1502404/
http://imdb.com/title/tt1504320/
http://imdb.com/title/tt1563069/
http://imdb.com/title/tt1564367/
http://imdb.com/title/tt1702443/
http://imdb.com/tvgrid/
http://m.imdb.com
http://pro.imdb.com/r/IMDbTabNB/
http://resume.imdb.com
http://resume.imdb.com/
https://secure.imdb.com/register/subscribe?c=a394d4442664f6f6475627
http://twitter.com/imdb
http://wireless.amazon.com
http://www.3news.co.nz/The-Hobbit-media-conference--full-video/tabid/312/articleID/198020/Default.aspx
http://www.amazon.com/exec/obidos/redirect-home/internetmoviedat
http://www.audible.com
http://www.boxofficemojo.com
http://www.dpreview.com
http://www.endless.com
http://www.fabric.com
http://www.imdb.com/board/bd0000089/threads/
http://www.imdb.com/licensing/
http://www.imdb.com/media/rm1037220352/rg261921280
http://www.imdb.com/media/rm2695346688/tt1449283
http://www.imdb.com/media/rm3987585536/tt1092026
http://www.imdb.com/name/nm0000092/
http://www.imdb.com/photo_galleries/new_photos/2010/index
http://www.imdb.com/search/title?sort=num_votes,desc&title_type=tv_series&my_ratings=exclude
http://www.imdb.com/sections/indie/
http://www.imdb.com/title/tt0079470/
http://www.imdb.com/title/tt0079470/quotes?qt0471997
http://www.imdb.com/title/tt1542852/
http://www.imdb.com/title/tt1606392/
http://www.imdb.de
http://www.imdb.es
http://www.imdb.fr
http://www.imdb.it
http://www.imdb.pt
http://www.movieline.com/2011/02/watch-jon-hamm-talk-butthole-surfers-paul-rudd-impersonate-jay-leno-at-book-reading-1.php
http://www.movingimagesource.us/articles/un-tv-20110210
http://www.npr.org/blogs/monkeysee/2011/02/10/133629395/james-franco-recites-byron-to-the-worlds-luckiest-middle-school-journalist
http://www.nytimes.com/2011/02/06/books/review/Brubach-t.html
http://www.shopbop.com/welcome
http://www.smallparts.com
http://www.twinpeaks20.com/details/
http://www.twitter.com/imdb
http://www.vanityfair.com/hollywood/features/2011/03/lauren-bacall-201103
http://www.warehousedeals.com
http://www.withoutabox.com
http://www.zappos.com
To extract values of 'href' attribute of anchor tags you may also use xmlstarlet after converting HTML to XHTML using HTML Tidy (Mac OS X version released on 25 March 2009):
curl -s www.imdb.com |
tidy -q -c -wrap 0 -numeric -asxml -utf8 --merge-divs yes --merge-spans yes 2>/dev/null |
xmlstarlet sel -N x="http://www.w3.org/1999/xhtml" -t -m "//x:a/#href" -v '.' -n |
grep '^[[:space:]]*http://' | sort -u | nl
On Mac OS X you may also use the command line tool linkscraper:
linkscraper http://www.imdb.com
see: http://codesnippets.joyent.com/posts/show/10772