Scanning APIs with ZAP Docker image - replacer with regex - regex

I'm trying to use API scanner Docker image as described here: https://www.zaproxy.org/blog/2017-06-19-scanning-apis-with-zap/ and I want to do some requests replacement using regexp. I'm using command:
docker run -v $(pwd):/zap/wrk/:rw --network=host -t owasp/zap2docker-weekly zap-api-scan.py --hook=/zap/wrk/authentication-hooks.py -t docs/openapi.yaml -f openapi -w output/oppenapi.md -z "-configfile /zap/wrk/zapproxy.prop" -d
with "zapproxy.prop":
replacer.full_list(0).description=customerId
replacer.full_list(0).enabled=true
replacer.full_list(0).matchtype=REQ_HEADER_STR
replacer.full_list(0).matchstr=/api/customers/\d+
replacer.full_list(0).regex=true
replacer.full_list(0).replacement=/api/customers/1
and the replacement doesn't work for URL I want to modify: GET /api/customers/10. The same rule used via GUI works just fine.
I've also tried:
replacer.full_list(0).description=customerId
replacer.full_list(0).enabled=true
replacer.full_list(0).matchtype=REQ_HEADER_STR
replacer.full_list(0).matchstr=/api/customers/10
replacer.full_list(0).regex=false
replacer.full_list(0).replacement=/api/customers/1
it also works fine.
Simon Bennetts suggested to check how GUI saves those settings: https://www.zaproxy.org/faq/how-do-you-find-out-what-key-to-use-to-set-a-config-value-on-the-command-line/. As you can see - there aren't any esacapes in mastchstr.
Is there something that I need to do to pass this regex correctly?

Escaping was the issue:
replacer.full_list(0).description=clientId
replacer.full_list(0).enabled=true
replacer.full_list(0).matchtype=REQ_HEADER_STR
replacer.full_list(0).matchstr=/api/customers/\\d+
replacer.full_list(0).regex=true
replacer.full_list(0).replacement=/api/customers/2

Related

How do you pipe and filter text from tail as input for a variable in a script?

Backstory
I am trying to create a script that updates a "device" through the devices cli, but it doesn't accept any form of command following the establishment of an ssh connection.
for this reason i have started using screen to logging the output from the device and then attempting to filter the log for relevant info so i can pass commands back to the remote device by stuffing it into screens buffer.(kind of a ramshackled way of doing it but its all i can think of.
Issue
I need to use some combo of grep and sed or awk to filter out one of two outputs i'm looking for respectively "SN12345678" '\w[a-zA-Z]\d{6-10}' and "finished" inside screenlog.2 I've got regex patterns for both of these but i cannot seem to get the right output and assign it to a variable
.screenrc (relevant excerpt)
screen -t script 0 ./script
screen -t local 1 bash
screen -t remote 2 bash
screen -t Shell 3 bash
./script
screen -p 2 -X log on #turns logging on window 2
screen -p 3 -X stuff 'tail-Fn 0 screenlog.2 | #SOMESED Function that i cant figure out'
screen -p 2 -X stuff 'ssh -o "UserKnownHostsFile /dev/null" -o "StrictHostKeyChecking=no" admin#192.168.0.1^M' && echo "Stuffed ssh login -> window 2"
sleep 2 # wait for ssh connection
screen -p 2 -X stuff admin^M && echo "stuffed pw"
sleep 4 # wait for auth
screen -p 2 -X stuff "copy sw ftp://ftpuser:admin#192.168.0.2/dev_uimage-4_4_5-26222^M" && echo "initiated flash"
screen -p 2 -X stuff "copy license ftp://ftpuser:admin#192.168.0.2/$(result of sed from screenlog.2).lic^M" && echo "uploading license"
sorry if this is a bit long winded i've been wracking my brain for the last few days trying to get this to work.
Thank you for your time!
Answer
Regular Expression
Looking at the example regex you provided, I'm going to assume SN can't just be hardcoded and that it could be uppercase,lowercase,digit for first character and uppercase,lowercase for the second digit, so I think you are looking for:
grep -Eo '[[:alnum:]][[:alpha:]][[:digit:]]{6,10}' # Works regardless of computer's local settings
# OR
egrep -o '[[:alnum:]][[:alpha:]][[:digit:]]{6,10}' # Works regardless of computer's local settings
# OR
grep -Eo '[0-9A-Za-z][A-Za-z][0-9]{6,10}'
# OR
egrep -o '[0-9A-Za-z][A-Za-z][0-9]{6,10}'
These are exact conversions of your regular expression (includes the _ as. a possibility of the first character):
grep -Eo '[[:alnum:]_][[:alpha:]][[:digit:]]{6,10}' # Works regardless of computer's local settings
# OR
grep -Eo '[0-9A-Za-z_][A-Za-z][0-9]{6,10}'
# OR (non-extended regular expressions)
grep -o '[[:alnum:]_][[:alpha:]][[:digit:]]\{6,10\}'
grep -o '[0-9A-Za-z_][A-Za-z][0-9]\{6,10\}'
Reuse the Match
I don't know how you would assign the output to a variable, but I would just write it to a file and delete the file afterwards (assuming the "script" and "Shell" windows have the same pwd [present working directory]):
. . .
screen -p 3 -X stuff 'tail -Fn1 screenlog.2 | grep -Eo "[[:alnum:]][[:alpha:]][[:digit:]]{6,10}" >> SerialNumberOrID^M'
. . .
screen -p 2 -X stuff "copy license ftp://ftpuser:admin#192.168.0.2/$(cat SerialNumberOrID).lic^M" && echo "uploading license"
rm -f SerialNumberOrID
Explanation
Regular Expression
I'm fairly confident that grep, sed, and awk (and most POSIX compliant utilities) don't support \w and \d. Those are Perl-like flags. You can pass -E to grep and sed to make them use extended regular expressions (will save you from having to do as much escaping).
Command Changes
Writing the match to a file seemed like the best way to reuse it. Using >> ensures that we append to the file, so that grep will only write the matching expression to the file and won't overwrite it with an empty file. This is why it's necessary to delete the file at the end of your script (so that it won't mess up next run and also so you don't have unnecessary files laying around). In the license upload command, we use cat to output the contents of the file in-line. I also changed the tail command to tail -Fn1 because I'm pretty sure you need to at least have 1 for it to feed a line into grep.
Resources
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX_Basic_Regular_Expressions
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions
grep, sed, and awk man pages

using sed to replace a line with back slashes in a shell script

I am trying to replace the bottom one of these 2 lines with sed in a file.
<rule>out_prefix=orderid ^1\\d\+ updatemtnotif/</rule>\n\
<rule>out_prefix=orderid ^2\\d\+ updatemtnotif/</rule>\n\
And the following command seems to do that when executed as a command at the bash prompt
sed -i 's#out_prefix=orderid ^2\\\\d\\+ updatemtnotif/#out_prefix=orderid ^2\\\\d\\+ updatemtnotif_fr/#g' /opt/temp/rules.txt
however, when I try to execute the same command remotely over ssh using here documents, the command fails to modify the file.
I think this is probably an escaping issue, but I have had no luck trying to modify the command in numerous ways. Can any one tell me what should I do to get it working over ssh? Thanks in advance!
to clarify,
input: <rule>out_prefix=orderid ^2\\d\+ updatemtnotif/</rule>\n\
output: <rule>out_prefix=orderid ^2\\d\+ updatemtnotif_fr/</rule>\n\
You can use it with ssh and heredoc like this:
ssh -t -t user#localhost<<'EOF'
sed 's~out_prefix=orderid ^2\\\\d\\+ updatemtnotif/~out_prefix=orderid ^2\\\\d\\+ updatemtnotif_fr/~' ~/path/to/file
exit
EOF
PS: It is important to quote the 'EOF' as shown.
I managed to fix it. had to escape the backslashes in the command I used inside the shell script.
's#out_prefix=orderid ^2\\\\\\\\d\\\\+ updatemtnotif/#out_prefix=orderid ^2\\\\\\\\d\\\\+ updatemtnotif_fr/#g' /opt/temp/rules.txt
That's a whole lot of backslashes but it did the trick.

Make a POST request using ab (apache benchmarking) on a django server

I'm trying to make a HTTP POST request using ab to a form built with django.
I'm using the following line:
ab -n 10 -C csrftoken=my_token -p ab_file.data -T application/x-www-form-urlencoded http://localhost:8000/
My ab_file.data looks like this:
url=my_encoded_url&csrfmiddlewaretoken=my_token
It always returns a 403 status code.
When I use curl using the same parameters, it works. The curl line:
curl -X POST -d "url=my_encoded_url&csrfmiddlewaretoken=my_token" --cookie "csrftoken=my_token" http://localhost:8000/
How can I do that?
File must have a properly url-encode data. If you url-encode manually, it is too easy to have typos like blanks wrong encodes. Best do it programmatically.
See an another answer: Apache Bench and POST data
on how to use Python to create such file ( ex: post.data)
Then use:
ab -T 'application/x-www-form-urlencoded' -n 10 -p post.data http://localhost:8080/
When using ab, the entire contents of the data file must be wrapped onto a single line - it fails silently if it's normally expanded JSON. So a post from a data file that works fine with curl will fail with ab until you do this.
Tip: If using Atom or VSCode, select all and hit Cmd-J to wrap everything to one line.
#jacobm654321,
for sure, the best thing to do is encode the URL programmatically. But my problem wasn't that. My problem is that the file containing the post data had a blank line at end of file. EditorConfig put it there. After remove that blank line, everything worked well.
Thanks anyway.

Is my regex too greedy?

Background: We're using a tape library and the backup software NetWorker to back up data here. The client that's installed is fairly basic, and when we need to restore more than one target directory we create a script that simply calls X client instances in the background via a script with X of the following lines:
recover -c client-srv -t "Mon Dec 10 08:00:00" -s barckup-srv -d /dest/dir/ -f -a /src/dir &
The trouble is that different partitions/directories backed up from the same machine at the same time might be spread across several different tapes, and some of those tapes may have been removed from the library between the backup and restore.
Up until recently the only ways the people here have been finding out about which tapes are needed were to either wait for the library to complain that it doesn't have a particular tape, or to set up a fake restore in an crappy old desktop GUI client and hit a particular menu option. The first option is super bad when the tape turns out to be off-site and takes a day to get back, and the second is tedious and time-consuming.
Actual Question: I've written a "meta"-script that reads the script that we've already created with the commands above, feeds it into the interactive CLI client, and gets it to spit out what tapes are required, and if they're actually in the library. To do this, the script uses the following regular expressions to pull out necessary info:
# pull out a list of the -a targets
restore_targets="`sed 's/^.* -a \([^ ]*\) .*$/\1/' $rec_script`"
# pull out a list of -c clients
restore_clients="`sed 's/^.* -c \([^ ]*\) .*$/\1/' $rec_script`"
numclients=`echo $restore_clients | uniq | wc -l`
# pull out a list of -t dates
restore_dates="`sed 's/^.* -t \"\([^\"]*\)\" .*$/\1/' $rec_script`"
numdates=`echo $restore_dates | uniq | wc -l`
I am not terribly familiar with using s/\(x\)/\1/ types of regexes, to the point that I don't remember the name, but is this the best way of accomplishing what I am doing? The commands work, but I'm wondering if I'm using the .* needlessly.
\1 refers to the first capturing group. If you replace foo(.*?) with \1 and feed in foobar, the resulting text becomes bar, as \1 points to the text captured by the first capturing group.
As for your your question, it might be safer and easier to parse the arguments using Python (or another high-level scripting language):
>>> import shlex
>>> shlex.split('recover -c client-srv -t "Mon Dec 10 08:00:00" -s barckup-srv -d /dest/dir/ -f -a /src/dir &')
['recover', '-c', 'client-srv', '-t', 'Mon Dec 10 08:00:00', '-s', 'barckup-srv', '-d', '/dest/dir/', '-f', '-a', '/src/dir', '&']
Now, this is much easier to work with. The quotes are gone and all of the components of the command are nicely split up into a list.
If you want this to be completely foolproof, you could use argparse and implement your own parser for this command line pretty easily. This will enable you to easily get the info, but it might be overkill for your situation.
As for your actual question, you can dissect the regex:
^.* -t "([^\"]*)" .*$
This regex captures -t "foo \" bar", while a non-greedy version would stop at -t "foo \".

Mirroring with regex in wget

I'm using wget and trying to mirror all 98 folders on a website. What would be the syntax to do "wget -mk http://example.com/folder[1-98]/"?
Thanks.
for i in $(seq 1 98);do echo "http://example.com/folder${i}/";done|wget -mki -
wget does not support specifying URL ranges in the format you've described. You would be better served building out the range of links with bash or some other programming language into a text file and then reading that text file with wget.
You can apply {} to apply range of numbers. Use this example "wget -mk http://example.com/folder{1..98}"