I have some text with a password which may contain special characters (like /, *, ., [], () and other that may be used in regular expressions). How to remove that password from the text using Korn shell or maybe sed or awk? I need an answer which would be compatible with Linux and IBM AIX.
For example, the password is "123*(/abc" (it is contained in a variable).
The text (it is also contained in a variable) may look like below:
"connect user/123*(/abc#connection_string"
As a result I want to obtain following text:
"connect user/#connection_string"
I tried to use tr -d but received wrong result:
l_pwd='1234/\#1234'
l_txt='connect target user/1234/\#1234#connection'
print $l_txt | tr -d $l_pwd
connect target user\connection
tr -d removes all characters in l_pwd from l_txt that's why the result is so strange.
Try this:
l_pwd="1234/\#1234";
escaped_pwd=$(printf '%s\n' "$l_pwd" | sed -e 's/[]\/$*.^[]/\\&/g')
l_txt="connect target user/1234/\#1234#connection";
echo $l_txt | sed "s/$escaped_pwd//g";
It prints connect target user/#connection in bash at least.
Big caveat is this does not work on newlines, and maybe more, check out where I got this solution from.
With ksh 93u+
Here's some parameter expansion madness:
$ echo "${l_txt//${l_pwd//[\/\\]/?}/}"
connect target user/#connection
That takes the password variable, substitutes forward and back slashes into ?, then uses that as a pattern to remove from the connection string.
A more robust version:
x=${l_pwd//[^[:alnum:]]/\\\0}
typeset -p x # => x='1234\/\\\#1234'
conn=${l_txt%%${x}*}${l_txt#*${x}}
typeset -p conn # => connect target user/#connection
i am making this function for my vim-statusline:
function! GitCommitSymbol(timer_id)
let l:uncommited = system('echo -n $(git checkout | grep -oP "(\M+)" -c)')
if (uncommited == number)
let g:uncommited = ''
hi MyGitSymbol ctermfg=11
else
let g:uncommited = ''
hi MyGitSymbol ctermfg=7
endif
call timer_start(30000, 'GitCommitSymbol')
endfunction
And i made this command git checkout | grep -oP "(\M+)" -c specifically to return only a number of uncommited changes (that's why i use grep).
So basically, if im on a location that is a git repository and i type the command on the shell, it returns me "0", "1" and so on, depending on how much uncommited changes i have.
However, if i type it on a location that is not a git repository, it returns me something like:
fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 0
So, on this line of the function: if (uncommited == number), i need a conditional, to show something IF the output of the command is numeric, and the "else" is for when it is not numeric (like the "fatal" output i put above). But i don't know how to make this conditional.
You don't need the extra echo, just call the command directly, then use Vimscript's trim() to remove whitespace around the output.
let uncommitted = trim(system('git checkout | grep -oP "(\M+)" -c'))
Then you can check if it's a number by using the =~ operator, together with a regex such as '^\d\+$' to match numeric output:
if uncommitted =~ '^\d\+$'
If later on you would like to access the count as a number, you can use str2nr(uncommitted).
I am trying to get a parameter from a yaml file using sed
The files looks like:
service:
vhost: host1
port: 8080
database:
vhost: host2
port: 8080
and I need the value from service.vhost
for doing that I am using
echo $value | sed -e 's/.*service.*vhost:\s\?\(\S*\).*/\1/';
but instead of host1 I am getting host2
What is the reason for this behavior?
I need the first occurrence of the matching expression, the file may come with unknown information and the word "vhost" may appear multiple times
As I wrote in my comment above, you're much better off using one of the many programming languages with built-in YAML support, like Ruby:
$ echo "$value" | ruby -ryaml -e 'puts YAML.load(ARGF)["service"]["vhost"]'
# => host1
If you're committed to using sed, though, something like this might work most of the time:
echo "$value" | sed -nE '
/^service:$/,$ !b
s/^\s+vhost: (\S+)/\1/
T
p; q
'
You'll notice that this uses -E to use extended regexes and avoid all of those backslashes, and -n to suppress automatic printing, since we really just want to print one thing.
Breaking it down by line:
/^service:$/,$ is an address range that matches a line containing service: and every subsequent line until the end of the file ($). ! inverts the match (i.e. causes the subsequent command to be executed for lines not within the range). b unconditionally branches to the end of the cycle. In other words, don't do anything until we get to service:.
s/^\s+vhost: (\S+)/\1/ should look familiar. It replaces a line like vhost: foo with foo.
T is like b but branches to the end of the cycle only if no successful substitution has been made in this cycle (it's the inverse of t). That is, if we didn't match vhost: above, don't do anything else on this line.
p prints the pattern space (which now contains the result of the above substitution). We need this because we used the -n switch. q quits without further processing. Since we found our match, we're done.
This can, of course, be made a one-liner:
echo "$value" | sed -nE '/^service:$/,$!b; s/^\s+vhost: (\S+)/\1/; T; p; q'
You can see it in action on TiO.
You can also try the commandline tool yq written in go:
% yq read file.yaml service.vhost
host1
When i am trying to sed as below
sed -e 's,</Context>,<Resource name="ABC" password="'"$DB_PASS"'"/>\n&,' -i /path
sed is truncating backslash.
For example
DB_PASS='1!2#3#4$5%6^7&8*9(0)[{]}\|'
O/p is
<Resource name="jdbc/KARDB"
password=""1!2#3#4$5%6^7</Context>8*9(0)[{]}|""/> </Context>
if my DB_PASS contains backslash, &, single quote double quote anyother spl characters i dont want sed to change password contents. but substitute as it is.
Thanks,
Kusuma
The specific problem you run into is that your password contains &, which in the replacement part of an s command refers to the matched text.
As has become a litany of mine, it is generally not a good idea to substitute shell variables into sed code precisely for reasons like these: sed cannot differentiate between code and your data. This is one of the more harmless things that could go wrong; imagine what GNU sed would have done if someone had entered a password like rm -Rf /,e #.
A direct replacement with GNU awk (it has to be gawk because RT is GNU-specific) could be
DB_PASS="$DB_PASS" gawk -v RS='</Context>' '{ printf $0; if(RT == RS) { print "<Resource name=\"jdbc/KARDB\" password=\"" ENVIRON["DB_PASS"] "\"/>" } printf RT }'
The usual -v dbpass="$DB_PASS" trick does not work here because you don't even want escape sequences interpreted, so we forcefully add DB_PASS to awk's environment and take it from there. Add -i inplace to awk's options if you want the file to be changed in place and you have GNU awk 4.1.0 or later, although I usually use cp file file~; awk ... file~ > file to have a backup in case things go wrong.
There's a problem, however: how do you handle " in passwords? Might be a good idea to do something like dbpass = ENVIRON["DB_PASS"]; gsub(/"/, """, dbpass); and use dbpass in the output. And the same for all other critical characters, such as < and >.
Since you appear to be parsing XML data, it would be better to use an XML-based tool, so that you don't run into problems when the Context tag is empty and written as <Context/>, and to avoid the above problem (which could be a rather big one). For example, with xmlstarlet you could do something like this:
xmlstarlet ed -s '//Context' -t elem -n Resource -i '//Context/Resource[not(#name)]' -t attr -n name -v 'jdbc/KARDB' -i '//Context/Resource[#name="jdbc/KARDB"]' -t attr -n password -v "$DB_PASS"
This consists of three steps:
-s '//Context' -t elem -n Resource
inserts an empty Resource subnode under every Context node (this is fine if there's only one; otherwise you might want to specify it in more detail)
-i '//Context/Resource[not(#name)]' -t attr -n name -v 'jdbc/KARDB'
inserts an attribute name="jdbc/KARDB" in every Context/Resource node that has no name (which is hopefully only the one we just inserted), and
-i '//Context/Resource[#name="jdbc/KARDB"]' -t attr -n password -v "$DB_PASS"
inserts a password attribute with the value of $DB_PASS (properly quoted) in every Resource node under a Context node whose name attribute has the value jdbc/KARDB (again, this should be only the node we just inserted). For a more perfect solution that inserts the whole node in one go, you'll want to take a closer look at xslt, but if you were happy with the original sed approach, none of the shortcomings of this approach should be a problem.
I have a file that looks something like this:
# cat $file
...
ip access-list extended DOG-IN
permit icmp 10.10.10.1 0.0.0.7 any
permit tcp 10.11.10.1 0.0.0.7 eq www 443 10.12.10.0 0.0.0.63
deny ip any any log
ip access-list extended CAT-IN
permit icmp 10.13.10.0 0.0.0.255 any
permit ip 10.14.10.0 0.0.0.255 host 10.15.10.10
permit tcp 10.16.10.0 0.0.0.255 host 10.17.10.10 eq smtp
...
I want to be able to search by name (using a script) to get 'section' output for independent access-lists. I want the output to look like this:
# grep -i dog $file | sed <options??>
ip access-list extended DOG-IN
permit icmp 10.10.10.1 0.0.0.7 any
permit tcp 10.11.10.1 0.0.0.7 eq www 443 10.12.10.0 0.0.0.63
deny ip any any log
...with no further output of inapplicable non-indented lines.
I have tried the following:
grep -A 10 DOG $file | sed -n '/^[[:space:]]\{1\}/p'
...Which only gives me the 10 lines after DOG which begin with a single space (including lines not applicable to the searched access-list).
sed -n '/DOG/,/^[[:space:]]\{1\}/p' $file
...Which gives me the line containing DOG, and the next line beginning with a single space. (Need all the applicable lines of the access-list...)
I want the line containing DOG, and all lines after DOG which begin with a single space, until the next un-indented line. There are too many variables in the content to depend on any patterns other than the leading space (there is not always a deny on the end, etc...).
Using GNU sed (Linux):
name='dog' # case-INsensitive name of section to extract
sed -n "/$name/I,/^[^[:space:]]/ { /$name/I {p;d}; /^[^[:space:]]/q; p }" file
To make matching case-sensitive, remove the I after the occurrences of /I above.
-n suppresses default output so that output must explicitly be requested inside the script with functions such as p.
Note the use of double quotes ("...") around the sed script, so as to allow references to the shell variable $name: The double quotes ensure that the shell variable references are expanded BEFORE the script is handed to sed (sed itself has no access to shell variables).
Caveat: This technique is tricky, because (a) you must use shell escaping to escape shell metacharacters you want to pass through to sed, such as $ as \$, and (b) the shell-variable value must not contain sed metacharacters that could break the sed script; for generic escaping of shell-variable values for use in sed scripts, see this answer of mine, or use my awk-based answer.
/$name/I,/^[^[:space:]]/ uses a range to match the line of interest (/$name/I; the trailing I is GNU sed's case-insensitivity matching option) through the start of the next section (/^[^[:space:]]/ - i.e., the next line that does NOT start with whitespace); since sed ranges are always inclusive, the challenge is to selectively remove the last line of the range, IF it is the start of the next section - note that this will NOT be the case if the section of interest is the LAST one in the file.
Note that the commands inside { ... } are only executed for each line in the range.
/$name/I {p;d}; unconditionally prints the 1st line of the range: d deletes the line (which has already been printed) and starts the next cycle (proceeds to the next input line).
/^[^[:space:]]/q matches the last line in the range, IF it is the next section's first line, and quits processing altogether (q), without printing the line.
p is then only reached for section-interior lines and prints them.
Note:
The assumption is that header lines can be identified by NOT starting with a whitespace char., and that any other lines are non-header lines - if more sophisticated matching is required, see my awk-based answer.
This solution has the slight disadvantage that the range regexes must be duplicated, although you could mitigate that with shell variables.
FreeBSD/macOS sed can almost do the same, except that it lacks the case-insensitivity option, I.
name='DOG' # case-SENSITIVE name of section to extract
sed -n -e "/$name/,/^[^[:space:]]/ { /$name/ {p;d;}; /^[^[:space:]]/q; p; }" file
Note that FreeBSD/OSX sed generally has stricter syntax requirements, such as the ; after a command even when followed by }.
If you do need case-insensitivity, see my awk-based answer.
awk -vfound=0 '
/DOG/{
found = !found;
print;
next
}
/^[[:space:]]/{
if (found) {
print;
next
}
}
{ found = !found }
'
You can substitute any ERE in place of /DOG/, such as /(DOG)|(CAT)/, and the rest of the script will do the work. You can condense it if you like of course.
Note that just because a line begins with a space, that doesn't mean there is only one space. /^[[:space:]]{1}/ will match the leading space, even in a string like
nonspace
meaning it is equivalent to /^[[:space:]]/. If your format is so rigid that there must always only be a single space, use /^[[:space:]][^[:space:]]/ instead. Lines like the one with "nonspace" above will not be matched.
I added a second answer as mklement0 pointed a flaw on my logic.
This is yet a very simple way to do that in Perl:
perl -ne ' /^\w+/ && {$p=0}; /DOG/ && {$p=1}; $p && {print}'
EXAMPLES:
cat /tmp/file | perl -ne ' /^\w+/ && {$p=0}; /DOG/ && {$p=1}; $p && {print}'
ip access-list extended DOG-IN
permit icmp 10.10.10.1 0.0.0.7 any
permit tcp 10.11.10.1 0.0.0.7 eq www 443 10.12.10.0 0.0.0.63
deny ip any any log
cat /tmp/file | perl -ne ' /^\w+/ && {$p=0}; /CAT/ && {$p=1}; $p && {print}'
ip access-list extended CAT-IN
permit icmp 10.13.10.0 0.0.0.255 any
permit ip 10.14.10.0 0.0.0.255 host 10.15.10.10
permit tcp 10.16.10.0 0.0.0.255 host 10.17.10.10 eq smtp
EXPLANATION:
If the line starts with [a-z0-9_] set $p false
If the line contains PATTERN in this case DOG sets $p true
if $p true prints
#mklement0 squeezed my already-inscrutable sed down to this:
sed '/^ip/!{H;$!d};x; /DOG/I!d'
which swaps accumulated multiline groups into the pattern buffer for processing -- the main logic (/DOG/I!d here) operates on whole groups.
The /^ip/! identifies continuation lines by the absence of a first-line marker and accumulates them, so the x only runs when an entire group has been accumulated.
Some corner cases don't apply here:
The first x swaps in a phantom empty group at the start. If that doesn't get dropped during ordinary processing, adding a 1d fixes that.
The last x also swaps out the last line of the file. That's usually just last line of the last group, already accumulated by the H, but if some command might produce one-line groups you need to supply a fake one at the end (with e.g. echo "header phantom" | sed '/^header/!{H;$!d};x' realdata.txt -, or { showgroups; echo header phantom; } | sed '/^header/!{H;$!d};x'.
A shorter, POSIX-compliant awk solution, which is a generalized and optimized translation of #Tiago's excellent Perl-based answer.
One advantage of these answers over the sed solutions is that they use literal substring matching rather than regular expressions, which allows passing in arbitrary search strings, without needing to worry about escaping. That said, if you did want regex matching, use the ~ operator rather than the index() function; e.g., index($0, name) would become $0 ~ name. You then have to make sure that the value passed for name either contains no accidental regex metacharacters meant to be treated as literals or is an intentionally crafted regex.
name='DOG' # Case-sensitive name to search for.
awk -v name="$name" '/^[^[:space:]]/ {if (p) exit; if (index($0,name)) {p=1}} p' file
Option -v name="$name" defines awk variable name based on the value of shell variable $name (awk has no direct access to shell variables).
Variable p is used as a flag to indicate whether the current line should be printed, i.e., whether it is part of the section of interest; as long as p is not initialized, it is treated as 0 (false) in a Boolean context.
Pattern /^[^[:space:]]/ matches only header lines (lines that start with a non-whitespace character), and the associated action ({...}) is only processed for them:
if (p) exit exits processing altogether, if p is already set, because that implies that the next section has been reached. Exiting right away has the benefit of not having to process the remainder of the file.
if (index($0, name)) looks for the name of interest as a literal substring in the header line at hand, and, if found (in which case index() returns the 1-based position at which the substring was found, which is interpreted astruein a Boolean context), sets flagpto1({p=1}`).
p simply prints the current line, if p is 1, and does nothing otherwise. That is, once the section header of interest has been found, it and subsequent lines are printed (up until the next section or the end of the input file).
Note that this is an example of a pattern-only command: only a pattern (condition) is specified, without an associated action ({...}), in which case the default action is to print the current line, if the pattern evaluates to true. (That technique is used in the common shorthand 1 to simply unconditionally print the current record.)
If case-INsensitivity is needed:
name='dog' # Case-INsensitive name to search for.
awk -v name="$name" \
'/^[^[:space:]]/ {if(p) exit; if(index(tolower($0),tolower(name))) {p=1}} p' file
Caveat: The BSD-based awk that comes with macOS (still applies as of 10.12.1) is not UTF-8-aware.: the case-insensitive matching won't work with non-ASCII letters such as ü.
GNU awk alternative, using the special IGNORECASE variable:
awk -v name="$name" -v IGNORECASE=1 \
'/^[^[:space:]]/ {if(p) exit; if(index($0,name)) {p=1}} p' file
Another POSIX-compliant awk solution:
name='dog' # Case-insensitive name of section to extract.
awk -v name="$name" '
index(tolower($0),tolower(name)) {inBlock=1; print; next} # 1st section line found.
inBlock && !/^[[:space:]]/ {exit} # Exit at start of next section.
inBlock # Print 2nd, 3rd, ... section line.
' file
Note:
next skips the remaining pattern-action pairs and proceeds to the next line.
/^[[:space:]]/ matches lines that start with at least one whitespace char. As #Chrono Kitsune explains in his answer, if you wanted to match lines that start with exactly one whitespace char., use /^[[:space:]][^[:space:]]/. Also note that, despite its name, character class [:space:] matches ANY form of whitespace, not just spaces - see man isspace.
There's no need to initialize flag variable inBlock, as it defaults to 0 in numeric/Boolean contexts.
If you have GNU awk, you can more easily achieve case-insensitive matching by setting the IGNORECASE variable to a nonzero value (-v IGNORECASE=1) and simply using index($0, name) inside the program.
A GNU awk solution, IF, you can assume that all section header lines start with 'ip' (so as to break the input into sections that way, rather than looking for leading whitespace):
awk -v RS='(^|\n)ip' -F'\n' -v name="$name" -v IGNORECASE=1 '
index($1, name) { sub(/\n$/, ""); print "ip" $0; exit }
' file
-v RS='(^|\n)ip' breaks the input into records by lines that fall between line-starting instances of string 'ip'.
-F'\n' then breaks each record into fields ($1, ...) by lines.
index($1, name) looks for the name on the current record's first line - case-INsensitively, thanks to -v IGNORECASE=1.
sub(/\n$/, "") removes any trailing \n, which can stem from the section of interest being the last in the input file.
print "ip" $0 prints the matching record, comprising the entire section of interest - since, however the record doesn't include the separator, 'ip', it is prepended.
The simplest way I can think of is: sed '/DOG/, /^ip/ !d' | sed '$d'
cat file | sed '/DOG/, /^ip/ !d' | sed '$d'
ip access-list extended DOG-IN
permit icmp 10.10.10.1 0.0.0.7 any
permit tcp 10.11.10.1 0.0.0.7 eq www 443 10.12.10.0 0.0.0.63
deny ip any any log
Explanation:
first sed command prints from the line containing DOG to the next line starting with ip
second sed command deletes the last line(which is the line starting with ip)