Both of the regexes below work In my case.
grep \s
grep ^[[:space:]]
However all those below fail. I tried both in git bash and putty.
grep ^\s
grep ^\s*
grep -E ^\s
grep -P ^\s
grep ^[\s]
grep ^(\s)
The last one even produces a syntax error.
If I try ^\s in debuggex it works.
Debuggex Demo
How do I find lines starting with whitespace characters with grep ? Do I have to use [[:space:]] ?
grep \s works for you because your input contains s. Here, you escape s and it matches the s, since it is not parsed as a whitespace matching regex escape. If you use grep ^\\s, you will match a string starting with whitespace since the \\ will be parsed as a literal \ char.
A better idea is to enable POSIX ERE syntax with -E and quote the pattern:
grep -E '^\s' <<< "$s"
See the online demo:
s=' word'
grep ^\\s <<< "$s"
# => word
grep -E '^\s' <<< "$s"
# => word
Related
I have the following strings:
text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1
with a regular expression, I want to extract:
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
I have tried with regex101.com the following expression: ([^:]+)(?::[^:]+){1}$
and it worked (only for the first string)
But if I try in bash, it does not
echo "text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1" | sed -n "/([^:]+)(?::[^:]+){1}$/p"
It would be much easier with cut without any regex:
cut -d: -f3- file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
Non capture groups (?: are not supported in sed and you have to escape the \( \) \{ \} and \+
You can repeat 2 occurrences of : from the start of the string and replace that with an empty string.
sed 's/^\([^:]\+:\)\{2\}//' file
Or using sed -E for extended regexp:
sed -E 's/^([^:]+:){2}//' file
Output
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
Using sed
$ sed s'|\([^:]*:\)\{2\}\(.*\)$|\2|' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
or
$ sed s'|\([^:]*:\)\{2\}||' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
There's no reason to drag sed or other external programs into this; just use bash's built in regular expression matching:
#!/usr/bin/env bash
strings=(text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1)
for s in "${strings[#]}"; do
[[ $s =~ ^([^:]*:){2}(.*) ]] && printf "%s\n" "${BASH_REMATCH[2]}"
done
Heck, you don't need regular expressions in bash:
printf "%s\n" "${s#*:*:}"
awk
string='ext/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1'
awk -vFS=: -vOFS=: '{$1=$2="";gsub(/^::/,"")}1' <<<"$string"
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
absolutely no need to use anything that requires regex-backreferences, since the regex anchoring is right at the line head anyway :
mawk ++NF OFS= FS='^[^:]*:[^:]*:'
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
According to https://regex101.com/r/NLSymf/3, the following regex:
\[\[(foo)([^\]]+)\]\]
(full) matches the string [[foo>test1|test2]], but this seems to not be understood by sed, since:
echo "[[foo>test1|test2]]" | sed -E -e '/\[\[(foo)([^\]]+)\]\]/d'
(which should return an empty string) returns:
[[foo>test1|test2]]
What is the regex that matches [[foo>test1|test2]] from sed's point of view?
The backslash character loses its escaping capability within a bracket expression. And stray closing brackets in a RE need not be escaped, that's why grep doesn't fail the first pipeline below. See RE Bracket Expression for reference.
$ echo 'a]' | grep -Eo '[^\]]'
a]
$ echo 'a]' | grep -Eo '[^]]'
a
The correct regex would be:
\[\[(foo)([^]]+)]]
I have working example of substitution in online regex tester https://regex101.com/r/3FKdLL/1 and I want to use it as a substitution in sed editor.
echo "repo-2019-12-31-14-30-11.gz" | sed -r 's/^([\w-]+)-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}.gz$.*/\1/p'
It always prints whole string: repo-2019-12-31-14-30-11.gz, but not matched group [\w-]+.
I expect to get only text from group which is repo string in this example.
Try this:
echo "repo-2019-12-31-14-30-11.gz" |
sed -rn 's/^([A-Za-z]+)-[[:alnum:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}.gz.*$/\1/p'
Explanations:
\w will work (not [\w] wich matches either backslash or w), but you should use [[:alnum:]] which is POSIX
For sed, \d isn't a regex class, but an escaped character representing a non-printable character
Add -n to mute sed, with /p to explicitly print matched lines
Additionaly, you could refactor your regex by removing duplication:
echo "repo-2019-12-31-14-30-11.gz" |
sed -rn 's/^([[:alnum:]]+)-[[:digit:]]{4}(-[[:digit:]]{2}){5}.gz.*$/\1/p'
Looks like a job for GNU grep :
echo "repo-2019-12-31-14-30-11.gz" | grep -oP '^\K[[:alpha:]-]+'
Displays :
repo-
On this example :
echo "repo-repo-2019-12-31-14-30-11.gz" | grep -oP '^\K[[:alpha:]-]+'
Displays :
repo-repo-
Which I think is what you want because you tried with [\w-]+ on your regex.
If I'm wrong, just replace the grep command with : grep -oP '^\K\w+'
I'd like to use grep to match all characters before the first whitespace.
grep "^[^\s]*" filename.txt
did not work. Instead, all characters before the first s are matched. Is there no \s available in grep?
You can also try with perl regex flag P and o flag to show only matched part in the output:
grep -oP "^\S+" filename.txt
With a POSIX character class:
grep -o '^[^[:blank:]]*' filename.txt
As for where \s is available:
POSIX grep supports only Basic Regular Expressions or, when called grep -E, Extended Regular Expressions, both of which have no \s
GNU grep supports \s as a synonym for [[:space:]]
BSD grep doesn't seem to support \s
Alternatively, you could use awk with the field separator explicitly set to a single space so leading blanks aren't ignored:
awk -F ' ' '{ print $1 }'
Assuming a simple text file:
123.123.123.123
I would like to replace the IP inside of it with 222.222.222.222. I have tried the below but nothing changes, however the same regex seems to work in this Regexr
sed -i '' 's/(\d{1,3}\.){3}\d{1,3}/222.222.222.222/' file.txt
Am I missing something?
Two problems here:
sed doesn't like PCRE digit property \d, use range: [0-9] or POSIX [[:digit:]]
You need to use -r flag for extended regex as well.
This should work:
s='123.123.123.123'
sed -r 's/([0-9]{1,3}\.){3}[0-9]{1,3}/222.222.222.222/' <<< "$s"
222.222.222.222
Better would be to use anchors to avoid matching unexpected input:
sed -r 's/^([0-9]{1,3}\.){3}[0-9]{1,3}$/222.222.222.222/' <<< "$s"
PS: On OSX use -E instead of -r:
sed -E 's/^([0-9]{1,3}\.){3}[0-9]{1,3}$/222.222.222.222/' <<< "$s"
222.222.222.222
You'd better use -r, as indicated by anubhava.
But in case you don't have it, you have to escape every single (, ), { and }. And also, use [0-9] instead of \d:
$ sed 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/222.222.222.222/' <<< "123.123.123.123"
222.222.222.222