Regular Expression for "D210" for Linux?

Regular Expression for "D210" for Linux? - regex

The tile says it all. Right now I'm using:
grep "^D[\d][\d][\d]" file.txt
to no avail.

\d is not recognized unless -P or --perl-regexp option is specified. (assuming GNU grep).
$ echo D210 | grep '^D\d\d\d'
$ echo D210 | grep -P '^D\d\d\d'
D210
$ echo D210 | grep -P '^D\d{3}'
D210
If your grep does not accept -P, use [0-9] or [[:digit:]]:
$ echo D210 | grep '^D[0-9][0-9][0-9]'
D210
$ echo D210 | grep '^D[[:digit:]][[:digit:]][[:digit:]]'
D210

Related

Parse Args with R.E

can you help me?
I want parse this: {'$', '$0', '$qwerty', '$123'} # $Previous_Character_Or_Group_Repeated_0_Or_More_Time
In ScriptShell:
echo "$" | grep '^\$.*$'
$
it's work.
echo "$1" | grep '^\$.*$'
echo "$hello" | grep '^\$.*$'
echo "$Qwerty123" | grep '^\$.*$'
it's doesn't work.
thx for reply,

Ok, Just use single quote, not double quote like this:
echo '$1' | grep '^\$.*$'
$1

sed regular expression extraction

i have a range of strings which conform to one of the two following patters:
("string with spaces",4)
or
(string_without_spaces,4)
I need to extract the "string" via a bash command, and so far have found a pattern that works for each, but not for both.
echo "(\"string with spaces\",4)" | sed -n 's/("\(.*\)",.*)/\1/ip'
output:string with spaces
echo "(string_without_spaces,4)" | sed -n 's/(\(.*\),.*)/\1/ip'
output:string_without_spaces
I have tried using "\? however it does not match the " if it is there:
echo "(SIM,0)" | sed -n 's/("\?\(.*\)"\?,.*)/\1/ip'
output: SIM
echo "(\"SIM\",0)" | sed -n 's/("\?\(.*\)"\?,.*)/\1/ip'
output: SIM"
can anyone suggest a pattern that would extract the string in both scenarios? I am not tied to sed but would prefer to not have to install perl in this environment.

How about using [^"] instead of . to exclude " to be matched.
$ echo '("string with spaces",4)' | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
string with spaces
$ echo "(string_without_spaces,4)" | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
string_without_spaces
$ echo "(SIM,0)" | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
SIM
$ echo '("SIM",0)' | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
SIM

Grep and sed returning only first match

I am trying to extract the title and description of a rss Feed , I have written following script to return all the title in the Feed , But its returning only the first Title from the xml:
curl "http://www.dailystar.com.lb/RSS.aspx?id=113" 2>/dev/null | grep -E -o "<title>(.*)</title>" |sed -e 's,.*<title>\(.*\)</title>.*,\1,g' | less
How can I also find the description ?

You can use grep -P:
curl "http://www.dailystar.com.lb/RSS.aspx?id=113" 2>/dev/null |\
grep -oP "<title>\K[\s\S]*?(?=</title>)"

First put each title and description on its own line. Here is an example:
curl "http://www.dailystar.com.lb/RSS.aspx?id=113" 2>/dev/null | \
grep -E -o "<title>(.*)</title>" | \
sed -e 's,<\(title\|description\)>,\n<\1>,g' |
sed -n 's,.*<title>\(.*\)</title>.*,\1,gp'
For the description:
curl "http://www.dailystar.com.lb/RSS.aspx?id=113" 2>/dev/null | \
grep -E -o "<title>(.*)</title>" | \
sed -e 's,<\(title\|description\)>,\n<\1>,g' | \
sed 's,<title>\([^<]*\)</title>,T:\1,' | \
sed 's,<description>\([^<]*\)</description>,D:\1,' | \
sed -n 's/[DT]://p'

You should use non-greedy match (.*?) instead of greedy matching (.*) to get all the titles:
curl "http://www.dailystar.com.lb/RSS.aspx?id=113" 2>/dev/null | grep -E -o "<title>(.*?)</title>" |sed -e 's,.*<title>\(.*?\)</title>.*,\1,g' | less

Can not extract the capture group with either sed or grep

I want to extract the value pair from a key-value pair syntax but I can not.
Example I tried:
echo employee_id=1234 | sed 's/employee_id=\([0-9]+\)/\1/g'
But this gives employee_id=1234 and not 1234 which is actually the capture group.
What am I doing wrong here? I also tried:
echo employee_id=1234| egrep -o employee_id=([0-9]+)
but no success.

1. Use grep -Eo: (as egrep is deprecated)
echo 'employee_id=1234' | grep -Eo '[0-9]+'
1234
2. using grep -oP (PCRE):
echo 'employee_id=1234' | grep -oP 'employee_id=\K([0-9]+)'
1234
3. Using sed:
echo 'employee_id=1234' | sed 's/^.*employee_id=\([0-9][0-9]*\).*$/\1/'
1234

To expand on anubhava's answer number 2, the general pattern to have grep return only the capture group is:
$ regex="$precedes_regex\K($capture_regex)(?=$follows_regex)"
$ echo $some_string | grep -oP "$regex"
so
# matches and returns b
$ echo "abc" | grep -oP "a\K(b)(?=c)"
b
# no match
$ echo "abc" | grep -oP "z\K(b)(?=c)"
# no match
$ echo "abc" | grep -oP "a\K(b)(?=d)"

Using awk
echo 'employee_id=1234' | awk -F= '{print $2}'
1234

use sed -E for extended regex
echo employee_id=1234 | sed -E 's/employee_id=([0-9]+)/\1/g'

You are specifically asking for sed, but in case you may use something else - any POSIX-compliant shell can do parameter expansion which doesn't require a fork/subshell:
foo='employee_id=1234'
var=${foo%%=*}
value=${foo#*=}
 
$ echo "var=${var} value=${value}"
var=employee_id value=1234

Replace string if first letter is uppercase using sed

I try to write sed answer to this question Edit a file using sed/awk using:
sed -e 's/^[A-Z]/$:$&/' file.txt
but the result is:
wednesday
$:$Weekday
$:$thursday
$:$Weekday
$:$friday
$:$Weekday
$:$saturday
$:$MaybeNot
$:$sunday
$:$MaybeNot
$:$monday
$:$Weekday
$:$tuesday
$:$Weekday
Why it replace if first character is lower case?

This is a "feature" according to this bug report caused by unexpected character ordering in the locale, further explained here and here.
$ locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_ALL=
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[A-Z]/./g'
..........................a.........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[a-z]/./g'
.........................Z..........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | LC_ALL=C sed -e 's/[A-Z]/./g'
..........................abcdefghijklmnopqrstuvwxyz
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | LC_ALL=C sed -e 's/[a-z]/./g'
ABCDEFGHIJKLMNOPQRSTUVWXYZ..........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[[:upper:]]/./g'
..........................abcdefghijklmnopqrstuvwxyz
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[[:lower:]]/./g'
ABCDEFGHIJKLMNOPQRSTUVWXYZ..........................
$ sed --version
GNU sed version 4.2.1

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regular Expression for "D210" for Linux? - regex

The tile says it all. Right now I'm using: grep "^D[\d][\d][\d]" file.txt to no avail.

Related

Parse Args with R.E

sed regular expression extraction

Grep and sed returning only first match

Can not extract the capture group with either sed or grep

Replace string if first letter is uppercase using sed

Categories

Resources