Just give me the words between the "" [duplicate] - regex

This question already has answers here:
Getting values between quotes
(2 answers)
Closed 9 years ago.
I have text lines like this
blahblah"word1"blahblah"word2"blahblah"word3"
I only want the text between the quotes and without the quotes. I could do an awk and us the " as a separator. And then get every second match. However, is there any way I can just use awk (or another command) to return words between sets of quotes? so I'd get back word1, word2, word3?
Thanks,

Here you go:
echo 'blahblah"word1"blahblah"word2"blahblah"word3"' | perl -ne 'print map("$_\n", m/"([^"]*)"/g)'

Depends which language you're using, but the regular expression to do this would be:
(?<=^(("[^"]*){2})*")[^"]+(?=")
That example will match everything between "s. if you want it to match only words between "s, use:
(?<=^(("[^"]*){2})*")\b+(?=")
The main difference is with the second example, spaces and most special characters will not be allowed. With the first example, all characters except for "s will be allowed between the "s. That includes new lines.

Non-robust, but fun:
sed -E 's/(^|")[^"]*("|$)/ /g'

Related

Regexp delete using sed works on regex101 but not with sed [duplicate]

This question already has answers here:
sed multiline delete with pattern
(2 answers)
Closed 2 years ago.
I need to remove strings from a text file that matches a REGEX pattern, using regex101 my pattern match works fine, but when I execute using sed, nothing gets deleted and for some reason the regex is not working:
https://regex101.com/r/oLNrDB/1/
I need to remove all occurrences of all text including newlines between the following 2 strings:
DELIMITER ;;
some text with newlines
DELIMITER ;
The sed command I am using is:
sed '/DELIMITER ;;[\S\s]*?DELIMITER ;/d' myfile.sql;
but the output is identical to the input file, what am I doing wrong ?
The problem here is that sed reads files line-by-line and applies the pattern to each line separately. In your case, this means that the one pattern can't match both the starting and finishing delimiter because no one line contains them both.
The sed way of doing this is to use a range with the delete command, /start pattern/,/end pattern/d, which means delete all lines between the start pattern and end pattern inclusive. For example
$ cat foo.txt
some text before
DELIMITER ;;
some text with newlines
DELIMITER ;
some text after
$ sed '/DELIMITER ;;/,/DELIMITER ;/d' foo.txt
some text before
some text after

Matching negative and positive pattern in one sed [duplicate]

This question already has answers here:
Replace All Lines That Do Not Contain Matched String
(4 answers)
Closed 3 years ago.
I have a problem with making sed command, which gonna change lines, where =sometext= occurs and change it to another pattern, but will not do it when https occcurs in that line. I have no idea how I should change this command:sed -i 's/=\([^=]*\)=/{{\1}}/g'
You'll want to read the sed manual about matching lines: https://www.gnu.org/software/sed/manual/sed.html chapter 4:
The following command replaces the word ‘hello’ with ‘world’ only in lines not containing the word ‘apple’:
sed '/apple/!s/hello/world/' input.txt > output.txt
Use multiple blocks, e.g.:
sed '/=sometext=/ { /https/b; s/.../.../; }'

Replace string variable with string variable using Sed [duplicate]

This question already has answers here:
"sed" special characters handling
(3 answers)
Is it possible to escape regex metacharacters reliably with sed
(4 answers)
Escape a string for a sed replace pattern
(17 answers)
Closed 5 years ago.
I have a file called ethernet containing multiple lines. I have saved one of these lines as a variable called old_line. The contents of this variable looks like this:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="2r:11:89:89:9g:ah", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"
I have created a second variable called new_line that is similar to old_line but with some modifications in the text.
I want to substitute the contents of old_line with the contents of new_line using sed. So far I have the following, but it doesn't work:
sed -i "s/${old_line}/${new_line}/g" ethernet
You need to escape your oldline so that it contains no regex special characters, luckily this can be done with sed.
old_line=$(echo "${old_line}" | sed -e 's/[]$.*[\^]/\\&/g' )
sed -i -e "s/${old_line}/${new_line}/g" ethernet
Since ${old_line} contains many regex special metacharacters like *, ? etc therefore your sed is failing.
Use this awk command instead that uses no regex:
awk -v old="$old_line" -v new="$new_line" 'p=index($0, old) {
print substr($0, 1, p-1) new substr($0, p+length(old)) }' ethernet

Matching pattern containing parentheses with sed [duplicate]

This question already has answers here:
Whether to escape ( and ) in regex using GNU sed
(4 answers)
Closed 4 years ago.
I need to insert '--' at the beginning of the line if line contains word VARCHAR(1000)
Sample of my file is:
TRIM(CAST("AP_RQ_MSG_TYPE_ID" AS NVARCHAR(1000))) AP_RQ_MSG_TYPE_ID,
TRIM(CAST("AP_RQ_PROCESSING_CD" AS NVARCHAR(1000)))
AP_RQ_PROCESSING_CD, TRIM(CAST("AP_RQ_ACQ_INST_ID" AS NVARCHAR(11)))
AP_RQ_ACQ_INST_ID, TRIM(CAST("AP_RQ_LOCAL_TXN_TIME" AS NVARCHAR(10)))
AP_RQ_LOCAL_TXN_TIME, TRIM(CAST("AP_RQ_LOCAL_TXN_DATE" AS
NVARCHAR(10))) AP_RQ_LOCAL_TXN_DATE, TRIM(CAST("AP_RQ_RETAILER" AS
NVARCHAR(11))) AP_RQ_RETAILER,
I used this command
sed 's/\(^.*VARCHAR\(1000\).*$\)/--\1/I' *.sql
But the result is not as expected.
Does anyone have idea what am I doing wrong?
this should do:
sed 's/.*VARCHAR(1000).*/--&/' file
The problem in your sed command is at the regex part. By default sed uses BRE, which means, the ( and ) (wrapping the 1000) are just literal brackets, you should not escape them, or you gave them special meaning: regex grouping.
The first and last (..) you have escaped, there you did right, if you want to reference it later by \1. so your problem is escape or not escape. :)
Use the following sed command:
sed '/VARCHAR(1000)/ s/.*/--\0/' *.sql
The s command appplies to all lines containing VARCHAR(1000). It then replaces the whole line .* by itself \0 with -- in front.
Through awk,
awk '/VARCHAR\(1000\)/ {sub (/^/,"--")}1' infile > outfile

Matching zero or more characters in sed

I was practicing some commands using sed when I was confused by the output of the following command:
echo 'first:second' | sed 's_[^:]*_(&)_g'
My question is: Why would this command only wrap the string "first" and "second" in parentheses?
Shouldn't the colon be wrapped too since I specified "zero or more non-colons" in my regex condition?
Please clarify.
You use
[^:]
which searches all characters except :.
So what you experience is the normal comportment.