Sed replace string with multi-lines - regex

I have simple sed command:
#!/bin/bash
COMMAND=$1
sed -e "s#COMMAND#$COMMAND#
The value for command should be a new line for every command but i cannot figure out how to give them to sed and sed put every command on new line. What i have tried is:
./script 'ls\n date\n uname\n'
Regards!

If I'm understanding your question, you are looking to replace a representation of newlines within a string (i.e. a backslash character, followed by an 'n') as actual printed newlines.
The following script takes a single quoted string (the input as shown in your question) containing the literals '\n' and converts those into actual new lines.
#!/bin/bash
echo -n $1 | sed -e 's#\\n#\n#g'
Example usage:
[user#localhost ~]$ ./simple_sed.sh 'ls\ndate\nuname\n'
ls
date
uname
The changes needed from your script are to
echo the argument, otherwise sed expects a file and will do nothing;
match the \\n and replace it with a newline; and
add a 'g' to the end which will continue searching within a line after a replacement has occurred (read: multiple \n are substituted in a single line).

Related

sed regex command to output lines that end with html?

I need a sed regex command that will output every line in a while that ends with 'html', and does NOT start with 'a'.
Would my current code work?
sed 's/\[^a]\*\.\(html)\/p' text.txt
The sed command would be
sed -n '/^[^a].*html$/p'
But the canonical command to print matching lines is grep:
grep '^[^a].*html$'
Sed just over complicates things...you can use grep to handle that easily!
egrep "^[^a].+\.html$" -f sourcefile > text.txt
//loads file from within the program egrep
egrep "^[^a].+\.html$" < sourcefile > text.txt
//Converts stdin file descriptor with the input redirect
//to sourceFile for this stage of the` pipeline
are equivalent functionally.
or
pipe input | xargs -n1 egrep "^[^a].+\.html$" > text.txt
//xargs -n1 means take the stdin from the pipe and read it one line at a time in conjunction with the single command specified after any other xargs arguments
// ^ means from start of line,
//. means any one character
//+ means the previous matched expression(which can be a
//(pattern group)\1 or [r-ange], etc) one or more times
//\. means escape the single character match and match the period character
//$ means end of line(new line character)
//egrep is short for extended regular expression matches which are really
nice
(assuming you aren't using a pipe or cat, etc)
You can convert a newline delimited file into a single input line with this command:
cat file | tr -d '\n' ' '
//It converts all newlines into a space!
Anyway, get creative with simple utilities and you can do a lot:
xargs, grep, tr are a good combo that are easy to learn. Without the sedness of it all.
Don't do this with sed. Do it with two different calls to grep
grep -v ^a file.txt | grep 'html$'
The first grep gets all the lines that do not start with "a", and sends the output from that into the second grep, which pulls out all the lines that end with "html".

sed - get last value in file

I am making a script to collect a value from an external file. In the middle of this, I saw myself having trouble with the following sed command to limit the result to a single line.
The following command searches for all words with "value=" by collecting the next text, ignoring rows with "#"
NUM=$(sed -n -e '/#/!s/^.*value=//p' $LOGFILE)
I found other command variations for this but none of them allowed the use of words to be ignored as is the case with this command line.
Any soul to do this capture only the final line but still ignoring lines that contain "#"?
optional: can you adapt this command to capture only numbers, ignoring rest the words on the line?
Here's 3 ways:
if you need just sed:
sed -n '/value=/h; ${g; s/value=//p}' file
if you can use other tools:
tac file | sed -n '/value=/{s///p;q}'
or, this is quite readable:
awk -F= '$1 == "value" {value = $2} END {print value}' file

How to get only lines with a single quote using GNU sed in Bash shell?

I'm writing a script to parse a text file (multiple lines). I need to print only lines matching the following pattern:
First character of the line is an Uppercase letter
Second character of the line is a lowercase letter OR a single quote
Third character of the line is a lowercase letter OR a space
Examples of "valid" lines
Abcd
A'cd
Ab c
Attemps with GNU sed 4.2.2 on Linux
I ] First attempt (escaping)
$ html2text foo.html | sed -r "/^([A-Z][a-z\'])/!d"
Produces the following error message:
html2text foo.html | sed -r "/^([A-Z][a-z\'])/date"
sed: -e expression n°1, character 19: extra characters after command
II ] Second attempt (no escaping)
$ html2text foo.html | sed -r "/^([A-Z][a-z'])/!d"
Produces the following error message:
html2text foo.html | sed -r "/^([A-Z][a-z'])/date"
sed: -e expression n°1, character 18: extra characters after command
I'm not quite sure how to deal with single quote "'" within a range. I know that escaping a single quote within a single-quoted sed expression is not supported at all, but here both sed expressions are double-quoted.
Weird thing is that error messages both return ".../date" (first line of error messages) which appear to be a bug or parsing issue ("/!d" flag is misinterpreted)...
Note: html2text convert 'foo.html' to text file. sed -r option stands for Extended regular expression. "[A-Z]" matches a range of characters (square square brackets are not literals here)
Thanks for your help
As pointed by casimir-et-hippolyte using grep is simpler here:
grep "^[A-Z][a-z'][a-z ]"
or using sed:
sed -n "/^[A-Z][a-z'][a-z ]/p"
if you need to have single quotes for some reason, this can be used to escape the single quote in the script
sed -n '/^[A-Z][a-z'"'"'][a-z ]/p'

replace \n\t pattern in a file

ok I have a recordset that is pipe delimited
I am checking the number of delimiters on each line as they have started including | in the data (and we cannot change the incoming file)
while using a great awk to parse out the bad records into a bad file for processing we discovered that some data has a new line character (\n) (followed by a tab (\t) )
I have tried sed to replace \n\t with just \t but it always either changes the \n\t with \r\n or replaces all the \n (file is \r\n for line end)
yes to answer some quetions below...
files can be large 200+ mb
the line feed is in the data spuriously (not every row.. but enought to be a pain)
I have tried
sed ':a;N;$!ba;s/\n\t/\t/g' Clicks.txt >test2.txt
sed 's/\n\t/\t/g' Clicks.txt >test1.txt
sample record
12345|876|testdata\n
\t\t\t\tsome text|6209\r\n
would like
12345|876|testdata\t\t\t\tsome text|6209\r\n
please help!!!
NOTE must be in KSH (MKS KSH to be specific)
i don't care if it is sed or not.. just need to correct the issue...
several of the solutions below woke on small data or do part of the job...
as an aside i have started playing with removing all linefeeds and then replacing the caraige return with carrige return linefeed.. but can't quite get that to work either
I have tried TR but since it is single char it only does part of the issue
tr -d '\n' test.txt
leave me with a \r ended file....
need to get it to \r\n (and no-no dos2unix or unix2dos exists on this system)
If the input file is small (and you therefore don't mind processing it twice), you can use
cat input.txt | tr -d "\n" | sed 's/\r/\r\n/g'
Edit:
As I should have known by now, you can avoid using cat about everywhere.
I had reviewed my old answers in SO for UUOC, and carefully checked for a possible filename in the tr usage. As Ed pointed out in his comment, cat can be avoided here as well:
The command above can be improved by
tr -d "\n" < input.txt | sed 's/\r/\r\n/g'
It's unclear what you are trying to do but given this input file:
$ cat -v file
12345|876|testdata
some text|6209^M
Is this what you're trying to do:
$ gawk 'BEGIN{RS=ORS="\r\n"} {gsub(/\n/,"")} 1' file | cat -v
12345|876|testdata some text|6209^M
The above uses GNU awk for multi-char RS. Alternatively with any awk:
$ awk '{rec = rec $0} /\r$/{print rec; rec=""}' file | cat -v
12345|876|testdata some text|6209^M
The cat -vs above are just there to show where the \rs (^Ms) are.
Note that the solution below reads the input file as a whole into memory, which won't work for large files.
Generally, Ed Morton's awk solution is better.
Here's a POSIX-compliant sed solution:
tab=$(printf '\t')
sed -e ':a' -e '$!{N;ba' -e '}' -e "s/\n${tab}/${tab}/g" Clicks.txt
Keys to making this POSIX-compliant:
POSIX sed doesn't recognize \t as an escape sequence, so a literal tab - via variable $tab, created with tab=$(printf '\t') - must be used in the script.
POSIX sed - or at least BSD sed - requires label names (such as :a and the a in ba above) - whether implied or explicit - to be terminated with an actual newline, or, alternatively, terminated implicitly by continuing the script in the next -e option, which is the approach chosen here.
-e ':a' -e '$!{N;ba' -e '}' is an established Sed idiom that simply "slurps" the entire input file (uses a loop to read all lines into its buffer first). This is the prerequisite for enabling subsequent string substitution across input lines.
Note how the option-argument for the last -e option is a double-quoted string so that the references to shell variable $tab are expanded to actual tabs before Sed sees them. By contrast, \n is the one escape sequence recognized by POSIX sed itself (in the regex part, not the replacement-string part).
Alternatively, if your shell supports ANSI C-quoted strings ($'...'), you can use them directly to produce the desired control characters:
sed -e ':a' -e '$!{N;ba' -e '}' -e $'s/\\n\t/\\t/g' Clicks.txt
Note how the option-argument for the last -e option is an ANSI C-quoted string, and how literal \n (which is the one escape sequence that is recognized by POSIX Sed) must then be represented as \\n. By contrast, $'...' expands \t to an actual tab before Sed sees it.
Thanks everyone for all your suggestions.. After looking at all the answers.. None quite did the trick... After some thought... I came up with
tr -d '\n' <Clicks.txt | tr '\r' '\n' | sed 's/\n/\r\n/g' >test.txt
Delete all newlines
translate all Carriage return to newline
Sed replace all newline with Carriage return line feed
This works in seconds on a 32mb file.

Convert line feeds to an actual string

I have a file like this:
#!/bin/bash
echo $(date "+%F %R:%S") ":: yum update"
/usr/bin/yum update -y
I want to convert that to exactly this quoted string:
"#!/bin/bash\necho $(date \"+%F %R:%S\") \":: yum update\"\n/usr/bin/yum update -y\n"
Any method I have used is matching the line feeds but converts them to a line feed instead of to "\n". So these examples:
sed 's/\n/\n/g' file
sed 's/\n/\\\n/g' file
tr '\n' '\n' <file
tr '\n' "\n" <file
all result in exactly the same output as the file itself. So how do I match the line feed character and replace it with the actual string "\n" and not something that will itself be recognized as a line feed?
Here, you want two characters, slash and n, to replace a single newline. Consequently, tr is not a good choice.
It wasn't clear to me if you wanted actual backslashes to precede your double quotes or not. The answers below assume that you do. If you don't, it is simple enough the remove those substitutions.
Using awk
$ awk '{gsub(/"/, "\\\""); printf "%s\\n",$0}' file
#!/bin/bash\necho $(date \"+%F %R:%S\") \":: yum update\"\n/usr/bin/yum update -y\n
Using sed
$ sed ':again; N; $!b again; s/"/\\"/g; s/\n/\\n/g; s/$/\\n/' file
#!/bin/bash\necho $(date \"+%F %R:%S\") \":: yum update\"\n/usr/bin/yum update -y\n
Perl:
perl -0777 -pe 's/\n/\\n/g; s/"/\\"/g' file
tr cannot do this job because it cannot translate one char (\n) into a 2-char string (\n).