How to convert NULL string to \N using sed - replace

I have NULL strings (not NULL chars) in a file. I want to convert all of them to \N (string) This is what i tired, which is not working
sed "s/NULL/\\N/g" data/fixed/3.csv > data/fixed/4.csv
PS : this change is to support MySQL null imports (not to add a new line)
http://dev.mysql.com/doc/refman/5.7/en/null-values.html

use single quotes:
$ echo 'abc NULL foo NULL' | sed 's/NULL/\\N/g'
abc \N foo \N
From http://mywiki.wooledge.org/Quotes
Single quotes: '...' removes the special meaning of every character
between the quotes. Everything inside single quotes becomes a literal
string. The only character that you can't safely enclose in single
quotes is a single quote.

Related

Add backslash before single and double quote

I am trying to add backslash before single and double quote. The problem that I have is that I want to exclude triple quote.
What I did is as for now:
for single quote:
sed -e s/\'/\\\\\'/g test.txt > test1.txt
for double quote:
sed -e s/\"/\\\\\"/g test.txt > test1.txt
I have text like:
1,"""Some text XM'SD12X""","""Some text XM'SD12X""","""Auto " Moto " Some text"Some text"""
What I want is:
120,"""Some text\'SD12X""","""Some text XM\'SD12X""","""Auto \" Moto \" Some text\"Some text"""
If perl is okay:
perl -pe 's/"{3}(*SKIP)(*F)|[\x27"]/\\$&/g'
"{3}(*SKIP)(*F) don't change triple double quotes
use (\x27{3}|"{3})(*SKIP)(*F) if you shouldn't change triple single/double quotes
|[\x27"] match single or double quotes
\\$& prefix \ to the matched portion
With sed, you can replace the triple quotes with newline character (since newline character cannot be present in pattern space for default line-by-line usage), then replace the single/double quote characters and then change newline characters back to triple quotes.
# assuming only triple double quotes are present
sed 's/"""/\n/g; s/[\x27"]/\\&/g; s/\n/"""/g'

Remove special character at the beginning of words in Unix

I need help removing special characters from the beginning of the word in a Unix shell.
For example I have the list of words like this,
'aaa
'bbb
'ccc
'ddd
I want to remove the quotes and get output like this,
aaa
bbb
ccc
ddd
How can I remove only the quote at the beginning of each word?
You will need to match at a word boundary, which is delimited with \b.
So for example, if you were using sed and wanted to remove a single quote ' at the beginning of any word, you would use
sed "s/'\b//g"
Which means "replace any single quote immediately before a word boundary with an empty string".
Additionally, if you aren't worried about at the beginning of the line, you can use the specifier ^, which matches the start of a line.
sed "s/^'//g"
Give a try to:
echo 'aaa 'bbb 'ccc 'ddd | tr -d "'"

Deleting everything after a pattern in Unix

I have a string replenishment_category string,Date string, I want to delete everything starting with Date (including it), also the comma before it if it is present.
I have the string to be replaced stored in a variable:
PARTITION_COLUMN='Date'
I tried sed to replace everything after the variable PARTITION_COLUMN
echo "replenishment_category string,Date string" | sed "s/"$PARTITION_COLUMN".* //g"
but the output still has the string that follows the date:
replenishment_category string,string
How do I remove the string part and also the comma preceding the Date?
Try this:
echo "replenishment_category string,Date string" | sed "s/$PARTITION_COLUMN.*//"
Notice the space removed after .* and the double quote around the entire command.
You could do this with shell parameter expansion alone, assuming you have extended globs enabled (shopt -s extglob):
$ var='replenishment_category string,Date string'
$ part_column=Date
$ echo "${var%%?(,)"$part_column"*}"
replenishment_category string
The ${word%%pattern} expansion works without extended globs and removes the longest match of pattern from the end of $word.
The ?(pattern) extended pattern matches zero or one occurrences of pattern and is used to remove the comma if present.
"$part_column"* matches any string that begins with the expansion of $part_column. Quoting it is not required in the example, but good practice to prevent glob characters from expanding.
Since you want to remove variable as well as comma before it so following sed may help you here.
echo "replenishment_category string,Date string" | sed "s/,$PARTITION_COLUMN.*//g"
sed would obviously work here:
echo "replenishment_category string,Date string" | sed "s/\b,$PARTITION_COLUMN.*//"
Output:
replenishment_category string

REGEX - How to get rid of quotation marks at the start and end of a string

Have a bunch of strings
"pipe 1/4" square"
"3" bar"
"3/16" spanner
2" nozzle
spare tyre
I want to get rid of " marks from the start of the string and the end of the string with RegEx.
I've been trying on a simulator with the aid of some references but cannot seem to do it right.
Q: What is the RegEx that will do this with BASH?
Use this regex to match double quotes which exists at the start and end of a line ^"|"$ and then replace the match with empty string.
Using sed.
sed 's/^"\|"$//g' <<<$var
Try the following command:
echo $var | sed 's/^(.*)"$/\1/'
This will pass the variable $var into the sed command via the pipe | operator. Sed will then substitute this input string with the group match in parenthesis. This match is available in sed as \1. So your input string, minus the final quotation mark, is what will actually be output by echo.
Using Bash parameter expansion:
a="\"pipe 1/4\" square\""
a="${a/#\"/}" && a="${a/%\"/}"
echo "$a"
Output:
pipe 1/4" square
Explanation:
${var/old/new} replaces old with new in $var.
A # before old makes it to match at the beginning of $var.
A % before old makes it to match at the end of $var.

String replace on a very large file

I have a giant text file that is JSON. You can see it here: http://api.mtgdb.info/cards/. I have saved this JSON to a file called cards.json.
In cards.json, I need to escape every single quote ' with a backslash \.
So I need to replace ' with \'.
Usually this is trivial in any editor, however the file is too large. How can I escape all single quotes in this string?
What I've tried:
I tried using sed. My command was sed s/\'/\\\'/ cards.json > cards_cleaned.json. However the cards_cleaned.json file did not have any escaped ', it was just an exact copy of cards.json. Sed works when i do sed s/\'/foobar/ cards.json > cards_cleaned.json, so I'm assuming something is wrong with my escaping backslashes.
I tried using vim. I opened cards.json in vim $ vi cards.json. Then I tried a global string replace using :%s/'/\'/g. This did not change anything in the file.
While #anubhava's or #gboffi's answers works, they produces and INVALID JSON.
JSON allows only few characters after the backslash:
\"
\\
\/
\b
\f
\n
\r
\t
\u four-hex-digits
e.g. the part of the following original (correct) JSON
[
{
"description" : "Whenever a land enters the battlefield, Ankh of Mishra deals 2 damage to that land's controller.",
"rarity" : "Rare",
"name" : "Ankh of Mishra"
}
]
you want to get
[
{
"description" : "Whenever a land enters the battlefield, Ankh of Mishra deals 2 damage to that land\'s controller.",
"rarity" : "Rare",
"name" : "Ankh of Mishra"
}
]
#e.g. instead of the land's want land\'s
But this is an INVALID JSON.
So, if you (for some strange reason) want have the backslash, you need to use double \\, such:
[
{
"description" : "Whenever a land enters the battlefield, Ankh of Mishra deals 2 damage to that land\\'s controller.",
"rarity" : "Rare",
"name" : "Ankh of Mishra"
}
]
Solution (for both)
with perl
perl -pE "s/'/\\\'/g" < mtg_cards.json > cards.malformed.json
#changes "land's" to wrong "land\'s"
and
perl -pE "s/'/\\\\'/g" < mtg_cards.json > card_with_double_BS.json
#changes "land's" to "land\\s"
Ps: Because your file is only one long (30MB) line, the vim has some problems. You can pretty print (fold and indent) the JSON, before editing. Many tools here, i'm using the json_xs command from the JSON_XS perl package. After "prettyfying" you can use the vim safely.
You need to use double quotes in the shell to avoid quoting the single quote character, but the you have to be careful because the shell, for a double quoted string, use the backslash as a quoting character
$ echo "eoieriou'iouou'oiuiouiuo"|sed "s/'/\\'/g"
eoieriou'iouou'oiuiouiuo
and the command that sed is trying to execute is s/'/\'/g but sed quoting character is the backslash, so that you substitute each single quote with a single quote...
We have to quote the backslash also when it arrives to sed, so let's try
$ echo "eoieriou'iouou'oiuiouiuo"|sed "s/'/\\\\'/g" # Four (4) backslashes in a row
eoieriou\'iouou\'oiuiouiuo
$
That's OK, isn't it? because sed is instructed to do s/'/\\'/g so that the quoted character, from the POV of sed, is the backslash itself...
Please note that the quotes, single or double, are not special characters from the POV of sed, they're special only in the context of the shell.
In Vi you will need to escape the \ character.
Try using
:%s/'/\\'/g
For me it worked.
Test.txt
\'\'\' \'\'\'
You need to double escape the backelas, so use:
sed -i.bak "s/'/\\\\'/g" cards.json
You can use like this, in vim.
:%s/'/\\\'/g
In sed,
sed "s/'/\\\'/g" filename
Here is an awk version:
cat file
hi'more data here'
awk '{gsub(g,"\\"g)}1' g="'" file
hi\'more data here\'
Or if you need double backslash:
awk '{gsub(g,"\\\\"g)}1' g="'" file
hi\\'more data here\\'
sed "s/'/\\\\&/g" cards.json > cards_cleaned.json
no need of your first escaped in search pattern \'
you should surround by double quote (single if single quote was not the char to change) and escape the escape due to double quote used at shell level in this case