Linux console perl replace not working on large file - regex

On ubuntu I'm running a console perl replace on a csv file of ~500MB. This is the call:
perl -i -pe 's/AS100\n/AS100/g' test.csv
Before run it on the complete file, I extracted a subset of it of ~30MB and run this script successfully.
When running on the full file, no substitution is done, and no error or message is showed.
I've tried also with sed, but the behavior is the same.
How can I solve this issue?
Thank you

If you have room, try to do this instead to look at the substitution as it is done to another file:
perl -pe 's/AS100\n/AS100/g' test.csv | tee > test2.csv
My question is though, is it only the rows ending with AS100 that needs the newline removal?

After trying everything, I found out that in the original file the pattern was:
As100\r
and that the \n was a conversion done by Sublime Text when saving the test file.
So the correct code to do the trick was:
perl -i -pe 's/AS100\r/AS100/g' test.csv

Related

Bash on macOS: How replace a path in a file with another string?

For integration tests, I have output that contains full file paths. I want to have my test script replace the user-specific start of the file path (e.g. /Users/uli/) with a generic word (USER_DIR) so that I can compare the files.
The problem, of course, are the slashes in the path. I tried the solutions given here and here, but they don't work for me:
#!/bin/bash
old_path="/Users/uli/"
new_path="USERDIR"
sed -i "s#$old_path#$new_path#g" /Users/uli/Desktop/replacetarget.txt
I get the error
sed: 1: "/Users/uli/Desktop/repl ...": invalid command code u
This is the version of sed that comes with macOS 10.14.6 (it has no --version option and is installed in /usr/bin/, so no idea what exact version).
Update:
I also tried
#!/bin/bash
old_path="/Users/uli/"
old_path=${old_path//\//\\\/}
new_path="USERDIR"
regex="s/$old_path/$new_path/g"
echo $old_path
echo $regex
sed -i $regex /Users/uli/Desktop/replacetarget.txt
But I get the same error. What am I doing wrong?
BSD sed requires an argument following -i (the empty string '' indicates no backup, similar to argumentless -i in GNU sed). As a result, your script is being treated as the backup-file extention, and your input file as the script.
old_path="/Users/uli/"
new_path="USERDIR"
sed -i '' "s#$old_path#$new_path#g" /Users/uli/Desktop/replacetarget.txt
However, sed is a stream editor, based on the file editor ed, so using -i is an indication you are using the wrong tool to begin with. Just use ed.
old_path="/Users/uli/"
new_path="USERDIR"
printf 's#%s#%s#g\nwq\n' "$old_path" "$new_path" | ed /Users/uli/Desktop/replacetarget.txt
Obligatory warning: neither editor is parameterized as such; you are simpling generating the script dynamically, which means it's your responsibility to ensure that the resulting script is valid. (For example, if either parameter contains a ;, it had better be escaped to prevent (s)ed from seeing it as a command separator.)

Using grep to match one digit with TCL

The file that I want to grep contains many lines.
I want to grep lines which contain only 1 digit: "0" or "1".
I used this command:
exec grep -e "^\[0-1\]{1}$" file
But I got:
child process exited abnormally
What's wrong with RegExp of grep?
The most common issue when running grep as a Tcl subprocess is that it exits with a non-zero error code when it doesn't find anything at all. This always causes Tcl to throw an exception. The simplest workaround is perhaps this:
exec /bin/sh -c {grep -e '^[0-1]{1}$'; true} < file
Note that we are feeding in the file using a redirection here; this means that it is not necessary to strip the name of the file from the results.

sed substitution works in terminal but not in SAS pipe

If I run the command echo abc | sed 's/b/\'$'\n'/ at a terminal, I get the output:
a
c
But if run the following SAS code on on the same server, reading the output of exactly the same command,
filename cmd pipe "echo abc | sed 's/b/\'$'\n'/";
data _null_;
infile cmd;
input;
put _infile_;
run;
I get this in the log instead:
a$nc
How can I make the SAS pipe output match what I get in the terminal?
Further details
I'm using SAS 9.1.3 on Solaris 10 / SunOS 5.1. The Solaris version of sed I'm using does not support \n escape sequences for newlines in regexes, hence the shell substitution in the command above. If I run
echo abc | sed s/b/\n/
I get the following output (in a terminal):
anc
I do not have the option of using any other version of sed.
I already have a tr-based solution, but I would like to find a way of making this work in sed if possible, so that I can replace longer strings with newlines.
i don't know about SAS, but the problem could be quite universal in various languages, which is: languages fork commands without a shell, therefore your pipe is not recognized. you need to change your code into bash -c "echo abc | sed 's/b/\'$'\n'/". you need to deal with quotes, of course.

use sed replace ""string"\1\"787" to "string"\1\"787" in cygwin

I am trying to search string and replace string in a file. I used the below code:
sed -e 's/{"AP_SESSION_ID"\1\"787"}/{"AP_SESSION_ID"\1\"800"}/g' FILE|tee FILE
but it is not working and the output is like this:
sed: number in \[0-9] invalid
My environment is CYGWIN.
sample file is:
DP_SESSION_ID is a sting for values
DP_SESSION_ID is aplicat
"DP_S42SETTACC_TYPE"\1\"02"
"DP_SAP_CLIENT"\1\"460"
"DP_SAP_COMM_CONNECTION"\1\"JAVA_COMM_TOOL_ANALYZER"
"DP_SAP_CONNECTION"\1\"JAVA_TOOL_ANALYZER"
"DP_SAP_TOOLBI_CONNECTION"\1\"JAVA_TOOLBI_ANALYZER"
"DP_SESSION_ID"\1\"808"
I want search this "DP_SESSION_ID"\1\" sting and replace corresponding number like 808 in file prenatally(windows env), and i wand sing line command in windows bat command or perl command i don't want scrip or program
even i have installed cygwin tool in my server so unix also ok but single line command
server: windows 2008,cygwin x
using tool : datastage server jobs
perl -pi -e 's{" "DP_SESSION_ID"\1\"808 '"}{' "DP_SESSION_ID"\1\"900 '"'"}g' " file name
this code is not working
Please give good solution
You need to "escape" the backslashes by using two in a row:
sed -e 's/{"AP_SESSION_ID"\\1\\"787"}/{"AP_SESSION_ID"\\1\\"800"}/g' FILE|tee FILE
Otherwise the \1 is treated as a backreference, and you have no subgroups (parenthesized expressions) to reference.
Apart from back-slashes, IMO you also need to escape the quotes.
sed -e 's/{\"AP_SESSION_ID\"\\1\\\"787\"}/{\"AP_SESSION_ID\"\\1\\\"800\"}/g' FILE|tee FILE

How to go from a multiple line sed command in command line to single line in script

I have sed running with the following argument fine if I copy and paste this into an open shell:
cat test.txt | sed '/[,0-9]\{0,\}[0-9]\{1,\}[acd][0-9]\{1,\}[,0-9]\{0,\}/{N
s/[,0-9]\{0,\}[0-9]\{1,\}[acd][0-9]\{1,\}[,0-9]\{0,\}\n\-\-\-//}'
The problem is that when I try to move this into a KornShell (ksh) script, the ksh throws errors because of what I think is that new line character. Can anyone give me a hand with this? FYI: the regular expression is supposed to be a multiple line replacement.
Thank you!
This: \{0,\} can be replaced by this: *
This: \{1,\} can be replaced by this: \+
It's not necessary to escape hyphens.
The newline can be replaced by -e (or by a semicolon)
The cat can be replaced by using the filename as an argument to sed
The result:
sed -e '/[,0-9]*[0-9]\+[acd][0-9]\+[,0-9]*/{N' -e 's/[,0-9]*[0-9]\+[acd][0-9]\+[,0-9]*\n---//}' test.txt
or
sed '/[,0-9]*[0-9]\+[acd][0-9]\+[,0-9]*/{N;s/[,0-9]*[0-9]\+[acd][0-9]\+[,0-9]*\n---//}' test.txt
(untested)
can you try to put your regex in a file and call sed with the option -f ?
cat test.txt | sed -f file.sed
Can you try to replace the new line character with `echo -e \\r`
The Korn Shell - unlike the C Shell - has no problem with newlines in strings. The newline is very unlikely to be your problem, therefore. The same comments apply to Bourne and POSIX shells, and to Bash. I've copied your example and run it on Linux under both Bash and Korn shell without any problem.
If you use C Shell for your work, are you sure you're running 'ksh ./script' and not './script'?
Otherwise, there is some other problem - an unbalanced quote somewhere, perhaps.
Check out the '-v' and '-n' options as well as the '-x' option to the Korn Shell. That may tell you more about where the problem is.