Sed isn't replacing all occurrences of string in file - regex

EDIT: I am using Cygwin. I am unsure whether this is of relevance and it was a detail I missed during writing this question.
EDIT2: Have tried replacing the "TAB" char people pointed out with the RegEx \s which covers spacing chars (spaces and tabs primarily) and this did not affect the expression at all, meaning that it is not the tabs causing the issue, especially since the expression runs once without errors anyway.
So far this script has been causing me a ton of trouble.
I DID have an issue before but I resolved that while I was writing a question here (lucky imo) but this one I've been stuck on for at least an hour now and I've tried varying solutions, none of which actually work or told me something I didn't already try.
I have a rather cool seeming FTP log fetcher script and part of this script replaces the 600MB of errors in this logfile to nothing, essentially removing them. Unfortunately this script also gets rid of parts of other errors too, so I've had to edit it. This is where I'm getting stuck.
Through base research I managed to find out that sed could do what I want, and through three hours of playing so far it does most of what I tell it to, minus one thing. One, and ONLY one, of the sed statements I have built only replaces the first instance of the string I've given it despite having the g modifier attached to the end.
I am working with a test script right now as to avoid potential permanent damage to my original FTP script, and the test script copies over an example file with a few of the errors I need replacing.
Walkthrough of the scripts INTENDED behaviour before showing:
1. Sets a prefix which happens on ALL lines in the file, pretty important part of the script.
2. Copies the example file to a file named test2.log
3. Replace all instances of the UNIX newline char \n with [loll] (first thing that came to my mind)
4. Remove all instances of battle error type 1 and 2.
5. Return all [loll] strings with the UNIX \n for newlines, therefore returning the logfile to its original state minus the errors.
Script:
#DTP="\[([0-9]+-[0-9]+-[0-9]+-[0-9]+|latest)\.log\] \[[0-9]+:[0-9]+:[0-9]+\] \[Server thread/(INFO|WARN)\]: "
echo "${DTP}"
DTP1="\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
DTP="\[loll\]\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
echo "${DTP}"
echo "1"
cp test.log test2.log
#cat test.log >test2.log
sed -i ':a;N;$!ba;s/\n/\[loll\]/g' test2.log #| egrep -i "" >test2.log
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.useAttack(PixelmonWrapper.java:173)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.takeTurn(PixelmonWrapper.java:330)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.takeTurn(BattleControllerBase.java:276)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.update(BattleControllerBase.java:157)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleRegistry.updateBattles(BattleRegistry.java:63)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleTickHandler.tickStart(BattleTickHandler.java:12)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler_20_BattleTickHandler_tickStart_WorldTickEvent.invoke(.dynamic)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler.invoke(ASMEventHandler.java:51)'${DTP}' at cpw.mods.fml.common.eventhandler.EventBus.post(EventBus.java:122)'${DTP}' at cpw.mods.fml.common.FMLCommonHandler.onPostWorldTick(FMLCommonHandler.java:255)'${DTP}' at net.minecraft.server.MinecraftServer.func_71190_q(MinecraftServer.java:929)'${DTP}' at net.minecraft.server.dedicated.DedicatedServer.func_71190_q(DedicatedServer.java:429)'${DTP}' at net.minecraft.server.MinecraftServer.func_71217_p(MinecraftServer.java:776)'${DTP}' at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:639)'${DTP}' at java.lang.Thread.run(Thread.java:745)//gI' test2.log
echo "2"
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException\[loll\]//gI' test2.log
echo "3"
sed -i 's/\[loll\]/\n/g' test2.log
I've set them to also run case insensitive checks on the provided strings as sometimes I write with all lower case, however for most of this I copied and pasted it directly.
Sample input:
http://pastebin.com/3KPB33X2
Outputs:
Expected:
meow
Test message
WOOF MEOWLOL
Actual: http://pastebin.com/pnvDwkxz
It's been killing my mind for a while now because I had this issue even before the other one, except I barely noticed it. I can't find any predictable behaviour in the script, and as far as I am aware it SHOULD be working perfectly fine and giving me the output I expect.
Any help would be appreciated, because as soon as I can get this bug sorted out I'll be able to enter in the rest of the script and replace this with the existing battle-error replacement script in my log-fetcher.
Knowing me it's something small and stupid but I've tried literally everything I came across, including adding the :a;N;$!ba; to the start of the bit which isn't working properly (and realising that failed horribly).
Thanks.
~BAI1

Are you looking for something like this:
sed -n ':a;s/\[.*Server thread\/\(INFO\|WARN\).*//i;/^$/!p;n;b a' battle.log

Related

Why is this vim regex so expensive: s/\n/\\n/g

Attempting this on a sufficiently large file (say 80,000+ lines and about 500k+) will crash things or stall eventually both on my server and on my local Mac.
I've tried this at the command line as well, with the same result:
vim -es -c '%s/\n/\\n/g' -c wq $file
Also, the problem appears to be with the selection (\n) and not the replacement (\\n).
For my larger files I can of course split them and cat them back when finished, but the split points cannot be arbitrary in my case and must be adjusted manually for each and every split.
I appreciate that there are other ways to do this -- sed, etc. -- but I have similar and additional problems there, and I would like to be able to do this with vim.
I'm adding my comment as an answer:
Text editors usually don't like 'gigantic' lines (which is what you'll get with that replacement).
To test that if this is is due because of the 'big line' and not the substitution itself I did this test:
I created a simple ~500KB file with a script. No new line characters, just a single line. Then I tried to load the file with vim. Result? I had to kill it :-).
However, if on the same script I write some new lines every now and then, I have no problems opening the file.
Also, one thing you could try is the following: on vim, replace \n by \n\n if it is fast, then this should also confirm the 'big line' issue.

egrep not behaving as expected when looking for one of three desired values

I'm writing a function that is supposed to automatically delete directories whose names meet a few prerequisites. One of these prequisites is that directories with a datestamp of today, yesterday or the day before yesterday are not deleted even if they otherwise fulfill the conditions. To that end, I fill three variables with the datestamps of today, yesterday and the day before that and plan to use them with "egrep -v" to exclude them from my for loop that is going to delete it.
The directory I am using as a testing directory contains the following files:
FFFA72U_20160513
FFFF11F_20160404
FFFF12F
FFFF13F
FFFF17F
FFFF21F_20130230
FFFF99F_20160511
I've tried a lot of different combinations, but I can't seem to get the egrep part right. My code currently looks like this:
currentDate=`date +%Y%m%d`
yesterday=`date --date yesterday +%Y%m%d`
bYesterday=`date --date="2 days ago" +%Y%m%d`
for i in `ls ./*targetdir* | egrep -i "^[A-Z]{4}[0-9]{2}[A-Z]_[0-9]{8}$" | egrep -iv "(${currenDate}|${yesterday}|${bYesterday})$"`
do
*actions here*
done
When the above executes, I expect it to return the two files that have the 20160404 and the 20130230 datestamps, but I get no matches whatsoever.
Removing the double-quotes around the egrep string gives me an error that the ( is unexpected, so that does not help. Replacing the double-quotes with single quotes also generates no output.
When I prefix both parentheses with a / or a \, it returns all four directories with a datestamp while I expect it to exclude the ones with a May 2016 datestamp.
I've tried many more small tweaks (e.g., escaping the pipes) that I can't perfectly recall/repeat here, but it boils down to the fact that I have no clue whatsoever why it is not generating the desired outpout.
At this point I'm a bit flabbergasted by it all and I'd really appreciate any pointers because even after all my attempts and reading several topics on this matter I don't really see a simple way to get the script to do what I want.
Typo: currenDate instead of currentDate. Empty alternative matches all records, the inverse matches none.

What is the difference b/w two sed commands below?

Information about the environment I am working in:
$ uname -a
AIX prd231 1 6 00C6B1F74C00
$ oslevel -s
6100-03-10-1119
Code Block A
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \
/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleA.txt
Code Block B
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \n/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleB.txt
I have two code blocks(shown above) that make use of sed to trim the input down to 6 digits but one command is behaving differently than I expected.
Sample of input for the two code blocks
Mar 25 14:06:16 prd231 ajbtux[33423660]: 20160325140616:~schd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT:~schdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341~ SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210~
I get the following output when the sample input above goes through the two code blocks.
cycleA.txt content
389210
cycleB.txt content
25140616231334236602016032514061610200705008341389210
I understand that my last piped sed command (sed 's/[^0-9]*//g') is deleting all characters other than numbers so I omitted it from the block codes and placed the output in two additional files. I get the following output.
cycleA1.txt content
SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210
cycleB1.txt content
Mar 25 15:27:58 prd231 ajbtux[33423660]: 20160325152758: nschd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT: nschdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341 n SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210 n
I can see that the first code block is removing every thing other that (SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210) and is using the tilde but the second code block is just replacing the tildes with the character n. I can also see that it is necessary in the first code block for a line break after this(sed 's/[~]/ ) and that is why I though having \n would simulate a line break but that is not the case. I think my different output results are because of the way regular expressions are being used. I have tried to look into regular expressions and searched about them on stackoverflow but did not obtain what I was looking for. Could someone explain how I can achieve the same result from code block B as code block A without having part of my code be on a second line?
Thank you in advance
This is an example of the XY problem (http://xyproblem.info/). You're asking for help to implement something that is the wrong solution to your problem. Why are you changing ~s to newlines, etc when all you need given your posted sample input and expected output is:
$ sed -n 's/.*schdCycCleanup.* \([0-9]*\).*/\1/p' file
389210
or:
$ awk -F'[ ~]' '/schdCycCleanup/{print $(NF-1)}' file
389210
If that's not all you need then please edit your question to clarify your requirements for WHAT you are trying to do (as opposed to HOW you are trying to do it) as your current approach is just wrong.
Etan Reisner's helpful answer explains the problem and offers a single-line solution based on an ANSI C-quoted string ($'...'), which is appropriate, given that you originally tagged your question bash.
(Ed Morton's helpful answer shows you how to bypass your problem altogether with a different approach that is both simpler and more efficient.)
However, it sounds like your shell is actually something different - presumably ksh88, an older version of the Korn shell that is the default sh on AIX 6.1 - in which such strings are not supported[1]
(ANSI C-quoted strings were introduced in ksh93, and are also supported not only in bash, but in zsh as well).
Thus, you have the following options:
With your current shell, you must stick with a two-line solution that contains an (\-escaped) actual newline, as in your code block A.
Note that $(printf '\n') to create a newline does not work, because command substitutions invariably trim all trailing newlines, resulting in the empty string in this case.
Use a more modern shell that supports ANSI C-quoted strings, and use Etan's answer. http://www.ibm.com/support/knowledgecenter/ssw_aix_61/com.ibm.aix.cmds3/ksh.htm tells me that ksh93 is available as an alternative shell on AIX 6.1, as /usr/bin/ksh93.
If feasible: install GNU sed, which natively understands escape sequences such as \n in replacement strings.
[1] As for what actually happens when you try echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g' in a POSIX-like shell that does not support $'...': the $ is left as-is, because what follow is not a valid variable name, and sed ends up seeing literal $s/[~]/\\\n/g, where the $ is interpreted as a context address applying to the last input line - which doesn't make a difference here, because there is only 1 line. \\ is interpreted as plain \, and \n as plain n, effectively replacing ~ instances with literal \n sequences.
GNU sed handles \n in the replacement the way you expect.
OS X (and presumably BSD) sed does not. It treats it as a normal escaped character and just unescapes it to n. (Though I don't see this in the manual anywhere at the moment.)
You can use $'' quoting to use \n as a literal newline if you want though.
echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g'

Change WiFi WPA2 passkey from a script

I'm using Raspbian Wheezy, but this is not a Raspberry Pi specific question.
I am developing a C application, which allows the user to change their WiFi Password.
I did not find a ready script/command for this, so I'm trying to use sed.
I pass the SSID name and new key to a bash script, and the key is replaced for the that ssid block within *etc/wpa_supplicant/wpa_supplicant.conf.*.
My application runs as root.
A sample block is shown below.
network={
ssid="MY_SSID"
scan_ssid=1
psk="my_ssid_psk"
}
so far I've tried the following (I've copied the wpa_supplicant.conf to wpa.txt for trying) :
(1) This tries to do the replacement between a range, started when my SSID is detected, and ending when the closing brace, followed by a newline.
SSID="TRIMURTI"
PSK="12345678"
sed -n "1 !H;1 h;$ {x;/ssid=\"${SSID}\"/,/}\n/ s/[[:space:]]*psk=.*\n/\n psk=\"${PSK}\"\n/p;}" wpa.txt
and
(2) This tries to 'remember' the matched pattern, and reproduce it in the output, but with the new key.
SSID="TRIMURTI"
PSK="12345678"
sed -n "1 !H; 1 h;$ {x;s/\(ssid=\"${SSID}\".*psk=\).*\n/\1\"${PSK}\"/p;}" wpa.txt
I have used hold & pattern buffers as the pattern can span multiple lines.
Above, the first example seems to ignore the range & replaces the 1st instance, and then truncates the rest of the file.
The second example replaces the last found psk value & truncates the file thereafter.
So I need help in correcting the above code, or trying a different solution.
If we can assume the fields will always be in a strict order where the ssid= goes before psk=, all you really need is
sed "/^[[:space:]]*ssid=\"$SSID\"[[:space:]]*$/,/}/s/^\([[:space:]]*psk=\"\)[^\"]*/\1$PSK/" wpa.txt
This is fairly brittle, though. If the input is malformed, or if the ssid goes after the psk in your block, it will break. The proper solution (which however is severe overkill in this case) is to have a proper parser for the input format; while that is in theory possible in sed, it would be much simpler if you were to swtich a higher-level language like Python or Perl, or even Awk.
The most useful case is update a password or other value in configuration is to utilize wpa_cli. E.g.:
wpa_cli -i "wlan0" set_network "0" psk "\"Some5Strong1Pass"\"
wpa_cli -i "wlan0" save_config
The save_config method is required to update cfg file: /etc/wpa_supplicant/wpa_supplicant.conf

Bash quote behavior and sed

I wrote a short bash script that is supposed to strip the leading tabs/spaces from a string:
#!/bin/bash
RGX='s/^[ \t]*//'
SED="sed '$RGX'"
echo " string" | $SED
It works from the command line, but the script gets this error:
sed: -e expression #1, char 1: unknown command: `''
My guess is that something is wrong with the quotes, but I'm not sure what.
Putting commands into variables and getting them back out intact is hard, because quoting doesn't work the way you expect (see BashFAQ #050, "I'm trying to put a command in a variable, but the complex cases always fail!"). There are several ways to deal with this:
1) Don't do it unless you really need to. Seriously, unless you have a good reason to put your command in a variable first, just execute it and don't deal with this messiness.
2) Don't use eval unless you really really really need to. eval has a well-deserved reputation as a source of nasty and obscure bugs. They can be avoided if you understand them well enough and take the necessary precautions to avert them, but this should really be a last resort.
3) If you really must define a command at one point and use it later, either define it as a function or an array. Here's how to do it with a function:
RGX='s/^[ \t]*//'
SEDCMD() { sed "$RGX"; }
echo " string" | SEDCMD
Here's the array version:
RGX='s/^[ \t]*//'
SEDCMD=(sed "$RGX")
echo " string" | "${SEDCMD[#]}"
The idiom "${SEDCMD[#]}" lets you expand an array, keeping each element a separate word, without any of the problems you're having.
It does. Try:
#!/bin/bash
RGX='s/^[ \t]*//'
#SED='$RGX'
echo " string" | sed "$RGX"
This works.
The issue you have is with quotes and spaces. Double quoted strings are passed as single arguments.
Add set -x to your script. You'll see that variables within a single-quote mark are not expanded.
+To expand on my comment above:
#!/bin/bash
RGX='s/^[[:space:]]+//'
SED="sed -r '$RGX'"
eval "printf \" \tstring\n\" | $SED"
Note that this also makes your regex an extended one, for no particular reason. :-)