Recreate output of tail -n to text files

Recreate output of tail -n to text files - regex

I had a bunch of bash scripts in a directory that I "backed up" doing $ tail -n +1 -- *.sh
The output of that tail is something like:
==> do_stuff.sh <==
#! /bin/bash
cd ~/my_dir
source ~/my_dir/bin/activate
python scripts/do_stuff.py
==> do_more_stuff.sh <==
#! /bin/bash
cd ~/my_dir
python scripts/do_more_stuff.py
These are all fairly simple scripts with 2-10 lines.
Given the output of that tail, I want to recreate all of the above files with the same content.
That is, I'm looking for a command that can ingest the above text and create do_stuff.sh and do_more_stuff.sh with the appropriate content.
This is more of a one-off task so I don't really need anything robust and I believe there are no big edge cases given files are simple (e.g none of the files actually contain ==> in them).
I started with trying to come up with a matching regex and it will probably look something like this (==>.*\.sh <==)(.*)(==>.*\.sh <==), but I'm stuck into actually getting it to capture filename, content and output to file.
Any ideas?

Presume your backup file is named backup.txt
perl -ne "if (/==> (\S+) <==/){open OUT,'>',$1;next}print OUT $_" backup.txt
Above version is for Windows
fixed version on *nix:
perl -ne 'if (/==> (\S+) <==/){open OUT,">",$1;next}print OUT $_' backup.txt

#!/bin/bash
while read -r line; do
if [[ $line =~ ^==\>[[:space:]](.*)[[:space:]]\<==$ ]]; then
out="${BASH_REMATCH[1]}"
continue
fi
printf "%s\n" "$line" >> "$out"
done < backup.txt
Drawback: extra blank line at the end of every created file except the last one.

Related

SED: How to search for word "tokens" on consecutive lines (Windows)?

I have EDI files I need to find, by using SED to search for some anomalies.
The anomaly is when I search for a "token" called SGP, and where they are on multiple consecutive lines — so one SGP on one line and another SGP on another line — regardless of what's after the token:
SGP+SEGU1037087'
SGP+DFSU1143210'
SGP+SEGU1166926'
SGP+TGHU1203545'
But I don't want to find files where there are other segment lines between each SGP line:
SGP+TGHU1643436'
GID+2+3:BAG'
FTX+AAA+++sdfjkhsdfjkhsdfjkh'
MEA+AAE+AAB+KGM:20000.0000'
MEA+AAE+AAW+MTQ:.0000'
SGP+HCIU2090577'
So I've tried this:
sed 'SGP.*\n.*SGP' < *.txt
And as probably expected, I get nothing.
Any ideas on how to feed into SED a list of files in DOS, and get a list of files that meet the above criteria?
UPDATE
I think I have the "feed the files" bit here. But I am still stuck on how to use SED properly.
for i in *.txt; do
sed -i '<<WHAT DO I PLACE HERE?>>' $i
done
UPDATE 2
Please no Unix/Bash/etc solutions.. I am in Windows only! Thank you
UPDATE 3
Tried a DOS equivalent of #tshiono's answer but I get nothing..
for %%f in (*.txt) do (
sed -ne ':l;N;$!b l;/SGP[^\n]\+\nSGP/p' %%f
}
UPDATE 4
#tshiono - I want the script to find files that have this pattern...
SGP+SEGU1037087'
SGP+DFSU1143210'
SGP+SEGU1166926'
SGP+TGHU1203545'
Not this pattern ...
SGP+SEGU1037087'
FTT+asdjkfhsdkf hsdjkfh sdfjkh sdf
FTX+f sdfjsdfkljsdkfljsdklfj
GID+sdfjkhsdjkfhsdjkfsdf
SGP+DFSU1143210'
FTT+asdjkfhsdkf hsdjkfh sdfjkh sdf
FTX+f sdfjsdfkljsdkfljsdklfj
GID+sdfjkhsdjkfhsdjkfsdf
SGP+SEGU1166926'
FTT+asdjkfhsdkf hsdjkfh sdfjkh sdf
FTX+f sdfjsdfkljsdkfljsdklfj
GID+sdfjkhsdjkfhsdjkfsdf
SGP+TGHU1203545'
Again - only lines with SGP as a token on every NEWLINE

Could you please try following.
awk '
FNR==1{
if(count){
if(fnr==count){
print prev_file " has all lines of SGP."
}
}
prev_file=FILENAME
count=fnr=""
}
/^SGP/{
++count
}
{
fnr++
}
END{
if(fnr==count){
print prev_file " has all lines of SGP."
}
}
' *.txt

The requirement is to detect which files contain consecutive lines both starting SGP.
Using standard (POSIX) sed, there's no way to get sed to print the file name. You can use this combination of shell script and sed, though, to detect which files contain consecutive lines starting with SGP:
for file in *.txt;
do
if [ -n "$(sed -n -e '/^SGP/{N;/^SGP.*\nSGP/{p;q;}}' "$file")" ]
then echo "$file"
fi
done
The shell test [ … ] checks whether the output of $(sed …) is a non-empty string, and reports the name of the file if it is. Note that the script is more flexible if, instead of using the glob *.txt, it uses the "$#" (list of arguments, preserving spaces etc). You can the write:
sh find-consecutive-SGP.sh *.txt
or use other more fanciful ways of specifying the file names as arguments.
The sed command doesn't print by default (-n). It looks for a line starting SGP and appends the next line into the 'pattern space'. It then looks to see if the result has two lots of SGP in it; one at the start (we know that will be there) and one after a newline. If that's found, it prints both lines (the pattern space) and quits because its job is done; it has found two consecutive lines both starting SGP. If the pattern space doesn't match, it is not printed (because of the -n) and more data is read. Any lines that don't start SGP are ignored and not printed.
With GNU sed, the F command prints the file name and a newline, so you could use:
for file in *.txt;
do
sed -n -e '/^SGP/{N;/^SGP.*\nSGP/{F;q;}}' "$file"
done
AFAICT from the GNU sed manual, there's no way to 'skip to the start of the next file' so you have to test each file separately as shown, rather than trying sed -n -e '…' *.txt — that will only report the first file that breaches the condition, not all the files.

If your objective is to get the list of filenames which meet the criteria,
how about:
for i in *.txt; do
[[ -n $(sed -ne ':l;N;$!b l;/SGP[^\n]\+\nSGP/p' "$i") ]] && echo "$i"
done
The sed commands :l;N;$!b makes a loop and slurps the whole lines
in the pattern space including "\n"
Then it matches the lines with the pattern of two consecutive lines
which both contain SGP.
If the sed output is non-empty, it prints the current filename.
[Update]
If your requirement is DOS platform, please try instead:
setlocal EnableDelayedExpansion
for %%f in (text*.txt) do (
set result=
for /f "usebackq tokens=*" %%a in (`sed.exe -ne ":l;N;$!b l;/SGP.\+\nSGP.\+/p" %%f`) do set result=!result!%%a
if "!result!" neq "" (
echo %%f
)
)
I've tested with Windows10 and sed-4.2.1.

Substring removal in bash

I'm currently trying to get into bash regular expressions to change multiple filenames at the same time. Here are the file names:
a_001_D_xy_S37_L003_R1_001.txt
a_001_D_xy_S37_L003_R2_001.txt
a_002_D_xy_S37_L006_R1_001.txt
a_002_D_xy_S37_L006_R2_001.txt
a_003_D_xy_S23_L003_R1_001.txt
a_003_D_xy_S23_L003_R2_001.txt
I want this as my result:
a_002_D_xy_R1.txt
a_002_D_xy_R2.txt
...
I only want to change those with *001.txt at the end. First I want to remove the _S.._L00. in the filenames and the 001 in the end. I split this procedure in two parts:
for file in *001.txt;
do
echo ${file#_S.._L..6}
done
This loop already does not work. As a second alternative I tried:
for file in *001.fastq.gz;
do
echo ${file/_S.._L00./}
done
but the filenames are again unchanged. (I just use echo here to see the results. If it works I will replace it with mv ${file} ${regularexpression})
Thanks for help!

Considering that you need lots of different fields it is possibly better to just split the filename and then reconstruct it as you wish.
I suggest using an array built by splitting the original filename with _. Then you just reconstruct the new name by using the fields that you wish.
for file in *001.txt; do
echo "FILE: $file"
IFS='_' read -r -a fileFields <<< "$file"
echo "FILE FIELDS: "
for index in "${!fileFields[#]}"; do
echo "- $index ${fileFields[index]}"
done
fileName="${fileFields[0]}_${fileFields[1]}_${fileFields[2]}_${fileFields[3]}_${fileFields[-2]}.txt"
echo "NEW FILE NAME: $fileName"
# mv $file $fileName
done
The echo commands are just for debuging, you can remove them all once you understand the code.
However, if you really need to split the string using BASH expressions you can check this post:
Extracting part of a string to a variable in bash or take a look at this BASH cheat sheet.

Try to make a function, you'll first have to decide the number (n) of files.
n=$(ls *_001.txt | wc -l)
functionRename(){
for(( i=1; i <=n; i++))
do
file=$(ls *_001.txt | head -n $i | tail -n 1)
mv "${file}" "${file%_S??_*}${file#???????????????????}"
file2=$(ls *_001.txt | head -n $i | tail -n 1)
mv "${file2}" "${file2%_001*}.txt"
done
}
functionRename

Using a variable in sed search pattern when the value of the variable contains square brackets

What I'm trying to do is check that a file has been created. The best way I can think to do this is by listing the files before hand, listing them afterwards, deleting the before list from the after list, then seeing if the after list is not zero. I ran into trouble deleting the before list from the after list. Filenames with square brackets were not being deleted from the list.
while read -r LINE
do
sed -i -- "/$LINE/d" listfilesafter.swp #without the -- I get 'sed: 1: "listfilesafter.swp": extra characters at the end of l command'
rm listfilesafter.swp--
done < listfilesbefore.swp
If I use '' then the variable doesn't get called, and the -r option on read doesn't seem to make it work like I expected. If anyone has any suggestions on alternative ways of doing this, do contribute, but I would still like to know how to use a variable in the search pattern when the value of the variable contains metacharacters. If anyone can help remove the code smell of "rm listfilesafter.swp--" then that would also be appreciated. Full code below:
cd ~/Desktop
ls >listfilesbefore.swp
#echo "balh blah" >SomeNonZeroFile.txt #comment or uncomment to test the if then statement
ls >listfilesafter.swp
sed -i -- '/listfilesafter.swp/d' listfilesafter.swp #deletes listfilesafter.swp from the list of files create after the event on line 3
while read -r LINE
do
sed -i -- "/$LINE/d" listfilesafter.swp #without the -- I get 'sed: 1: "listfilesafter.swp": extra characters at the end of l command'
rm listfilesafter.swp--
done < listfilesbefore.swp
cat listfilesafter.swp
echo "check listfiles. Enter to continue."
read dummy_variable
if [ -s listfilesafter.swp ]
then
rm listfilesbefore.swp
rm listfilesafter.swp
echo "success, the file was created"
else
rm listfilesbefore.swp
rm listfilesafter.swp
echo "failure, the file was not created"
fi

Given that you have two lists of files in sorted order (since ls lists the files in sorted order), you should probably be using a command like diff or, in this case,
comm to find the differences between the two lists of files.
If you want to know which file(s) were created, then that's the list of files (lines) in the second file that are not in the first. With no options, comm lists the lines it reads in 3 columns:
lines in the first file not in the second
lines in the second file not in the first
lines in both files
You only need the lines (file names) in the second column, and therefore you want to suppress the list of files in the first and third columns, so you'll use comm -13 to do that:
before=$(mktemp ${TMPDIR:-/tmp}/files.XXXXXX)
after=$(mktemp ${TMPDIR:-/tmp}/files.XXXXXX)
trap "rm -f $before $after; exit 1" 0 1 2 3 13 15
ls > $before
…execute command that creates file(s)…
ls > $after
comm -13 $before $after
rm -f $before $after
trap 0
Obviously, you could capture the list of files from comm in a variable for further analysis, etc.
Making sed work when the search strings contain metacharacters
I'm still confused about sed. How do I use a variable in the search pattern of sed if the value contains metacharacters? Or in this case would I be better off using something other than sed?
In the scenario you have, you're far better off not using sed, and in any case your technique is horrendously slow if there are hundreds or thousands of files in the directory (running sed once per file name is not going to be fast).
However, supposing that it was necessary to use sed and that you wanted to deal with metacharacters in the file names in the list, then you would have to escape the metacharacters (with a backslash in front). I'd probably do something like this:
sed 's/[][\/*.]/\\&/g; s%.*%/^&$/d%' listfilesbefore.swp > script.sed
sed -f script.sed listfilesafter.swp
The first script takes any metacharacters in the line (file name) and replaces it with backslash-metacharacter. In the first substitute, the [][\/*.] character class matches square brackets, two types of slashes, stars and dots. Depending on the predilections of the variant of sed you're using, you might need to protect (){} with backslashes too, but in POSIX standard sed, the {} gain metacharacter meaning when prefixed with a backslash, so they're not modified by default. The second substitute takes the possibly modified line and converts it into a 'match and delete' command. The output, therefore, is a sed script that will delete the file names found in listfilesbefore.swp. The second command applies that script to listfilesafter.swp, doing in one sed command what your outline code does with one run of sed per file name.
Using sed to generate a sed script is a powerful technique. It isn't always appropriate, but when it is, it is very useful.
Shell script demo.sh
echo "Pre-populate the directory with some random file names"
for file in $(random -n 20 -T '%W%V%C-%w%v%c%v%c-%04[0000:9999]d.txt')
do
cp /dev/null $file
done
for template in '%w%v%w(%03[000:999]d)%w%v%w.txt' \
'%w%v%w[123]%w%v%we.txt' \
'%w%v%wfile*%03[0:999]d*.txt' \
'%w%v%w%v%c\\\%d.txt' \
'%w%v%w-{%04X}-{%04X}.txt'
do
for file in $(random -n 2 -T "$template")
do
cp /dev/null "$file"
done
done
ls > listfilesbefore.swp
ls
echo
echo "Create some new files with metacharacters in the names"
for file in 'new(123)file.txt' 'new[123]file.txt' 'newfile*321*.txt' \
'newfile\\\.txt' 'newfile-{A39F}-{B77D}.txt'
do
cp /dev/null "$file"
done
ls
ls > listfilesafter.swp
echo
echo "Create sed script"
sed 's/[][\/*.]/\\&/g; s%.*%/^&$/d%' listfilesbefore.swp > script.sed
echo
cat script.sed
echo
echo "Apply it"
sed -f script.sed listfilesafter.swp
The random command I'm using is of my own devising, but it is convenient for demonstrations such as this.
Example run
Pre-populate the directory with some random file names
AIG-taral-3486.txt
COV-oipuc-9088.txt
CUG-vowan-5758.txt
FEH-ieqek-0603.txt
IUS-aaduw-7080.txt
KER-jazuc-4824.txt
MIZ-iezec-8255.txt
NIT-kupib-6873.txt
PUX-oocov-2216.txt
QAW-xonod-3937.txt
QES-wawok-4790.txt
RON-difag-1986.txt
SAD-gesug-5706.txt
SAJ-luqoj-4311.txt
TUZ-wapaw-8547.txt
VAL-zutap-8054.txt
YIP-xudeb-7397.txt
YUP-uudiv-8848.txt
ZIB-jurax-2903.txt
ZUR-xonik-8800.txt
aavfile*147*.txt
demo.sh
diman\\\7115.txt
ganur\\\8732.txt
gud-{7049}-{3103}.txt
listfilesbefore.swp
lur[123]maee.txt
rivfile*065*.txt
ueo(417)yea.txt
uoi(751)qio.txt
woi-{37E8}-{009C}.txt
xof[123]hoxe.txt
Create some new files with metacharacters in the names
AIG-taral-3486.txt
COV-oipuc-9088.txt
CUG-vowan-5758.txt
FEH-ieqek-0603.txt
IUS-aaduw-7080.txt
KER-jazuc-4824.txt
MIZ-iezec-8255.txt
NIT-kupib-6873.txt
PUX-oocov-2216.txt
QAW-xonod-3937.txt
QES-wawok-4790.txt
RON-difag-1986.txt
SAD-gesug-5706.txt
SAJ-luqoj-4311.txt
TUZ-wapaw-8547.txt
VAL-zutap-8054.txt
YIP-xudeb-7397.txt
YUP-uudiv-8848.txt
ZIB-jurax-2903.txt
ZUR-xonik-8800.txt
aavfile*147*.txt
demo.sh
diman\\\7115.txt
ganur\\\8732.txt
gud-{7049}-{3103}.txt
listfilesbefore.swp
lur[123]maee.txt
new(123)file.txt
new[123]file.txt
newfile*321*.txt
newfile-{A39F}-{B77D}.txt
newfile\\\.txt
rivfile*065*.txt
ueo(417)yea.txt
uoi(751)qio.txt
woi-{37E8}-{009C}.txt
xof[123]hoxe.txt
Create sed script
/^AIG-taral-3486\.txt$/d
/^COV-oipuc-9088\.txt$/d
/^CUG-vowan-5758\.txt$/d
/^FEH-ieqek-0603\.txt$/d
/^IUS-aaduw-7080\.txt$/d
/^KER-jazuc-4824\.txt$/d
/^MIZ-iezec-8255\.txt$/d
/^NIT-kupib-6873\.txt$/d
/^PUX-oocov-2216\.txt$/d
/^QAW-xonod-3937\.txt$/d
/^QES-wawok-4790\.txt$/d
/^RON-difag-1986\.txt$/d
/^SAD-gesug-5706\.txt$/d
/^SAJ-luqoj-4311\.txt$/d
/^TUZ-wapaw-8547\.txt$/d
/^VAL-zutap-8054\.txt$/d
/^YIP-xudeb-7397\.txt$/d
/^YUP-uudiv-8848\.txt$/d
/^ZIB-jurax-2903\.txt$/d
/^ZUR-xonik-8800\.txt$/d
/^aavfile\*147\*\.txt$/d
/^demo\.sh$/d
/^diman\\\\\\7115\.txt$/d
/^ganur\\\\\\8732\.txt$/d
/^gud-{7049}-{3103}\.txt$/d
/^listfilesbefore\.swp$/d
/^lur\[123\]maee\.txt$/d
/^rivfile\*065\*\.txt$/d
/^ueo(417)yea\.txt$/d
/^uoi(751)qio\.txt$/d
/^woi-{37E8}-{009C}\.txt$/d
/^xof\[123\]hoxe\.txt$/d
Apply it
listfilesafter.swp
new(123)file.txt
new[123]file.txt
newfile*321*.txt
newfile-{A39F}-{B77D}.txt
newfile\\\.txt

Regex match for file and rename + overwrite old file

Im trying to make a bash script to rename some files wich match my regex, if they match i want to rename them using the regex and overwrite an old existing file.
I want to do this because on computer 1 i have a file, on computer 2 i change the file. Later i go back to computer 1 and it gives an example conflict so it saves them both.
Example file:
acl_cam.MYI
Example file after conflict:
acl_cam (Example conflit with .... on 2015-08-20).MYI
I tried a lot of thinks like rename, mv and couple other scripts but it didn't work.
the regex i should use in my opinion:
(.*)/s\(.*\)\.(.*)
then rename it to value1 . value2 and replace the old file (acl_cam.MYI) and do this for all files/directories from where it started
can you guys help me with this one?

The issue you have, if I understand your question correctly, is two part. (1) What is the correct regex that will match the error string and produce a filename?; and (2) how to use the returned filename to move/remove the offending file?
If the sting at issue is:
acl_cam (Example conflit with .... on 2015-08-20).MYI
and you need to return the MySQL file name, then a regex similar to the following will work:
[ ][(].*[)]
The stream editor sed is about as good as anything else to return the filename from your string. Example:
$ printf "acl_cam (Example conflit with .... on 2015-08-20).MYI\n" | \
sed -e 's/[ ][(].*[)]//'
acl_cam.MYI
(shown with line continuation above)
Then it is up to you how you move or delete the file. The remaining question is where is the information (the error string) currently stored and how do you have access to it? If you have a file full of these errors, then you could do something like the following:
while read -r line; do
victim=$( printf "%s\n" "$line" | sed -e 's/[ ][(].*[)]//' )
## to move the file to /path/to/old
[ -e "$victim" ] && mv "$victim" /path/to/old
done <$myerrorfilename
(you could also feed the string to sed as a here-string, but omitted for simplicity)
You could also just delete the file if that suits your purpose. However, more information is needed to clarify how/where that information is stored and what exactly you want to do with it to provide any more specifics. Let me know if you have further questions.

Final solution for this question for people who are interested:
for i in *; do
#Wildcar check if current file containt (Exemplaar
if [[ $i == *"(Exemplaar"* ]]
then
#Rename the file to the original name (without Exemplaar conflict)
NewFileName=$(echo "$i" | sed -E -e 's/[ ][(].*[)]//')
#Remove the original file
rm $NewFileName;
#Copy the conflict file as the original file name
cp -a "$i" $NewFileName;
#Delete the conflict file
rm "$i";
echo "Removed file: $NewFileName with: $i";
fi
done
I used this code to replace my database conflict files created by dropbox sync with different computers.

Regex to remove lines in file(s) that ending with same or defined letters

i need a bash script for mac osx working in this way:
./script.sh * folder/to/files/
#
# or #
#
./script.sh xx folder/to/files/
This script
read a list of files
open each file and read each lines
if lines ended with the same letters ('*' mode) or with custom letters ('xx') then
remove line and RE-SAVE file
backup original file
My first approach to do this:
#!/bin/bash
# ck init params
if [ $# -le 0 ]
then
echo "Usage: $0 <letters>"
exit 0
fi
# list files in current dir
list=`ls BRUTE*`
for i in $list
do
# prepare regex
case $1 in
"*") REGEXP="^.*(.)\1+$";;
*) REGEXP="^.*[$1]$";;
esac
FILE=$i
# backup file
cp $FILE $FILE.bak
# removing line with same letters
sed -Ee "s/$REGEXP//g" -i '' $FILE
cat $FILE | grep -v "^$"
done
exit 0
But it doesn't work as i want....
What's wrong?
How can i fix this script?
Example:
$cat BRUTE02.dat BRUTE03.dat
aa
ab
ac
ad
ee
ef
ff
hhh
$
If i use '*' i want all files that ended with same letters to be clean.
If i use 'ff' i want all files that ended with 'ff' to be clean.
Ah, it's on Mac OSx. Remember that sed is a little different from classical linux sed.
man sed
sed [-Ealn] command [file ...]
sed [-Ealn] [-e command] [-f command_file] [-i extension] [file
...]
DESCRIPTION
The sed utility reads the specified files, or the standard input
if no files are specified, modifying the input as specified by a list
of commands. The
input is then written to the standard output.
A single command may be specified as the first argument to sed.
Multiple commands may be specified by using the -e or -f options. All
commands are applied
to the input in the order they are specified regardless of their
origin.
The following options are available:
-E Interpret regular expressions as extended (modern)
regular expressions rather than basic regular expressions (BRE's).
The re_format(7) manual page
fully describes both formats.
-a The files listed as parameters for the ``w'' functions
are created (or truncated) before any processing begins, by default.
The -a option causes
sed to delay opening each file until a command containing
the related ``w'' function is applied to a line of input.
-e command
Append the editing commands specified by the command
argument to the list of commands.
-f command_file
Append the editing commands found in the file
command_file to the list of commands. The editing commands should
each be listed on a separate line.
-i extension
Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup will be
saved. It is not recom-
mended to give a zero-length extension when in-place
editing files, as you risk corruption or partial content in situations
where disk space is
exhausted, etc.
-l Make output line buffered.
-n By default, each line of input is echoed to the standard
output after all of the commands have been applied to it. The -n
option suppresses this
behavior.
The form of a sed command is as follows:
[address[,address]]function[arguments]
Whitespace may be inserted before the first address and the
function portions of the command.
Normally, sed cyclically copies a line of input, not including
its terminating newline character, into a pattern space, (unless there
is something left
after a ``D'' function), applies all of the commands with
addresses that select that pattern space, copies the pattern space to
the standard output, append-
ing a newline, and deletes the pattern space.
Some of the functions use a hold space to save all or part of the
pattern space for subsequent retrieval.
anything else?
it's clear my problem?
thanks.

I don't know bash shell too well so I can't evaluate what the failure is.
This is just an observation of the regex as understood (this may be wrong).
The * mode regex looks ok:
^.*(.)\1+$ that ended with same letters..
But the literal mode might not do what you think.
current: ^.*[$1]$ that ended with 'literal string'
This shouldn't use a character class.
Change it to: ^.*$1$
Realize though the string in $1 (before it goes into the regex) should be escaped
incase there are any regex metacharacters contained within it.
Otherwise, do you intend to have a character class?

perl -ne '
BEGIN {$arg = shift; $re = $arg eq "*" ? qr/([[:alpha:]])\1$/ : qr/$arg$/}
/$re/ && next || print
'
Example:
echo "aa
ab
ac
ad
ee
ef
ff" | perl -ne '
BEGIN {$arg = shift; $re = $arg eq "*" ? qr/([[:alpha:]])\1$/ : qr/$arg$/}
/$re/ && next || print
' '*'
produces
ab
ac
ad
ee
ef

A possible issue:
When you put * on the command line, the shell replaces it with the name of all the files in your directory. Your $1 will never equal *.
And some tips:
You can replace replace:
This:
# list files in current dir
list=`ls BRUTE*`
for i in $list
With:
for i in BRUTE*
And:
This:
cat $FILE | grep -v "^$"
With:
grep -v "^$" $FILE
Besides the possible issue, I can't see anything jumping out at me. What do you mean clean? Can you give an example of what a file should look like before and after and what the command would look like?

This is the problem!
grep '\(.\)\1[^\r\n]$' *
on MAC OSX, ( ) { }, etc... must be quoted!!!
Solved, thanks.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js