Linux: rename files containing ASCII-Code for capital letters

Linux: rename files containing ASCII-Code for capital letters - regex

I have a collection of files where the capital letters are replaced by their ASCII-code (example ;065 for A). How can I most effectively recursively rename them from the command line?
Since I don't want to make the mess worse, I unfortunately don't know how test any commands...
For me it would be no problem to modify the command for each letter.

Many Linux distributions ship some variant or another of the Perl rename script, sometimes as prename, sometimes as rename. Any variant will do, but not the Linux rename utility that isn't written in Perl (run it with no argument and see if the help text mentions perl anywhere). This script runs Perl code on file names, typically a regex replacement.
prename -n 's/;(03[2-9]|0[4-9][0-9]|1[01][0-9]|12[0-6])/chr($1)/eg' *
I made a regular expression that matches three-digit numbers that are the character code of a printable ASCII character. You may need to adjust it depending on exactly what can follow a semicolon. The * at the end says to rename all files in the current directory, it's just a normal shell wildcard. It's ok to include files that don't contain anything to rename: prename will just skip them.
The -n option says to show what would be done, but don't actually rename any file. Review the output. If you're happy with it, run the command again without -n to actually rename the files.

Related

Sed - How to read a file line by line and go the path mentioned in the file then replace string?

I am on a new project where I need to add some strings to all the API names, which are exported
Someone hinted this can be done with simple sed commands.
What really needed is : Example :
In my project say 100 files and many files have something like the below pattern
in file1 its mentioned at some line : export(xyx);
in file2 its mentioned at some line : export (abc);
What is needed here is to replace the
xyz with xyz_temp and
abc with abc_temp.
Now the problem is these APIs are in different folders and different files.
Fortunately, I got to know we can redirect the result of cscope tool to some file with matching patterns.
so I did redirect the result of a search of the "export" string and I got below. Say file I have exported the scope result - export_api.txt as below.
/path1/file1.txt export(xyz);
/path2/file2.txt export(abc);
Now, I am not sure how to use sed to do this automation of
Reading this export_ap.txt
Reading each line
Replacing the string as above.
Any direction would highly appriciated.
Thanks in advance.

If you have a list of files which need to be changed and your replacement only needs to append _tmp, then this can be accomplished with a single sed call:
sed -i 's/export(\(abc\|xyz\));/export(\1_tmp);/' files...
-i will modify the files in-place, overwriting them.
If you don't care for what you are going to replace, but append a postfix to all export expressions, match any identifier. Here is one such example:
export(\([^)]*\))
Depending on your expressions and valid identifier names, you might want to or need to change this to one of:
export(\(.*\))
export(\([_a-zA-Z][_a-zA-Z0-9]*\))
export(\([_a-zA-Z"'][_a-zA-Z0-9"']*\))
export(\([_a-zA-Z]*\))
…
Another option would be to only match lines containing "export(" and then replace the closing parenthisis (given that your input lines contain the token ");" only once):
sed -i '/export(/s/);/_tmp);/' files...
# or reusing the complete match:
sed -i '/export(/s/);/_tmp&/' files...
This avoids the backreference and makes the regular expression simpler, because they can now be of fixed size

You can use the read builtin to parse the line in your export_api.txt file, then call sed on each file. Pattern match the export snippet to choose the correct sed invocation. The way read is invoked here assumes that your path and snippet are delimited by IFS and that path does not contain any whitespace or separators:
while read -r path snippet; do
case "$snippet" in
*abc*) sed -i 's/export(abc);/export(abc_tmp);/' "$path" ;;
*xyz*) sed -i 's/export(xyz);/export(xyz_tmp);/' "$path" ;;
esac
done < export_api.txt
NOTE: this will change/overwrite any of your files. Your files might be left in a broken state.
PS I wonder why you cannot use your IDE to search/replace those occurrences?

grep and regex stored in string

my question is quite short:
a="'[0-9]*'"
grep -E '[0-9]*' #for example, line containing 000 will be recognized and printed
but
grep -E $a #line containing 000 WILL NOT be printed, why is that?
Does substitution for grep regex change the command's behaviour or have I missed something from a syntactic point of view? In other words, how do I make it so that grep accepts regex from a string stored in a variable.
Thank you in advance.

Quotes go around data, not in data. That means, when you store data (in this case, a regex expression) in a variable, don't embed quotes in the variable; instead, put double-quotes around the variable when you use it:
a="[0-9]*"
grep -E "$a"
You can sometimes get away with leaving the double-quotes off when using variables (as in Avinash Raj's comment), but it's not generally safe. In this case, it'll work fine provided there are no files or subdirectories in the current working directories with names that happen to start with a digit. You see, without double-quotes around $a, the shell will take its value, try to split it into multiple words (not a problem here), try to expand each word that contains shell wildcards into a list of matching files (potential problem here), and pass that to the command (grep) as its list of arguments. That means that if you happen to have files that start with digits in the current directory, grep thinks you ran a command like this:
grep -E 1file.txt 2file.jpg 3file.etc
... and it treats the first filename as the pattern to search for, and any other filenames as files to be searched. And you'll be scratching your head wondering why your script works or fails depending on which directory you happen to be in.
Note: the pattern [0-9]* is a valid regular expression, and a valid shell glob (wildcard) pattern, but it means very different things in the two contexts. As a regex, it means 0 or more digits in a row. As a shell glob, it means something that starts with a digit. Speaking of which, grep -E '[0-9]*' is not actually going to be very useful, since everything contains strings of 0 or more digits, so it'll match every line of every file you feed it.

How to rename a file using regex capture group in Linux?

I want to rename a_1.0.tgz to b_1.0.tgz, since 1.0 may be changed to any version number, how can I achieve that?
For example, I can use mv a*.tgz b.tgz if I don't need to keep the version number.

zsh comes with the utility zmv, which is intended for exactly that. While zmv does not support regex, it does provide capture groups for filename generation patterns (aka globbing).
First, you might need to enable zmv. This can be done by adding the following to your ~/.zshrc:
autoload -Uz zmv
You can then use it like this:
zmv 'a_(*)' 'b_$1'
This will rename any file matching a_* so, that a_ is replaced by b_. If you want to be less general, you can of course adjust the pattern:
to rename only .tgz files:
zmv 'a_(*.tgz)' 'b_$1'
to rename only .tgz files while changing the extension to .tar.gz
zmv 'a_(*).tgz' 'b_$1.tar.gz'
to only rename a_1.0.tgz:
zmv 'a_(1.0.tgz)' 'b_$1'
To be on the save side, you can run zmv with the option -n first. This will only print, what would happen, but not actually change anything. For more information have a look at the man zshcontrib.

I'm not too familiar with zsh so I don't know if it supports regular expressions but I don't think you really need them here.
You can match the file using a glob and use a substitution:
for file in a_[0-9].[0-9].tgz; do
echo "$file" "${file/a/b}"
done
In the glob pattern, [0-9] matches any number between 0 and 9. ${file/a/b} substitutes the first occurrence of a with b.
Change the echo to mv if you're happy with the result.

Assuming you would like to replace the first character in all files matching a*.tgz with the letter b:
for f in a*.tgz; do
echo mv "$f" "b${f:1}"
done
Remove the echo when you are certain that this does what you want it to do.
The ${f:1} uses the ${name:offset} parameter expansion. From the zshexpn manual (on OS X):
If offset is non-negative, then if the variable name is a
scalar substitute the contents starting offset characters
from the first character of the string, [...]

Rename Files Mac Command Line

I have a bunch of files in a directory that were produced with rather unfortunate names. I want to change two of the characters in the name.
For example I have:
>ch:sdsn-sdfs.txt
and I want to remove the ">" and change the ":" to a "_".
Resulting in
ch_sdsn-sdfs.txt
I tried to just say mv \\>ch\:* ch_* but that didn't work.
Is there a simple solution to this?

For command line script to rename, this stackoverflow question has good answers.
For Mac, In GUI, Finder comes with bulk rename capabilities. If source list of files has some pattern to find & replace, it comes very handy.
Select all the files that need to be replaced, right click and select rename
On rename, enter find and replace string
Other options in rename, to sequence the file names:
To prefix or suffix text:

First, I should say that the easiest way to do this is to use the
prename or rename commands.
Homebrew package rename, MacPorts package renameutils :
rename s/0000/000/ F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.
The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.

Replace and add leading zeros when renaming files

Please be patient, this post will be somewhat long...
I have a bunch of files, some of them with a simple and clean name (e.g. 1E01.txt) and some with a lot of extras:
Sample2_Name_E01_-co_032.txt
Sample2_Name_E02_-co_035.txt
...
Sample12_Name_E01_-co_061.txt
and so on. What is important here is the number after "Sample" and the letter+number after "Name" - the rest is disposable. If i get rid of the non-important parts, the filename reduces to the same pattern as the "clean" filenames (2E01.txt, 2E02.txt, ..., 12E01.txt). I've managed to rename the files with the following expression (came up with this one myself, don't know if is very elegant but works fine):
rename -v 's/Sample([0-9]+)_Name_([A-Z][0-9]+).*/$1$2\.txt/' *.txt
Now, the second part, is adding a leading zero for filenames with just one digit, such as 1E01.txt turns into 01E01.txt. I've managed to to this with (found and modified this on another StackExchange post):
rename -v 'unless (/^[0-9]{2}.*\.txt/) {s/^([0-9]{1}.*\.txt)$/0$1/;s/0*([0-9]{2}\..*)/$1/}' *.txt
So I finally got to my question: is there a way to merge both expressions in just one rename command? I know I could do a bash script to automate the process, but what I want is to find a one-pass renaming solution.
thanks

You can try this command to rename 1-file.txt to 0001-file.txt
# fill zeros
$ rename 's/\d+/sprintf("%04d",$&)/e' *.txt
You can change the command a little to meet your need.

Well if that is your "parsing" regex, then you are limiting the files that the script can act on those matching that pattern. Thus, the sprintf using the same literal strings is not a more specialized case, and you could just do this:
s{Sample(\d+)_Name_(\p{IsUpper})(\d+)}
{sprintf "Sample%02d_Name_%s%03d", $1, $2, $3}e
;
Here, you are using the same known features again and simply formatting the accompanying numbers.
The /e switch is for 'eval' and it evaluates the replacement as Perl for each match.
I renamed some of your expressions to more standard character class symbols: [A-Z] becomes the property class \p{IsUpper}, [0-9] becomes the digit code \d (also possible \p{IsDigit} ).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js