Match the string using regex - regex

I have few files like 123.iso, 234.isoaa, 456.isoab, sajdhsjf.isoaf.
I want to extract all the files except those that end with exactly .iso.
For example, I should have 234.isoaa, 456.isoab, sajdhsjf.isoaf.

Assuming you meant "all the files with suffix beginning with .iso except those...", this works:
ls -1 | egrep "\.iso.+"

Try this :
\b[\w](?!.iso\b).[\w]\b

As Tim Pietzker noted, you didn't say for which shell you need a solution, but in zsh you could do
setopt local_options extended_glob
echo *^*.iso(N)
If you are happy to get only files which have .isoX at the end (with any X), this should work in bash, zsh and ksh:
echo *.iso?*
Note that this second solution - different to the first one - would not list files such as abc.txt.
Of course you can do a ls -1 instead of the echo. This depends on how you what you want to do with the result.

Related

Batch rename files with regex not working

I've got many files on a linux server which have this format
text_text_mixturelettersnumbers.filefor example Hesperocyparis_goveniana_E00196073A.bam.baior Hesperocyparis_forbesii_RBGEH19_bwa_out.txt. I would like to change the first underscore to a hyphen and leave everything else so it looks like this text-text_mixturelettersnumbers.file.
I have tried rename -n 's/(\w+)_(\w+_.)/$1-$2/' * and many different versions thereof but nothing is happening. Could someone please point out what I've got wrong?
Thanks
Markus
The util-linux rename does not have an option to display the results only. It is very basic.
If you want to list the files that contain two underscores before an extension, use
for f in *_*_*.*; do
echo "$f => ${f/_/-}";
done
To actually rename, use mv:
for f in *_*_*.*; do
mv -- "$f" "${f/_/-}";
done
The "${f/_/-}" replaces the first _ with - in variable f.

Extracting group from regex in shell script using grep

I want to extract the output of a command run through shell script in a variable but I am not able to do it. I am using grep command for the same. Please help me in getting the desired output in a variable.
x=$(pwd)
pw=$(grep '\(.*\)/bin' $x)
echo "extracted is:"
echo $pw
The output of the pwd command is /opt/abc/bin/ and I want only /root/abc part of it. Thanks in advance.
Use dirname to get the path and not the last segment of the path.
You can use:
x=$(pwd)
pw=`dirname $x`
echo $pw
Or simply:
pw=`dirname $(pwd)`
echo $pw
All of what you're doing can be done in a single echo:
echo "${PWD%/*}"
$PWD variable represents current directory and %/* removes last / and part after last /.
For your case it will output: /root/abc
The second (and any subsequent) argument to grep is the name of a file to search, not a string to perform matching against.
Furthermore, grep prints the matching line or (with -o) the matching string, not whatever the parentheses captured. For that, you want a different tool.
Minimally fixing your code would be
x=$(pwd)
pw=$(printf '%s\n' "$x" | sed 's%\(.*\)/bin.*%\1%')
(If you only care about Bash, not other shells, you could do sed ... <<<"$x" without the explicit pipe; the syntax is also somewhat more satisfying.)
But of course, the shell has basic string manipulation functions built in.
pw=${x%/bin*}

Linux Commad Line Zip with Regex

I have thousands of jpg files that are all called 1.jpg, 2.jpg, 3.jpg and so on. I need to zip up a range of them and I thought I could do this with regex, but so far haven't had any luck.
Here is the command
zip images.zip '[66895-105515]'.jpg
Does anyone have any ideas?
I am very sure that is not possible to match number ranges like this with regular expressions (digit ranges, yes, but not whole multi-digit numbers), as regular expressions work on the character level. However, you can use the "seq" command to generate the list of files and use "xargs" to pass them to "zip":
seq --format %g.jpg 66895 105515 | xargs zip images.zip
I tested the command with a bunch of dummy files under Linux and it works fine.
Use in conjunction with ls and bash range ({m..n}) operator like this:
ls {66895..105515}".jpg" 2>/dev/null | zip jpegs -#
You need to pipe some stuff - list the files, filter by the regex, zip up each listed file.
ls | grep [66895-10551] | xargs zip images.zip
Edit: Whoops, didn't test with multi-digit numbers. As denisw mentions, this method won't work.

Apply regular expression substitution globally to many files with a script

I want to apply a certain regular expression substitution globally to about 40 Javascript files in and under a directory. I'm a vim user, but doing this by hand can be tedious and error-prone, so I'd like to automate it with a script.
I tried sed, but handling more than one line at a time is awkward, especially if there is no limit to how many lines the pattern might match.
I also tried this script (on a single file, for testing):
ex $1 <<EOF
gs/,\(\_\s*[\]})]\)/\1/
EOF
The pattern will eliminate a trailing comma in any Perl/Ruby-style list, so that "[a, b, c,]" will come out as "[a, b, c]" in order to satisfy Internet Explorer, which alone among browsers, chokes on such lists.
The pattern works beautifully in vim but does nothing if I run it in ex, as per the above script.
Can anyone see what I might be missing?
You asked for a script, but you mentioned that you are vim user. I tend to do project-wide find and replace inside of vim, like so:
:args **/*.js | argdo %s/,\(\_\s*[\]})]\)/\1/ge | update
This is very similar to the :bufdo solution mentioned by another commenter, but it will use your args list rather than your buflist (and thus doesn't require a brand new vim session nor for you to be careful about closing buffers you don't want touched).
:args **/*.js - sets your arglist to contain all .js files in this directory and subdirectories
| - pipe is vim's command separator, letting us have multiple commands on one line
:argdo - run the following command(s) on all arguments. it will "swallow" subsequent pipes
% - a range representing the whole file
:s - substitute command, which you already know about
:s_flags, ge - global (substitute as many times per line as possible) and suppress errors (i.e. "No match")
| - this pipe is "swallowed" by the :argdo, so the following command also operates once per argument
:update - like :write but only when the buffer has been modified
This pattern will obviously work for any vim command which you want to run on multiple files, so it's a handy one to keep in mind. For example, I like to use it to remove trailing whitespace (%s/\s\+$//), set uniform line-endings (set ff=unix) or file encoding (set filencoding=utf8), and retab my files.
1) Open all the files with vim:
bash$ vim $(find . -name '*.js')
2) Apply substitute command to all files:
:bufdo %s/,\(\_\s*[\]})]\)/\1/ge
3) Save all the files and quit:
:wall
:q
I think you'll need to recheck your search pattern, it doesn't look right. I think where you have \_\s* you should have \_s* instead.
Edit: You should also use the /ge options for the :s... command (I've added these above).
You can automate the actions of both vi and ex by passing the argument +'command' from the command line, which enables them to be used as text filters.
In your situation, the following command should work fine:
find /path/to/dir -name '*.js' | xargs ex +'%s/,\(\_\s*[\]})]\)/\1/g' +'wq!'
you can use a combination of the find command and sed
find /path -type f -iname "*.js" -exec sed -i.bak 's/,[ \t]*]/]/' "{}" +;
If you are on windows, Notepad++ allows you to run simple regexes on all opened files.
Search for ,\s*\] and replace with ]
should work for the type of lists you describe.

Extracting username from UNIX path using Regex

I need to get a username from an Unix path with this format:
/home/users/myusername/project/number/files
I just want "myusername" I've been trying for almost a hour and I'm completely clueless.
Any idea?
Thanks!
Maybe just /home/users/([a-zA-Z0-9_\-]*)/.*?
Note that the critical part [a-zA-Z0-9_\-]* has to contain all valid characters for unix usernames. I took from here, that a username should only contain digits, characters, dashes and underscores.
Also note that the extracted username is not the whole matching, but the first group (indicated by (...)).
The best answer to this depends on what you are trying to achieve. If you want to know the user who owns that file then you can use the stat command, this unfortunately has slightly different syntax dependant on the operating system however the following two commands work
Max OS/X
stat -f '%Su' /home/users/myusername/project/number/files
Redhat/Fedora/Centos
stat -c '%U' /home/users/myusername/project/number/files
If you really do want the string following /home/users then the either of the Regexes provided above will do that, you could use that in a bash script as follows (Mac OS/X)
USERNAME=$(echo '/home/users/myusername/project/number/files' | \
sed -E -e 's!^/home/users/([^/]+)/.*$!\1!g')
Check http://rubular.com/r/84zwJmV62G. The first match, not the entire match, is the username.
in a bourne shell something like :
string="/home/users/STRINGWEWANT/some/subdir/here"
echo $string | awk -F\/ '{print $3}'
would be one option, assuming its always the third element of the path. There are more lightweight that use only the shell builtins :
echo ${x#*users/}
will strip out everything up to and including 'users/'
echo ${y%%/*}
Will strip out the remainder.
So to put it all together :
export path="/home/users/STRINGWEWABT/some/other/dirs"
export y=`echo ${path#*users/}` && echo ${y%%/*}
STRINGWEWABT
Also checkout the bash manpage and search for "Parameter Expansion"
(\/home\/users\/)([^\/]+)
The 2nd capture group (index 1) will be myusername