Regular expression based searching for Mercurial changeset - regex

I would like to be able to perform regular expression-type searches on Mercurial changesets and display results using log.
I've come up with the following function, which seems to work, but has a number of possible bugs (e.g. $1 is in line of text containing the word changeset).
function hgs { hg log `hg log | grep changeset | grep "$1" \
| sed 's/changeset: *//g' | sed 's/:.*$//g' | \
awk '{print " -r " $0}'`; }
export -f hgs
Am I trying to recreate something here that already exists as a well-tested solution elsewhere?

It pretty much looks like a combination of using hg grep, making use revsets and templated output could possibly help you (check hg help revsets, hg help templates, hg help grep and possibly also hg help fileset).
E.g. to find all changes to config.lib or where the commit message contains 'pkgconfig' which were made after 2010:
hg log -r"(file('config.lib') or desc('pkgconfig')) and date('>2010')"
revsets are very powerful. You can also sort, limit to a certain number of changesets, combine different requirements...
Using the --template argument to hg log can be used to format the output in any pattern you desire.

Related

Using grep for listing files by owner/read perms

The rest of my bash script works, just having trouble using grep. On each file I am using the following command:
ls -l $filepath | grep "^.r..r..r.*${2}$"
How can I properly use the second argument in the regular expression? What I am trying to do is print the file if it can be read by anyone and the owner is who is passed by the second argument.
Using:
ls -l $filepath | grep "^.r..r..r"
Will print the information successfully based on the read permissions. What I am trying to do is print based on... [read permission][any characters in between][ending with the owner's name]
The immediate problem with your attempt is the final $ which anchors the search to the end of the line, which is the end of the file name, not the owner field. A better solution would replace grep with Awk instead, which has built-in support for examining only specific fields. But actually don't use ls for this, or really in scripts at all.
Unfortuntately, the stat command's options are not entirely portable, but for Linux, try
case $(stat -c %a:%u "$filepath") in
[4-7][4-7][4-7]:"$2") ls -l "$filepath";;
esac
or maybe more portably
find "$filepath" -user "$2" -perm /444 -ls
Sadly, the -perm /444 predicate is not entirely portable, either.
Paradoxically, the de facto most portable replacement for stat to get a file's permissions might actually be
perl -le '#s = stat($ARGV[0]); printf "%03o\n", $s[2]' "$filepath"
The stat call returns a list of fields; if you want the owner, too, the numeric UID is in $s[4] and getpwuid($s[4]) gets the user name.

Drop commits by commit message in `git rebase`

I would like to do a git rebase and drop commits whose commit messages match a certain regular expression. For example, it might work like
git rebase --drop="deletme|temporary" master
And this would do a rebase over master while dropping all commits containing the string deleteme or temporary.
Is is possible to do this with the standard Git tool? If not, is it possible with a third-party Git tool? In particular, I want it to be a single, noninteractive command.
This can be accomplished using the same method as I used in this answer.
First, we need to find the relevant commits. You can do that with something like:
git log --format=format:"%H %s" master..HEAD | grep -E "deleteme|temporary"
This will give you a list of commits with commit messages containing deleteme or temporary that are between master and your current branch. These are the commits that need to be dropped.
Save this bash script somewhere you can access it:
#!/bin/bash
for sha in $(git log --format=format:"%H %s" master..HEAD | grep -E "deleteme|temporary" | cut -d " " -f 1)
do
sha=${sha:0:7}
sed -i "s/pick $sha/drop $sha/" $#
done
Then run the rebase as:
GIT_SEQUENCE_EDITOR=/path/to/script.sh git rebase -i
This will automatically drop all commits that contain deleteme or temporary in their commit message.
As mentioned in my other answer:
[This script won't allow] you to customize what command is run to calculate which commits to use, but if this is an issue, you could probably pass in an environment variable to allow such customization.
Obligatory warning: Since a rebase rewrites history, this can be dangerous / disruptive for anyone else working on this branch. Be sure you clearly communicate what you have done with anyone you are collaborating with.
You could e. g. use interactive rebase. So do git rebase -i <first commit that should not be touched>, and then in vim where you have the list of commits, you can do :%s/^[^ ]* \([^ ]* issue\)/d \1/g to use drop stanza for all commits whose commit message starts with issue. But be aware that git rebase is not working optimally with merge commits. By default they are skipped and the history flattened, but you can try to keep them with parameters.
#Scott Weldon's answer works great for this usecase, however If the regex checks from the start of the message for example with (^(deleteme)|^(temporary)), then this won't work, since the start of the grep is the commit hash. So in that case you can use this instead
#!/bin/bash
for sha in $(git log --format=format:"%s %H" master..HEAD | grep -E "^(deleteme)|^(temporary)" | awk '{print $NF}')
do
sha=${sha:0:7}
sed -i "s/pick $sha/drop $sha/" $#
done
The core difference is that %sand %H swapped places, and therefore we search the last part of the string instead of the first part of the string by piping to awk '{print $NF}')
Also worth noting that this is called the same way as in Scott Weldon's answer:
GIT_SEQUENCE_EDITOR=/path/to/script.sh git rebase -i master

batch renaming of files with perl expressions

This should be a basic question for a lot of people, but I am a biologist with no programming background, so please excuse my question.
What I am trying to do is rename about 100,000 gzipped data files that have existing name of a code (example: XG453834.fasta.gz). I'd like to name them to something easily readable and parseable by me (example: Xanthomonas_galactus_str_453.fasta.gz).
I've tried to use sed, rename, and mmv, to no avail. If I use any of those commands on a one-off script then they work fine, it's just when I try to incorporate variables into a shell script do I run into problems. I'm not getting any errors, just no names are changed, so I suspect it's an I/O error.
Here's what my files look like:
#! /bin/bash
# change a bunch of file names
file=names.txt
while IFS=' ' read -r r1 r2;
do
mmv ''$r1'.fasta.gz' ''$r2'.fasta.gz'
# or I tried many versions of: sed -i 's/"$r1"/"$r2"/' *.gz
# and I tried many versions of: rename -i 's/$r1/$r2/' *.gz
done < "$file"
...and here's the first lines of my txt file with single space delimiter:
cat names.txt
#find #replace
code1 name1
code2 name2
code3 name3
I know I can do this with python or perl, but since I'm stuck here working on this particular script I want to find a simple solution to fixing this bash script and figure out what I am doing wrong. Thanks so much for any help possible.
Also, I tried to cat the names file (see comment from Ashoka Lella below) and then use awk to move/rename. Some of the files have variable names (but will always start with the code), so I am looking for a find & replace option to just replace the "code" with the "name" and preserve the file name structure.
I suspect I am not escaping the variable within the single tick of the perl expression, but I have poured over a lot of manuals and I can't find the way to do this.
If you're absolutely sure than the filenames doesn't contain spaces of tabs, you can try the next
xargs -n2 < names.txt echo mv
This is for DRY run (will only print what will do) - if you satisfied with the result, remove the echo ...
If you want check the existence ot the target, use
xargs -n2 < names.txt echo mv -i
if you want NEVER allow overwriting of the target use
xargs -n2 < names.txt echo mv -n
again, remove the echo if youre satisfied.
I don't think that you need to be using mmv, a simple mv will do. Also, there's no need to specify the IFS, the default will work for you:
while read -r src dest; do mv "$src" "$dest"; done < names.txt
I have double quoted the variable names as it is generally considered good practice but in this case, a space in either of the filenames will result in read not working as you expect.
You can put an echo before the mv inside the loop to ensure that the correct command will be executed.
Note that in your file names.txt, the .fasta.gz suffix is already included, so you shouldn't be adding it inside the loop aswell. Perhaps that was your problem?
This should rename all files in column1 to column2 of names.txt. Provided they are in the same folder as names.txt
cat names.txt| awk '{print "mv "$1" "$2}'|sh

Complex changes to a URL with sed

I am trying to parse an RSS feed on the Linux command line which involves formatting the raw output from the feed with sed.
I currently use this command:
feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}" | sed 's/^\(.\{3\}\)\(.\{13\}\)\(.\{6\}\)\(.\{3\}\)\(.*\)/\1\3\5/'
This gives me a number of feed items per line that look like this:
Sat 20:33 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/newsticker/meldung/WhatsApp-Ausfall-Server-Probleme-blockieren-Messaging-Dienst-2121664.html/from/atom10?wt_mc=rss.ho.beitrag.atom
Notice the long URL at the end. I want to shorten this to better fit on the command line. Therefore, I want to change my sed command to produce the following:
Sat 20:33 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/-2121664
That means cutting everything out of the URL except a dash and that seven digit number preceeding the ".html/blablabla" bit.
Currently my sed command only changes stuff in the date bit. It would have to leave the title and start or the URL alone and then cut stuff out of it until it reaches the seven digit number. It needs to preserve that and then cut everything after it out. Oh yeah, and we need to leave a dash right in front of that number too.
I have no idea how to do that and can't find the answer after hours of googling. Help?
EDIT:
This is the raw output of a line of feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}", in case it helps:
Sat, 22 Feb 2014 20:33:00 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/newsticker/meldung/WhatsApp-Ausfall-Server-Probleme-blockieren-Messaging-Dienst-2121664.html/from/atom10?wt_mc=rss.ho.beitrag.atom
EDIT 2:
It seems I can only pipe that output into one command. Piping it through multiple ones seems to break things. I don't understand why ATM.
Unfortunately (for me), I could only think of solving this with extended regexp syntax (either -E or -r flag on different systems):
... | sed -E 's|(://[^/]+/).*(-[0-9]+)\.html/.*|\1\2|'
UPDATE: In basic regexp syntax, the best I can do is
... | sed 's|\(://[^/]*/\).*\(-[0-9][0-9]*\)\.html/.*|\1\2|'
The key to writing this sort of regular expression is to be very careful about what the boundaries of what you expect are, so as to avoid the random gunk that you want to get rid of causing you problems. Also, you should bear in mind that you can use characters other than / as part of a s operation's delimiters.
sed 's!\(http://www\.heise\.de/\)newsticker/meldung/[^./]*\(-[0-9]+\)\.html[^ ]*!\1\2!'
Be aware that getting the RE right can be quite tricky; assume you'll need to test it! (This is a key part of the “now you have two problems” quote; REs very easily become horrendous.)
Something like this maybe?
... | awk -F'[^0-9]*' '{print "http://www.heise.de/-"$2}'
This might work for you (GNU sed):
sed 's|\(//[^/]*/\).*\(-[0-9]\{7\}\).*|\1\2|' file
You can place the first sed command so:
feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}" |
sed 's/^\(.\{3\}\)\(.\{13\}\)\(.\{6\}\)\(.\{3\}\)\(.*\)/\1\3\5/;s|\(//[^/]*/\).*\(-[0-9]\{7\}\).*|\1\2|'

SVN tag version number increment from the command line

Well I wanted to know if I could get the latest tag from subversion, increment it and create the new tag all in one command? Currently I get the latest tag like this:
svn ls http://svn/path/to/tags | tail -n 1
Which gives me something like this:
1.2.34/
then I will create a new tag with the version number of 1.2.35 as I've incremented the version number like this:
svn copy http://svn/path/to/trunk http://svn/path/to/tags/1.2.35
from here I just do a switch to point production code to the latest tag.
I know I could write a script to take care of this but I wanted to know if I could do this just from the command line with one command (Chaining the commands). Were I'm stuck is, how do I increment the tag name to the next version number (e.g., from 1.2.34 to 1.2.35)? Version number ranges should follow x.[0-99].[0-99]. Any ideas, help would be great.
Related:
http://www.commandlinefu.com/commands/browse
The "one liner" to get the next tag would be something like this:
svn ls http://svn/path/to/tags | \
sort -t '.' -k 1,1n -k 2,2n -k 3,3n | \
tail -1|sed 's:/$::' | \
awk 'BEGIN{FS="."}{print $1 "." $2 "." $3+1}'
... but you should probably just write a script so that you can actually test it. (And yes, I'm aware that the sort and tail and sed and awk could probably all collapse under its own weight into a bit of perl, but you'll need all those "parts" in there somewhere.)
Something like
svn copy http://svn/path/to/trunk http://svn/path/to/tags/`svn ls http://svn/path/to/tags | some-script-for-getting number`