Copy File with Regex Replacement - regex

I'm trying to find a simple/elegant command-line solution for a process that is often used in scripts. Something like: (Fictional example)
CopyWithReplace <SourceFile> <DestFile> -m <match regular expression> -r <replacement regular expression>
It would copy the text file with the matched text replaced as specified. Ideally, the find/replace would happen in the pipeline, rather than as a secondary step. (Destinations quite often are remote locations, and long distance WAN links are often not as fast and reliable as desired.)
What would be the simplest** way to achieve this scriptable functionality in a windows environment?
** Simplest = easy to write batch code, fewest 3rd party tools, etc. Bonus points for a reasonably standard Regex implementation.

This can be achieved with sed.
The basic usage pattern, for a substitution as you described, is:
sed 's/regexp/replacement/g' inputFileName > outputFileName
sed is a Unix utility, but there are several ways of using it in Windows if you wish. This StackOverflow post lists the various options available.

Related

Using regex with `rename` version from `util-linux`

I’m using a GNU/Linux distribution where the utility rename comes from util-linux and I want to make full use of regular (Perl or POSIX) expressions with it.
There are two versions of rename :
The “Perl” version, with syntax rename 's/^fgh/jkl/' fgh*
The util-linux version, with syntax rename fgh jkl fgh*
If the use of regexes seems pretty obvious with the first one, to which I have no easy access. However, I’m confused about the second one: I could not find any relevant documentation or examples on the possible use, and in that case the format, of the regular expressions to use.
Let’s take, to make a simple example, a directory containing:
foo_a1.ext
foo_a32.ext
foo_c18.ext
foo_h12.ext
I want to use a syntax like one of these two lines:
rename "foo_[a-z]([0-9]{1,2}).ext" "foo_\1.ext" *
rename "foo_[:alpha:]([:digit:]{1,2}).ext" "foo_\1.ext" *
for which the expected output would be:
foo_1.ext
foo_32.ext
foo_18.ext
foo_12.ext
Of course this does not work! Either I’m missing something obvious, or there is
no implemented way to use actual regular expressions with this tool.
(Please note that I am aware of the other possibilities for renaming files with regular expressions in a shell interpreter; this question aims at a specific version of the rename tool.)
Here is the manual page: http://linux.die.net/man/1/rename. It is pretty straightforward:
rename from to file...
rename will rename the specified files by replacing the first
occurrence of from in their name by to.
I believe there are no regexes, it is just plain substring match.
The following command gives the expected result with your input file but using the perl version:
rename 's/foo_\D+(\d+)/foo_$1/' *.ext
You can test the command using -n option to rename

Ignore multiline comments git diff

I'm trying to find the significant differences in C/C++ source code in which only source code changes. I know you can use the git diff -G<regex> but it seems very limiting in the kind of regexes that can be run. For example, it doesn't seem to offer a way to ignore multiline comments in C/C++.
Is there any way in git or preferably libgit2 to ignore comments (including multiline), whitespaces, etc. before a diff is run? Or a way of determining if a line from the diff output is a comment or not?
git diff -w to ignore whitespace differences.
You cannot ignore multiline comments because git is a versioning tool, not a language dependent interpreter. It doesn't know your code is C++. It does not parse files for semantics, so it cannot interpret what is comment and what isn't. In particular, it relies on diff (or a configured difftool) to compare text files and it expects a line-by-line comparison.
I agree with #andrew-c that what you are really asking is to compare the two pieces of code without comments. More specifically helpful, you are asking to compare the lines of code where all multiline comments have been turned into empty lines. You keep the blank lines there so you have the correct line numbers to reference on a normal copy.
So you could manually convert the two code states to blank out multiline comments... or you might look at building your own diff wrapper that did the stripping for you. But the latter is not likely to be worth the effort.
You can achieve this using git attributes and diff filters as described in Viewing git filters output when using meld as a diff tool to call a sed script, which however is pretty complex on its own if you want it to handle all cases like comment delimiters inside string literals etc.

Global substitution for latex commands in vim

I am writing a long document and I am frequently formatting some terms to italics. After some time I realized that maybe that is now what I want so I would like to remove all the latex commands that format text to italics.
Example:
\textit{Vim} is undoubtedly one of the best editors ever made. \textit{LaTeX} is an extremely powerful, intelligent typesetter. \textbd{Vim-LaTeX} aims at bringing together the best of both these worlds
How can I run a substitution command that recognizes all the instances of \textit{whatever} and changes them to just whatever without affecting different commands such as \textbd{Vim-LaTeX} in this example?
EDIT: As technically the answer that helps is the one from Igor I will mark that one as the correct one. Nevertheless, Konrad's answer should be taken into account as it shows the proper Latex strategy to follow.
You shouldn’t use formatting commands at all in your text.
LaTeX is built around the idea of semantic markup. So instead of saying “this text should be italic” you should mark up the text using its function. For instance:
\product{Vim} is undoubtedly one of the best editors ever made. \product{LaTeX}
is an extremely powerful, intelligent typesetter. \product{Vim-LaTeX} aims at
bringing together the best of both these worlds
… and then, in your preamble, a package, or a document class, you (re-)define a macro \product to set the formatting you want. That way, you can adapt the macro whenever you deem necessary without having to change the code.
Or, if you want to remove the formatting completely, just make the macro display its bare argument:
\newcommand*\product[1]{#1}
Use this substitution command:
% s/\\textit{\([^}]*\)}/\1/
If textit can span muptiple lines:
%! perl -e 'local $/; $_=<>; s/\\textit{([^}]*)}/$1/g; print;'
And you can do this without perl also:
%s/\\textit{\(\_.\{-}\)}/\1/g
Here:
\_. -- any symbol including a newline character
\{-} -- make * non-greedy.

Compounding switch regexes in Vim

I'm working on refactoring a bunch of PHP code for an instructor. The first thing I've decided to do is to update all the SQL files to be written in Drupal SQL coding conventions, i.e., to have all-uppercase keywords. I've written a few regular expressions:
:%s/create table/CREATE TABLE/gi
:%s/create database/CREATE DATABASE/gi
:%s/primary key/PRIMARY KEY/gi
:%s/auto_increment/AUTO_INCREMENT/gi
:%s/not null/NOT NULL/gi
Okay, that's a start. Now I just open every SQL file in Vim, run all five regular expressions, and save. This feels like five times the work it should be. Can they be compounded in to one obnoxiously long but easily copy-pastable regex?
why do you have to do it in vim? how about sed/awk?
e.g. with sed
sed -e 's/create table/\U&/g' -e's/not null/\U&/g' -e 's/.../\U&/' *.sql
btw, in vi you may do
:%s/create table/\U&/g
to change case, well save some typing.
update
if you really want a long command to execute in vi, maybe you could try:
:%s/create table\|create database\|foo\|bar\|blah/\U&/g
Open the file containing that substitution commands.
Copy its contents (to the unnamed register, by default):
:%y
If there is only one file where the substitutions should be
performed, open it as usual and run the contents of that register
as a Normal mode command:
:#"
If there are several files to edit automatically, open those
files as arguments:
:args *.sql
Execute the yanked substitutions for each file in the argument list:
:argdo #"|up
(The :update command running after the substitutions, writes
the buffer to file if it has been changed.)
While sed can handle what you want (hovewer it can be interactive as you requestred by flag 'i'), vim still much powerfull. Once I needed to change last argument in some function call in 1M SLOC code base. The arguments could be in one line or in several lines. In vim I achieved it pretty easy.
You can open all php files in vim at once:
vim *.php
After that run in ex mode:
:bufdo! %s/create table/CREATE TABLE/gi
Repeat the rest of commands. At the end save all the files and exit vim:
:xall

grep replacement with extensive regular expression implementation

I have been using grepWin for general searching of files, and wingrep when I want to do replacements or what-have-you.
GrepWin has an extensive implementation of regular expressions, however doesn't do replacements (as mentioned above).
Wingrep does replacements, however has a severely limited range of regular expression implementation.
Does anyone know of any (preferably free) grep tools for windows that does replacement AND has a reasonable implementation of regular expressions?
Thanks in advance.
I think perl at the command line is the answer you are looking for. Widely portable, powerful regex support.
Let's say that you have the following file:
foo
bar
baz
quux
you can use
perl -pne 's/quux/splat!/' -i /tmp/foo
to produce
foo
bar
baz
splat!
The magic is in Perl's command line switches:
-e: execute the next argument as a perl command.
-n: execute the command on every line
-p: print the results of the command, without issuing an explicit
'print' statement.
-i: make substitutions in place. overwrite the document with the
output of your command... use with caution.
I use Cygwin quite a lot for this sort of task.
Unfortunately it has the world's most unintuitive installer, but once it's installed correctly it's very usable... well apart from a few minor issues with copy and paste and the odd issue with line-endings.
The good thing is that all the tools work like on a real GNU system, so if you're already familiar with Linux or similar, you don't have to learn anything new (apart from how to use that crazy installer).
Overall I think the advantages make up for the few usability issues.
If you are on Windows, you can use vbscript (requires no downloads). It comes with regex. eg change "one" to "ONE"
Set objFS=CreateObject("Scripting.FileSystemObject")
Set WshShell = WScript.CreateObject("WScript.Shell")
Set objArgs = WScript.Arguments
strFile = objArgs(0)
Set objFile = objFS.OpenTextFile(strFile)
strFileContents = objFile.ReadAll
Set objRE = New RegExp
objRE.Global = True
objRE.IgnoreCase = False
objRE.Pattern = "one"
strFileContents = objRE.Replace(strFileContents,"ONE") 'simple replacement
WScript.Echo strFileContents
output
C:\test>type file
one
two one two
three
C:\test>cscript //nologo test.vbs file
ONE
two ONE two
three
You can read up vbscript doc to learn more on using regex