Global substitution for latex commands in vim - regex

I am writing a long document and I am frequently formatting some terms to italics. After some time I realized that maybe that is now what I want so I would like to remove all the latex commands that format text to italics.
Example:
\textit{Vim} is undoubtedly one of the best editors ever made. \textit{LaTeX} is an extremely powerful, intelligent typesetter. \textbd{Vim-LaTeX} aims at bringing together the best of both these worlds
How can I run a substitution command that recognizes all the instances of \textit{whatever} and changes them to just whatever without affecting different commands such as \textbd{Vim-LaTeX} in this example?
EDIT: As technically the answer that helps is the one from Igor I will mark that one as the correct one. Nevertheless, Konrad's answer should be taken into account as it shows the proper Latex strategy to follow.

You shouldn’t use formatting commands at all in your text.
LaTeX is built around the idea of semantic markup. So instead of saying “this text should be italic” you should mark up the text using its function. For instance:
\product{Vim} is undoubtedly one of the best editors ever made. \product{LaTeX}
is an extremely powerful, intelligent typesetter. \product{Vim-LaTeX} aims at
bringing together the best of both these worlds
… and then, in your preamble, a package, or a document class, you (re-)define a macro \product to set the formatting you want. That way, you can adapt the macro whenever you deem necessary without having to change the code.
Or, if you want to remove the formatting completely, just make the macro display its bare argument:
\newcommand*\product[1]{#1}

Use this substitution command:
% s/\\textit{\([^}]*\)}/\1/
If textit can span muptiple lines:
%! perl -e 'local $/; $_=<>; s/\\textit{([^}]*)}/$1/g; print;'
And you can do this without perl also:
%s/\\textit{\(\_.\{-}\)}/\1/g
Here:
\_. -- any symbol including a newline character
\{-} -- make * non-greedy.

Related

Copy File with Regex Replacement

I'm trying to find a simple/elegant command-line solution for a process that is often used in scripts. Something like: (Fictional example)
CopyWithReplace <SourceFile> <DestFile> -m <match regular expression> -r <replacement regular expression>
It would copy the text file with the matched text replaced as specified. Ideally, the find/replace would happen in the pipeline, rather than as a secondary step. (Destinations quite often are remote locations, and long distance WAN links are often not as fast and reliable as desired.)
What would be the simplest** way to achieve this scriptable functionality in a windows environment?
** Simplest = easy to write batch code, fewest 3rd party tools, etc. Bonus points for a reasonably standard Regex implementation.
This can be achieved with sed.
The basic usage pattern, for a substitution as you described, is:
sed 's/regexp/replacement/g' inputFileName > outputFileName
sed is a Unix utility, but there are several ways of using it in Windows if you wish. This StackOverflow post lists the various options available.

Finding and modifying function definitions (C++) via bash-script

Currently I am working on a fairly large project. In order to increase the quality of our code, we decided to enforce the treatement of return values (Error Codes) for every function. GCC supports a warning concerning the return value of a function, however the function definition has to be preceeded by the following flag.
static __attribute__((warn_unused_result)) ErrorCode test() { /* code goes here */ }
I want to implement a bashscript that parses the entire source code and issues a warning in case the
__attribute__((warn_unused_result))
is missing.
Note that all functions that require this kind of modification return a type called ErrorCode.
Do you think this is possible via a bash script ?
Maybe you can use sed with regular expressions. The following worked for me on a couple of test files I tried:
sed -r "s/ErrorCode\s+\w+\s*(.*)\s*\{/__attribute__((warn_unused_result)) \0/g" test.cpp
If you're not familiar with regex, the pattern basically translates into:
ErrorCode, some whitespace, some alphanumerics (function name), maybe some whitespace, open parenthesis, anything (arguments), close parenthesis, maybe some whitespace, open curly brace.
If this pattern is found, it is prefixed by __attribute__((warn_unused_result)). Note that this only works if you are putting the open curly brace always in the same line as the arguments and you don't have line breaks in your function declarations.
An easy way I could imagine is via ctags. You create a tag file over all your source code, and then parse the tags file. However, I'm not quite sure about the format of the tags file. The variant I'm using here (Exuberant Ctags 5.8) seems to put an "f" in the fourth column, if the tag represents a function. So in this case I would use awk to filter all tags that represent functions, and then grep to throw away all lines without __attribute__((warn_unused_result)).
So, in a nutshell, first you do
$ ctags **/*.c
This creates a file called "tags" in the current directory. The command might also be ctags-exuberant, depending on your variant. The **/*.c is a glob pattern that might work in your shell - if it doesn't, you have to supply your source files in another way (look at the ctagsoptions).
Then you filter the funktions:
$ cat tags | awk -F '\t' '$4 == "f" {print $0}' | grep -v "__attribute__((warn_unused_result))"
No, it is not possible in the general case. The C++ grammar is the most complex of all the languages I know of, and C++ is not parsable via regular expressions in the general case. You might succeed if you limit yourself to a very narrow set of uses, but I am not sure how feasible it is in your case.
I also do not think the excersise is worth the effort, since sometimes ignoring the result of the function is an OK thing.

Ignore multiline comments git diff

I'm trying to find the significant differences in C/C++ source code in which only source code changes. I know you can use the git diff -G<regex> but it seems very limiting in the kind of regexes that can be run. For example, it doesn't seem to offer a way to ignore multiline comments in C/C++.
Is there any way in git or preferably libgit2 to ignore comments (including multiline), whitespaces, etc. before a diff is run? Or a way of determining if a line from the diff output is a comment or not?
git diff -w to ignore whitespace differences.
You cannot ignore multiline comments because git is a versioning tool, not a language dependent interpreter. It doesn't know your code is C++. It does not parse files for semantics, so it cannot interpret what is comment and what isn't. In particular, it relies on diff (or a configured difftool) to compare text files and it expects a line-by-line comparison.
I agree with #andrew-c that what you are really asking is to compare the two pieces of code without comments. More specifically helpful, you are asking to compare the lines of code where all multiline comments have been turned into empty lines. You keep the blank lines there so you have the correct line numbers to reference on a normal copy.
So you could manually convert the two code states to blank out multiline comments... or you might look at building your own diff wrapper that did the stripping for you. But the latter is not likely to be worth the effort.
You can achieve this using git attributes and diff filters as described in Viewing git filters output when using meld as a diff tool to call a sed script, which however is pretty complex on its own if you want it to handle all cases like comment delimiters inside string literals etc.

When and why did the output of qr() change?

The output of perl's qr has changed, apparently sometime between versions 5.10.1 and 5.14.2, and the change is not documented--at least not fully.
To demonstrate the change, execute the following one-liner on each version:
perl -e 'print qr(foo)is."\n"'
Output from perl 5.10.1-17squeeze6 (Debian squeeze):
(?-xism:foo)
Output from perl 5.14.2-21+deb7u1 (Debian wheezy):
(?^:foo)
The perl documentation (perldoc perlop) says:
$rex = qr/my.STRING/is;
print $rex; # prints (?si-xm:my.STRING)
s/$rex/foo/;
which appears to no longer be true:
$ perl -e 'print qr/my.STRING/is."\n"'
(?^si:my.STRING)
I would like to know when this change occurred (which version of Perl, or supporting library or whatever).
Some background, in case it's relevant:
This change has caused a bunch of unit tests to fail. I need to decide if I should simply update the unit tests to reflect the new format, or make the tests dynamic enough to support both formats, etc. To make an informed decision, I would like to understand why the change took place. Knowing when and where it took place seems like the best place to start in that investigation.
It's documented in perl5140delta:
Regular Expressions
(?^...) construct signifies default modifiers
[...] Stringification of regular expressions now uses this notation. [...]
This change is likely to break code that compares stringified regular expressions with fixed strings containing ?-xism.
The function regexp_pattern can be used to parse the modifiers for normalisation purposes.
Part of the reason this was added, was that regular expressions were getting quite a few new modifiers.
Your example would actually produce something like this if that change didn't happen:
(?d-xismpaul:foo)
That also doesn't really express the modifiers in place.
d/u/l can only be added to a regex, not subtracted like i.
They are also mutually exclusive.
a/aa There are actually two levels for this modifier.
While work went underway adding these modifiers it was determined that this will break quite a few tests on CPAN modules.
Seeing as the tests were going to break anyway, it was agreed upon that there should be a way of specifying just use the defaults ((?^:…)).
That way, the tests wouldn't have to updated every time a new modifier was added.
To receive the stringified form of a regexp you can use Regexp::Parser and its qr method. Using this module you can not only test the representation of a regexp, but also walk a tree.

Highlighting Javascript Inline Block Comments in Vim

Background
I use JScript (Microsoft's ECMAScript implementation) for a lot of Windows system administration needs. This means I use a lot of ActiveX (Automated COM) objects. The methods of these objects often expect Number or Boolean arguments. For example:
var fso = new ActiveXObject("Scripting.FileSystemObject");
var a = fso.CreateTextFile("c:\\testfile.txt", true);
a.WriteLine("This is a test.");
a.Close();
(CreateTextFile Method on MSDN)
On the second line you see that the second argument is one that I'm talking about. A Boolean of "true" doesn't really describe how the method's behavior will change. This isn't a problem for me, but my automation-shy coworkers are easily spooked. Not knowing what an argument does spooks them. Unfortunately a long list of constants (not real constants, of course, since current JScript versions don't support them) will also spook them. So I've taken to documenting some of these basic function calls with inline block comments. The second line in the above example would be written as such:
var a = fso.CreateTextFile("c:\\testfile.txt", /*overwrite*/ true, /*unicode*/ false);
That ends up with a small syntax highlighting dilemma for me, though. I like my comments highlighted vibrantly; both block and line comments. These tiny inline block comments mean little to me, personally, however. I'd like to highlight those particular comments in a more muted fashion (light gray on white, for example). Which brings me to my dilemma.
Dilemma
I'd like to override the default syntax highlighting for block comments when both the beginning and end marks are on the same line. Ideally this is done solely in my vimrc file, and not in a superseding personal copy of the javascript.vim syntax. My initial attempt is pathetic:
hi inlineComment guifg=#bbbbbb
match inlineComment "\/\*.*\*\/"
Straight away you can see the first problem with this regular expression pattern is that it's a greedy search. It's going to match from the first "/*" to the last "*/" on the line, meaning everything between two inline block comments will get this highlight style as well. I can fix that, but I'm really not sure how to deal with my second concern.
Comments can't be defined inside of String literals in ECMAScript. So this syntax highlighting will override String highlighting as well. I've never had a problem with this in system administration scripts, but it does often bite me when I'm examining the source of many javascript libraries intended for browsers (less.js for example).
What regex pattern, syntax definition, or other solution would the amazing StackOverflow community recommend to restore my vimrc zen?
I'm not sure, but from your description it sounds like you don't need a new syntax definition. Vim syntax files usually let you override a particular syntax item with your own choice of highlighting. In this case, the item you want is called javaScriptComment, so a command like this will set its highlighting:-
hi javaScriptComment guifg=#bbbbbb
but you have to do this in your .vimrc file (or somewhere that's sourced from there), so it's evaluated before the syntax file. The syntax file uses the highlight default command, so the syntax file's choice of highlighting only affects syntax items with no highlighting set. See :help :hi-default for more details on that. BTW, it only works on Vim 5.8 and later.
The above command will change all inline /* */ comments, and leave // line comments with their default setting, because line comments are a different syntax item (javaScriptLineComment). You can find the names of all these groups by looking at the javascript.vim file. (The easiest way to do this is :e $VIMRUNTIME/syntax/javascript.vim .)
If you only want to change some inline comments, it's a little more complicated, but still easy to see what to do by looking at javascript.vim . If you do that, you can see that block comments are defined like this:-
syn region javaScriptComment start="/\*" end="\*/" contains=#Spell,javaScriptCommentTodo
See that you can use separate regexes for begin and end markers: you don't need to worry about matching the stuff in between with non-greedy quantifiers, or anything like that. To have a syntax item that works similarly but only on one line, try adding the oneline option (:h :syn-oneline for more details):-
syn region myOnelineComment start="/\*" end="\*/" oneline
I've removed the two contains groups because (1) if you're only using it for parameter names, you probably don't want spell-checking turned on inside these comments, and (2) contained sections that aren't oneline override the oneline in the container region, so you would still match all TODO comments with this region.
You can define this new kind of comment region in your .vimrc, and set the highlighting how you like: it looks like you already know how to do that, so I won't go into more details on that. I haven't tried out this particular example, so you may still need a bit of fiddling to make it work. Give it a try and let me know how it goes.
Why don't you simply add a comment line above the call?
I think that
// fso.CreateTextFile(filename:String, overwrite:Boolean, unicode:Boolean)
var a = fso.CreateTextFile("c:\\testfile.txt", true, false);
is a lot more readable and informative than
var a = fso.CreateTextFile("c:\\testfile.txt", /*overwrite*/ true, /*unicode*/ false);