Mass replace deprecated functions in entire projects - regex

I have a bunch of PHP coded websites that have been recently moved to a PHP 5.4 server and they're throwing deprecation warnings and errors.
Is there a way to mass find & replace function names with the proper ones? For example, I would like to be able to replace all instances of session_unregister('someVar') with unset($_SESSION['someVar'])...
Should i use regex or is there an other way?

For this particular example you could use sed like this:
echo "session_unregister('someVar')" | sed 's/session_unregister(/unset\($_SESSION[/;s/)/])/'
A bit more flexible would be to use the C preprocessor. Assume your php source file name is my.php. Add extension .h so it becomes my.php.h. At the beginning of the file, insert:
#define session_unregister(X) unset($_SESSION[X])
Assume the file contains lines like in your example: session_unregister('someVar')
Run the preprocessor like this:
cc -E my.php.h
Now you should instead see unset($_SESSION['someVar'])
(plus some extra garbage you don't want).
Note that this just answers your particular question, but I wouldn't recommend it without more detailed testing.

Related

How does Omnisharp use wildcards in its config files?

Background
This relates to an older stackoverflow question. I was hoping to ask for more details but haven't got the Reputation to write comments yet.
Circumstances are the same: I'm adding codecheck warnings that I want to ignore, by editing the "IgnoredCodeIssues" section of Omnisharp's config.json file.
The question
What wildcard/regexp characters work here and how? Is it perhaps a known standard with its own docs somewhere that I can read?
Example
If I enter an issue warning verbatim it works, but it would be way more efficient to use wildcards. For example this warning:
Method 'Update' has the same with 'Start'
is a warning I don't care about and it's going to pop up a lot. A good solution would be to configure it to work for all instances of this issue, i.e. to use wildcards at the 'Update' and 'Start' parts.
Using a typical regexp it would look like this:
/(Method)\s'\w+'\shas the same with\s'\w+'/g
but that's obviously not the syntax here and would just break the config file. So I'm hoping to understand the particular syntax of wildcards in this particular file.
More details
I use Omnisharp-sublime and Sublime Text 3.
I've read the docs and rummaged around the GitHub page (no links as my reputation is too low for 2+ links) but the only relevant information is an example config file with a couple of ignored issues:
"IgnoredCodeIssues": [
"^Keyword 'private' is redundant. This is the default modifier.$",
".* should not separate words with an underscore.*"
],
EDIT:
Using
"Method '.*.' has the same with.*",
(note the .*.) makes the warnings go away but I have no idea if there are side-effects or other consequences like hiding warnings that I might do want to be notified of. Is there anyone who has seen wildcard expansions like that before? Would be great to be able to look it up and study it before adding more to my config.json
Based on the examples in the config file, you should just use standard regex inside double quotes. Don't use the Perl style of /regex/replace/options. config.json is read by the OmniSharp server, which is written in C#, so if you're looking to do anything fancy with your regexes, make sure they're C#-compatible.
So, for instance, your example regex would look like this:
"(Method)\s'\w+'\shas the same with\s'\w+'"

Finding and modifying function definitions (C++) via bash-script

Currently I am working on a fairly large project. In order to increase the quality of our code, we decided to enforce the treatement of return values (Error Codes) for every function. GCC supports a warning concerning the return value of a function, however the function definition has to be preceeded by the following flag.
static __attribute__((warn_unused_result)) ErrorCode test() { /* code goes here */ }
I want to implement a bashscript that parses the entire source code and issues a warning in case the
__attribute__((warn_unused_result))
is missing.
Note that all functions that require this kind of modification return a type called ErrorCode.
Do you think this is possible via a bash script ?
Maybe you can use sed with regular expressions. The following worked for me on a couple of test files I tried:
sed -r "s/ErrorCode\s+\w+\s*(.*)\s*\{/__attribute__((warn_unused_result)) \0/g" test.cpp
If you're not familiar with regex, the pattern basically translates into:
ErrorCode, some whitespace, some alphanumerics (function name), maybe some whitespace, open parenthesis, anything (arguments), close parenthesis, maybe some whitespace, open curly brace.
If this pattern is found, it is prefixed by __attribute__((warn_unused_result)). Note that this only works if you are putting the open curly brace always in the same line as the arguments and you don't have line breaks in your function declarations.
An easy way I could imagine is via ctags. You create a tag file over all your source code, and then parse the tags file. However, I'm not quite sure about the format of the tags file. The variant I'm using here (Exuberant Ctags 5.8) seems to put an "f" in the fourth column, if the tag represents a function. So in this case I would use awk to filter all tags that represent functions, and then grep to throw away all lines without __attribute__((warn_unused_result)).
So, in a nutshell, first you do
$ ctags **/*.c
This creates a file called "tags" in the current directory. The command might also be ctags-exuberant, depending on your variant. The **/*.c is a glob pattern that might work in your shell - if it doesn't, you have to supply your source files in another way (look at the ctagsoptions).
Then you filter the funktions:
$ cat tags | awk -F '\t' '$4 == "f" {print $0}' | grep -v "__attribute__((warn_unused_result))"
No, it is not possible in the general case. The C++ grammar is the most complex of all the languages I know of, and C++ is not parsable via regular expressions in the general case. You might succeed if you limit yourself to a very narrow set of uses, but I am not sure how feasible it is in your case.
I also do not think the excersise is worth the effort, since sometimes ignoring the result of the function is an OK thing.

Using regex with `rename` version from `util-linux`

I’m using a GNU/Linux distribution where the utility rename comes from util-linux and I want to make full use of regular (Perl or POSIX) expressions with it.
There are two versions of rename :
The “Perl” version, with syntax rename 's/^fgh/jkl/' fgh*
The util-linux version, with syntax rename fgh jkl fgh*
If the use of regexes seems pretty obvious with the first one, to which I have no easy access. However, I’m confused about the second one: I could not find any relevant documentation or examples on the possible use, and in that case the format, of the regular expressions to use.
Let’s take, to make a simple example, a directory containing:
foo_a1.ext
foo_a32.ext
foo_c18.ext
foo_h12.ext
I want to use a syntax like one of these two lines:
rename "foo_[a-z]([0-9]{1,2}).ext" "foo_\1.ext" *
rename "foo_[:alpha:]([:digit:]{1,2}).ext" "foo_\1.ext" *
for which the expected output would be:
foo_1.ext
foo_32.ext
foo_18.ext
foo_12.ext
Of course this does not work! Either I’m missing something obvious, or there is
no implemented way to use actual regular expressions with this tool.
(Please note that I am aware of the other possibilities for renaming files with regular expressions in a shell interpreter; this question aims at a specific version of the rename tool.)
Here is the manual page: http://linux.die.net/man/1/rename. It is pretty straightforward:
rename from to file...
rename will rename the specified files by replacing the first
occurrence of from in their name by to.
I believe there are no regexes, it is just plain substring match.
The following command gives the expected result with your input file but using the perl version:
rename 's/foo_\D+(\d+)/foo_$1/' *.ext
You can test the command using -n option to rename

Bash script - mass modify files sed regular expression

I have a set of .csv files (all in one folder) with the format shown below:
170;151;104;137;190;125;170;108
195;192;164;195;171;121;133;104
... (a lot more rows) ...
The thing is I screwed up a bit and it should look like this
170;151;104.137;190.125;170;108
195;192;164.195;171.121;133;104
In case the difference is too subtle to notice:
I need to write a script that changes every third and fifth semicolon into a period in every row in efery file in that folder.
My research indicate that I have to devise some clever sed s/ command in my script. The problem is I'm not very good with regular expressions. From reading the tutorial it's probably gonna involve something with /3 and /5.
Here's a really short way to do it:
sed 's/;/./3;s/;/./4' -iBAK *
It replaces the 3rd and then the 5th (which is now the 4th) instances of the ; with ..
I tested it on your sample (saved as sample.txt):
$ sed 's/;/./3;s/;/./4' <sample.txt
170;151;104.137;190.125;170;108
195;192;164.195;171.121;133;104
For safety, I have made my example back up your originals as <WHATEVER>.BAK. To prevent this, change -iBAK to -i.
This script may not be totally portable but I've tested it on Mac 10.8 with BSD sed (no idea what version) and Linux with sed (gsed) 4.1.4 (2003). #JonathanLeffler notes that it's standard POSIX sed as of 2008. I also just found it and like it a lot.
Golf tip: If you run the command from bash, you can use brace expansion to achieve a supremely short version:
sed -es/\;/./{3,4} -i *
Here's one way:
sed -i 's/^\([^;]*;[^;]*;[^;]*\);\([^;]*;[^;]*\);/\1.\2./' foldername/*
(Disclaimer: I did test this, but some details of sed are not fully portable. I don't think there's anything non-portable in the above, so it should be fine, but please make a backup copy of your folder first, before running the above. Just in case.)

Hints and tools for finding unmatched braces / preprocessor directives

This is one of my most dreaded C/C++ compiler errors:
file.cpp(3124) : fatal error C1004: unexpected end-of-file found
file.cpp includes almost a hundred header files, which in turn include other header files. It's over 3000 lines. The code should be modularized and structured, the source files smaller. We should refactor it. As a programmer there's always a wish list for improving things.
But right now, the code is a mess and the deadline is around the corner. Somewhere among all these lines—quite possibly in one of the included header files, not in the source file itself—there's apparently an unmatched brace, unmatched #ifdef or similar.
The problem is that when something is missing, the compiler can't really tell me where it is missing. It just knows that when it reached end of the file it wasn't in the right parser state.
Can you offer some tools or other hints / methodologies to help me find the cause for the error?
If the #includes are all in one place in the source file, you could try putting a stray closing brace in between the #includes. If you get an 'unmatched closing brace' error when you compile, you know it all balances up to that point. It's a slow method, but it might help you pinpoint the problem.
One approach: if you have Notepad++, open all of the files (no problem opening 100 files), search for { and } (Search -> Find -> Find in All Open Documents), note the difference in count (should be 1). Randomly close 10 files and see if the difference in count is still 1, if so continue, else the problem is in one of those files. Repeat.
Handy tip:
For each header file, auto-generate a source file which includes it, then optionally contains an empty main method, and does nothing else. Compile all of these files as test cases, although there's no point running them.
Provided that each header includes its own dependencies (which is a big "provided"), this should give you a better idea which header is causing the problem.
This tip is adapted from Google's published C++ style guide, which says that each component's source files should include the interface header for that component before any other header. This has the same effect, of ensuring that there is at least one source file which will fail to compile, and implicate that header, if there's anything wrong with it.
Of course it won't catch unmatched braces in macros, so if you use much in the way of macros, you may have to resort to running the preprocessor over your source file, and examining the result by hand and IDE.
Another handy tip, which is too late now:
Check in more often (on a private branch to keep unfinished code out of everyone else's way). That way, when things get nasty you can diff against the last thing that compiled, and focus on the changed lines.
Hints:
make small changes and recompile after each small change
for unmatched braces, use an editor/IDE that supports brace match hilighting
if all else failds, Ye Olde method of commenting out blocks of code using a binary chop approach works for me
Very late to the party, but on linux you can use fgrep -o
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
So if you fgrep -o { then you'll get a list of all the opening braces in your file.
You can then pipe that to wc -l and that will give you the number of opening braces in your file.
With a bit of bash arithmetic, you can do the same with closing braces, and then print the difference.
In the below example I'm using git status -s to get the short-format output of all modified files from my git repo, and then I'm iterating over them to find which files may have mismatched braces
for i in $(git status -s | awk '{print $2}'); do
open=$(fgrep -o { $i | wc -l); # count number of opening braces
close=$(fgrep -o } $i | wc -l); # count number of closing braces
diff=$((open-close)); # find difference
echo "$diff : $i"; # print difference for filename
done
I think using some editor with brace highlighting will help. There should also be some tools around that do automatic indention on code.
EDIT: Does this vim script help? It seems to do #ifdef highlighting.
Can you create a script that actually does all the including, and then write the whole stuff to a temporary file? Or try the compiler for helping you with that? Then you can use the bracket highlighting features of various editors to find the problem and you'll probably be able to identify the file. (An extra help can be to have your script add a comment around every included file).
This might not be relevant to your problem, but I was getting an "Unexpected #else" error when framing some header files in an #if/#else/#endif block.
I found that if I set the problem modules to not use pre-compiled headers, the problem went away. Something to do with the "#pragma hdrstop" should not be within an #if/#endif.
Have a look at this question (Highlighting unmatched brackets in vim)
Pre-compile your code first, this will create a big chunk with the include files stuffed into the same file. Then you can use those brace-matching scripts.
I recently ran into this situation while refactoring some code and I'm not a fan of any of the answers above. The problem is that they neglect to incorporate a pretty basic assumption:
Any given file (most likely) will have matching braces and if/endif macros.
While it is true one can have a C++ source file that opens a bracket or an if block and includes another module that closes it, I have never seen this in practice. Something like the following:
foo.cpp
namespace Blah{
#include "bar.h"
bar.h:
}; /// namespace Blah
So with the assumption that any given file / module contains matching sets of braces and preprocessor directives, this problem is now trivial to solve. We just need a script to read the files / count the brackets & directives that it finds and report mismatches.
I've implemented one in Ruby here, feel free to take and adapt to your needs:
https://gist.github.com/movitto/6c6d187f7a350c2d71480834892552ab
I just spent an hour with a problem like this. It was very hard to spot, but in the end, I had typed a double # on one line:
##ifdef FEATURE_X
...
That broke the world.
Easy one to search for though.
The simplest and fastest solution is to comment file from the end by consistent blocks. If the error still exists, then you not yet comment the open brace.