How to organize subroutines for use by multiple commands?

How to organize subroutines for use by multiple commands? - stata

I am working on creating a package with two new commands, say foo and bar.
For example, if foo.ado contains:
program define foo
...
rex
end
program define rex
...
end
But my other command, bar.ado, also needs to call rex. Where should I put rex?
I see the following few options:
Create a rex.ado file as well.
Create a rex.do file and include it from within both foo.ado and bar.ado using include "`c(sysdir_plus)'r/rex.do" at the bottom of each file.
Copy the code into both foo.ado and bar.ado, which seems ugly because now the code must be maintained in two places.
What is best practice for organizing subroutines that are needed by both foo and bar?
Also, should the subroutine be called rex, _rex, or something else — maybe _foobar_rex — to indicate it is actually a sub-command that foo and bar depend on to work correctly rather than a separate command intended to stand on its own?

Create a rex.ado file as well
Your question is a bit too broad. Personally, I would go with the first option to be safe, although it really depends on the structure of your project. Sometimes including rex in a single ado file may be enough. This will be the case, for example, if foo is a wrapper command. However, for most other use cases, including two commands sharing a common program, i strongly believe that you will need to have a separate ado file.
The second option is obviously unnecessary, since the first does the same thing, plus it does not have to load the program every single time you call it. The third option is probably the worst in a programming context, as it may create conflicts and will be difficult to maintain down the road.
With regards to naming conventions, I would recommend using something like _rex only if you include the program as a subroutine in an ado file. Otherwise, rex will do just fine and will also indicate that the program has a wider scope within your project. It is also better, in my opinion, to provide a more elaborate explanation about the intended use of rex using a comment at the start of the ado file, rather than trying to incorporate this in the name.

Related

Tools to refactor names of types, functions and variables?

struct Foo{
Bar get(){
}
}
auto f = Foo();
f.get();
For example you decide that get was a very poor choice for a name but you have already used it in many different files and manually changing ever occurrence is very annoying.
You also can't really make a global substitution because other types may also have a method called get.
Is there anything for D to help refactor names for types, functions, variables etc?

Here's how I do it:
Change the name in the definition
Recompile
Go to the first error line reported and replace old with new
Goto 2
That's semi-manual, but I find it to be pretty easy and it goes quickly because the compiler error message will bring you right to where you need to be, and most editors can read those error messages well enough to dump you on the correct line, then it is a simple matter of telling it to repeat the last replacement again. (In my vim setup with my hotkeys, I hit F4 for next error message, then dot for repeat last change until it is done. Even a function with a hundred uses can be changed reliably* in a couple minutes.)
You could probably write a script that handles 90% of cases automatically too by just looking for ": Error: " in the compiler's output, extracting the file/line number, and running a plain text replace there. If the word shows up only once and outside a string literal, you can automatically replace it, and if not, ask the user to handle the remaining 10% of cases manually.
But I think it is easy enough to do with my editor hotkeys that I've never bothered trying to script it.
The one case this doesn't catch is if there's another function with the same name that might still compile. That should never happen if you do this change in isolation, because an ambiguous name wouldn't compile without it.
In that case, you could probably do a three-step compiler-assisted change:
Make sure your code compiles before. Then add #disable to the thing you want to rename.
Compile. Every place it complains about it being unusable for being disabled, do the find/replace.
Remove #disable and rename the definition. Recompile again to make sure there's nothing you missed like child classes (the compiler will then complain "method foo does not override any function" so they stand right out too.
So yeah, it isn't fully automated, but just changing it and having the compiler errors help find what's left is good enough for me.

Some limited refactoring support can be found in major IDE plugins like Mono-D or VisualD. I remember that Brian Schott had plans to add similar functionality to his dfix tool by adding dependency on dsymbol but it doesn't seem implemented yet.
Not, however, that all such options are indeed of a very limited robustness right now. This is because figuring out the fully qualified name of any given symbol is very complex task in D, one that requires full semantics analysis to be done 100% correctly. Think about local imports, templates, function overloading, mixins and how it all affects identifying the symbol.
In the long run it is quite certain that we need to wait before reference D compiler frontend becomes available as a library to implement such refactoring tool in clean and truly reliable way.

A good find all feature can be better than a bad refactoring which, as mentioned previously, requires semantic.
Personally I have a find all feature in Coedit which displays the context of a match and works on all the project sources.
It's fast to process the results.

Changing global variable names

I working on a huge code base written many years ago. We're trying to implement multi-threading and I'm incharge of cleaning up global variables (sigh!)
My strategy is to move all global variables to a class, and then individual threads will use instances of that class and the globals will be accessed through class instance and -> operator.
In first go, I've compiled a list of global variables using nm by finding B and D group object names. The list is not complete, and incase of static variables, I don't get file and line number info.
The second stage is even more messy, I've to replace all globals in the code base with classinstance->global_name pattern. I'm using cscope Change text string for this. The problem is that in case of some globals, their name is also being used locally inside functions, and thus cscope is replacing them as well.
Any other way to go about it? Any strategies, or help please!

just some suggestions, from my experience:
use eclipse: the C++ indexer is very good, and when dealing with a large project I find it very useful to track variables. shift+ctrl+g (I have forgotten how to access to it from menus!) let you search all the references, ctrl+alt+h (open call hierarchy) the caller-callee trees...
use eclipse: it has good refactoring tools, that is able to rename a variable without touching same-name-different-scope variables. (it often fails in case there are templates involved. I find it good, better than visual studio 2008 counterpart).
use eclipse: I know, it get some time to get started with it, but after you get it, it's very powerful. It can deal easily with the existing makefile based project (file -> new -> project -> makefile project with existing code).
I would consider not to use class members, but accessors: it's possibile that some of them will be shared among threads, and need some locking in order to be properly used. So I would prefer: classinstance->get_global_name()
As a final note, I don't know whether using the eclipse indexer at command-line would be helpful for your task. You can find some examples googling for it.
This question/answer can give you some more hints: any C/C++ refactoring tool based on libclang? (even simplest "toy example" ). In particular I do quote "...C++ is a bitch of a language to transform"

Halfway there: if a function uses a local name that hides the global name, the object file won't have an undefined symbol. nm can show you those undefined symbols, and then you know in which files you must replace at least some instances of that name.
However, you still have a problem in the rare cases that a file uses both the global name and in another function hides the global name. I'm not sure if this can be resolved with --ffunction-sections; but I think so: nm can show the section and thus you'll see the undefined symbols used in foo() appear in section .text.foo.

switch easily between different main()

I'm using VS2010
I have a project with several headers and one file with the main() function.
For testing purposes I'd like to be able to easily another main() function that would instanciate different things than my original main.
Is there an easy way to define 2 "main" function, and easily switch between them?
The best would be to compile 2 binaries, one that starts at main1() and the other at main2(), or it can be a solution that requires to recompile some files, it doesn't matter

You are almost always better off using a separate compiled binary with a separate main.
First, "for testing purposes" might include code that should never be in the real binary -- such as test library code. That requires a second binary.
Second, if there is nothing that should be omitted, you still have the issue that anyone can supply an argument or copy and rename the binary to match argv[0] that will give this functionality.
I know it might be difficult to architect your project files to create separate real and test programs, but in most cases, you will have a much better result.

In linker options you have entry point name. This way you can have main1() and main2():
http://msdn.microsoft.com/en-us/library/f9t8842e(v=vs.80).aspx

"There can be only one" What you need to do is create a set of sub functions that main invokes biased upon conditions or though conditional compilation statements.

#ifdef TESTING
int main() {
/* whatever */
}
#else
int main() {
/* whatever else */
}
#endif

An application can only have one main. If you want to run two things, you need to do so in main, via:
The name of the executable run (hint: the first argv is the name of the executable)
Further command line parameters (program -thingone)
Lazily commenting out calls to functions which do something.

Besides specifying different entry points in the linker or having a real main() that calls whichever lower level function you want to pretend is a top level function, you could add a project for each main() you want.
This can be somewhat annoying in VS because separate projects aren't set up by default to share source code.. Some other IDEs make it easier have different executables (or other build products) built from different subsets of a shared set of source code, but I've never found that to be easy using VS's defaults.

How do I specify where to start in a dependency graph that has no fixed start point

I'm using a tool chain that is more like a web. There are lotys of alternate start points, all resulting in a single final output.
I typically use make or scons - actually, I prefer scons, but my team highly prefers make. I'm open to other build tools.
E.g. final_result depends on penultimate
final_result: penultimate
penultimate may be made in any of several different ways:
If starting from file1, then
penultimate: file1 ; rule1
If starting from file2, then
penultimate: file2 ; rule2
Q: how do I specify to start with file2, not file1?
I suppose that I could use a command line switch and ifdeffing. But I would prefer to have make or scons figure out "Hey, there's a file2 around, so I should use rule2, not fil1/rule1". In part because the web is much more complex than this...
Worse, sometimes an intermediate on one path may be a start on another. Let's see:
A .s produces a .diag
foo.diag: foo.s
But sometimes there is no .s, and I just have a .diag that somebody else gave me already built.
A .diag produces a .heximg, and a .hwresult
foo.hwresult: hwsim foo.heximg
foo.heximg: foo.diag
But sometimes I am given a .img directly
Etc.
I just want to write the overall dependency graph, and say "OK, now here's what I have been given - now how do you get to the final result?"
With what I have now, when I am given, say, a foo.img, I get told (by make in this case) "foo.s not dfound". Because make wants to go all the way back in the dependency graph to tell if foo.img is out of date, whereas I want to say "assume foo.img is up to date, and work forwrads for stuff that depends on foo.img, instead of going back for stuff that foo.img depends on."

You have to do it all with pattern rules (implicit rules). If you specify an explicit rule then make considers that a hard dependency and if some portion of the dependency is not met, make will fail.
If you use an implicit rule then make will consider that a POSSIBLE way to build the target. If that way doesn't work (because some prerequisite does not exist and make doesn't know how to build it) make will try another way. If no way works, and the target already exists, make will just use that target without having to update it.
Also you say "a .diag produces a .heximg and a .hwresult" then gave a strange example makefile syntax that I didn't recognize, but FYI with pattern rules you can specify that a single command generates multiple outputs (you can't do this with explicit rules):
%.heximg %.hwresult: %.diag
Here's the bad news: the only way to define an implicit rule in GNU make is if there is a common "stem" in the filename. That is, you can write an implicit rule that converts foo.diag to foo.heximg by writing a pattern rule "%.heximg: %.diag", because they have a common stem "foo", but there's no way to create a pattern rule for a compilation from "foo1" to "penultimate", because they don't share a common stem.

I'm not sure, but probably you're looking for Double-Colon Rules:
Double-colon rules are explicit rules written with :: instead of : after the target names. They are handled differently from ordinary rules when the same target appears in more than one rule.
Double-colon rules are somewhat obscure and not often very useful; they provide a mechanism for cases in which the method used to update a target differs depending on which prerequisite files caused the update, and such cases are rare.

Parsing C++ to make some changes in the code

I would like to write a small tool that takes a C++ program (a single .cpp file), finds the "main" function and adds 2 function calls to it, one in the beginning and one in the end.
How can this be done? Can I use g++'s parsing mechanism (or any other parser)?

If you want to make it solid, use clang's libraries.

As suggested by some commenters, let me put forward my idea as an answer:
So basically, the idea is:
... original .cpp file ...
#include <yourHeader>
namespace {
SpecialClass specialClassInstance;
}
Where SpecialClass is something like:
class SpecialClass {
public:
SpecialClass() {
firstFunction();
}
~SpecialClass() {
secondFunction();
}
}
This way, you don't need to parse the C++ file. Since you are declaring a global, its constructor will run before main starts and its destructor will run after main returns.
The downside is that you don't get to know the relative order of when your global is constructed compared to others. So if you need to guarantee that firstFunction is called
before any other constructor elsewhere in the entire program, you're out of luck.

I've heard the GCC parser is both hard to use and even harder to get at without invoking the whole toolchain. I would try the clang C/C++ parser (libparse), and the tutorials linked in this question.

Adding a function at the beginning of main() and at the end of main() is a bad idea. What if someone calls return in the middle?.
A better idea is to instantiate a class at the beginning of main() and let that class destructor do the call function you want called at the end. This would ensure that that function always get called.

If you have control of your main program, you can hack a script to do this, and that's by far the easiet way. Simply make sure the insertion points are obvious (odd comments, required placement of tokens, you choose) and unique (including outlawing general coding practices if you have to, to ensure the uniqueness you need is real). Then a dumb string hacking tool to read the source, find the unique markers, and insert your desired calls will work fine.
If the souce of the main program comes from others sources, and you don't have control, then to do this well you need a full C++ program transformation engine. You don't want to build this yourself, as just the C++ parser is an enormous effort to get right. Others here have mentioned Clang and GCC as answers.
An alternative is our DMS Software Reengineering Toolkit with its C++ front end. DMS, using its C++ front end, can parse code (for a variety of C++ dialects), builds ASTs, carry out full name/type resolution to determine the meaning/definition/use of all symbols. It provides procedural and source-to-source transformations to enable changes to the AST, and can regenerate compilable source code complete with original comments.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js