How to conveniently refactor OCaml project?

How to conveniently refactor OCaml project? - ocaml

I am using Emacs + Tuareg mode to do my OCaml project.
It is working fine and I get used to it.
However, along with my project source base getting bigger and bigger, I find managing the project is getting harder and harder.
Especially for refactoring. If I change a module name or function name, I have to search everywhere for the part that need to changed accordingly or I just constantly compile again and again to let compiler tell me where I should go.
It is not convenient.
Anyone can suggest a good way for source base management?
thanks

A good option is TypeRex. This is an alternative Emacs mode created by OCamlPro that has a bunch of OCaml-aware features including proper support for refactoring (like renaming identifiers).
It also has a bunch of other nice features like good auto-complete, semantic grep and so on.
Unfortunately, this involves changing your build process to use some wrapper programs. These generate the additional information the mode needs to function. However, once you get the build set up, it's a really awesome editing environment.

Related

What generic template processor should I use?

This is a potentially dangerous question because interdisciplinary questions and answers will be biased, but I'll have a stab at it anyway. All in good spirit!
So, here we go. I'm writing a major editing mode for Emacs for the language that it has almost no support for yet. And I'm at the point, where I have to decide on a way to generate project files. Below is the syllabus of the task ahead:
The templates have to represent project directory tree, not only single files.
The resulting files are of various formats, potentially including SGML-like languages, but not limited to this variety. They also have to generate C-like source code and, eLisp source code and plain text files, like README, for example.
The templates must be processed in a batch upon user-initiated action (as in user wants to create a project - several files must be created in the user-appointed directory). It may be beneficial to have an ability to supervise the creation, but this is less important then the ability to run the process entirely automatically.
Bonus features:
The template language has already a user base (with a potential of reuse of existing templates).
The templates can be used for code snippets (contain blanks which are filled interactively once the user invokes code-generating routine while editing the file).
Obvious things like cross-platform-ness, ease of use both through graphical interface and command line.
I made a research, but I won't share my results (yet) so I won't bias the answers. The problem with answering this question is not that the answer is hard to find, but that it is hard to chose one from many.

I'm developing a system based on Mustache for exactly the use case that you've described. The template language itself is a very simple extension of Mustache called Groome.
I also released a command-line tool called Molt that renders Groome templates. I'd be curious to know if it does everything that you need. I'm still adding features to the tool and haven't yet announced it. Thanks.

I went to solve a similar problem several years aback, where I wanted to use Emacs to generate code out of a UML diagram (cogre), and also generate Makefiles from project specifications. I first tried to use Tempo, but when I tried to get the templates to nest, I ran into problems. I also looked into skeleton, but that didn't quite fit the plan either.
I ended up using Google Templates for a little bit, and liked the syntax, and developed SRecode instead, and just borrowed the good bits from Google templates. SRecode was written specifically for machine-generated code. The interaction for template insertion (aka - what tempo was written for) isn't first class in SRecode. For generating code from a data structure, however, it is very robust, and has a lot of features, and automatically filled variables. It works closely with your major mode, and allows many nested templates, with control over the nested dictionary values. There is a subsystem that will use Semantic tags and generate code from them for a couple languages. That means you can parse code in one language with Semantic, and generate code in another language with SReocde using those tags. Nifty! Many parts of CEDET Reference manuals were built that way.
The templates themselves allow looping, if statements, and include statements. There are a couple examples in SRecode for making an 'application', such as the comment writer, and EDE uses it to create Makefiles, which is almost exactly what you are trying to do.

Another option is Generator, which offers “language-agnostic project bootstrapping with an emphasis on simplicity”. Installation requires Node.js and npm.
Generator’s emphasis on simplicity means it is very easy to learn how to make a template. Generator also saves you from having to reference templates by file paths – it looks for templates in ~/.generator.
However, there is no way to write README or LICENSE files for the template itself without those files being copied to the generated project. Also, post-generation commands written in the Makefile will be copied to the generated Makefile, even after they are no longer of use. Finally, the ad-hoc templating language doesn’t provide a way to escape its __lowercasevariables__ – though I can’t think of a language where that limitation would be a problem.

As of 2011: Netbeans 7 or Eclipse Indigo for C++?

This is basically a duplicate of:
Netbeans or Eclipse for C++?
But, that question as 3+ years old, and a lot has changed since then.
I have a large code base with a custom (but Makefile based) build system. The areas I am specifically wondering about include:
Syntax highlighting
Code navigation.
Code hints.
"ReSharper style" code helpers.
Documentation integration.
Debugger UI and features.
Has anyone had the chance to evaluate both Netbeans and Eclipse?
EDIT: As a followup question, are any of the Netbeans users here concerned with its future given Oracle's recent bad history with "open" efforts? (Open Solaris, MySQL, Open Office)
Thank you

I cannot comment on eclipse, but on netbeans 7 I will say things that are very important for me and that work fine so far:
code completion, go to declarations
pkg-config automatic include management for parsing
stuff that sometimes works and sometimes don't
find usages, sometimes it might fail to find usages in other open projects
debugger sometimes gets confused with unittest-cpp macros and it will not go on the appropiate line
stuff that are not yet working and i care deeply:
C++0x syntax highlighting (auto, lambdas, enum class, variadic templates, none of them are recognized by the built-in parser)
stuff that is not quite working but i could not care less:
git integration. I enjoy using git from command-line so this is a non-issue
in all, the IDE is very usable. I hope to have a chance to try out latest cdt on Indigo Eclipse, but so far i haven't that much of a real reason to investigate

I cannot comment on Netbeans, but I can offer you information on Eclipse. I work with C++ on UNIX systems, and I have started to use Eclipse when exploring large code bases that I know little about. I don't use it to build, but it would be easy to integrate our build system with it as one only needs commands.
Eclipse has most of what you are looking for: (I'm speaking of Eclipse/CDT)
Not only can you completely customize your syntax highlighting, you can also have it format the code with templates. My company has a code standard for spacing, tabs and formatting of functions and conditional code, and with little effort I was able to modify an existing template to meet our code standards.
The navigation is not bad, if you highlight and hover over a variable, it shows you the definition in a small pop-up bubble. If you do the same for a type, it will you show you where the type is defined. For functions, it will show the first few lines of the implementation of the function, with an option to expand it and see the whole function. I find all of these nice for code discovery and navigation. You can also highlight a variable, and use a right-click menu option to jump to its declaration.
I suppose by code hints you are referring to something like intellisense? This is the main reason why I use Eclipse when looking over a large code base. Just hit the '.' or '->' and a second later you get your options.
The debugger UI is quite capable. You can launch gdb within the tool and it allows you to graphically move through your code just as you would in a tool like ddd or Visual C++. It offers standard features like viewing registers, memory, watching variables, etc.
That being said, I have found some weaknesses. The first is that it doesn't really strongly support revision control systems outside of CVS and SVN very easily (integrated into the GUI). I found a plug-in for the system we use at my company, but it spews XML and Unicode garbage. It was easier to just use the revision control on the command line. I suspect this is the plug-in's issue and not Eclipse. I wish there were better tool integration though.
The second complaint is that for each project I have to manually setup the include directories and library paths. Perhaps with an environment variable this could be circumvented? Or I may just do not know how to set things up correctly. Then again if it is not obvious to a developer how to do this, I consider that a weakness of the tool.
All in all I like working with Eclipse. It is not my main editing environment, but I appreciate it for working on large code bases.

I'm a huge fan of Netbeans. I am in a similar situation to yours, but creating the project was very easy. Just point Netbeans at where the code is checked out and it figures out most things for itself. I rarely have to do any configuration. One thing to note though, if your tree is very large, it can take some time to fully index - and while it does, memory and cpu will be hosed on the box.
The integration with cvs is awesome, and the Hudson integration is very cool for CB. I've not used Git myself, though I should imagine it's a no-brainer.
One thing that does irritate me no end is that it does not behave very well with code relying heavily on templates. i.e. shows lots of warnings and errors about types not being found etc.
I have not used the latest version of Eclipse, I tried the major release before the current one and gave up because it did not have the same smooth project integration with the makefiles etc. I find it's not as nice if you don't want to use it's make system - though I could be wrong.
I don't use any of the code formatting provided, I instead prefer something like AStyle instead. I know that NetBeans does a good job with Java - but have not used it for C++. CDT I seem to remember doing some odd stuff with indentation when formatting C++ code - esp. if templates are involved - but that was atleast two years ago.
Hope some of it helps - the best way to do this is to download and try for yourself and see what works for you. Anything we tell you is purely subjective.

I used to work with Netbeans with MinGW, I Just tried 7.0.1.
I currently use Eclipse Indigo with CDT and MinGW - It's better performance wise (less CPU & Memory).
Netbeans creates a makefile to compile all the time,
In Eclipse you can build directly with the CDT-Toolchain or use Makefile - Eclipse is more flexible.
Debugging: Netbeans might be better in Solaris/Linux.
I Personally rather eclipse over Netbeans, I think eclipse is more professional.

One particular issue that causes me quite a lot of grief with Netbeans 7.0 is that it tends to want to work with utf8 files, and not all of out c++ projects are utf8. It will issue a warning about opening such a file, and if you do open it, will corrupt said file, which is a pain.
I've not found out how to properly make netbeans handle this. Apparently the encoding can be changed, but for the entire project. So presumably changing it to us-acii would stop this problem, although non ascii characters wouldn't display properly.

How do you handle command line options and config files?

What packages do you use to handle command line options, settings and config files?
I'm looking for something that reads user-defined options from the command line and/or from config files.
The options (settings) should be dividable into different groups, so that I can pass different (subsets of) options to different objects in my code.
I know of boost::program_options, but I can't quite get used to the API. Are there light-weight alternatives?
(BTW, do you ever use a global options object in your code that can be read from anywhere? Or would you consider that evil?)

At Google, we use gflags. It doesn't do configuration files, but for flags, it's a lot less painful than using getopt.
#include <gflags/gflags.h>
DEFINE_string(server, "foo", "What server to connect to");
int main(int argc, char* argv[]) {
google::ParseCommandLineFlags(&argc, &argv, true);
if (!server.empty()) {
Connect(server);
}
}
You put the DEFINE_foo at the top of the file that needs to know the value of the flag. If other files also need to know the value, you use DECLARE_foo in them. There's also pretty good support for testing, so unit tests can set different flags independently.

For command lines and C++, I've been a fan of TCLAP: Templatized Command Line Argument Parser.
http://sourceforge.net/projects/tclap/

Well, you're not going to like my answer. I use boost::program_options. The interface takes some getting used to, but once you have it down, it's amazing. Just make sure to do boatloads of unit testing, because if you get the syntax wrong you will get runtime errors.
And, yes, I store them in a singleton object (read-only). I don't think it's evil in that case. It's one of the few cases I can think of where a singleton is acceptable.

If Boost is overkill for you, GNU Gengetopt is probably, too, but IMHO, it's a fun tool to mess around with.
And, I try to stay away from global options objects, I prefer to have each class read its own config. Besides the whole "Globals are evil" philosophy, it tends to end up becoming an ever-growing mess to have all of your configuration in one place, and also it's harder to tell what configuration variables are being used where. If you keep the configuration closer to where it's being used, it's more obvious what each one is for, and easier to keep clean.
(As to what I use, personally, for everything recently it's been a proprietary command line parsing library that somebody else at my company wrote, but that doesn't help you much, unfortunately)

I've been using TCLAP for a year or two now, but randomly I stumbled across ezOptionParser. ezOptionParser doesn't suffer from "it shouldn't have to be this complex"-syndrome the same way that other option parsers do.
I'm pretty impressed so far and I'll likely be using it going forward, specifically because it supports config files. TCLAP is a more sophisticated library, but the simplicity and extra features from ezOptionParser is very compelling.
Other perks from its website include (as of 0.2.0):
Pretty printing of parsed inputs for debugging.
Auto usage message creation in three layouts (aligned, interleaved or staggered).
Single header file implementation.
Dependent only on STL.
Arbitrary short and long option names (dash '-' or plus '+' prefixes not required).
Arbitrary argument list delimiters.
Multiple flag instances allowed.
Validation of required options, number of expected arguments per flag, datatype ranges, user defined ranges, membership in lists and case for string lists.
Validation criteria definable by strings or constants.
Multiple file import with comments.
Exports to file, either set options or all options including defaults when available.
Option parse index for order dependent contexts.

GNU getopt is pretty nice. If you want a C++ feel, consider getoptpp which is a wrapper around the native getopt.
As far as configuration file is concerned, you should try to make it as stupid as possible so that parsing is easy. If you are bit considerate, you might want to use yaac&lex but that would be really a big bucks for small apps.
I also would like to suggest that you should support both config files and command line options in your application. Config files are better for those options which are to be changed less frequently. Command-line options are good when you want to pass the immediate changing arguments (typically when you are creating a app, which would be called by some other program.)

If you are working with Visual Studio 2005 on x86 and x64 Windows there is some good Command Line Parsing utilities in the SimpleLibPlus library. I have used it and found it very useful.

Not sure about command line argument parsing. I have not needed very rich capabilities in that area and have generally rolled my own to save adding more dependencies to my software. Depending upon what your needs are you may or may not want to try this route. The C++ programs I have written are generally not invoked from the command line.
On the other hand, for a config file you really can't beat an XML based format. It's readable, extensible, structured, etc... :) Plus there are lots of XML parsers out there. Despite the fact it is a C library, I tend to use libxml2 from xmlsoft.org.

Try Apache Ant. Its primary usage is Java projects, but there isn't anything Java about it, and its usable for almost anything.
Usage is fairly simple and you've got a lot of community support too. It's really good at doing things the way you're asking.
As for global options in code, I think they're quite necessary and useful. Don't misuse them, though.

Reading/Understanding third-party code

When you get a third-party library (c, c++), open-source (LGPL say), that does not have good documentation, what is the best way to go about understanding it to be able to integrate into your application?
The library usually has some example programs and I end up walking through the code using gdb. Any other suggestions/best-practicies?
For an example, I just picked one from sourceforge.net, but it's just a broad engineering/programming question:
http://sourceforge.net/projects/aftp/

I frequently use a couple of tools to help me with this:
GNU Global. It generates cross-referencing databases and can produce hyperlinked HTML from source code. Clicking function calls will take you to their definitions, and you can see lists of all references to a function. Only works for C and perhaps C++.
Doxygen. It generates documentation from Javadoc-style comments. If you tell it to generate documentation for undocumented methods, it will give you nice summaries. It can also produce hyperlinked source code listings (and can link into the listings provided by htags).
These two tools, along with just reading code in Emacs and doing some searches with recursive grep, are how I do most of my source reverse-engineering.

One of the better ways to understand it is to attempt to document it yourself. By going and trying to document it yourself, it forces you to really dive in and test and test and test and make sure you know what each statement is doing at what times. Then you can really start to understand what the previous developer may have been thinking (or not thinking for that matter).

Great question. I think that this should be addressed thoroughly, so I'm going to try to make my answer as thorough as possible.
One thing that I do when approaching large projects that I've either inherited or contributing to is automatically generate their sources, UML diagrams, and anything that can ease the various amounts of A.D.D. encountered when learning a new project:)
I believe someone here already mentioned Doxygen, that's a great tool! You should look into it and write a small bash script that will automatically generate sources for the application you're developing in some tree structure you've setup.
One thing that I've haven't seen people mention is BOUML! It's fantastic and free! It automatically generates reverse UML diagrams from existing sources and it supports a variety of languages. I use this as a way to really capture the big picture of what's going on in terms of architecture and design before I start reading code.
If you've got the money to spare, look into Understand for %language-here%. It's absolutely great and has helped me in many ways when inheriting legacy code.
EDIT:
Try out ack (betterthangrep.com), it is a pretty convenient script for searching source trees:)

Familiarize yourself with the information available in the headers. The functions you call will be declared there. Then try to identify the valid arguments and pre-/post-conditions of the functions, as those are your primary guidance (even if they are not documented!). The example programs are your next bet.

If you have code completion/intellisense I like opening up the library and going '.' or 'namespace::' and seeing what comes up. I always find it helpful, you can navigate through the objects/namespaces and see what functionality they have. This is of course assuming its an OOP library with relatively good naming of functions/objects.

There really isn't a silver bullet other than just rolling up your sleeves and digging into the code.
This is where we earn our money.

Three things;
(1) try to run the test or example apps available, set low debug levels, and walk through logs.
(2) use source navigator tool / cscope ( available both on windows and linux) and browse the code to understand the flow.
(3) also in parallel use gdb to walk into code while running test/example apps.

Autocompletion in Vim

In a nutshell, I'm searching for a working autocompletion feature for the Vim editor. I've argued before that Vim completely replaces an IDE under Linux and while that's certainly true, it lacks one important feature: autocompletion.
I know about Ctrl+N, Exuberant Ctags integration, Taglist, cppcomplete and OmniCppComplete. Alas, none of these fits my description of “working autocompletion:”
Ctrl+N works nicely (only) if you've forgotton how to spell class, or while. Oh well.
Ctags gives you the rudiments but has a lot of drawbacks.
Taglist is just a Ctags wrapper and as such, inherits most of its drawbacks (although it works well for listing declarations).
cppcomplete simply doesn't work as promised, and I can't figure out what I did wrong, or if it's “working” correctly and the limitations are by design.
OmniCppComplete seems to have the same problems as cppcomplete, i.e. auto-completion doesn't work properly. Additionally, the tags file once again needs to be updated manually.
I'm aware of the fact that not even modern, full-blown IDEs offer good C++ code completion. That's why I've accepted Vim's lack in this area until now. But I think a fundamental level of code completion isn't too much to ask, and is in fact required for productive usage. So I'm searching for something that can accomplish at least the following things.
Syntax awareness. cppcomplete promises (but doesn't deliver for me), correct, scope-aware auto-completion of the following:
variableName.abc
variableName->abc
typeName::abc
And really, anything else is completely useless.
Configurability. I need to specify (easily) where the source files are, and hence where the script gets its auto-completion information from. In fact, I've got a Makefile in my directory which specifies the required include paths. Eclipse can interpret the information found therein, why not a Vim script as well?
Up-to-dateness. As soon as I change something in my file, I want the auto-completion to reflect this. I do not want to manually trigger ctags (or something comparable). Also, changes should be incremental, i.e. when I've changed just one file it's completely unacceptable for ctags to re-parse the whole directory tree (which may be huge).
Did I forget anything? Feel free to update.
I'm comfortable with quite a lot of configuration and/or tinkering but I don't want to program a solution from scratch, and I'm not good at debugging Vim scripts.
A final note, I'd really like something similar for Java and C# but I guess that's too much to hope for: ctags only parses code files and both Java and C# have huge, precompiled frameworks that would need to be indexed. Unfortunately, developing .NET without an IDE is even more of a PITA than C++.

Try YouCompleteMe. It uses Clang through the libclang interface, offering semantic C/C++/Objective-C completion. It's much like clang_complete, but substantially faster and with fuzzy-matching.
In addition to the above, YCM also provides semantic completion for C#, Python, Go, TypeScript etc. It also provides non-semantic, identifier-based completion for languages for which it doesn't have semantic support.

There’s also clang_complete which uses the clang compiler to provide code completion for C++ projects. There’s another question with troubleshooting hints for this plugin.
The plugin seems to work fairly well as long as the project compiles, but is prohibitively slow for large projects (since it attempts a full compilation to generate the tags list).

as per requested, here is the comment I gave earlier:
have a look at this:
Vim integration to MonoDevelop
for .net stuff at least..
OmniCompletion
this link should help you if you want to use monodevelop on a MacOSX
Good luck and happy coding.

I've just found the project Eclim linked in another question. This looks quite promising, at least for Java integration.

I'm a bit late to the party but autocomplpop might be helpful.

is what you are looking for something like intellisense?
insevim seems to address the issue.
link to screenshots here

Did someone mention code_complete?
http://www.vim.org/scripts/script.php?script_id=1764
But you did not like ctags, so this is probably not what you are looking for...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js