Understanding C++ codebase with UML-tools - c++

I am trying to understand a C++ codebase. I have used some free tools that will scan the code and produce diagrams, but they are not so easy to understand.
What I think would be useful is to manually construct something assisted by the UML tool.
What i need is to create something that looks like the data structure at run-time. Ideally by pulling objects from the UML and arranging them. Also I would like to organise the classes in sub-packages - like those close to the DB, or towards the branches of the datastructures.
(I am partly doing this now with Folders in the Visual Studio Solution explorer)
This is a LINUX project with many Makesfiles, but many tools like Visual Studio, "understands" the code when I just create projects with the files in the main directory of the exe I am working on

Most tools will only give you a structural view (classes and packages), which honestly doesn't tell you that much of what goes on in runtime.
Enterprise Architect from Sparx Systems incorporates a Visual Execution Analyzer, which can generate sequence diagrams from a debug session. It supports C++ but only on Windows so you'd have to rebuild, but if I understand you correctly you've already got your code running in Visual Studio.
Here's a brief demo (in this case the code is in C#, but they do claim to support C++ as well). This isn't a full-roundtrip, write-the-code-in-UML kind of thing, but personally I think that's a pipe dream anyway. Use UML to document, use a programming language to code.

Honestly, you might have substantial problems getting anything useful out of a UML tool with regard to reverse generating code.
If the code is unusually clean and had a decent OO design, good containment, inheritance, and few associations theres a chance that it might look alright... but with most real projects, when you reverse generate into UML, the resulting diagrams are a spidery mess that will probably do more harm than good.
If you are going to stick with the reverse-model genreation though, try and configure the UML utility to show less items - keep it to the key relationships like aggregation and inheritance. When you start showing uses relationships rather than contains/aggregation relationships then everything tends to connect to everything unless the project was very well written, and that will just lead to more confusion and micsonceptions.
My best recommendation would be - if the tool makes it look incomprehensible, save yourself some time and do it yourself.

UML and C++ code that wasn't generated from UML don't play together that nicely. Especially as soon as templates are involved you might get into more trouble then you would actually like and you need to ask yourself if the time spent into getting those models isn't actually better invested in good old-fashioned source code reading. You will have to read the code eventually to understand what it does so just get started now.

Didn't understand if it's linux or windows (you say linux, but then say you're using visual studio).
If you want to understand the code, Source Insight is a good tool for that. Not necessarily UMLs, but produces nice graphs in real time.

Related

C++ Programming tools

My teacher recommended us to use notepad++ and cygwin for our programming needs. Are there any better solutions anyone can recommend out there to program and compile?
Myriad of various IDE's.... Eclipse CDT, Visual Studio Express, Code::Blocks, DevCPP....
And yes, Notepad++ and Cygwin with gcc would be a very viable option if you only need to compile single files for your homework.
Use a IDE
An integrated development environment (IDE) (also known as integrated design environment, integrated debugging environment or interactive development environment) is a software application that provides comprehensive facilities to computer programmers for software development. An IDE normally consists of:
a source code editor
a compiler and/or an interpreter
build automation tools
a debugger
A few of them to choose from
http://netbeans.org/index.html
http://www.codeblocks.org/
http://www.eclipse.org/cdt/
In my opinion, a very important tool for beginners is a debugger. A lot of question can be answered by yourself if you have a look into the debugger. You can use the gdb but it is hard to use and understand for beginners. So I would recommend to use Visual C++ 2010 Express which has an excellent and easy to use debugger.
Disclaimer:
The following are personal opinions, related to my personal taste on
the subject. Anyone in the programmer community has its own taste and
preferences an can agree or not. Here I just want to tell you about
some rationals. Consider products and related names as "examples."
My Answer
There are mainly three ways to write code:
The manual one
The assisted one
The automated one.
Think to them as:
Driving your car alone
Driving with a navigator
Driving with an autopilot.
Here "driving alone" means "use a generic text editor, a command-line based compiler and a command-line based debugger. The editor may eventually have a clue about the language syntax (thus differentiating different language structural elements, like keyword, literals, operators etc.) but knowing really nothing about what you are coding.
This is what notepad++ does. It makes coding harder, but for very simple things makes you really learn how to "drive".
A "navigator" is a basic IDE like Devc++, or like CodeBlocks: they have the notion of "project", manage the relation between files and manage the invocation of the compiler and debugger, managing the mapping on their output respect to your sources.
You write your own code, but the "road to compile" is told by the "navigator" you have to trust.
An "autopilot" is a more complex IDE (like VisualStudio, Netbeans, Eclipse ...) that can also "manage the code" providing code analysis for either syntax and semantics, context sensitive auto-completion, code generation for common tasks.
They can give you some code you have to complete and connect together.
They make you faster in producing code, testing it, debugging it, but you must have more trust in them or know how they "suggest".
They can be productive, but you have to "configure" them to suite your needs.
Now: since everything is a matter of "trust", and you cannot trust what you don't know yet, and is a matter of "knowing yuur needs" (but a learner may not yet have an idea abut them)
starting with "beasts" like VisualStudio (that mess arout 50% of your computer registry, pretend you to download GIGABYTES from the Internet and installs GIGABYTES of whatever MS library) is clueless: before you will start using all of that, will take years, and VS itself will be changed 2 or three times) or Eclipse (that has the more powerful syntax and semantic analizer, but requires lot of "arcane configuration" you don't even know since you didn't make the first step in programming) may be an excess. At least until your programs will stay in a couple of pages.
starting win notepad++ and GCC (or Mingw) is just a matter of dowload few megabytes, set a PATH, and you go. Fastest way to turn the key on.
when things become more complex, and require some help in organize them, simple IDE like CodeBloks or Codelite are more than effective at "to the point". I will avoid Dev-C++: it's OLD, and doesn't support the "state of art of the C++ language". You an live with them for all your scolarity
when going to more professional kind of projects, and your experience in "using tools" is better, things like Eclipse, or NetBeans may become more "effective". I will in any case avoid VisualStudio: it's not that "effective". But it is the best to develop in Microsoft environments producing MS oriented applications, especially in the ".Net" world. Something you will not see before 2/3 years of experience.
If you're learning you can download VStudio Express. I believe it's free. Easier to use than notepad and cygwin. This isn't a biased opinion. I'm a Linux C++ developer most days but acknowledge the fact that it might be easier to learn using VStudio.
If you are using linux, you can use kate and g++ for editing and compiling c++ files.
If you are using windows, I think your teacher's recommendations are good. Althought there are various IDE's for C++, it is better to use a simple editor that doesn't have code completion and compilation feature while learning a programming language for the first time. IDE's are nice but not good for learners I think.
It's probably a good idea to go with your teacher's suggestion, since you might also need some help in the future, either from him or your colleagues. Another advantage is that, being in school, you'll probably develop using more than one programming language. Notepad++ has support for almost everything you can think of, so you can use it not only for this course. That way you'll have an advantage because you'll learn shortcuts, etc...
If you plan on doing a lot of programming in the future, I highly recommend putting the effort into learning VIM. Nothing else can touch it in terms of speed and power. It has built-in shell access and it is programmable. It is like having God in your text editor. The major down-side is the steep learning curve.
Also, you want to use Git in-case you screw-up and want to go back to a previous point. It lets you periodically check-point your code so you can always go back. For example, maybe you delete something, then later on decide you want to use that code after all. If you've been check-pointing with Git, you can get it back.
Graphical differs sometimes come in handy too.
I started coding in C++ using Turbo C++(the default program available on college computers, I told them it was prehistoric), but then I found Visual Studio Express and never looked back since that day.
Also since i could not install Visual Studio on College computers, I put a portable version of DevC++ on my pen drive to use there.
Eventually I got the College to install Visual Studio Express editions on all Lab Computers (Once I managed to convince them that it was free with no Licensing issues)
For a beginner, go with a text editor and a compiler. Helps you in understanding what actually goes on.
You could use Dev-C++, which is a good compiler for C and C++, if you want lightweight.
Otherwise Visual Studio probably.

Doxygen integration with (unmanaged) Visual C++ 2005

We are slowly moving towards better-standardized commenting in a large C++ project, introducing Doxygen. I personally find it a pain typing in comments, especially since Java IDEs are so good at automating this.
So I wondered what tools there might be? A search turned up DoxyComment which looks quite nice, is this the best/standard tool or are there others worth a look too?
Atomineer is a tool that I and a few others have been using for documenting unmanaged C++ code with Doxygen markup. It's not free, but it is cheap, and may be worth a try:
http://www.atomineerutils.com/products.php
If typing the meta-comments which are instructions to doxygen is a significant part of your comment-writing effort, you're doing it wrong.
Comment should not include things which can be automatically determined by a tool, any programmer will determine just as much (or more) information from e.g. parameter names than any tool.
Another way to look at this is that doxygen already does an excellent job of presenting what can be automatically determined. You don't need to write: "B::B constructs a B object", since doxygen is going to sort it into the constructors section of the documentation automatically.
Focus on what's non-obvious, and take time to think about what you're writing.
Normally many functions and variables will have no need for an individual comment, since either the name is descriptive enough, or they are better explained in a class-level comment describing how multiple members interact.

C++ Professional Code Analysis Tools

I would like to ask about the available (free or not) Static and Dynamic code analysis tools that can be used to C++ applications ESPECIALLY COM and ActiveX.
I am currently using Visual Studio's /analyze compiler option, which is good and all but I still feel there is lots of analysis to be done.
I'm talking about a C++ application where memory management and code security is of utmost importance.
I'm trying to check for problems relating to security such as memory management, input validation, buffer overflows, exception handling... etc
I'm not interested in, say, inheritance depth or lines of executable code.
Without a doubt you want to use Axman. This is by far the best ActiveX/Com security testing tool available, and its open source. This was one of the leading tools used in the Month Of Browser Bugs by H.D. Moore, who is also the creator of Metasploit. I I have personally used Axman to find vulnerabilities and write exploit code.
Axman uses TypeLib to identify all of the components that makeup a COM . This is a type relfection, and it means that Source code is not required. Axman uses reflection to automatically generate fuzz test cases against a COM.
There is a security tool category called the fuzzers which were used in the recent Pwn2Own 2010 contest in Vancouver. The winning guy said that he's not going to tell software makers which bug he found but instead how to create a good fuzzer that will allow them to find the bugs. This was covered by computerworld.
Basically, it finds every place that the software can take input and tries to inject random data until the application crashes. Starting from there, the user attempts to understand what went wrong and develops an effective attack.
I don't know any particular fuzzers but there are many kinds of them for various uses (buffer overflows vs sql injections, 2 very different problems, 2 different fuzzers)
We use Coverity Prevent which is a very sophisticated static analysis tool that stores defects in a database that has a web interface. It works for C, C++, and Java.
We also use open source tools like Valgrind.
Start here are work your way you http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
I don't mean to be rude when I say "google it". Personally (ymmv) I learn much more along the way from googling than just having someone give me "the" answer.
Also, when I look for tools, I will go to to SourceForge and look for - in this case - "static code analysis"
Btw, check out Valgrind

Are there any free tools to help with automatic code generation?

A few semesters back I had a class where we wrote a very rudimentary scheme parser and eventually an interpreter. After the class, I converted my parser into a C++ parser that did a reasonably good job of parsing C++ as long as I didn't do anything fancy with the preprocessor or macros. I could use it to read over my classes and functions and do neat things like automatically generate class readers or writers or set up function callbacks from a text file.
However, my program is pretty limited. I'm sure I could spend some time to make it more robust and do more neat things, but I don't want to spend the time and effort if there are already more robust tools available that do the same thing. I figure there has to be something like this out there since parsers are an essential part of compilers, but I haven't seen tools specifically for automatic code generation that make it easy to go through and play with data structures that represent classes, functions and variables for C++ specifically. Are there tools that do this?
Edit:
Hopefully this will clarify a little bit of what I'm looking for. The program I have runs as a prebuild step in visual studio. It reads over my source files, makes a list of classes, their members, their functions, etc. which is then used to generate new code. Currently I just use it to make it easy to read and write my data structures to a plain text file, but I could do other things as well. The file readers and writers are output into plain .cpp and .h files which I include in the rest of my project just as I would any other file. What I'm looking for are tools that do similar things so I can decide if I should continue to use my own or switch to a some better solution. I'm not looking for anything that generates machine code or edits code that I've written.
A complete parser-building tool like ANTLR or YACC is necessary if you want to parse C++ from scratch, but it's overkill for your purposes.
It reads over my source files, makes a list of classes, their members, their functions, etc. which is then used to generate new code.
Two main options:
GCC-XML can generate a list of classes, members, and functions. The distribution version on their web site is quite old; try the CVS version instead. I don't know about the availability of a Windows port.
Doxygen is designed for producing documentation, but it can also produce an XML output, which you should be able to use to do what you want.
Currently I just use it to make it easy to read and write my data structures to a plain text file...
This is known as serialization. Try Boost.Serialization or maybe libs11n or Google Protocol Buffers. Stack Overflow has further discussion.
...but I could do other things as well.
Other cool applications of this kind of automatic code generation include reflection (inspecting your objects' members at runtime, using duck typing with C++, etc.) and generating wrappers for calling C++ from scripting languages. For a C++ reflection library, see Reflex. For an example of generating wrappers for scripting languages, see Boost.Python or SWIG.
The C++ FAQ Lite has references to YACC grammars for C++. YACC is an old-school parser that was used to generate parser output, clumsy and difficult to learn but very powerful. Nowadays, you'd use Gnu Bison instead of YACC.
Don't forget about Cog. It requires you to know Python. In essence it embeds the output of Python scripts into your code. It's absurdly easy to use, but it takes a totally different approach from things like ANTLR and its purpose is somewhat different.
Maybe Boost::Serialize or ANTLR?
I answered a similar question (re splitting source files into separate header and cpp files) by suggesting the use of lzz.
lzz has a very powerful C++ parser that builds a representation for everything except the bodies of functions. As long as you don't need the contents of the function bodies you you could modify 'lzz' so that it performs the generation step you want.
If you want tools that can parse production C++ code, and carry out arbitrary analyses and transformations, see our DMS Software Reengineering Toolkit and its C++ front end.
It would be straightforward to use the information DMS can provide about C++ code, its structures, types, instances, to generate such access functions. If you wanted to generate access functions in another language, DMS provides means to code transformations from the input language (in this case, C++) to that target language.
Mozilla developed Pork for this kind of thing. I can't say it's easy to use (or even to build), but it is in production.
I've already used professionally the Nvelocity engine combined with C# as a prevoius step to coding, with very good results.

Why is Visual C++ lacking refactor functionality?

When programming in C++ in Visual Studio 2008, why is there no functionality like that seen in the refactor menu when using C#?
I use Rename constantly and you really miss it when it's not there. I'm sure you can get plugins that offer this, but why isn't it integrated in to the IDE when using C++? Is this due to some gotcha in the way that C++ must be parsed?
The syntax and semantics of C++ make it incredibly difficult to correctly implement refactoring functionality. It's possible to implement something relatively simple to cover 90% of the cases, but in the remaining 10% of cases that simple solution will horribly break your code by changing things you never wanted to change.
Read http://yosefk.com/c++fqa/defective.html#defect-8 for a brief discussion of the difficulties that any refactoring code in C++ has to deal with.
Microsoft has evidently decided to punt on this particular feature for C++, leaving it up to third-party developers to do what they can.
I'm not sure why it is like this, but third-party tools exist that help. For example, right now I'm evaluating Visual Assist X (by Whole Tomato). We're also using Visual Studio 2005.
devexpress provides Add-in Refactor! for C++ for VS2005 and VS2008.
Don't feel hard-done-by, it isn't available in VB.Net either :)
C++ is a HARD language to parse compared with C# (VB too unless you've "Option Explicit" and "Option Strict" switched on, it's difficult to tell exactly what any line of code is doing out of a MUCH larger context).
At a guess it could have something to do with the "difficulty" of providing it.
P.S. I marked my answer as community wiki because I know it's not providing any useful information.
Eclipse does few c++ refactorings including 'rename'. Check out this question here on StackOverflow.
It is also possible to use Microsoft compiler with Eclipse. Check out here.
Try Eclipse and see if it fits for you.
There is a lot of fud and confusion around this issue. This amazing youtube video should clear up why C++ refactoring is hard: https://www.youtube.com/watch?v=mVbDzTM21BQ
tl;dr Google refactors their entire 100 million line C++ codebase by using a compiler (Clang + LLVM) that allows access to its intermediate format.
Bottom line, third parties are screwed here, there is no realistic way for them to refactor VS C++ unless MS outputs intermediate results the same way. If you think of it from the programming problem perspective this is obvious: in order to refactor VS C++ you have to be able to compile C++ the exact same way VS does with the same bugs, limitations, flaws, hacks, shortcuts, workarounds, etc. The usual suspects like Coderush and Resharper do not have the budget for that kind of insanity although apparently they are trying but it has been years...
http://www.jetbrains.com/resharper-cpp/
Update 2016: Resharper now does a decent job at C++ refactor. Limitations are purely for large / gigantic projects.
MS has finally done this: https://channel9.msdn.com/Shows/C9-GoingNative/GoingNative-33-C-Refactoring-in-Visual-Studio-2015#time=04m37s
They have started doing this about 10 years ago, I remember watching ms channel9 long ago.
I've been using Visual Assist X with visual studio for about one year and a half. It's an incredible tool that helps you a lot with ordinary C++ code, but it doesn't perform very well on templated code. For instance, you if have a sophisticated policy-based template design, it won't know how to rename your variables, and the project won't compile anymore.
Install plugin which enables you that functionality: https://visualstudiogallery.msdn.microsoft.com/164904b2-3b47-417f-9b6b-fdd35757d194
I'd like to point out that Qt Creator (a C++ IDE which is compatible with VC++ libraries and build system) provides symbol renaming that works very well:
You can rename symbols in all files in a project. When you rename a class, you can also change filenames that match the class name.
Qt Creator - Refactoring: Renaming Symbols
Qt Creator's rename functionality gives you a list of the symbol references it found and an opportunity to exclude any of them before performing the replace. So if it gets a symbol reference wrong, you can exclude it.
So C++ symbol renaming is possible. Coming to VS from Qt Creator I feel your pain, to the point where I've considered converting preexisting VS projects of considerable size to use Qt Creator instead.
I don't buy the argument that this is specifically hard in C++. In addition to the fact that it already works very well in Qt Creator, there's the fact that the compiler and linker can find and match symbols: If that wasn't possible you couldn't build your application.
In fact, languages like Python that are dynamically typed also have renaming tools. If you can create such a tool for a language where there are no explicit references to variable type you can definitely do it for C++.
Case in point:
... Rope, a python refactoring library... I tried it for a few renames, and that definitely worked as expected.
Stack Overflow - What refactoring tools do you use for Python?
Well in spite of comments by all you experts I totally disagree that refactoring support issue has something to do with C++ language semantics or any language semantics for that matter. Except the compiler builder themselves don't choose to implement one in first case due to their own reasons or constraints whatsoever they maybe.
And offense not to be taken but I am sorry to say Mr jsb the above link you provided to support your case (i.e of yosefk) about C++ defect is totally out of question. Its more like you providing direction to "Los angeles" when someone asked for of "San Franisco".
In my opinion raising refactoring difficulty issue for certain language is more like raising a finger on language integrity itself. Especially for languages which is sometimes just pain.... when it comes to their variable declaration and use. :) Okay! tell me how come you loose track of some node within a node tree ... eh? So what it is do with any language be it as simple as machine level code. You know you VS compiler can easily detect if some variable or routine is dead code. Got my point?
About developing third party tool. I think compiler vendors can implement it far more easily and effectively if they ever wanted to then a third party tool which will have to duplicate all the parsing database to handle it. Nowadays compiler can optimize code very efficiently at machine code level and I am hearing here that its difficult to tell how some variable is used previously. You haven't paid any real attention to inner working of compiler I suppose. What database it keep within.
And sure its the almost same database that IDE use for all such similar purposes. In previous time compiler were just a separate entity and IDE just a Text Editor with some specialization but as times goes by the gap between compiler and IDE Editor become less and its directly started working on similar parsed database. Which makes it possible to handle all those intellisense and refactoring or other syntax related issues more effectively. With all precompile things and JIT compiling this gap is almost negligent. So it almost make sense to use same database for both purpose or else your memory demand go higher due to duplication.
You all are programmers - I am not! And you guys seems to be having difficulty visualizing how refactoring can be implemented for C++ or any language that I can't comprehend. Its just all about for something you have to put more effort for some less depending on how heavy is a person you trying push.
Anyway way VS a nice IDE especially when it comes to C#.