I'm working with a legacy C code which I need to document in UML. There's no immediate requirement to use those UML diagrams for synthesis, but there is a desire to go in that direction in the future.
Now, the code is riddled with features which can be enabled or disabled at compile time:
#if(FEATURE_X == ON)
deal_with_x();
#endif
Since there's no way to distinguish between compile-time and run-time conditions in UML (is there?), I end up using the same decision block for both, which means my diagrams really represent the following code:
if(FEATURE_X == ON) {
deal_with_x();
}
While I expect the compiler to eliminate the call when feature X is disabled, this is not quite the same code for at least two reasons:
deal_with_x() has to be defined even if feature X is disabled
static code analysis will complain about dead code
What is the right way to deal with the situation? Is there a UML feature I'm not aware of that could help? Or should I create separate activity diagrams for different configurations (quite a work)? Or should I rely on the compiler to eliminate unnecessary calls and avoid using precompiler directives altogether?
While my question is about C code and precompiler directives, I can see the same problem can arise with C++ templates, especially if static if gets introduced in the language.
Simply go for tagged values to describe that.
This has nothing to do with any activity diagram. This is a pure static deployment thing. So you may create components which use different tagged values for different compilation.
Compile-time conditions tell you how to generate your code. So this will go to some deployment part of your model. The activity diagram refer to behavior of your system. In case you have a target component which is compiled the one or other way and you show its different usages in activity diagrams you can signal this in various ways. One is a naming convention which is described somewhere else in the model or an accompanying documentation. Another way is to create requirements which state to create a common source and link those requirements to the activities and the later components.
As a (personal) side note: usage (esp. over-usage) of compile time options makes your code hard to read up to unreadable. Indeed, each use of a compile time option will make the same source some completely different thing. So rather to start with "I want to have the same source for this function and therefore tend to use compiler-flags" go the other way. Concentrate on function and when it comes to deployment then eventually think of "optimization" towards using compiler flags. So actually, leave thought about it completely out of activity diagrams.
Related
Imagine a project with the development stretched over 10+ years timespan. Some parts are in C, some are in C++ and all of the code uses global functions and global variables. The architecture was designed inherently single threaded and kept growing that way. But now we consider utilizing many-core architectures.
Now one idea being evaluated is to refactor a part of the code into a library, to make it possible to create more than one instance, so that they can run in separate threads and don’t interfere with each other.
The proposal that gains the most traction at this point is to wrap all the library files into namespaces with macro defines, like:
namespace VARIANT {
// all the code
}
Then define the VARIANT in a header or on project level. This will make it possible to have different contexts within different namespaces. And the selling point is that this approach will require minimal code change and has low risk of introducing any regression.
But if at some point we need to make the behavior of Variant1 different from Variant2, things will get tricky, since there’s no way to compare the value of a macro define with a string in a preprocessor macro.
Is there a more elegant way to achieve this?
Another variant might be spotting all global variables and making them thread_local. Requires either C++11 or at least compiler extensions providing the same (__thread using older GCC).
If I read this question right, you even don't need to convert your C files into C++ files (which your approach requires as C does not support namespaces...), but you need C11 for.
Refactoring an old project and make it multithreading is not so simple. First of all you have mixture of C and C++ codes and you cannot blindly follow C++ approach here. Instead of namespace you need to thing on the below areas:-
Find out all the code blocks, container like list, large array of objects etc which need synchronization.
Find out interdependency of threads and how will you control them. For example one thread will generate a report and insert into a table and second one need that information to generate its final report, now you need to find out these kind of dependency among the threads in your code base and need to find out their control mechanism.
Old style of multithreading in C++ was very tricky hence you need to migrate your code to C++11 where implementation of multithreading is much easier.
As you said that in your current project there are lots of global variable, you need to think properly how you are going to share these variables amongst different threads and how will you synchronize access of these variables.
These are some hints you need to consider lots of areas in advance before starting refactoring else all your efforts end in smoke.
GOOD LUCK for your plan.
Just do it in steps, testing each time:
1) typedef a struct with all the globals in it. malloc one, and edit the existing code to reference it. Test - should work exactly the same as with the globals.
2) Create one thread to run one instance of the code. Test - should work exactly the same as with the globals.
3) Try multiple threads.
One step at a time...
Please try very hard to not attempt any bodges!
This is a potentially dangerous question because interdisciplinary questions and answers will be biased, but I'll have a stab at it anyway. All in good spirit!
So, here we go. I'm writing a major editing mode for Emacs for the language that it has almost no support for yet. And I'm at the point, where I have to decide on a way to generate project files. Below is the syllabus of the task ahead:
The templates have to represent project directory tree, not only single files.
The resulting files are of various formats, potentially including SGML-like languages, but not limited to this variety. They also have to generate C-like source code and, eLisp source code and plain text files, like README, for example.
The templates must be processed in a batch upon user-initiated action (as in user wants to create a project - several files must be created in the user-appointed directory). It may be beneficial to have an ability to supervise the creation, but this is less important then the ability to run the process entirely automatically.
Bonus features:
The template language has already a user base (with a potential of reuse of existing templates).
The templates can be used for code snippets (contain blanks which are filled interactively once the user invokes code-generating routine while editing the file).
Obvious things like cross-platform-ness, ease of use both through graphical interface and command line.
I made a research, but I won't share my results (yet) so I won't bias the answers. The problem with answering this question is not that the answer is hard to find, but that it is hard to chose one from many.
I'm developing a system based on Mustache for exactly the use case that you've described. The template language itself is a very simple extension of Mustache called Groome.
I also released a command-line tool called Molt that renders Groome templates. I'd be curious to know if it does everything that you need. I'm still adding features to the tool and haven't yet announced it. Thanks.
I went to solve a similar problem several years aback, where I wanted to use Emacs to generate code out of a UML diagram (cogre), and also generate Makefiles from project specifications. I first tried to use Tempo, but when I tried to get the templates to nest, I ran into problems. I also looked into skeleton, but that didn't quite fit the plan either.
I ended up using Google Templates for a little bit, and liked the syntax, and developed SRecode instead, and just borrowed the good bits from Google templates. SRecode was written specifically for machine-generated code. The interaction for template insertion (aka - what tempo was written for) isn't first class in SRecode. For generating code from a data structure, however, it is very robust, and has a lot of features, and automatically filled variables. It works closely with your major mode, and allows many nested templates, with control over the nested dictionary values. There is a subsystem that will use Semantic tags and generate code from them for a couple languages. That means you can parse code in one language with Semantic, and generate code in another language with SReocde using those tags. Nifty! Many parts of CEDET Reference manuals were built that way.
The templates themselves allow looping, if statements, and include statements. There are a couple examples in SRecode for making an 'application', such as the comment writer, and EDE uses it to create Makefiles, which is almost exactly what you are trying to do.
Another option is Generator, which offers “language-agnostic project bootstrapping with an emphasis on simplicity”. Installation requires Node.js and npm.
Generator’s emphasis on simplicity means it is very easy to learn how to make a template. Generator also saves you from having to reference templates by file paths – it looks for templates in ~/.generator.
However, there is no way to write README or LICENSE files for the template itself without those files being copied to the generated project. Also, post-generation commands written in the Makefile will be copied to the generated Makefile, even after they are no longer of use. Finally, the ad-hoc templating language doesn’t provide a way to escape its __lowercasevariables__ – though I can’t think of a language where that limitation would be a problem.
Rationale: In my day-to-day C++ code development, I frequently need to
answer basic questions such as who calls what in a very large C++ code
base that is frequently changing. But, I also need to have some
automated way to exactly identify what the code is doing around a
particular area of code. "grep" tools such as Cscope are useful (and
I use them heavily already), but are not C++-language-aware: They
don't give any way to identify the types and kinds of lexical
environment of a given use of a type or function a such way that is
conducive to automation (even if said automation is limited to
"read-only" operations such as code browsing and navigation, but I'm
asking for much more than that below).
Question: Does there exist already an open-source C/C++-based library
(native, not managed, not Microsoft- or Linux-specific) that can
statically scan or analyze a large tree of C++ code, and can produce
result sets that answer detailed questions such as:
What functions are called by some supplied function?
What functions make use of this supplied type?
Ditto the above questions if C++ classes or class templates are involved.
The result set should provide some sort of "handle". I should be able
to feed that handle back to the library to perform the following types
of introspection:
What is the byte offset into the file where the reference was made?
What is the reference into the abstract syntax tree (AST) of that
reference, so that I can inspect surrounding code constructs? And
each AST entity would also have file path, byte-offset, and
type-info data associated with it, so that I could recursively walk
up the graph of callers or referrers to do useful operations.
The answer should meet the following requirements:
API: The API exposed must be one of the following:
C or C++ and probably is "C handle" or C++-class-instance-based
(and if it is, must be generic C o C++ code and not Microsoft- or
Linux-specific code constructs unless it is to meet specifics of
the given platform), or
Command-line standard input and standard output based.
C++ aware: Is not limited to C code, but understands C++ language
constructs in minute detail including awareness of inter-class
inheritance relationships and C++ templates.
Fast: Should scan large code bases significantly faster than
compiling the entire code base from scratch. This probably needs to
be relaxed, but only if Incremental result retrieval and Resilient
to small code changes requirements are fully met below.
Provide Result counts: I should be able to ask "How many results
would you provide to some request (and no don't send me all of the
results)?" that responds on the order of less than 3 seconds versus
having to retrieve all results for any given question. If it takes
too long to get that answer, then wastes development time. This is
coupled with the next requirement.
Incremental result retrieval: I should be able to then ask "Give me
just the next N results of this request", and then a handle to the
result set so that I can ask the question repeatedly, thus
incrementally pulling out the results in stages. This means I
should not have to wait for the entire result set before seeing
some subset of all of the results. And that I can cancel the
operation safely if I have seen enough results. Reason: I need to
answer the question: "What is the build or development impact of
changing some particular function signature?"
Resilient to small code changes: If I change a header or source
file, I should not have to wait for the entire code base to be
rescanned, but only that header or source file
rescanned. Rescanning should be quick. E.g., don't do what cscope
requires you to do, which is to rescan the entire code base for
small changes. It is understood that if you change a header, then
scanning can take longer since other files that include that header
would have to be rescanned.
IDE Agnostic: Is text editor agnostic (don't make me use a specific
text editor; I've made my choice already, thank you!)
Platform Agnostic: Is platform-agnostic (don't make me only use it
on Linux or only on Windows, as I have to use both of those
platforms in my daily grind, but I need the tool to be useful on
both as I have code sandboxes on both platforms).
Non-binary: Should not cost me anything other than time to download
and compile the library and all of its dependencies.
Not trial-ware.
Actively Supported: It is likely that sending help requests to mailing lists
or associated forums is likely to get a response in less than 2
days.
Network agnostic: Databases the library builds should be able to be used directly on
a network from 32-bit and 64-bit systems, both Linux and Windows
interchangeably, at the same time, and do not embed hardcoded paths
to filesystems that would otherwise "root" the database to a
particular network.
Build environment agnostic: Does not require intimate knowledge of my build environment, with
the notable exception of possibly requiring knowledge of compiler
supplied CPP macro definitions (e.g. -Dmacro=value).
I would say that CLang Index is a close fit. However I don't think that it stores data in a database.
Anyway the CLang framework offer what you actually need to build a tool tailored to your needs, if only because of its C, C++ and Objective-C parsing / indexing capabitilies. And since it's provided as a set of reusable libraries... it was crafted for being developed on!
I have to admit that I haven't used either because I work with a lot of Microsoft-specific code that uses Microsoft compiler extensions that i don't expect them to understand, but the two open source analyzers I'm aware of are Mozilla Pork and the Clang Analyzer.
If you are looking for results of code analysis (metrics, graphs, ...) why not use a tool (instead of API) to do that? If you can, I suggest you to take a look at Understand.
It's not free (there's a trial version) but I found it very useful.
Maybe Doxygen with GraphViz could be the answer of some of your constraints but not all,for example the analysis of Doxygen is not incremental.
Lately as part of my day job I've been learning IBM Rhapsody and using it to generate code in C++ from the UML.
Yesterday it struck me that it might be cool to think about adding state machine support to my C++ compiler, so I jotted a few notes here: http://ellcc.org/wiki/index.php/State_machines_and_Active_Classes
My motivations for doing this are:
It seems like a cool idea.
The compiler could do much better semantic checking (with better error checking) than the current Rhapsody/normal C++ compiler.
There are many optimization possibilities available when the compiler itself understands the state machine structure.
I may try to extend my grammar to except something like the proposal to see how well it works.
What is your opinion of the proposal? Does it seem readable? Does it seem worthwhile?
Edit:
Thanks for the answers recommending specific libraries to do state machines, but that wasn't my question. I've implemented many state machines using both libraries and code that I've written.
I was really looking for ideas, criticism, etc. about the design of a state machine extension to a C++-like language, not whether this change would be appropriate for addition to standard C++. Think of it as a domain specific extension, where my my domain is real-time control applications.
I've started implementation of the extension in my compiler as described here: http://ellcc.org/wiki/index.php/State%5Fmachines%5Fand%5FActive%5FClasses
So far the concept hasn't had to change much going from proposal to implementation but there have been a few changes in details and I'm refining my understanding of the semantics of the problem.
Time will tell whether the whole concept has any value, however. ;-)
With a few exceptions, C++ has traditionally been extended using class libraries, not new keywords. State machines can easily be implemented using such libraries, so I don't think your proposal has much of a chance.
One problem I see in your proposal is the use of 'goto' to go to another state. What happens if I want to use goto in my own code within a state transition?
Excellent work developing what you've done. Something like you've done probably is possible, but I'm doubtful it would ever get into the C++. Most changes that make it into the language itself are included only to allow people to write more useful and powerful libraries.
There's a library here that provides support for state machines. I haven't tried it, but it might interest you, and you may be able to combine your ideas with a library like this to allow other people to make use of it.
http://www.boost.org/doc/libs/1_34_1/libs/statechart/doc/index.html
Or, you could develop your own extension as you suggest, and it would at least be useful for you. Microsoft implement some extension keywords, so there is no reason why you couldn't create your own extended version of C++.
Keep the new ideas coming.
You should take a look at how another smart developer added State machine support to a C-like language: UnrealScript Language Reference. See the section named "States".
UnrealScript supports states at the
language level. In UnrealScript, each
actor in the world is always in one
and only one state. Its state reflects
the action it wants to perform. For
example, moving brushes have several
states like "StandOpenTimed" and
"BumpOpenTimed". Pawns have several
states such as "Dying", "Attacking",
and "Wandering". In UnrealScript, you
can write functions and code which
exist in a particular state. These
functions are only called when the
actor is in that state
It's an interesting idea, but I think you'd actually have better luck creating your own domain-specific language for state machines than officially extending C++. C++ is designed as a very general-purpose programming language. I think Boost has proven that C++ is flexible enough that most features can be implemented nicely using libraries. It also evolves very slowly, to the extent that standard C++ still doesn't even have built-in threading support as of 2009, (it is planned in 0x). So it's unlikely the committee would consider this addition for some time.
Your solution does not look like it has any advantages to a template- or preprocessor-macro-based solution.
I am also not sure, how you can provide better semantic checking. And I doubt that you can apply many useful code optimizations.
However, to allow better optimizations and semantic checking, you should also replace "goto" with a new keyword (e.g. __change__ newState), and disallow goto for state changes! Allow goto for local jumps as usual.
Then the compiler can extract a list of possible transitions.
Read your proposal, have the following comments:
There's actually no keyword to declare and define an actual state machine! Do you assume a single global state machine (and thus a single global state)? How does that relate to __active__ ?
The closest comparable construct in C++ is actually the enum. Why not extend it?
There seems to be some connection between defined events and states, but I fail to see how it's implemented.
Why are threads and timers needed at all? Some use cases of state machines may benefit from them, but a good proposal should keep these seperate. Most importantly this should allow the use of standard C++0x threads.
Personally, I would extend the enum syntax:
enum Foo {
red, blue, green; /* Standard C++ so far - defines states. State list ends with a ; not a , */
Foo() { *this = red; } // Reuse ctor syntax, instead of __initial__
~Foo() { } // reuse dtor syntax, instead of __onexit__
void Bar() {/**/} // Defines an event, no return value. Doesn't need keyword __event__
};
It follows naturally that you can now declare your events in a header, and define them in a .cpp file. I don't even need to suggest the syntax here, any C++ programmer can guess that at this point. Add a bit of inheritance syntax for combined states:
enum DrawingObject : public Shape, public Color { /** } // allows (red && circle)
and you're pretty much at the point where your proposal is, without any new keywords, all by reusing an already familiar syntax.
I'm just starting to explore C++, so forgive the newbiness of this question. I also beg your indulgence on how open ended this question is. I think it could be broken down, but I think that this information belongs in the same place.
(FYI -- I am working predominantly with the QT SDK and mingw32-make right now and I seem to have configured them correctly for my machine.)
I knew that there was a lot in the language which is compiler-driven -- I've heard about pre-compiler directives, but it seems like someone would be able to write books the different C++ compilers and their respective parameters. In addition, there are commands which apparently precede make (like qmake, for example (is this something only in QT)).
I would like to know if there is any place which gives me an overview of what compilers are out there, and what their different options are. I'd also like to know how each of them views Makefiles (it seems that there is a difference in syntax between them?).
If there is no website regarding, "Everything you need to know about C++ compilers but were afraid to ask," what would be the best way to go about learning the answers to these questions?
Concerning the "numerous options of the various compilers"
A piece of good news: you needn't worry about the detail of most of these options. You will, in due time, delve into this, only for the very compiler you use, and maybe only for the options that pertain to a particular set of features. But as a novice, generally trust the default options or the ones supplied with the make files.
The broad categories of these features (and I may be missing a few) are:
pre-processor defines (now, you may need a few of these)
code generation (target CPU, FPU usage...)
optimization (hints for the compiler to favor speed over size and such)
inclusion of debug info (which is extra data left in the object/binary and which enables the debugger to know where each line of code starts, what the variables names are etc.)
directives for the linker
output type (exe, library, memory maps...)
C/C++ language compliance and warnings (compatibility with previous version of the compiler, compliance to current and past C Standards, warning about common possible bug-indicative patterns...)
compile-time verbosity and help
Concerning an inventory of compilers with their options and features
I know of no such list but I'm sure it probably exists on the web. However, suggest that, as a novice you worry little about these "details", and use whatever free compiler you can find (gcc certainly a great choice), and build experience with the language and the build process. C professionals may likely argue, with good reason and at length on the merits of various compilers and associated runtine etc., but for generic purposes -and then some- the free stuff is all that is needed.
Concerning the build process
The most trivial applications, such these made of a single unit of compilation (read a single C/C++ source file), can be built with a simple batch file where the various compiler and linker options are hardcoded, and where the name of file is specified on the command line.
For all other cases, it is very important to codify the build process so that it can be done
a) automatically and
b) reliably, i.e. with repeatability.
The "recipe" associated with this build process is often encapsulated in a make file or as the complexity grows, possibly several make files, possibly "bundled together in a script/bat file.
This (make file syntax) you need to get familiar with, even if you use alternatives to make/nmake, such as Apache Ant; the reason is that many (most?) source code packages include a make file.
In a nutshell, make files are text files and they allow defining targets, and the associated command to build a target. Each target is associated with its dependencies, which allows the make logic to decide what targets are out of date and should be rebuilt, and, before rebuilding them, what possibly dependencies should also be rebuilt. That way, when you modify say an include file (and if the make file is properly configured) any c file that used this header will be recompiled and any binary which links with the corresponding obj file will be rebuilt as well. make also include options to force all targets to be rebuilt, and this is sometimes handy to be sure that you truly have a current built (for example in the case some dependencies of a given object are not declared in the make).
On the Pre-processor:
The pre-processor is the first step toward compiling, although it is technically not part of the compilation. The purposes of this step are:
to remove any comment, and extraneous whitespace
to substitute any macro reference with the relevant C/C++ syntax. Some macros for example are used to define constant values such as say some email address used in the program; during per-processing any reference to this constant value (btw by convention such constants are named with ALL_CAPS_AND_UNDERSCORES) is replace by the actual C string literal containing the email address.
to exclude all conditional compiling branches that are not relevant (the #IFDEF and the like)
What's important to know about the pre-processor is that the pre-processor directive are NOT part of the C-Language proper, and they serve several important functions such as the conditional compiling mentionned earlier (used for example to have multiple versions of the program, say for different Operating Systems, or indeed for different compilers)
Taking it from there...
After this manifesto of mine... I encourage to read but little more, and to dive into programming and building binaries. It is a very good idea to try and get a broad picture of the framework etc. but this can be overdone, a bit akin to the exchange student who stays in his/her room reading the Webster dictionary to be "prepared" for meeting native speakers, rather than just "doing it!".
Ideally you shouldn't need to care what C++ compiler you are using. The compatability to the standard has got much better in recent years (even from microsoft)
Compiler flags obviously differ but the same features are generally available, it's just a differently named option to eg. set warning level on GCC and ms-cl
The build system is indepenant of the compiler, you can use any make with any compiler.
That is a lot of questions in one.
C++ compilers are a lot like hammers: They come in all sizes and shapes, with different abilities and features, intended for different types of users, and at different price points; ultimately they all are for doing the same basic task as the others.
Some are intended for highly specialized applications, like high-performance graphics, and have numerous extensions and libraries to assist the engineer with those types of problems. Others are meant for general purpose use, and aren't necessarily always the greatest for extreme work.
The technique for using each type of hammer varies from model to model—and version to version—but they all have a lot in common. The macro preprocessor is a standard part of C and C++ compilers.
A brief comparison of many C++ compilers is here. Also check out the list of C compilers, since many programs don't use any C++ features and can be compiled by ordinary C.
C++ compilers don't "view" makefiles. The rules of a makefile may invoke a C++ compiler, but also may "compile" assembly language modules (assembling), process other languages, build libraries, link modules, and/or post-process object modules. Makefiles often contain rules for cleaning up intermediate files, establishing debug environments, obtaining source code, etc., etc. Compilation is one link in a long chain of steps to develop software.
Also, many development environments abstract the makefile into a "project file" which is used by an integrated development environment (IDE) in an attempt to simplify or automate many programming tasks. See a comparison here.
As for learning: choose a specific problem to solve and dive in. The target platform (Linux/Windows/etc.) and problem space will narrow the choices pretty well. Which you choose is often linked to other considerations, such as working for a particular company, or being part of a team. C++ has something like 95% commonality among all its flavors. Learn any one of them well, and learning the next is a piece of cake.