Why do compilers only look backward for type and function declarations? - c++

This is purely to satisfy my own curiosity, but why are functions and types only resolved against those defined earlier in the code, rather than anywhere within the same scope?
Sometimes it shows up when functions need to call each other:
void foo(int depth) { bar(depth + 2); }
void bar(int depth) { if (depth > 0) foo(depth - 3); }
Where you either need to move bar to before foo, or declare bar beforehand:
void bar(int);
void foo(int depth) { bar(depth + 2); }
void bar(int depth) { if (depth > 0) foo(depth - 3); }
In more complex examples you may need to review the #include tree and #ifndef guards to figure out why your code does not recognize a function or type defined in some other file, and that's if you already know you're supposed to check for that.
And then of course there's this classic:
typedef struct {
Node *next;
} Node;
Where you need to know that this is possible:
typedef struct Node {
struct Node *next;
} Node;
...but apparently many people don't so they just use 'void *' or 'struct Node' everywhere.
So, is there a reason why these compilers don't resolve forward too? I can understand the preprocessor only checking backward since things can be #undefined, but once a type or function is declared it's there for good and can't be overloaded by a later definition.
Is it for historical reasons, with the limited technology that was available when they were first developed? Or is there some logical ambiguity I'm missing?

The answer to your question is simply that "It would make it much harder to write a compiler if it didn't" - the language specification says that it has to be this way, but the reason for this wording in the language specification is that "it makes it easier to write a compiler that way". To some degree, it's probably also that in the old days, compilers would generate code "as they went along", and not read all the code in a translation unit (source file) first, then process it.
Bear in mind that C and C++ compilers are still being used on machines that don't have huge amounts of memory and very fast processors. So, if you try to compile LARGE amounts of code on a small capacity machine, then the "we don't want to read ALL of the source first, then prase it" approach makes more sense than on a 16GB, quad-core desktop machine. I expect you could load the entire source code for a fairly large project into memory all at once (for example, all of the files in LLVM + Clang is around 450MB, so could easily fit in memory on a modern desktop/laptop).
Edit: It should be noted that "interpreted languages", such as PHP, JavaScript and Basic, typically don't have this requirement, but other compiled languages typically do - for example Pascal has a special keyword forward to tell the compiler that this function exists, but I'm telling you what it contains later.
Both Pascal and C (and C++ because it's based on C in this aspect) allow for pointers to structures that aren't complete yet. Just this simple "you don't have all the information yet" means that the compiler has to build up the type information, and then "go back and fix it up" [obviously only as required]. But it allows us to do:
struct X
{
struct X* next;
...
};
or in C++:
struct X
{
X* next;
...
};
Edit2: This blog by Jan Hubicka, a GCC developer, explains some of the problems with "dealing with all the code at once". Of course, most of us aren't compiling Firefox and similar size projects, but large projects, when you have to deal with ALL of the code at once, do cause problems with "not enough memory available" even with modern machines if the developers don't "put the compiler on a diet from time to time".
http://hubicka.blogspot.co.uk/2014/04/linktime-optimization-in-gcc-1-brief.html

The reason is that all necessary information of an entity being present when needed allows the compiler to translate a source file in a single pass. Cf. http://blogs.msdn.com/b/ericlippert/archive/2010/02/04/how-many-passes.aspx, for a comparison with C# (which doesn't require previous declarations; but then everything is in a class anyway).
C shares this feature with Pascal and a few other languages. Writing such a compiler is easier and, as others pointed out, tends to use less memory (but potentially increases compilation time, paradoxically, because the declarations in headers are being parsed/compiled over and over again for each translation unit).

Because any kind of logical deduction requires some kind of "logical order of things".
You deliberately ignore the order of things, to express your point here. Ok, nice, but this is very ignorant.
In C and C++ you have an order that kind of tries to workaround some typical problems: declaration and definition. In C/C++ you can refer to things that are at least declared, even if they were not yet defined.
But, this separation or "declared" and "defined" is just a relaxation of logical order, which --- in the end --- is just a simplification of things.
Imagine a pure plain language sequence of expressions to describe your program (in contrast to any programming language, which tries to express the same, but even tries to "compile" in into some practical computer program): A is B, but B depends on C, but B is like C only if A is to B what D may be to A what C may be to A what D is to C.
WTF???
If you can deduce a real world solution to a dedicated real world problem in a kind that does not depend on any logical order, then you will know your answer, and you may become very rich, just by knowing it.

Related

Recover C++ class declaration from ELF binary generated by GCC 2.95

Is it possible (that is, could be easily accomplished with commodity tools) to reconstruct a C++ class declaration from a .so file (non-debug, non-x86) — to the point that member functions could be called successfully as if the original .h file was available for that class?
For example, by trial and error I found that this code works when 64K are allocated for instance-local storage. Allocating 1K or 8K leads to SEGFAULT, although I never saw offsets higher than 0x0650 in disassembly and it is highly unlikely that this class actually stores much data by itself.
class TS57 {
private:
char abLocalStorage[65536]; // Large enough to fit all possible instance data.
public:
TS57(int iSomething);
~TS57(void);
int SaveAsMap(long (*)(long, long, long, long, long), char const*, char const*, char const*);
};
And what if I needed more complex classes and usage scenarios? What if allocating 64K per instance would be too wasteful? Are there simple tools (like nm or objdump) that may give insight on a .so library's type information? I found some papers on SecondWrite — an “executable analysis and rewriting framework” which does “recovery of object oriented features from C++ binaries”, — but, despite several references in newsgroups, could not even conclude whether this is a production-ready software, or just a proof of concept, or a private tool not for general use.
FYI. I am asking this totally out of curiosity. I already found a C-style wrapper for that function that not only instantiates the object and calls its method, but also conveniently does additional housekeeping. Moreover, I am not enthusiastic to write for G++ 2.95 — as this was the version that library was compiled by, and I could not find a switch to get the same name mangling scheme in my usual G++ 3.3. Therefore, I am not interested in workarounds: I wonder whether a direct solution exists — in case it is needed in the future.
The short answer is "no, a machine can't do that".
The longer answer is still roughly the same, but I'll try to explain why:
Some information about functions would be available from nm libmystuff.so|c++filt, which will demangle the names. But that will only show functions that have a public name, and it will most likely still be a bit ambiguous as to what the data-types actually mean.
During compilation, a lot of "semantical information"[1] is lost. Functions are inlined, loops are transforme [loops made with for, do-while and while, even goto in some cases, are made to look almost identical), conditions are compiled out or re-arranged, variable names are lost, much of the type information and enum-names are completely lost, etc. Private and public fields of classes would be lost.
Compiler will also do "clever" transformations on the code to replace complex instructions with less complex ones (int x; ... x = x * 5 may become lea eax, [eax*4 + eax] or similar) [this one is pretty simple - try figuring out "backwards" how the compiler solved a populationcount (number of bits set in a binary number) or cosine when it has been inlined...]
A human, that knows what the code is MEANT to do, and good knowledge of the machine code of the target processor, MAY be able to reverse engineer the code and break out functions that have been inlined. But it's still hard to tell the difference between:
void Foo::func()
{
this->x++;
}
and
void func(Foo* p)
{
p->x++;
}
These two functions should become exactly identical machine-code, and if the function does not have a name in the symbol table, there is no way to tell which it is.
[1] Information about the "meaning" of the code.

Using void in functions without parameter?

In C++ using void in a function with no parameter, for example:
class WinMessage
{
public:
BOOL Translate(void);
};
is redundant, you might as well just write Translate();.
I, myself generally include it since it's a bit helpful when code-completion supporting IDEs display a void, since it ensures me that the function takes definitely no parameter.
My question is, Is adding void to parameter-less functions a good practice? Should it be encouraged in modern code?
In C++
void f(void);
is identical to:
void f();
The fact that the first style can still be legally written can be attributed to C.
n3290 § C.1.7 (C++ and ISO C compatibility) states:
Change: In C++, a function declared with an empty parameter list takes
no arguments.
In C, an empty parameter list means that the number and
type of the function arguments are unknown.
Example:
int f(); // means int f(void) in C++
// int f( unknown ) in C
In C, it makes sense to avoid that undesirable "unknown" meaning. In C++, it's superfluous.
Short answer: in C++ it's a hangover from too much C programming. That puts it in the "don't do it unless you really have to" bracket for C++ in my view.
I see absolutely no reason for this. IDEs will just complete the function call with an empty argument list, and 4 characters less.
Personally I believe this is making the already verbose C++ even more verbose. There's no version of the language I'm aware of that requires the use of void here.
I think it will only help in backward compatibility with older C code, otherwise it is redundant.
I feel like no. Reasons:
A lot more code out there has the BOOL Translate() form, so others reading your code will be more comfortable and productive with it.
Having less on the screen (especially something redundant like this) means less thinking for somebody reading your code.
Sometimes people, who didn't program in C in 1988, ask "What does foo(void) mean?"
Just as a side note. Another reason for not including the void is that software, like starUML, that can read code and generate class diagrams, read the void as a parameter. Even though this may be a flaw in the UML generating software, it is still annoying to have to go back and remove the "void"s if you want to have clean diagrams

Could C++ or C99 theoretically be compiled to equally-portable C90?

This is a big question, so let me get a few things out of the way:
Let's ignore the fact that some C++ features cannot be implemented in C (for example, supporting pre-main initialization for any global static object that is linked in).
This is a thought experiment about what is theoretically possible. Please do not write to say how hard this would be (I know), or that I should do X instead. It's not a practical question, it's a fun theoretical one. :)
The question is: is it theoretically possible to compile C++ or C99 to C89 that is as portable as the original source code?
Cfront and Comeau C/C++ do compile C++ to C already. But for Comeau the C they produce is not portable, according to Comeau's sales staff. I have not used the Comeau compiler myself, but I speculate that the reasons for this are:
Macros such as INT_MAX, offsetof(), etc. have already been expanded, and their expansion is platform-specific.
Conditional compilation such as #ifdef has already been resolved.
My question is whether these problems could possibly be surmounted in a robust way. In other words, could a perfect C++ to C compiler be written (modulo the unsupportable C++ features)?
The trick is that you have to expand macros enough to do a robust parse, but then fold them back into their unexpanded forms (so they are again portable and platform-independent). But are there cases where this is fundamentally impossible?
It would be very difficult for anyone to categorically say "yes, this is possible" but I'm very interested in seeing any specific counterexamples: code snippets that could not be compiled in this way for some deep reason. I'm interested in both C++ and C99 counterexamples.
I'll start out with a rough example just to give a flavor of what I think a counterexample might look like.
#ifdef __SSE__
#define OP <
#else
#define OP >
#endif
class Foo {
public:
bool operator <(const Foo& other) { return true; }
bool operator >(const Foo& other) { return false; }
};
bool f() { return Foo() OP Foo(); }
This is tricky because the value of OP and therefore the method call that is generated here is platform-specific. But it seems like it would be possible for the compiler to recognize that the statement's parse tree is dependent on a macro's value, and expand the possibilities of the macro into something like:
bool f() {
#if __SSE__
return Foo_operator_lessthan(...);
#else
return Foo_operator_greaterthan(...);
#endif
}
It is not only theoretically possible, but also practically trivial - use LLVM with a cbe target.
Theoretically all Turing-complete languages are equivalent.
You can compile C++ to an object code, and then decompile it to plain C or use an interpreter written in plain C.
In theory of course anything could be compiled to C first, but it is not practical to do so, specifically for C++.
For Foo operator< in your example it could be converted to:
bool isLess(const struct Foo * left, const struct Foo * right );
as a function signature. (If C90 doesn't allow bool then return int or char, and similarly old C versions that don't allow const, just don't use it).
Virtual functions are more tricky, you need function pointers.
struct A
{
virtual int method( const std::string & str );
};
struct A
{
int (*method)( struct A*, const struct string *);
};
a.method( "Hello" );
a.method( &a, create_String( "hello" ) );
// and take care of the pointer returned by create_String
There are a number of subtle differences. For example, consider the line:
int i = UINT_MAX;
IIRC, in C++ this assigns an implementation-defined value. In C99 and C89, it assigns an implementation-defined value, or raises an implementation-defined signal. So if you see this line in C++, you can't just pass it through to a C89 compiler unmodified unless you make the non-portable assumption that it won't raise a signal.
Btw, if I've remembered wrong, think of your own example of differences in the standards relating to relatively simple expressions...
So, as "grep" says, you can do it because C89 is a rich enough language to express general computation. On the same grounds, you could write a C++ compiler that emits Perl source.
By the sound of your question, though, you're imagining that the compiler would make a set of defined modifications to the original code to make it compile as C89. In fact, even for simple expressions in C++ or C99, the C89 emitted might not look very much like the original source at all.
Also, I've ignored that there may be some parts of the standard libraries you just can't implement, because C89 doesn't offer the capabilities, so you'd end up with a "compiler" but not a complete implementation. I'm not sure. And as dribeas points out, low-level functions like VLAs present problems - basically you can't portably use the C89 "stack" as your C99 "stack". Instead you'd have to dynamically allocate memory from C89 to use for automatic variables required in the C99 source.
One big problem is exceptions. It might be possible to emulate them using setjmp, longjmp etc., but this would always be extremely inefficient compared to a real device-aware unwind engine.
http://www.comeaucomputing.com
There's no better proof of feasibility than a working example. Comeau is one of the most conforming c++03 compiler, and has support for many features of the upcoming standard, but it does not really generate binary code. It just translates your c++ code into c code that can be compiled with different C backends.
As for portability, I would assume it is not possible. There are some features that cannot be implemented without compiler specific extensions. The first example that comes to mind is C99 dynamic arrays: int n; int array[n]; that cannot be implemented in pure C89 (AFAIK) but can be implemented on top of extensions like alloca.

Moving from C++ to C

After a few years coding in C++, I was recently offered a job coding in C, in the embedded field.
Putting aside the question of whether it's right or wrong to dismiss C++ in the embedded field, there are some features/idioms in C++ I would miss a lot. Just to name a few:
Generic, type-safe data structures (using templates).
RAII. Especially in functions with multiple return points, e.g. not having to remember to release the mutex on each return point.
Destructors in general. I.e. you write a d'tor once for MyClass, then if a MyClass instance is a member of MyOtherClass, MyOtherClass doesn't have to explicitly deinitialize the MyClass instance - its d'tor is called automatically.
Namespaces.
What are your experiences moving from C++ to C?
What C substitutes did you find for your favorite C++ features/idioms? Did you discover any C features you wish C++ had?
Working on an embedded project, I tried working in all C once, and just couldn't stand it. It was just so verbose that it made it hard to read anything. Also, I liked the optimized-for-embedded containers I had written, which had to turn into much less safe and harder to fix #define blocks.
Code that in C++ looked like:
if(uart[0]->Send(pktQueue.Top(), sizeof(Packet)))
pktQueue.Dequeue(1);
turns into:
if(UART_uchar_SendBlock(uart[0], Queue_Packet_Top(pktQueue), sizeof(Packet)))
Queue_Packet_Dequeue(pktQueue, 1);
which many people will probably say is fine but gets ridiculous if you have to do more than a couple "method" calls in a line. Two lines of C++ would turn into five of C (due to 80-char line length limits). Both would generate the same code, so it's not like the target processor cared!
One time (back in 1995), I tried writing a lot of C for a multiprocessor data-processing program. The kind where each processor has its own memory and program. The vendor-supplied compiler was a C compiler (some kind of HighC derivative), their libraries were closed source so I couldn't use GCC to build, and their APIs were designed with the mindset that your programs would primarily be the initialize/process/terminate variety, so inter-processor communication was rudimentary at best.
I got about a month in before I gave up, found a copy of cfront, and hacked it into the makefiles so I could use C++. Cfront didn't even support templates, but the C++ code was much, much clearer.
Generic, type-safe data structures (using templates).
The closest thing C has to templates is to declare a header file with a lot of code that looks like:
TYPE * Queue_##TYPE##_Top(Queue_##TYPE##* const this)
{ /* ... */ }
then pull it in with something like:
#define TYPE Packet
#include "Queue.h"
#undef TYPE
Note that this won't work for compound types (e.g. no queues of unsigned char) unless you make a typedef first.
Oh, and remember, if this code isn't actually used anywhere, then you don't even know if it's syntactically correct.
EDIT: One more thing: you'll need to manually manage instantiation of code. If your "template" code isn't all inline functions, then you'll have to put in some control to make sure that things get instantiated only once so your linker doesn't spit out a pile of "multiple instances of Foo" errors.
To do this, you'll have to put the non-inlined stuff in an "implementation" section in your header file:
#ifdef implementation_##TYPE
/* Non-inlines, "static members", global definitions, etc. go here. */
#endif
And then, in one place in all your code per template variant, you have to:
#define TYPE Packet
#define implementation_Packet
#include "Queue.h"
#undef TYPE
Also, this implementation section needs to be outside the standard #ifndef/#define/#endif litany, because you may include the template header file in another header file, but need to instantiate afterward in a .c file.
Yep, it gets ugly fast. Which is why most C programmers don't even try.
RAII.
Especially in functions with multiple return points, e.g. not having to remember to release the mutex on each return point.
Well, forget your pretty code and get used to all your return points (except the end of the function) being gotos:
TYPE * Queue_##TYPE##_Top(Queue_##TYPE##* const this)
{
TYPE * result;
Mutex_Lock(this->lock);
if(this->head == this->tail)
{
result = 0;
goto Queue_##TYPE##_Top_exit:;
}
/* Figure out `result` for real, then fall through to... */
Queue_##TYPE##_Top_exit:
Mutex_Lock(this->lock);
return result;
}
Destructors in general.
I.e. you write a d'tor once for MyClass, then if a MyClass instance is a member of MyOtherClass, MyOtherClass doesn't have to explicitly deinitialize the MyClass instance - its d'tor is called automatically.
Object construction has to be explicitly handled the same way.
Namespaces.
That's actually a simple one to fix: just tack a prefix onto every symbol. This is the primary cause of the source bloat that I talked about earlier (since classes are implicit namespaces). The C folks have been living this, well, forever, and probably won't see what the big deal is.
YMMV
I moved from C++ to C for a different reason (some sort of allergic reaction ;) and there are only a few thing that I miss and some things that I gained. If you stick to C99, if you may, there are constructs that let you program quite nicely and safely, in particular
designated initializers (eventually
combined with macros) make
initialization of simple classes as
painless as constructors
compound literals for temporary variables
for-scope variable may help you to do scope bound resource management, in particular to ensure to unlock of mutexes or free of arrays, even under preliminary function returns
__VA_ARGS__ macros can be used to have default arguments to functions and to do code unrolling
inline functions and macros that combine well to replace (sort of) overloaded functions
The difference between C and C++ is the predictability of the code's behavior.
It is a easier to predict with great accuracy what your code will do in C, in C++ it might become a bit more difficult to come up with an exact prediction.
The predictability in C gives you better control of what your code is doing, but that also means you have to do more stuff.
In C++ you can write less code to get the same thing done, but (at leas for me) I have trouble occasionally knowing how the object code is laid out in memory and it's expected behavior.
Nothing like the STL exists for C.
There are libs available which provide similar functionality, but it isn't builtin anymore.
Think that would be one of my biggest problems... Knowing with which tool I could solve the problem, but not having the tools available in the language I have to use.
In my line of work - which is embedded, by the way - I am constantly switching back & forth between C and C++.
When I'm in C, I miss from C++:
templates (including but not limited to STL containers). I use them for things like special counters, buffer pools, etc. (built up my own library of class templates & function templates that I use in different embedded projects)
very powerful standard library
destructors, which of course make RAII possible (mutexes, interrupt disable, tracing, etc.)
access specifiers, to better enforce who can use (not see) what
I use inheritance on larger projects, and C++'s built-in support for it is much cleaner & nicer than the C "hack" of embedding the base class as the first member (not to mention automatic invocation of constructors, init. lists, etc.) but the items listed above are the ones I miss the most.
Also, probably only about a third of the embedded C++ projects I work on use exceptions, so I've become accustomed to living without them, so I don't miss them too much when I move back to C.
On the flip side, when I move back to a C project with a significant number of developers, there are whole classes of C++ problems that I'm used to explaining to people which go away. Mostly problems due to the complexity of C++, and people who think they know what's going on, but they're really at the "C with classes" part of the C++ confidence curve.
Given the choice, I'd prefer using C++ on a project, but only if the team is pretty solid on the language. Also of course assuming it's not an 8K μC project where I'm effectively writing "C" anyway.
Couple of observations
Unless you plan to use your c++ compiler to build your C (which is possible if you stick to a well define subset of C++) you will soon discover things that your compiler allows in C that would be a compile error in C++.
No more cryptic template errors (yay!)
No (language supported) object oriented programming
Pretty much the same reasons I have for using C++ or a mix of C/C++ rather than pure C. I can live without namespaces but I use them all the time if the code standard allows it. The reasons is that you can write much more compact code in C++. This is very usefull for me, I write servers in C++ which tend to crash now and then. At that point it helps a lot if the code you are looking at is short and consist. For example consider the following code:
uint32_t
ScoreList::FindHighScore(
uint32_t p_PlayerId)
{
MutexLock lock(m_Lock);
uint32_t highScore = 0;
for(int i = 0; i < m_Players.Size(); i++)
{
Player& player = m_Players[i];
if(player.m_Score > highScore)
highScore = player.m_Score;
}
return highScore;
}
In C that looks like:
uint32_t
ScoreList_getHighScore(
ScoreList* p_ScoreList)
{
uint32_t highScore = 0;
Mutex_Lock(p_ScoreList->m_Lock);
for(int i = 0; i < Array_GetSize(p_ScoreList->m_Players); i++)
{
Player* player = p_ScoreList->m_Players[i];
if(player->m_Score > highScore)
highScore = player->m_Score;
}
Mutex_UnLock(p_ScoreList->m_Lock);
return highScore;
}
Not a world of difference. One more line of code, but that tends to add up. Nomally you try your best to keep it clean and lean but sometimes you have to do something more complex. And in those situations you value your line count. One more line is one more thing to look at when you try to figure out why your broadcast network suddenly stops delivering messages.
Anyway I find that C++ allows me to do more complex things in a safe fashion.
yes! i have experienced both of these languages and what i found is C++ is more friendly language. It facilitates with more features. It is better to say that C++ is superset of C language as it provide additional features like polymorphism, interitance, operator and function overloading, user defined data types which is not really supported in C. The thousand lines of code is reduce to few lines with the help of object oriented programming that's the main reason of moving from C to C++.
I think the main problem why c++ is harder to be accepted in embedded environment is because of the lack of engineers that understand how to use c++ properly.
Yes, the same reasoning can be applied to C as well, but luckily there aren't that many pitfalls in C that can shoot yourself in the foot. C++ on the other hand, you need to know when not to use certain features in c++.
All in all, I like c++. I use that on the O/S services layer, driver, management code, etc.
But if your team doesn't have enough experience with it, it's gonna be a tough challenge.
I had experience with both. When the rest of the team wasn't ready for it, it was a total disaster. On the other hand, it was good experience.
Certainly, the desire to escape complex/messy syntax is understandable. Sometimes C can appear to be the solution. However, C++ is where the industry support is, including tooling and libraries, so that is hard to work around.
C++ has so many features today including lambdas.
A good approach is to leverage C++ itself to make your code simpler. Objects are good for isolating things under the hood so that at a higher level, the code is simpler. The core guidelines recommend concrete (simple) objects, so that approach can help.
The level of complexity is under the engineer's control. If multiple inheritance (MI) is useful in a scenario and one prefers that option, then one may use MI.
Alternatively, one can define interfaces, inherit from the interface(s), and contain implementing objects (composition/aggregation) and expose the objects through the interface using inline wrappers. The inline wrappers compile down to nothing, i.e., compile down to simple use of the internal (contained) object, yet the container object appears to have that functionality as if multiple inheritance was used.
C++ also has namespaces, so one should leverage namespaces even if coding in a C-like style.
One can use the language itself to create simpler patterns and the STL is full of examples: array, vector, map, queue, string, unique_ptr,... And one can control (to a reasonable extent) how complex their code is.
So, going back to C is not the way, nor is it necessary. One may use C++ in a C-like way, or use C++ multiple inheritance, or use any option in-between.

Why doesn't anyone upgrade their C compiler with advanced features?

struct elem
{
int i;
char k;
};
elem user; // compile error!
struct elem user; // this is correct
In the above piece of code we are getting an error for the first declaration. But this error doesn't occur with a C++ compiler. In C++ we don't need to use the keyword struct again and again.
So why doesn't anyone update their C compiler, so that we can use structure without the keyword as in C++ ?
Why doesn't the C compiler developer remove some of the glitches of C, like the one above, and update with some advanced features without damaging the original concept of C?
Why it is the same old compiler not updated from 1970's ?
Look at visual studio etc.. It is frequently updated with new releases and for every new release we have to learn some new function usage (even though it is a problem we can cope up with it). We will also get updated with the new compiler if there is any.
Don't take this as a silly question. Why it is not possible? It could be developed without any incompatibility issues (without affecting the code that was developed on the present / old compiler)
Ok, lets develop the new C language, C+, which is in between C and C++ which removes all glitches of C and adds some advanced features from C++ while keeping it useful for specific applications like system level applications, embedded systems etc.
Because it takes years for a new Standard to evolve.
They are working on a new C++ Standard (C++0x), and also on a new C standard (C1x), but if you remember that it usually takes between 5 and 10 years for each iteration, i don't expect to see it before 2010 or so.
Also, just like in any democracy, there are compromises in a Standard. You got the hardliners who say "If you want all that fancy syntactic sugar, go for a toy language like Java or C# that takes you by the hand and even buys you a lollipop", whereas others say "The language needs to be easier and less error-prone to survive in these days or rapidly reducing development cycles".
Both sides are partially right, so standardization is a very long battle that takes years and will lead to many compromises. That applies to everything where multiple big parties are involved, it's not just limited to C/C++.
typedef struct
{
int i;
char k;
} elem;
elem user;
will work nicely. as other said, it's about standard -- when you implement this in VS2008, you can't use it in GCC and when you implement this even in GCC, you certainly not compile in something else. Method above will work everywhere.
On the other side -- when we have C99 standard with bool type, declarations in a for() cycle and in the middle of blocks -- why not this feature as well?
First and foremost, compilers need to support the standard. That's true even if the standard seems awkward in hindsight. Second, compiler vendors do add extensions. For example, many compilers support this:
(char *) p += 100;
to move a pointer by 100 bytes instead of 100 of whatever type p is a pointer to. Strictly speaking that's non-standard because the cast removes the lvalue-ness of p.
The problem with non-standard extensions is that you can't count on them. That's a big problem if you ever want to switch compilers, make your code portable, or use third-party tools.
C is largely a victim of its own success. One of the main reasons to use C is portability. There are C compilers for virtually every hardware platform and OS in existence. If you want to be able to run your code anywhere you write it in C. This creates enormous inertia. It's almost impossible to change anything without sacrificing one of the best things about using the language in the first place.
The result for software developers is that you may need to write to the lowest common denominator, typically ANSI C (C89). For example: Parrot, the virtual machine that will run the next version of Perl, is being written in ANSI C. Perl6 will have an enormously powerful and expressive syntax with some mind-bending concepts baked right into the language. The implementation, though, is being built using a language that is almost the complete opposite. The reason is that this will make it possible for perl to run anywhere: PCs, Macs, Windows, Linux, Unix, VAX, BSD...
This "feature" will never be adopted by future C standards for one reason only: it would badly break backward compatibility. In C, struct tags have separate namespaces to normal identifiers, and this may or may not be considered a feature. Thus, this fragment:
struct elem
{
int foo;
};
int elem;
Is perfectly fine in C, because these two elems are in separate namespaces. If a future standard allowed you to declare a struct elem without a struct qualifier or appropriate typedef, the above program would fail because elem is being used as an identifier for an int.
An example where a future C standard does in fact break backward compatibiity is when C99 disallowed a function without an explicit return type, ie:
foo(void); /* declare a function foo that takes no parameters and returns an int */
This is illegal in C99. However, it is trivial to make this C99 compliant just by adding an int return type. It is not so trivial to "fix" C programs if suddenly struct tags didn't have a separate namespace.
I've found that when I've implemented non-standard extensions to C and C++, even when people request them, they do not get used. The C and C++ world definitely revolves around strict standard compliance. Many of these extensions and improvements have found fertile ground in the D programming language.
Walter Bright, Digital Mars
Most people still using C use it because they're either:
Targeting a very specific platform (ie, embedded) and therefore must use the compiler provided by that platform vendor
Concerned about portability, in which case a non-standard compiler would defeat the purpose
Very comfortable with plain C and see no reason to change, in which case they just don't want to.
As already mentioned, C has a standard that needs to be adhered to. But can't you just write your code using slightly modified C syntax, but use a C++ compiler so that things like
struct elem
{
int i;
char k;
};
elem user;
will compile?
Actually, many C compilers do add features - doesn't pretty much every C compiler support C++ style // comments?
Most of the features added to updates of the C standard (C99 being the most recent) come from extensions that 'caught on'.
For example, even though the compiler I'm using right now on an embedded platform does not claim to conform to the C99 standard (and it is missing quite a bit from it), it does add the following extensions (all of which are borrowed from C++ or C99) to it's 'C90' support:
declarations mixed with statements
anonymous structs and unions
inline
declaration in the for loop initialization expression
and, of course, C++ style // comments
The problem I run into with this is that when I try to compile those files using MSVC (either for testing or because the code is useful on more than just the embedded platform), it'll choke on most of them (I'm honestly not sure about anonymous structs/unions).
So, extensions do get added to C compilers, it's just that they're done at different rates and in different ways (so code using them becomes more difficult to port) and the process of moving them into a standard occurs at a near glacial pace.
We have a typedef for exactly this purpose.
And please do not change the standard we have enough compatibility problems already....
# Manoj Doubts comment
I have no problem with you or somebody else to define C+ or C- or Cwhatever unless you don't touch C :)
I still need a language that capable to complete my task - have a same piece of code (not a small one) to be able to run on tens of Operating system compiled by significant number of different compilers and be able to run on tens of different hardware platform at the moment there is only one language that allow me complete my task and i prefer not to experiment with this ability :) Especially for reason you provided. Do you really think that ability to write
foo test;
instead
struct foo test;
will make you code better from any point of view ?
The following program outputs "1" when compiled as standard C or something else, probably 2, when compiled as C++ or your suggested syntax. That's why the C language can't make this change, it would give new meaning to existing code. And that's bad!
#include <stdio.h>
typedef struct
{
int a;
int b;
} X;
int main(void)
{
union X
{
int a;
int b;
};
X x;
x.a = 1;
x.b = 2;
printf("%d\n", x.a);
return 0;
}
Because C is Standardized. Compiler could offer that feature and some do, but using it means that the source code doesn't follow the standard and could only be compiled on that vendor's compiler.
Well,
1 - None of the compilers that are in use today are from the 70s...
2 - There are standarts for both C and C++ languages and compilers are developed according to those standarts. They can't just change some behaviour !
3 - What happens if you develop on VS2008 and then try to compile that code by another compiler whose last version was released 10 years ago ?
4 - What happens when you play with the options on the C/C++ / Language tab ?
5 - Why don't Microsoft compilers target all the possible processors ? They only target x86, x86_64 and Itanium, that's all...
6 - Believe me , this is not even considered as a problem !!!
You don't need to develop a new language if you want to use C with C++ typedefs and the like (but without classes, templates etc).
Just write your C-like code and use the C++ compiler.
As far as new functionality in new releases go, Visual C++ is not completely standard-conforming (see http://msdn.microsoft.com/en-us/library/x84h5b78.aspx), By the time Visual Studio 2010 is out, the next C++ standard will likely have been approved, giving the VC++ team more functionality to change.
There are also changes to the Microsoft libraries (which have little or nothing to do with the standard), and to what the compiler puts out (C++/CLI). There's plenty of room for changes without trying to deviate from the standard.
Nor do you need anything like C+. Just write in C, use whatever C++ features you like, and compile as C++. One of the Bjarne Stroustrup's original design goals for C++ was to make it unnecessary to write anything in C. It should compile perfectly efficiently provided you limit the C++ features you use (and even then will compile very efficiently; modern C++ compilers do a very good job).
And the unanswered question: Why would you want to use non-standard C, when you could write standard C or standard C++ with almost equal facility?
This sounds like the embrace and extend concept.
Life under your scenario.
I develop code using a C compiler that has the C "glitches" removed.
I move to a different platform with another C compiler that has the C "glitches" removed, but in a slightly different way.
My code doesn't compile or runs differently on the new platform, I waste time "porting" my code to the new platform.
Some vendors actually like to fix "glitches" because this tends to lock people into a single platform.
If you want to write in standard C, follow the standards. That's it.
If you want more freedom use C# or C++.NET or anything else your hardware supports.