undefined reference to `_memset64' when creating a struct array under DMD with -betterC - d

I'm quite new to D, and am trying to create an array of structs with -betterC active, but keep running into this error:
/home/xander/Documents/lithium/kernel/kernel.main.d:17: undefined reference to `_memset64'
when I try to link it. Here is the offending line:
idt_entry[256] idtr;
I did not find this error with gdc, but require access to inline assembly for my project, so switching back is not an option.
Repo link:
Omega0x013/lithium

I think defining the function will solve your problem. source from here
extern (C) long *_memset64(long *p, long value, size_t count)
{
long *pstart = p;
long *ptop;
for (ptop = &p[count]; p < ptop; p++)
*p = value;
return pstart;
}

D compilers, like a lot of compilers (GCC and clang come to mind), will detect some patterns and usages and generate code calling functions from the standard library that reflect that pattern and can do it potentially more efficiently, given that standard library functions are often greatly optimized and in occasions using niche machine-specific hardware extensions, like AVX-accelerated functions.
In your case, since this is freestanding code (a fact that you did not conceal really well given that this is an x86 IDT structure :D), the standard library is not available, and you can't link to it, but this is something the compiler does not know, so it will still generate a call to the function, thus, it's your job as the programmer to supply it!
This answer provides the drop-in function to implement, but I think some reasoning was due, it's never good to copy code without understanding its purpose!

Related

Passing structs by-value while conforming to the C calling convention in LLVM IR

I would like to pass structs by-value between C++ and a JIT'd LLVM program. I've seen a lot of discussion about this and even a few questions on SO. I've read that I need to do something called "argument coercion" if I want my program to really pass-by-value. Using byval and sret looks like the easy cross-platform solution. It's still a bit of a pain and the C++ code has to remember to pass pointers instead of values (although, the calling code is C++ so I could do some templating magic).
The more I read about this problem, the less I seem to understand it. Calling convention is a platform-specific issue that should be dealt with by the code generator, right? I don't understand why the platform-specific code generator doesn't just deal with the platform-specific way of handling structs (while conforming to the platform's C ABI). The front-end should be platform-agnostic!
Is there a pass that does argument coercion for me? A pass that visits every function declaration and every function call and transforms all of the structs so that they are compatible with the platform's C ABI? I feel like that's something that all frontends would be using if it existed and Clang doesn't use it so maybe it's not possible. Why isn't this a viable solution? If a pass can just deal with this then I would expect it to be part of LLVM.
I don't understand why every frontend has to do argument coercion. I don't even understand how to do argument coercion. I've seen a few instances of people taking the Clang code generation code and factoring out the part that does argument coercion. Unfortunately, this seems like the best solution if I want real C ABI compatibility. The fact that it's even possible to reuse part of another frontend for a completely different language makes me continue to wonder why this has to be done in the frontend?
Something has to be done about this! We can't just keep writing the same C ABI compatibility code in every frontend. It's ridiculous! Maybe I simply don't understand.
Could someone clear this up for me? I'm thinking about using byval and sret simply because it's easier than modifying the clang code generator. Is there an easier way?
When passing around structs by value in LLVM IR, you have to make up your own rules. I chose the simplest set of rules I could.
Let's say I have a program like this:
struct MyStruct {
int a;
char b, c, d, e;
};
MyStruct identityImpl(MyStruct s) {
return s;
}
MyStruct identity(MyStruct s) {
return identityImpl(s);
}
The LLVM IR for this program is equivalent to this:
void identityImpl(MyStruct *ret, const MyStruct *s) {
MyStruct localS = *s;
*ret = localS;
}
void identity(MyStruct *ret, const MyStruct *s) {
MyStruct localS = *s;
MyStruct localRet;
identityImpl(&localRet, &localS);
*ret = localRet;
}
It's not the most efficient way of passing the struct because MyStruct can fit in a 64-bit register. However, the optimizer can remove localS and use s directly if it can prove that localS is never written to. Both of those functions optimize down to a single call to memcpy.
This only took half a day. Going the Clang route probably would have taken at least a week. I still think it's rather unfortunate that I had to do this but I understand the problem now. The passing of structs is not specified by the platform's C ABI.

Trigger a compiler error when trying to add char* to int?

C++ compilers happily compiles this code, with no warning:
int ival = 8;
const char *strval = "x";
const char *badresult = ival + strval;
Here we add a char* pointer value (strval) to an int value (ival) and store the result in a char* pointer (badresult). Of course, the content of the badresult will be total garbage and the app might crash on this line or later when it is trying to use the badresult elsewhere.
The problem is that it is very easy to make such mistakes in real life. The one I caught in my code looked like this:
message += port + "\n";
(where message is a string type handling the result with its operator += function; port is an int and \n is obviously a const char pointer).
Is there any way to disable this kind of behavior and trigger an error at compile time?
I don't see any normal use case for adding char* to int and I would like a solution to prevent this kind of mistakes in my large code base.
When using classes, we can create private operators and use the explicit keyword to disable unneeded conversions/casts, however now we are talking about basic types (char* and int).
One solutions is to use clang as that has a flag to enable warning for this.
However I can't use clang, so I am seeking for a solution that triggers a compiler error (some kind of operator overload or mangling with some defines to prevent such constructs or any other idea).
Is there any way to disable this kind of behavior and trigger an error at compile time?
Not in general, because your code is very similar to the following, legitimate, code:
int ival = 3;
const char *strval = "abcd";
const char *goodresult = ival + strval;
Here goodresult is pointing to the last letter d of strval.
BTW, on Linux, getpid(2) is known to return a positive integer. So you could imagine:
int ival = (getpid()>=0)?3:1000;
const char *strval = "abcd";
const char *goodresult = ival + strval;
which is morally the same as the previous example (so we humans know that ival is always 3). But teaching the compiler that getpid() does not return a negative value is tricky in practice (the return type pid_t of getpid is some signed integer, and has to be signed to be usable by fork(2), which could give -1). And you could imagine more weird examples!
You want compile-time detection of buffer overflow (or more generally of undefined behavior), and in general that is equivalent to the halting problem (it is an unsolvable problem). So it is impossible in general.
Of course, one could claim that a clever compiler could warn for your particular case, but then there is a concern about what cases should be useful to warn.
You might try some static source program analysis tools, perhaps Clang static analyzer or Frama-C (with its recent Frama-C++ variant) - or some costly proprietary tools like Coverity and many others. These tools don't detect all errors statically and takes much more time to execute than an optimizing compiler.
You could (for example) write your own GCC plugin to detect such mistakes (that means developing your own static source code analyzer). You'll spend months in writing it. Are you sure it is worth the effort?
However I can't use clang,
Why? You could ask permission to use the clang static analyzer (or some other one), during development (not for production). If your manager refuses that, it becomes a management problem, not a technical one.
I don't see any normal use case for adding char* to int
You need more imagination. Think of something like
puts("non-empty" + (isempty(x)?4:0));
Ok that is not very readable code, but it is legitimate. In the previous century, when memory was costly, some people used to code that way.
Today you'll code perhaps
if (isempty(x))
puts("empty");
else
puts("non-empty")
and the cute thing is that a clever compiler could probably optimize the later into the equivalent of former (according to the as-if rule).
No way. It is valid syntax, and very useful in many cases.
Just think about you were to write int b=a+10 but you wrote int b=a+00 incorrectly, the compiler won't know it is an error by mistake.
However, you can consider to use C++ classes. Most C++ classes are well designed to prevent such obvious mistakes.
In the first example in your question, really, compilers should issue a warning. Compilers can trivially see that the addition resolves to 8 + "x" and clang does indeed optimise it to a constant. I see the fact it doesn't warn about this as a compiler bug. Although compilers are not required to warn about this, clang goes through great efforts to provide useful diagnostics, and it would be an improvement to diagnose this as well.
In the second example, as Matteo Italia pointed out, clang does already provide a warning option for this, enabled by default: -Wstring-plus-int. You can turn specific warnings into errors by using -Werror=<warning-option>, so in this case -Werror=string-plus-int.

Recover C++ class declaration from ELF binary generated by GCC 2.95

Is it possible (that is, could be easily accomplished with commodity tools) to reconstruct a C++ class declaration from a .so file (non-debug, non-x86) — to the point that member functions could be called successfully as if the original .h file was available for that class?
For example, by trial and error I found that this code works when 64K are allocated for instance-local storage. Allocating 1K or 8K leads to SEGFAULT, although I never saw offsets higher than 0x0650 in disassembly and it is highly unlikely that this class actually stores much data by itself.
class TS57 {
private:
char abLocalStorage[65536]; // Large enough to fit all possible instance data.
public:
TS57(int iSomething);
~TS57(void);
int SaveAsMap(long (*)(long, long, long, long, long), char const*, char const*, char const*);
};
And what if I needed more complex classes and usage scenarios? What if allocating 64K per instance would be too wasteful? Are there simple tools (like nm or objdump) that may give insight on a .so library's type information? I found some papers on SecondWrite — an “executable analysis and rewriting framework” which does “recovery of object oriented features from C++ binaries”, — but, despite several references in newsgroups, could not even conclude whether this is a production-ready software, or just a proof of concept, or a private tool not for general use.
FYI. I am asking this totally out of curiosity. I already found a C-style wrapper for that function that not only instantiates the object and calls its method, but also conveniently does additional housekeeping. Moreover, I am not enthusiastic to write for G++ 2.95 — as this was the version that library was compiled by, and I could not find a switch to get the same name mangling scheme in my usual G++ 3.3. Therefore, I am not interested in workarounds: I wonder whether a direct solution exists — in case it is needed in the future.
The short answer is "no, a machine can't do that".
The longer answer is still roughly the same, but I'll try to explain why:
Some information about functions would be available from nm libmystuff.so|c++filt, which will demangle the names. But that will only show functions that have a public name, and it will most likely still be a bit ambiguous as to what the data-types actually mean.
During compilation, a lot of "semantical information"[1] is lost. Functions are inlined, loops are transforme [loops made with for, do-while and while, even goto in some cases, are made to look almost identical), conditions are compiled out or re-arranged, variable names are lost, much of the type information and enum-names are completely lost, etc. Private and public fields of classes would be lost.
Compiler will also do "clever" transformations on the code to replace complex instructions with less complex ones (int x; ... x = x * 5 may become lea eax, [eax*4 + eax] or similar) [this one is pretty simple - try figuring out "backwards" how the compiler solved a populationcount (number of bits set in a binary number) or cosine when it has been inlined...]
A human, that knows what the code is MEANT to do, and good knowledge of the machine code of the target processor, MAY be able to reverse engineer the code and break out functions that have been inlined. But it's still hard to tell the difference between:
void Foo::func()
{
this->x++;
}
and
void func(Foo* p)
{
p->x++;
}
These two functions should become exactly identical machine-code, and if the function does not have a name in the symbol table, there is no way to tell which it is.
[1] Information about the "meaning" of the code.

GCC pragma to add/remove compiler options in a source file

I have developed a cross-platform library which makes fair use of type-punning in socket communications. This library is already being used in a number of projects, some of which I may not be aware of.
Using this library incorrectly can result in dangerously Undefined Behavior. I would like to ensure to the best of my ability that this library is being used properly.
Aside from documentation of course, under G++ the best way I'm aware of to do that is to use the -fstrict_aliasing and -Wstrict-aliasing options.
Is there a way under GCC to apply these options at a source file level?
In other words, I'd like to write something like the following:
MyFancyLib.h
#ifndef MY_FANCY_LIB_H
#define MY_FANCY_LIB_H
#pragma (something that pushes the current compiler options)
#pragma (something to set -fstrict_aliasing and -Wstrict-aliasing)
// ... my stuff ...
#pragma (something to pop the compiler options)
#endif
Is there a way?
I rather dislike nay-sayers. You can see an excellent post at this page: https://www.codingame.com/playgrounds/58302/using-pragma-for-compile-optimization
All the other answers clearly have nothing to do with the question so here is the actual documentation for GCC:
https://gcc.gnu.org/onlinedocs/gcc/Pragmas.html
Other compilers will have their own methods so you will need to look those up and create some macros to handle this.
Best of luck. Sorry that it took you 10 years to get any relevant answer.
Let's start with what I think is a false premise:
Using this library incorrectly can result in dangerously Undefined Behavior. I would like to ensure to the best of my ability that this library is being used properly.
If your library does type punning in a way that -fstrict-aliasing breaks, then it has undefined behavior according to the C++ standard regardless of what compiler flags are passed. The fact that the program seems to work on certain compilers when compiled with certain flags (in particular, -fno-strict-aliasing) does not change that.
Therefore, the best solution is to do what Florian said: change the code so it conforms to the C++ language specification. Until you do that, you're perpetually on thin ice.
"Yes, yes", you say, "but until then, what can I do to mitigate the problem?"
I recommend including a run-time check, used during library initialization, to detect the condition of having been compiled in a way that will cause it to misbehave. For example:
// Given two pointers to the *same* address, return 1 if the compiler
// is behaving as if -fstrict-aliasing is specified, and 0 if not.
//
// Based on https://blog.regehr.org/archives/959 .
static int sae_helper(int *h, long *k)
{
// Write a 1.
*h = 1;
// Overwrite it with all zeroes using a pointer with a different type.
// With naive semantics, '*h' is now 0. But when -fstrict-aliasing is
// enabled, the compiler will think 'h' and 'k' point to different
// memory locations ...
*k = 0;
// ... and therefore will optimize this read as 1.
return *h;
}
int strict_aliasing_enabled()
{
long k = 0;
// Undefined behavior! But we're only doing this because other
// code in the library also has undefined behavior, and we want
// to predict how that code will behave.
return sae_helper((int*)&k, &k);
}
(The above is C rather than C++ just to ease use in both languages.)
Now in your initialization routine, call strict_aliasing_enabled(), and if it returns 1, bail out immediately with an error message saying the library has been compiled incorrectly. This will help protect end users from misbehavior and alert the developers of the client programs that they need to fix their build.
I have tested this code with gcc-5.4.0 and clang-8.0.1. When -O2 is passed, strict_aliasing_enabled() returns 1. When -O2 -fno-strict-aliasing is passed, that function returns 0.
But let me emphasize again: my code has undefined behavior! There is (can be) no guarantee it will work. A standard-conforming C++ compiler could compile it into code that returns 0, crashes, or that initiates Global Thermonuclear War! Which is also true of the code you're presumably already using elsewhere in the library if you need -fno-strict-aliasing for it to behave as intended.
You can try the Diagnostic pragmas and change the level in error for your warnings. More details here:
http://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html
If your library is a header-only library, I think the only way to deal with this is to fix the strict aliasing violations. If the violations occur between types you define, you can use the usual tricks involving unions, or the may_alias type attribute. If your library uses the predefined sockaddr types, this could be difficult.

Why doesn't anyone upgrade their C compiler with advanced features?

struct elem
{
int i;
char k;
};
elem user; // compile error!
struct elem user; // this is correct
In the above piece of code we are getting an error for the first declaration. But this error doesn't occur with a C++ compiler. In C++ we don't need to use the keyword struct again and again.
So why doesn't anyone update their C compiler, so that we can use structure without the keyword as in C++ ?
Why doesn't the C compiler developer remove some of the glitches of C, like the one above, and update with some advanced features without damaging the original concept of C?
Why it is the same old compiler not updated from 1970's ?
Look at visual studio etc.. It is frequently updated with new releases and for every new release we have to learn some new function usage (even though it is a problem we can cope up with it). We will also get updated with the new compiler if there is any.
Don't take this as a silly question. Why it is not possible? It could be developed without any incompatibility issues (without affecting the code that was developed on the present / old compiler)
Ok, lets develop the new C language, C+, which is in between C and C++ which removes all glitches of C and adds some advanced features from C++ while keeping it useful for specific applications like system level applications, embedded systems etc.
Because it takes years for a new Standard to evolve.
They are working on a new C++ Standard (C++0x), and also on a new C standard (C1x), but if you remember that it usually takes between 5 and 10 years for each iteration, i don't expect to see it before 2010 or so.
Also, just like in any democracy, there are compromises in a Standard. You got the hardliners who say "If you want all that fancy syntactic sugar, go for a toy language like Java or C# that takes you by the hand and even buys you a lollipop", whereas others say "The language needs to be easier and less error-prone to survive in these days or rapidly reducing development cycles".
Both sides are partially right, so standardization is a very long battle that takes years and will lead to many compromises. That applies to everything where multiple big parties are involved, it's not just limited to C/C++.
typedef struct
{
int i;
char k;
} elem;
elem user;
will work nicely. as other said, it's about standard -- when you implement this in VS2008, you can't use it in GCC and when you implement this even in GCC, you certainly not compile in something else. Method above will work everywhere.
On the other side -- when we have C99 standard with bool type, declarations in a for() cycle and in the middle of blocks -- why not this feature as well?
First and foremost, compilers need to support the standard. That's true even if the standard seems awkward in hindsight. Second, compiler vendors do add extensions. For example, many compilers support this:
(char *) p += 100;
to move a pointer by 100 bytes instead of 100 of whatever type p is a pointer to. Strictly speaking that's non-standard because the cast removes the lvalue-ness of p.
The problem with non-standard extensions is that you can't count on them. That's a big problem if you ever want to switch compilers, make your code portable, or use third-party tools.
C is largely a victim of its own success. One of the main reasons to use C is portability. There are C compilers for virtually every hardware platform and OS in existence. If you want to be able to run your code anywhere you write it in C. This creates enormous inertia. It's almost impossible to change anything without sacrificing one of the best things about using the language in the first place.
The result for software developers is that you may need to write to the lowest common denominator, typically ANSI C (C89). For example: Parrot, the virtual machine that will run the next version of Perl, is being written in ANSI C. Perl6 will have an enormously powerful and expressive syntax with some mind-bending concepts baked right into the language. The implementation, though, is being built using a language that is almost the complete opposite. The reason is that this will make it possible for perl to run anywhere: PCs, Macs, Windows, Linux, Unix, VAX, BSD...
This "feature" will never be adopted by future C standards for one reason only: it would badly break backward compatibility. In C, struct tags have separate namespaces to normal identifiers, and this may or may not be considered a feature. Thus, this fragment:
struct elem
{
int foo;
};
int elem;
Is perfectly fine in C, because these two elems are in separate namespaces. If a future standard allowed you to declare a struct elem without a struct qualifier or appropriate typedef, the above program would fail because elem is being used as an identifier for an int.
An example where a future C standard does in fact break backward compatibiity is when C99 disallowed a function without an explicit return type, ie:
foo(void); /* declare a function foo that takes no parameters and returns an int */
This is illegal in C99. However, it is trivial to make this C99 compliant just by adding an int return type. It is not so trivial to "fix" C programs if suddenly struct tags didn't have a separate namespace.
I've found that when I've implemented non-standard extensions to C and C++, even when people request them, they do not get used. The C and C++ world definitely revolves around strict standard compliance. Many of these extensions and improvements have found fertile ground in the D programming language.
Walter Bright, Digital Mars
Most people still using C use it because they're either:
Targeting a very specific platform (ie, embedded) and therefore must use the compiler provided by that platform vendor
Concerned about portability, in which case a non-standard compiler would defeat the purpose
Very comfortable with plain C and see no reason to change, in which case they just don't want to.
As already mentioned, C has a standard that needs to be adhered to. But can't you just write your code using slightly modified C syntax, but use a C++ compiler so that things like
struct elem
{
int i;
char k;
};
elem user;
will compile?
Actually, many C compilers do add features - doesn't pretty much every C compiler support C++ style // comments?
Most of the features added to updates of the C standard (C99 being the most recent) come from extensions that 'caught on'.
For example, even though the compiler I'm using right now on an embedded platform does not claim to conform to the C99 standard (and it is missing quite a bit from it), it does add the following extensions (all of which are borrowed from C++ or C99) to it's 'C90' support:
declarations mixed with statements
anonymous structs and unions
inline
declaration in the for loop initialization expression
and, of course, C++ style // comments
The problem I run into with this is that when I try to compile those files using MSVC (either for testing or because the code is useful on more than just the embedded platform), it'll choke on most of them (I'm honestly not sure about anonymous structs/unions).
So, extensions do get added to C compilers, it's just that they're done at different rates and in different ways (so code using them becomes more difficult to port) and the process of moving them into a standard occurs at a near glacial pace.
We have a typedef for exactly this purpose.
And please do not change the standard we have enough compatibility problems already....
# Manoj Doubts comment
I have no problem with you or somebody else to define C+ or C- or Cwhatever unless you don't touch C :)
I still need a language that capable to complete my task - have a same piece of code (not a small one) to be able to run on tens of Operating system compiled by significant number of different compilers and be able to run on tens of different hardware platform at the moment there is only one language that allow me complete my task and i prefer not to experiment with this ability :) Especially for reason you provided. Do you really think that ability to write
foo test;
instead
struct foo test;
will make you code better from any point of view ?
The following program outputs "1" when compiled as standard C or something else, probably 2, when compiled as C++ or your suggested syntax. That's why the C language can't make this change, it would give new meaning to existing code. And that's bad!
#include <stdio.h>
typedef struct
{
int a;
int b;
} X;
int main(void)
{
union X
{
int a;
int b;
};
X x;
x.a = 1;
x.b = 2;
printf("%d\n", x.a);
return 0;
}
Because C is Standardized. Compiler could offer that feature and some do, but using it means that the source code doesn't follow the standard and could only be compiled on that vendor's compiler.
Well,
1 - None of the compilers that are in use today are from the 70s...
2 - There are standarts for both C and C++ languages and compilers are developed according to those standarts. They can't just change some behaviour !
3 - What happens if you develop on VS2008 and then try to compile that code by another compiler whose last version was released 10 years ago ?
4 - What happens when you play with the options on the C/C++ / Language tab ?
5 - Why don't Microsoft compilers target all the possible processors ? They only target x86, x86_64 and Itanium, that's all...
6 - Believe me , this is not even considered as a problem !!!
You don't need to develop a new language if you want to use C with C++ typedefs and the like (but without classes, templates etc).
Just write your C-like code and use the C++ compiler.
As far as new functionality in new releases go, Visual C++ is not completely standard-conforming (see http://msdn.microsoft.com/en-us/library/x84h5b78.aspx), By the time Visual Studio 2010 is out, the next C++ standard will likely have been approved, giving the VC++ team more functionality to change.
There are also changes to the Microsoft libraries (which have little or nothing to do with the standard), and to what the compiler puts out (C++/CLI). There's plenty of room for changes without trying to deviate from the standard.
Nor do you need anything like C+. Just write in C, use whatever C++ features you like, and compile as C++. One of the Bjarne Stroustrup's original design goals for C++ was to make it unnecessary to write anything in C. It should compile perfectly efficiently provided you limit the C++ features you use (and even then will compile very efficiently; modern C++ compilers do a very good job).
And the unanswered question: Why would you want to use non-standard C, when you could write standard C or standard C++ with almost equal facility?
This sounds like the embrace and extend concept.
Life under your scenario.
I develop code using a C compiler that has the C "glitches" removed.
I move to a different platform with another C compiler that has the C "glitches" removed, but in a slightly different way.
My code doesn't compile or runs differently on the new platform, I waste time "porting" my code to the new platform.
Some vendors actually like to fix "glitches" because this tends to lock people into a single platform.
If you want to write in standard C, follow the standards. That's it.
If you want more freedom use C# or C++.NET or anything else your hardware supports.