Here's what i did:
I changed a .h file from
SomeObj* getCacheObj( int i = 0 );
to
SomeObj* getCacheObj( int i );
SomeObj* getCacheObj();
I recompiled the code (no problems), the changes went to somelib.so (one of many so files). I then replaced the old so on the equipment with this one and got the folowing error when loading the so:
undefined symbol: _ZN13KeypathHelper11getCacheObjEv
Now the strange part is that I've been told this class is only used in this so file (How can I make sure?). I am not that experienced and not sure how to investigate. Any suggestions are welcome.
Update
This particular problem was caused because another so file was using the KeypathHelper class and I only replaced the one containing it. The way I found out which other so needed to be updated was by greping all so's for KeypathHelper.
The _ZN13KeypathHelper11getCacheObjEv symbol is a mangled name for KeypathHelper::getCacheObj() (you can easily translate using c++filt, for example). Given that you have only added a method and whatever is loading the shared object cannot find it makes me think that you either haven't updated the shared object or forgot to provide a definition for KeypathHelper::getCacheObj() (in other words — implement the method).
In order to investigate, you have to see what is failing to resolve the symbol. Usually, developers have a sense for it. Say, if a binary XXX cannot load library YYY due to unresolved symbol, then XXX is using it and it does not appear to be in YYY (or anywhere else for that matter). If there is no sense for that, one can resort to reading ld.so (8) manual page and debug the dynamic linker by using available means like defining LD_DEBUG.
Also, #PlasmaHH has asked a very good question. If the only change you made was to the header file, then you must know that a single function/method with a default value for a parameter is not the same as as two functions/methods where one has a parameter and one does not.
As for your second question about how to make sure that symbol in a shared object is not being used outside — you have to change the symbol visibility so that nobody from the outside is able to link/resolve/use the symbol. For example, see GCC Visibility.
Hope it helps. Good Luck!
Related
struct Foo{
Bar get(){
}
}
auto f = Foo();
f.get();
For example you decide that get was a very poor choice for a name but you have already used it in many different files and manually changing ever occurrence is very annoying.
You also can't really make a global substitution because other types may also have a method called get.
Is there anything for D to help refactor names for types, functions, variables etc?
Here's how I do it:
Change the name in the definition
Recompile
Go to the first error line reported and replace old with new
Goto 2
That's semi-manual, but I find it to be pretty easy and it goes quickly because the compiler error message will bring you right to where you need to be, and most editors can read those error messages well enough to dump you on the correct line, then it is a simple matter of telling it to repeat the last replacement again. (In my vim setup with my hotkeys, I hit F4 for next error message, then dot for repeat last change until it is done. Even a function with a hundred uses can be changed reliably* in a couple minutes.)
You could probably write a script that handles 90% of cases automatically too by just looking for ": Error: " in the compiler's output, extracting the file/line number, and running a plain text replace there. If the word shows up only once and outside a string literal, you can automatically replace it, and if not, ask the user to handle the remaining 10% of cases manually.
But I think it is easy enough to do with my editor hotkeys that I've never bothered trying to script it.
The one case this doesn't catch is if there's another function with the same name that might still compile. That should never happen if you do this change in isolation, because an ambiguous name wouldn't compile without it.
In that case, you could probably do a three-step compiler-assisted change:
Make sure your code compiles before. Then add #disable to the thing you want to rename.
Compile. Every place it complains about it being unusable for being disabled, do the find/replace.
Remove #disable and rename the definition. Recompile again to make sure there's nothing you missed like child classes (the compiler will then complain "method foo does not override any function" so they stand right out too.
So yeah, it isn't fully automated, but just changing it and having the compiler errors help find what's left is good enough for me.
Some limited refactoring support can be found in major IDE plugins like Mono-D or VisualD. I remember that Brian Schott had plans to add similar functionality to his dfix tool by adding dependency on dsymbol but it doesn't seem implemented yet.
Not, however, that all such options are indeed of a very limited robustness right now. This is because figuring out the fully qualified name of any given symbol is very complex task in D, one that requires full semantics analysis to be done 100% correctly. Think about local imports, templates, function overloading, mixins and how it all affects identifying the symbol.
In the long run it is quite certain that we need to wait before reference D compiler frontend becomes available as a library to implement such refactoring tool in clean and truly reliable way.
A good find all feature can be better than a bad refactoring which, as mentioned previously, requires semantic.
Personally I have a find all feature in Coedit which displays the context of a match and works on all the project sources.
It's fast to process the results.
I working on a huge code base written many years ago. We're trying to implement multi-threading and I'm incharge of cleaning up global variables (sigh!)
My strategy is to move all global variables to a class, and then individual threads will use instances of that class and the globals will be accessed through class instance and -> operator.
In first go, I've compiled a list of global variables using nm by finding B and D group object names. The list is not complete, and incase of static variables, I don't get file and line number info.
The second stage is even more messy, I've to replace all globals in the code base with classinstance->global_name pattern. I'm using cscope Change text string for this. The problem is that in case of some globals, their name is also being used locally inside functions, and thus cscope is replacing them as well.
Any other way to go about it? Any strategies, or help please!
just some suggestions, from my experience:
use eclipse: the C++ indexer is very good, and when dealing with a large project I find it very useful to track variables. shift+ctrl+g (I have forgotten how to access to it from menus!) let you search all the references, ctrl+alt+h (open call hierarchy) the caller-callee trees...
use eclipse: it has good refactoring tools, that is able to rename a variable without touching same-name-different-scope variables. (it often fails in case there are templates involved. I find it good, better than visual studio 2008 counterpart).
use eclipse: I know, it get some time to get started with it, but after you get it, it's very powerful. It can deal easily with the existing makefile based project (file -> new -> project -> makefile project with existing code).
I would consider not to use class members, but accessors: it's possibile that some of them will be shared among threads, and need some locking in order to be properly used. So I would prefer: classinstance->get_global_name()
As a final note, I don't know whether using the eclipse indexer at command-line would be helpful for your task. You can find some examples googling for it.
This question/answer can give you some more hints: any C/C++ refactoring tool based on libclang? (even simplest "toy example" ). In particular I do quote "...C++ is a bitch of a language to transform"
Halfway there: if a function uses a local name that hides the global name, the object file won't have an undefined symbol. nm can show you those undefined symbols, and then you know in which files you must replace at least some instances of that name.
However, you still have a problem in the rare cases that a file uses both the global name and in another function hides the global name. I'm not sure if this can be resolved with --ffunction-sections; but I think so: nm can show the section and thus you'll see the undefined symbols used in foo() appear in section .text.foo.
Allow me to preface this question with 2 comments:
1) I'm a C# developer, so I don't have much practice dealing with linker errors in C++ and some standard C++ syntax is a bit unfamiliar to me. I suspect this will be an easy question to the C++ gurus out there.
2) I'm not sure how to ask this question in a way that will be relevant to the masses but I'm open to suggestions/corrections from the community. The problem with lnk2019 errors is that it seems pretty individualized as to what the problem actually is. MSDN has an article that deals with the error generally and Stack Overflow already has a slew of questions with that tag and yet I still can't seem to solve my problem.
On to the details...
I was given an old (VS2005) C++ solution with 42 projects and was asked to try and get it to build. After doing quite a bit of twiddling, I've gotten it down to just 3 projects that won't build. I'd like to focus on just one of them because I think if we can figure that one out, I can do the same things to the other 2 projects to fix them.
Let's start with the error. As you can see, the project in question is named "HttpWire".
Deleting intermediate and output files for project 'Http Wire',
configuration 'Release|x64' Compiling... HttpWire.cpp
Compiling resources... Linking... Creating library
Release\AMD64\HttpWire.lib and object Release\AMD64\HttpWire.exp
HttpWire.obj : error LNK2019: unresolved external symbol "public:
__cdecl THttpWire::THttpWire(char const *)" (??0THttpWire##QEAA#PEBD#Z) referenced in function
CreateConnectionWire Release\AMD64\HttpWire.dll : fatal error LNK1120:
1 unresolved externals
Looks like the linker is upset because the function "CreateConnectionWire" is calling "THttpWire" but for some reason the linker is unable to find it. There is only 1 .cpp file in the project (HttpWire.cpp) and here it is:
#include "THttpWire.h"
BOOL WINAPI DllMain(HINSTANCE hDllInst, DWORD reason, LPVOID reserved)
{
return TRUE;
}
__declspec(dllexport) TConnectionWire *CreateConnectionWire(LPCTSTR connectionString)
{
return new THttpWire(connectionString);
}
__declspec(dllexport) void DeleteConnectionWire(TConnectionWire *connectionWire)
{
delete connectionWire;
}
The #include file, "THttpWire.h" lives in another project called "AirTime Core". It includes several other things and then has the following:
class THttpWire : public TConnectionWire
{
public:
THttpWire(LPCTSTR connectionString);
virtual ~THttpWire();
... (lots of other stuff) ...
}
And then, finally, we have THttpWire.cpp:
#include "THttpWire.h"
...
THttpWire::THttpWire(LPCTSTR connectionString) :
TConnectionWire(connectionString),
hWinHttp(NULL), hSession(NULL), hRequest(NULL),
opTimedOut(FALSE), asyncError(0),
headers(NULL), headersOffset(0), headersLength(0),
availData(0)
{
requestSent = new TSyncEvent(TRUE);
updateToString();
}
This syntax is a bit weird to me... what are we doing here? I mean, I realize this is a constructor, and since THttpWIre appears to inherit from TConnectionWire (according to the .h), then the ":TConnectionWire(connectionString)" makes sense (I'm assuming this is like C# appending ": base()" to constructors of objects that inherit from other objects), but then what is all the other stuff between that and the opening brace (note that TConnectionWire does not appear to inherit from anything else)?
SO...
After doing some searching on MSDN and SO, I've learned the following (please correct me if I'm wrong)
CreateConnectionWire is prefaced by __declspec(dllexport) which simply makes it available to other projects consuming this .dll (as discussed here)
LPCTSTR is a const char* (see MSDN). Note that my projects are set with "Treat wchar_t as Built-in Type: No (/Zc:wchar_t-)" in the property pages. (see the bottom of this article and also this article)
Right now, my primary suspicion is with LPCTSTR. Perhaps it is not defined the same in both projects, which would yield different method signatures... but I don't know how to check for this or fix it if that is the case. Or, perhaps the "/Zc:wchar_t-" thing is affecting it adversely?My next suspicion is that there is something in the string of methods listed in the constructor (with the syntax that I don't understand) that is causing some sort of problem and making the "THttpWire" constructor not available, generally.What do you think? I'd be happy to share any other bits that you think would be useful.
Other information that may or may not be helpful (I'll let you decide)
When I first started with this project, there were several .lib and .h files missing and I've had to go around trying to find them (examples were opends60.lib, mssoap30.lib, WinLUA.h, etc.). It is quite possible I don't have the same version the solution was originally built against.
The projects were all built with "_WIN32_WINNT=0x0400" defined, which appears to mean it was meant to be built against the Windows 2000 SDK (see MSDN). I found something that I thought was the Win 2000 SDK (the oldest one on here, but when I link to that, I get many more errors. Instead, I'm linking to the SDK version 6.1. HOWEVER, this causes WinHttp not to compile because "SOCKADDR_STORAGE" isn't defined for anything "_WIN32_WINNT<0x0501" (windows XP). THUS, I've redefined "_WIN32_WINNT=0x0501" for all of the projects that appear to be related to HttpWire. It is possible I missed one or two.
There is only 1 .cpp file in the project (HttpWire.cpp)
Well, that's a problem because clearly you need more than 1. You also need THttpWire.cpp since it contains the constructor code. The one that the linker cannot find.
Keep the C++ build model in mind, it is very different from C#. Source code files are separately compiled. And then the linker glues all the bits of code together to make the program. Those bits may come from an .obj file created from a .cpp file. Or they could come from a .lib file, a "container" of bits of code.
Which is the likely explanation since you mentioned an "AirTime Core" project. Project + Properties, Linker, Input, Additional Dependencies setting. You need to add the output of the "AirTime Core" project, whatever it is named.
Our project (C++, Linux, gcc, PowerPC) consists of several shared libraries. When releasing a new version of the package, only those libs should change whose source code was actually affected. With "change" I mean absolute binary identity (the checksum over the file is compared. Different checksum -> different version according to the policy). (I should mention that the whole project is always built at once, no matter if any code has changed or not per library).
Usually this can by achieved by hiding private parts of the included Header files and not changing the public ones.
However, there was a case where a mere delete was added to the destructor of a class TableManager (in the TableManager.cpp file!) of library libTableManager.so, and yet the binary/checksum of library libB.so (which uses class TableManager ) has changed.
TableManager.h:
class TableManager
{
public:
TableManager();
~TableManager();
private:
int* myPtr;
}
TableManager.cpp:
TableManager::~TableManager()
{
doSomeCleanup();
delete myPtr; // this delete has been added
}
By inspecting libB.so with readelf --all libB.so, looking at the .dynsym section, it turned out that the length of all functions, even the dynamically used ones from other libraries, are stored in libB! It looks like this (length is the 668 in the 3rd column):
527: 00000000 668 FUNC GLOBAL DEFAULT UND _ZN12TableManagerD1Ev
So my questions are:
Why is the length of a function actually stored in the client lib? Wouldn't a start address be sufficient?
Can this be suppressed somehow when compiling/linking of libB.so (kind of "stripping")? We would really like to reduce this degree of dependency...
Bingo. It is actually kind of a "bug" in binutils which they found and fixed in 2008. The size information is actually not useful!
What Simon Baldwin wrote in the binutils mailing list describes exactly the problem ( emphases by me):
Currently, the size of an undefined ELF symbol is copied out of the
object file or DSO that supplies the symbol, on linking. This size is
unreliable, for example in the case of two DSOs, one linking to the
other. The lower- level DSO could make an ABI-preserving change that
alters the symbol size, with no hard requirement to rebuild the
higher-level DSO. And if the higher- level DSO is rebuilt, tools that
monitor file checksums will register a change due to the altered size
of the undefined symbol, even though nothing else about the
higher-level DSO has altered. This can lead to unnecessary and
undesirable rebuild and change cascades in checksum-based systems.
We have the problem with an older system (binutils 2.16). I compared it with version 2.20 on the desktop system and - voilà - the lengths of shared global symbols were 0:
157: 00000000 0 FUNC GLOBAL DEFAULT UND _ZN12TableManagerD1Ev
158: 00000000 0 FUNC GLOBAL DEFAULT UND _ZNSs6assignERKSs#GLIBCXX_3.4 (2)
159: 00000000 0 FUNC GLOBAL DEFAULT UND sleep#GLIBC_2.0 (6)
160: 00000000 0 FUNC GLOBAL DEFAULT UND _ZN4Gpio11setErrorLEDENS_
So I compared both binutils source codes, and - voilà again - there is the fix which Alan suggested in the mailing list:
Maybe we just apply the patch and recompile binutils since we need to stay with the olderish platform. Thanks for your patience.
You'd need to read through the code for the loader to be sure, but I think in this case we can make a fairly reasonable guess about what that length field is intended to accomplish.
The loader needs to take all the functions that are going to be put into the process, and map them to memory addresses. So, it gives the first function an address. Then, the second comes after the end of the first -- but to know "the end of the first", it needs to know how long the first function is.
I can see two ways for it to approach getting that length: it can either have it encoded in the file (as you'd seen it is in ELF) or else it can open the file that contains the function, and get the length from there.
The latter seems (to me) to have two fairly obvious disadvantages. The first is speed -- opening all those extra files, parsing their headers, etc., just to get the lengths of the functions is almost certainly slower than reading an extra four bytes for each function from the current file. The second is convenience: as long as you don't call any of the functions in a file, you don't need that file to be present at all. If you read the lengths directly from the file (e.g., like Windows normally does with DLLs) you'd need that file to be present on the target system, even if it's never actually used.
Edit: Since some people apparently missed the (apparently too-) subtle implication of "intended to accomplish", let me be entirely clear: I'm reasonably certain this field is not (and never has been) actually used.
Anybody who thinks that makes this answer wrong, however, needs to go back to programming 101 and learn the difference between an interface and an implementation.
In this case, the file format defines an interface -- a set of capabilities that a loader can use. In the specific case of Linux, it appears that this field isn't ever used.
That, however, doesn't change the fact that the field still exists, nor that the OP asked about why it exists. Simply saying "it's not used", while true in itself, would/does not answer the question he asked.
I have this old C++ COM component. I took the latest code base, built it and found that one of the properties has become lower case. For example, in the pre-compiled dll i have a property "Type", but when building from source it's called "type". The idl shows that the property is called "Type". So what could possibly be happening here?
COM is case-insensitive, so there is only one entry in the library's symbol table for the symbol "type". The version which is put into the symbol table is the first one that the compiler encounters.
Microsoft's advice on the matter is simply:
Make sure that the same name is not already present in the IDL file when introducing a new identifier.
You should stick to either Type or type in the IDL, for consistent results.
You discovered a quirk in the OS stock implementation of ICreateTypeLib, used by practically all tool chains on Windows that can create a type library. It uses a rather crude way to deal with possible problems caused by languages that are not case-sensitive, VB/A being a prominent example.
At issue is the definition of an identifier with one casing, being referenced elsewhere in the type library with another casing. Not a problem at all in, say, VB, big problem when the client programmer uses a case-sensitive language like C# or C++.
The "fix" it uses is to force the casing to be consistent everywhere in the library. Unfortunately it is not very sophisticated about it. Best example is a method declaration earlier in the type library that takes an argument named type. Any identifier named Type in the rest of the type library will now get case-converted to type.
Repairing this problem is easy enough, just change the name of the identifier so it no longer matches. You'll have to find it, not so easy, best to use Oleview.exe, File > View Typelib command. Copy/paste the decompiled IDL into a text editor and use its Search command.
I had the same problem almost 10 years after this question was asked and I would like to share my solution (thanks for the help in understanding the problem).
First I would like to say that I had several names whose casing was changed by tlbimp and changing all the instances of these names to my expected casing in the IDL fixed all but one. I'm assuming that that name (Text) came from a different IDL I imported. I was also not happy with the solution of changing the names of parameters and the like since in the future someone else may change them.
The solution I found was to introduce a dummy interface with the casing I wanted. I did this before all other imports and then referenced it in the library section of the IDL. Note that both these details are required. If you don't put it in the library section it's ignored and if I defined it at the beginning of the library section after the imports it's too late.
import "oaidl.idl";
import "ocidl.idl";
[
uuid(4EA92D5A-BF84-46C4-AA38-0F7DEADC69B),
helpstring("Ensure that names used in interop have correct casing")
]
interface IAmHack : IUnknown
{
HRESULT Space();
HRESULT The();
HRESULT Final();
HRESULT Frontier();
};
// ...
library MyLib
{
interface IAmHack;
importlib("stdole2.tlb");