Related
I developed a special business algorithm into a static library, and my other developers write non-critical code which will link to this static library when compile. I want to restrict only my company computer(Linux) can link this static library, to prevent this static library from being stolen and abused.
if not a good solution,any other suggestions is appreciated!
Thank you very much!
Very difficult to do this in a way that isn't easily bypassed by someone with some skills in doing such things. A simple solution would be to check the network card MAC-address, and refuse to run if it's "wrong" - but of course, anyone wanting to use your code would then just patch the binary to match their MAC-address. And you have to recompile the code whenever you use it on another computer.
Edit based on comment: No, the linker won't check the MAC-address, but the library can have code in it that checks the MAC-address, and then prints a message.
A more likely to work solution is to request permission from a server somewhere. If you also use some crypto-services to contact the server along with a two-part key, it can be pretty difficult to break. There are commercial products based on this, and it that didn't work, then they wouldn't exist - but I'm sure there are people who can bypass/fake those too.
As always, it comes down to the compromise between "how important it is to protect" vs. "how hard is it to break the protection". Games companies spend tons of money to make the game uncheatable and hard to copy, but within days, there are people who have bypassed it. And that's not even state secrets - government agencies (CIA, FBI, KGB, MI5, etc) that can have 100 really bright people in a room trying to break something will almost certainly break into whatever it is, almost no matter what it is - it just isn't worth the effort unless it's something REALLY important [and of course, then it is also well protected by both physical and logical protection mechanisms - you don't just log onto an FBI server from the internet, without some extra security, for example].
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Now that I know some of the basics of C++, I must admit that I still find it very hard to deal with code that others have written in C++. This may inherently be so, as C++ allows for complex object hierarchies that are, or at least to me, very hard to grasp if one is just supplied with a C++ Project without any further comments or instructions.
So my question is more a question to the more experienced C++ programmers among you: how can someone understand a large C++ project written by others?
I easily loose my way and can be lost for weeks, if I try to understand how a large project of, for example, 10,000 lines of code is written. Functions of classes are pointers to functions of different classes that may or may not be overloaded and may or may not be inherited by other classes, etcetera, without ending.
Are there any practical tips that may speed up my ability to read and understand large C++ projects? Is there perhaps a tutorial with such tips? Please, elaborate! :)
I've been programming professionally for some time now, and as such I have repeatedly been handed down codebases written by others before me. Understanding is never easy, especially when the code is inconsistent.
The first thing to realize, though, is that learning your ways in a new codebase is not so different than re-discovering a codebase you had not touched for a while. Thus, whether written by your old-self of others does not matter much; and since you probably manage to cope with re-discovering codebases you had worked on before, you should be able to discover new codebases as well. Don't lose hope.
The second thing to realize is that understanding is a vague term, and there are certainly different degrees. Often times, nobody asks you to understand the ins and outs completely; more likely you will be asked to understand a portion of the codebase in which either there is a bug or some new functionality should be developed. Therefore, as time passes, you will gradually gain an understanding of various portions, and you will inevitably have a deeper knowledge of the portions you worked the most whilst others can be relatively abstract or even completely obscure. It's okay, it's been a long time since human beings stopped trying to learn everything there was to learn.
With that said, there are several axis of understanding you can try:
you should look for architecture: a good thing is to trace the library dependencies (the Makefile/Project should help here) this will give you the coarse technical blocks out of which the application is built. Executables are normally leaves of the dependency trees.
you should look for data-flow: what's the trigger of the application (called directly or as a callback) ? what are the steps followed by this data (roughly, just a sketch). Do not hesitate to focus on a specific narrow usecase and use the debugger to trace things, and do not try to dig too deep at first; just get a feel of things.
There are also other axis that may help gaining some understanding of the domain the application has been written for. An understanding of the domain is useful because it provides you with a key insight on what should happen and it also helps you decipher the comments/function names.
user documentation: what is this used for ? if you can arrange for a demo it is generally very helpful, otherwise maybe you can try playing with it yourself (in a test environment)
tests: what is tested ? what is exposed to the user ?
persistent data: what is serialized ? what is saved in a database ? Persistent data is accessed at some point, so it helps if you understand when it is read/written.
If it is a working product (that runs) and you can "debug" it, start by looking at just one particular feature.
Learn how it is working from the user's point of view (UI, behaviour, inputs, outputs, ...).
Once you know the feature from the outside, just look for the code for that feature (only that feature); the starting point might be a handler for a menu, or from a dialog or a mouse/pointer event.
From there; manually trace the code for one action or sub-feature; skip deep internal libraries (treat them as black box for now) and learn how it works.
Once you know that section of code, dig deeper in libraries API that was called from the upper level code.
Take your time.
Do not try to understand everything at once.
Draw up schematic (pen and paper) of the dependencies (stay high level, no class dependencies at the beginning).
Good luck.
The problem that you are mentioning does not have clear and simple answer. Nevertheless here are some tips:
At the beginning try to randomly remember everything. Names of directories, classes, params of templates, etc. As much as you can. This sounds pointless but still makes sense.
While working with the code always think "Have I looked at this function/param/etc before?" If the answer is yes, spend with this piece of code more. If not, just make basic grasp and go on.
As the time will go on, you will find out that more and more sounds clear and easier to grasp.
It is impossible to give any exact values because size and complexity of projects vary greatly. Do not expect simple and immediate results.
Other points:
You definitely need a source code browser. Spend time in learning how to use it. Good example is http://sourceinsight.com/. This is not my site!!! I do have my own site. I will not mention it here.
If you see a function that is called 500 times, it is 500 times more likely that knowledge about this function will be useful comparing with a function, that is called only once.
The best is to grasp the architecture of the project. Trying to do this it is necessary to remember that project may have no architecture at all.
Studying the code you should remember your task. Typical situation - you need to modify something or fix a bug. If this is so look for the right part of the code and focus your effort on it.
I will soon be shipping a paid-for static library, and I am wondering if it is possible to build in any form of copy protection to prevent developers copying the library.
Ideally, I would like to prevent the library being linked into an executable at all, if (and only if!) the library has been illegitimately copied onto the developer's machine. Is this possible?
Alternatively, it might be acceptable if applications linked to an illegitimate copy of the library simply didn't work; however, it is very important that this places no burden on the users of these applications (such as inputting a license key, using a dongle, or even requiring an Internet connection).
The library is written in C++ and targets a number of platforms including Windows and Mac.
Do I have any options?
I agree with other answers that a fool-proof protection is simply impossible. However, as a gentle nudge...
If your library is precompiled, you could discourage excessive illegitimate use by requiring custom license info in the API.
Change a function like:
jeastsy_lib::init()
to:
jeastsy_lib::init( "Licenced to Foobar Industries", "(hex string here)" );
Where the first parameter identifies the customer, and the second parameter is an MD5 or other hash of the first parameter with a salt.
When your library is purchased, you would supply both of those parameters to the customer.
To be clear, this is an an easily-averted protection for someone smart and ambitious enough. Consider this a speed bump on the path to piracy. This may convince potential customers that purchasing your software is the easiest path forward.
A C++ static library is a terribly bad redistributable.
It's a bot tangential, but IMO should be mentioned here. There are many compiler options that need to match the caller:
Ansi/Unicode,
static/dynamic CRT linking,
exception handling enabled/disabled,
representation of member function pointers
LTCG
Debug/Release
That's up to 64 configurations!
Also they are not portable across platforms even if your C++ code is platform independent - they might not even work with a future compiler version on the same platform! LTCG creates huge .lib files. So even if you can omit some of the choices, you have a huge build and distribution size, and a general PITA for the user.
That's the main reason I wouldn't consider buying anything that comes with static libraries only, much less somethign that adds copy protection of any sort.
Implementation ideas
I can't think of any better fundamental mechanism than Shmoopty's suggestion.
You can additionally "watermark" your builds, so that if you detect a library "in the wild", you can determine whom you sold that one to. (However, what are you going to do? Write angry e-mails to an potentially innocent customer?) Also, this requires some effort, using an easily locatable sequence of bytes not affecting execution won't help much.
You need to protect yourself agains LIB "unpacker" tools. However, the linker should still be able to remove unused functions.
General thoughts
Implementing a decent protection mechanism takes great care and some creativity, and I haven't yet seen a single one that does not create additional support cost and requires tough social decisions. Every hour spent on copy protection is an hour not spent improving your product. The market for C++ code isn't exactly huge, I see a lot of work that your customers have to pay for.
When I buy code, I happily pay for documentation, support, source code and other signs of "future proofness". Not so much for licencing.
Ideally, I would like to prevent the library being linked into an executable at all, if (and only if!) the library has been illegitimately copied onto the developer's machine. Is this possible?
How would you determine whether your library has been "illegitimately copied" at link time?
Remembering that none of your code is running when the linker does its work.
So, given that none of your code is running, we can't do anything at compile or link time. That leaves trying to determine whether the library was illegitimately copied onto the linking machine, from a completely unrelated target machine. And I'm still not seeing any way of making the two situations distinguishable, even if you were willing to impose burdens like "requires internet access" on the end-user.
My conclusion is that fuzzy lollipop's suggestion of "make something so useful that people want to buy it" is the best way to "copy-protect" your code library.
copy protection and in this case, execution protection by definition "places a burden on the user". no way to get around that. best form of copy protection is write something so useful people feel compelled to buy it.
You can't do what you want (perfect copy protection that places no burden on anyone except the people illegally copying the work).
There's no way for you to run code at link time with the standard linkers, so there's no way to determine at that point whether you're OK or not.
That leaves run-time, and that would mean requiring the end-users to validate somehow, which you've already determined is a non-starter.
Your only options are: ship it as-is and hope developers don't copy it too much, OR write your own linker and try to get people to use that (just in case it isn't obvious: That's not going to work. No developer in their right mind is going to buy a library that requires a special linker).
If you are planning to publish an expensive framework you might look into using FLEXlm.
I'm not associated with them but have seen it in various expensive frameworks often targeted Silicon Graphics hardware.
A couple ideas... (these have some major draw backs though which should be obvious)
For at compile time: put the library file on a share, and give it file permissions only for the developers you've sold it to.
For at run time: compile the library to work only on certain machines, eg. check the UIDs or MAC ids or something
I will soon be shipping a paid-for static library
The correct answer to your question is: don't bother with copy protection until you prove that you need it.
You say that you are "soon to be shipping a paid-for static library." Unless you have proven that you have people who are willing to steal your technology, implementing copy protection is irrelevant. An uneasy feeling that "there are people out there who will steal it" is not proof it will be stolen.
The hardest part of starting up a business is creating a product people will pay for. You have not yet proven that you have done that; ergo copy protection is irrelevant.
I'm not saying that your product has no value. I am saying that until you try to sell it, you will not know whether it has value or not.
And then, even if you do sell it, you will not know whether people steal it or not.
This is the difference between being a good programmer and being a good business owner.
First, prove that someone wants to steal your product. Then, if someone wants to steal it, add copy protection and keep improving your product.
I have only done this once. This was the method I used. It is far from foolproof, but I felt it was a good compromise. It is similar to the answer of Drew Dorman.
I would suggest providing an initialisation routine that requires the user to provide their email and a key linked to that email. Then have a way that anyone using the product can view the email information.
I used this method on a library that I use when writing plugins for AfterEffects. The initialisation routine builds the message shown in the "About" dialog for the plugin, and I made this message display the given email.
The advantages of this method in my eyes are:
A client is unlikely to pass on their email and key because they don't want their email associated with products they didn't write.
They could circumvent this by signing up with a burner email, but then they don't get their email associated with products they do write, so again this seems unlikely.
If a version with a burner email gets distributed then people might try it, then decide they want to use it, but need a version associated to their email so might buy a copy. Free advertising. You may even wish to do this yourself.
I also wanted to ensure that when I provide plugins to a company, they can't give my library to their internal programmers to write plugins themselves, based on my years of expertise. To do this I also linked the plugin name to the key. So a key will only work for a specific plugin name and developer email.
To expand on Drew's answer - to do this you take the users email when they sign up, you tag a secret set of characters on the end and then hash it. You give the user the hash. The secret set of characters is the same for all users and is known to your library, but the email makes the hash unique. When a user initialises the library with their email and the hash, your library appends the characters, hashes it and checks the result against the hash the user provided. This way you do not need a custom build for every user.
In the end I felt anything more complex than this would be futile as someone who really wanted to crack my library would probably be better at it than I would be at defending it. This method just stops a casual pirater from easily taking my library.
I've seen a lot of discussion on here about copy protection. I am more interested in anti-reversing and IP protection.
There are solutions such as Safenet and HASP that claim to encrypt the binary, but are these protected from reversing when used with a valid key?
What kinds of strategies can be used to obfuscate code and throw off reversers? Are there any decent commercial implementations out there?
I know most protection schemes can be cracked, but the goal here is to delay the ability to reverse the software in question, and make it much more blatant if another company tries to implement these methods.
There are solutions such as Safenet and HASP that claim to encrypt the binary, but are these protected from reversing when used with a valid key?
No. A dedicated reverse engineer can decrypt it, because the operating system has to be able to decrypt it in order to run it.
Personally, I wouldn't worry. Admittedly I don't know anything about your business, but it seems to me that reverse engineering C++ is relatively difficult compared to languages like Java or .NET. That will be enough protection to see off all but the most determined attackers.
However, a determined attacker will always be able to get past whatever you implement, because at some point it has to be turned into a bunch of CPU instructions and executed. You can't prevent them from reading that.
But that's a lot of effort for a non-trivial program. It seems a lot more likely to me that somebody might just create a competitor after seeing your program in action (or even just from your marketing material). It's probably easier than trying to reverse engineer yours, and avoids any potential legal issues. That isn't something you can (or should) prevent.
hire some of the people I've worked with over the years, they will completely obfuscate the source code!
Read this
http://discuss.joelonsoftware.com/default.asp?joel.3.598266.61
There are two main areas on this:
Obfuscation - Often means renaming and stripping symbols. Some may also rearrange code by equivalent code transformations. Executable packers also typically employ anti-debugging logic.
Lower level protection - This means kernel or hardware level programming. Seen in rootkits like Sony, nProtect, CD/DVD copy protection.
Its almost impossible to truely obfuscate code in such a way that it will be totaly impossible to reverse engineer.
If it was possible, then computer virus would be absolutely unstoppable, no one would be able to know how they work and what they do. Until we are able to run encrypted code, the encryption is at some point decrypted and "readable" (as in, someone that can read machine code) before it can be executed by the cpu.
Now with that in mind, you can safely assume that cheap protection will fend off cheap hackers. Read cheap as in "not good", it is totaly unrelated to price you pay. Great protection will fend off great hackers, but ultimate protection doesn't exist.
Usually, the more commercial your solution is, the more "well-known" the attack vectors are.
Also, please realise that things such as encrypted applications imply extra overhead and annoy users. USB dongles also annoy users because they have to carry it around and cost a fortune to replace. So it also become a trade-off between you being happy that you've been protected against a handful of hackers and all of your customers which will have to carry the hindrances your protection method bears.
Sure, you can go to all sorts of clever lengths to attempt to defeat/delay debuggers and reverse-engineering. As others have said, you will not stop a determined attacker, period...and once your app is hacked you can expect it to be available for free online.
You state two goals of your desired protection scheme:
1) Make it hard to reverse engineer.
2) Make it blatent somebody is ripping you off.
For #1, any obfuscator/debugging-detector/etc scheme will have at least some impact. Frankly, however, the shrinking % of engineers who have ever delved into compiler output means that compiled C/C++ code IS obfuscated code to many.
For #2, unless you have a specific and legally protected algorithm/process which you're trying to protect, once the app is reverse engineered you're sunk. If it IS legally protected you've already published the protected details, so what are you trying to gain?
In general, I think this is a hard way to "win" and that you're better off fixing this on the "business-side" -- that is, make your app a subscription, or charge maintenance/support...but the specifics are obviously dependant on your circumstances.
You need to set a limit of how far you will go to protect your code. Look at the market and what your are charging for your solution. You will never secure your product 100% so you should evaluate what method will give you the best protection. In most cases, a simple license key and no obfuscation will suffice.
Delaying reverse engineering will only 'delay' the inevitable. What you need to focus on is deterring the initial attempt to breach copyright/IP. A good legal Terms and Conditions notice on the About page, or a bold copyright notice warning that any attempts to reverse engineer the code will result in a pick-axe through the spinal column...
Most people will back off attempting to rip something off if there is a chance they will be served some legal action.
We use SafeNet and our clients see it as 'official' protection. That in itself is a good deterrent.
Last term (August - December 2008) me and some class mates wrote an application in C++. Nothing spectacular, it is an ORM for Sqlite3. We implemented some stuff like reflection to make it work and release the end user from the ugly stuff. Personally, i think we made a nice job, and that our ORM could actually be useful for someone (even though its writen specifically for Sqlite3, its easily adaptable for oter databases).
Consequently, i`ve come to the conclusion that it should be published somewhere (sourceforge most likely) as an open source project. But, as it was a term project, there are some things that need to be addresesed before doing that. Namely, it has some memory leaks that should be fixed, and some parts of the code could be refactored to make everyone´s life easier in the future.
I would like to know more experienced C++ programmers opinion on some issues:
Is it worth rewriting some parts to
apply new techonologies (for example,
boost).
Should our ORM be adapted to latest
C++ standard? Is there any benefit in
doing this?
How will we know when our code is
ready for release?
What are the chances that this ORM
will be forgotten into the mists of
the internet? (i.e is it worth
publishing it beyond personal pride
as a programmer?)
Right now i can`t think of many more questions, but i would like to read on similar experiences.
EDIT: I should probably translate my code + comments to english right? (self question)
Thanks in advance.
I guess I am "more experienced" with regard to your particular question. I co-developed an open source web application language & template system a lot like ColdFusion back in the early days of web design before Java or ASP were around. You can still see it at http://www.steelblue.com/ if you are interested. It's still used at the company I was at when it was developed, but I don't think anywhere else.
What I found is that unless you are already well connected and people are watching what you are doing, getting people to use your open source code is just about as hard as selling somone your closed source program. You really need to advocate for your project and it should have some kind of unique selling proposition that distinguishes it from the compitition.
So, that's the unsolicited advice. Here are some specific answers to the questions you had...all purely my opinion, of course.
I wouldn't rewrite any code unless you have a featuer you want to put in. That feature might be compatibility with a specific platforms or compilers. It might be to support a new db datatype or smarter indicies or whatever. If you are going to put some more serious work into the applicaiton, think about a roadmap of what you can realistically accomplish in the next iteration and what choices will make the app the "most better" at the end of your cycle.
Release the code as soon as it is usable for a specific purpose, any purpose. Two reasons. First off, there might be someone who wants it for that purpose right now. If it's not available, they will use something else. Also, if it's open source, they might contribute back to the project. Second, the sooner you find out how much people want to use the code, the better. Either it will be more popular than you expect and you can get excited about continuing the development....or....you will find that no one is even visiting your web page to see what you've got. In either case, better to know sooner than later what people really want from your project so you can take that into account when planning new releases.
About the "forgotten into the mists." I think most projects are. I don't want to be a downer, but looking at Wikipedia, there were 5 C++ ORM tools popular enough to get mention and they were all open source. As I said above, unless you can sell your idea to people, they are going to go with another proven open source solution. For someone to choose you over them, three things have to happen: 1. They need a feature you have that the others don't. 2. They find your project web site and it demonstrates the superiority of your code. 3. They trust your code enough to give it a shot.
On the other hand, if you are in this for the long haul and want to continue development thigns get easier over time. Eventually the project will get all the basics covered and you can start developing those new featuers that aren't in the other solutions. Also, the longer you are in active development the more trustworthy the project will seem. Finally, you will get more experience in the nitch. 2 years from now you will be better positioned to say where your effort will have the most impact on bettering the project.
A final thought: If you are enjoying it, learning from it, and it's not getting in the way of you keeping food on the table, it's a good use of your time.
Good luck!
-Al
Regarding the open source part:
If you really want to make it an open source project, you really should publish it regardless of it's current state - fully working and debugged - or half working and full of memory leaks.
Just, if it's state is bad, make sure to document it, and give it a suitable version number (less than one?). then others may view your code, suggest improving, join your team, etc...
My--rather random--thoughts on the matter (in the order I think is most important):
How will we know when our code is ready for release?
Like Liran Orevi said: if you're going open source release early. Document it reasonable well, and take the time to provide a road map of planned or hoped for future improvements (these are a invitation for people to help you, so note which ones have no one working on them).
Is it worth rewriting some parts to apply new technologies (for example, boost).
Should our ORM be adapted to latest C++ standard? Is there any benefit in doing this?
SQLite relies on a fairly limited base. Maybe you don't want your tool to demand a much heavier environment. If the code in not currently a tangled and unmaintainable mess, you might want to avoid boost and newest frills. Once you have a stable release (1.0 at least) you can starting thinking about the improvements that can be made for version 2.
What are the chances that this ORM will be forgotten into the mists of the internet? (i.e is it worth publishing it beyond personal pride as a programmer?)
Most things end up in the big /dev/null in the sky, and there is only one way to find out... If it goes anywhere at all, you win. If it doesn't it was a modest investment, and maybe you learned something while you were at it.