How to prove ownership of sourcecode? - source-code-protection

How to prove ownership of sourcecode? - source-code-protection

I am looking for some online tools to help me preventing my digital products from cloning/copying under DMCA legally.
I am a PHP/WordPress developer, selling some premium plugins on my site, I found a man who is selling the items matching with a couple of items I coded originally but I am confused how I will verify my ownership if I proceed legally?
I apologize if I posted this question on wrong place, but any help regarding my question/request will be highly appreciated.
Thanks.

Watermarking is what you need. This introduces small 'features' into the source code which you can later prove that it's yours. For example you can obviously choose distinctive variable names but these can be renamed. You can use distinctive layout i.e. whitespace characters. You can add redundant code that doesn't do anything but looks as though it does. If the copier retains this then it's pretty clear where it came from.
Additionally, you could obfuscate your code so the copier has a hard time understanding and therefore changing it. If your obfuscation is good enough then you could add code to nodelock it i.e. tie it to your domain and so then copying it to another domain will break it.

Related

How do you understand a large chunk of code?

I am a fresh college grad student that just started my job. In my ramp up period, I need to learn a lot of product code. There are some design docs but they do not help much.
Can you provide some general techniques to browse and understand huge product code (specifically C++)?

Run it through doxygen. This will generate html documentation which will be helpful even if the code does not have proper doxygen-style comments.
Another good advice is to look through the unit tests, if there are any. If there are no unit tests, a good way to understand the code is to write your own unit tests. The effort to do this will pay for itself many times over.

Use every method available to you (in no particular priority):
Use the product itself and understand what it does
Talk to the devs that maintained it or have worked with it previously
Debug through it and see how data flows and how classes interact ("when I click this button, what exactly happens, who is responsible?")
Look at architecture, UML, or class diagrams
One of my favorites: create your own diagrams of class hierarchies, class interactions, general control flow, high-level components, process/DLL interactions, object lifetimes and management
If they're not totally out-of-date, read the dev/test/user specs (goes well with #1)
Read the documentation on it
Most of all: be tenacious and persistent. If you don't put in the work, don't expect to understand it. If you don't understand something, dig and dig until you do. Software is not magic, it's just hard work :)

Some people will tell you to start with the data structures, but in a large system even that's not terribly helpful much of the time. I can think of four major points:
Take your time. Often, it's more like a whole series of gestalt shifts than it is a single, linear, gradual understanding. So be patient.
No matter how big it is, you should be able to put a breakpoint in and walk it in a debugger. Even in a large, complicated, multi-threaded system, you should be able walk through and see what's happening.
Ask for bugs, and start fixing them, no matter how crazy they seem. It's akin to dropping yourself into a foreign country; you'll pickup the language eventually.
Find a mentor. A jungle guide is invaluable.

I think there have been a few good responses already. My 2c worth...
Not sure what you class as huge (10 KLOC, 1000 KLOC, 10000 KLOC, etc), but one would hope that this is broken down in some way and is not a monolithic single program. Perhaps your management has some guidance on which 'module(s)' you are most likely to be spending time in at the moment. Hopefully this can help break down the problem scope.
Firstly, before you try to understand the code try to understand the product. What does it do? Then how does it do it? What does it interact with? Then how does it interact? etc...
When getting to the code try to understand the high level design and philosophy first, and work on the breadth before the depth. I agree with some of the above re fixing some bugs, but I also strongly suggest you continue to get a handle on the high level even if you need to get into the details to fix some bugs.
I also agree with the above in terms of generating some diagrams for yourself if you can't find any already in existence. And then share them, perhaps a team/product wiki? I'm curious as to why the existing doco does not help very much. Typically this is because this type of doco was generated from the early concepts and the product no longer bears any similarity, but if this is not the case then what can you contribute to this issue. One assumes that where you are today someone else will be in short enough order, and you are in an ideal position to know what essential doco is missing!
If the product is actually 'huge' then you have to accept that you will never be able to hold all of it in your head, so the best thing you can do is be familiar enough to know where to start looking (comes back to understanding the product, and approaching code breadth first).

This is obviously a pretty common question, and it's similar to this one (and the questions related to it): How to understand the design and code flow of any product quickly?
Dig through some of those answers / comments, for starters. Else, we'll just end up repeating them. :)

Advice for starting own wiki?

My friends and I were thinking of starting our own wiki. Given how widespread they have become recently, we heard it isn't that hard. We want to keep the site as simple as possible - we have some experience with web design, but not a whole lot with system administration. What are some things that we should keep in mind going forward (such as, which wikifarms may be useful, or what caveats should we keep in mind)?

I'm guessing from your question that you mean for personal, instead of business, use.
As Bayard implies, the key to success is the social side. For the technical side you'll need to have a server (or someone prepared to host it) and good wiki software. The most obvious choice here is MediaWiki which is well developed (features), well tested, well known (through Wikipedia) and completely free. Furthermore, it can easily be extended with a variety of new features (extensions).
Take your time making the choice of software because it is hard to change later. WikiMatrix may help here (to compare software).
However, the social side is also important. What is your topic? Why is it necessary? Could you accomplish the same with Google Docs (if it is just for friends) or do you want a wider involvement?
If you want a wider involvement (e.g. allow the public to contribute), then decide whether you will permit anonymous edits.
Now the most important: moderation. This means (1) you need clear rules (like who can delete pages and what the process is) and (2) someone (or, better, a group) to enforce those rules (the moderators). You will need to create the right balance for you in terms of being strict with the rules (encourages quality) and being flexible (encourages participation).
You will also need someone to take a lead - to encourage, support and manage the moderators and processes. This person is often called a wiki champion. Here's a good link explaining more about this role.
Final tips: be clear what should go onto the wiki and what not, stay close to your users (customers) by encouraging feedback and keep it fun for everyone!
Later addition: check out these Stack Overflow questions and answers:
Getting developers to use a wiki
Getting started with a personal wiki and moinmoin
Does it make sense to set up a wiki at the workplace?
What’s the best open source wiki platform?
Another edition: make sure the moderators create and maintain great "how to" pages for the wiki. Often they are not intuitive (especially for people used to Word). You might want to start with a "What is a wiki?" page - and then, after a brief introduction, link to a Wikipedia page all about wikis.

MindTouch has a free, open source wiki (http://www.mindtouch.com/downloads) that sounds like it would be perfect for what you're trying to do. I've used it in the past and it's super easy to get up and running and very flexible. Watch one of their demos before you make any decisions though (http://www.mindtouch.com/support_and_services/demo_videos).

The most difficult part of implementing a successful wiki tends to be social, rather than technical. Wikipatterns is a good resource which describes the challenges you're likely to encounter.

Tagging unit tests with owner considered a good idea?

I would like to know your opinion on whether it is a good idea to have developers put their name or signature on top of every test they write and why (not)?

I would say no. If you really need to know who wrote the test, you should be able to look back in your version control to see who's to blame. When people start putting their name on things, you lose the sense of collective ownership/blame, and people start to get more concerned with "their" code rather than the system as a whole.

I upvoted Andy's but I'd also add that putting the name in the code also is then something else that must be maintained. eg. Joe creates the test, but Jane changes it, is it Joe's test or Jane's test? And if Jane doesn't change the comment, you'll now go and talk to Joe about the code that Jane wrote... All too confusing. Use Blame and be done with it.

What would you do with the information?
There's no use case for having the author's name.
Generally, the information has one of two meanings.
The person's gone (gone from the company, gone from the project, or a contractor and someone who'll never be found again.)
The person's still around.
In the second case, you already knew that. Having their name in a source code file doesn't clarify the fact that they worked on this code, are still with the company and still on the project.
So, author's name has no use cases.

I favour self-explanatory test cases rather than signed tests.
Even if you know who wrote the test, and he's still working here, and he's available, you cannot be certain he remembers the reasons why he wrote this test.
Make sure the names of the test case are explicit enough. Add comments if necessary, reference bug ID, User Story, Customer ...

I think it depends on the attitude that already exists. If there are many conflicts, then removing all the names is useful, because the code stands for itself. However, if the names are put on the tests (as with code) then the developer is taking ownership.
Taking ownership is always a good thing because it encourages the developer to make it as perfect as possible. It also helps when you need to ask a question about the test, or if the test is failing, and you can't figure out why, you'll be able to ask the expert on the subject.
However, if there is a darker atmosphere, more about developers who are defensive, and are trying to undermine each other, then the names will cause them to focus on 'who made this code wrong' or 'this test failed because X coded it badly' rather than focusing on the error that the test might be detecting.
So there's always a balance when explicitly attaching names to tests like that.
And as Andy mentioned, there's always source control if you REALLY need to know who wrote something.

I think it really depends on what the rest of your culture is around code ownership. If your teams culture is that all code is owned by someone, and only that person can touch that code, then labeling whose code is whose might make sense.
However, I prefer to work on teams where there's collective ownership of code. Sometimes it's nice to have an original author on a file, just to see whose original design it was, but beyond that, I don't think tagging specific tests is useful.
As other people mentioned, if you really need to figure out who made a particular change, you can figure that out from version control. In general though, I think tests should be owned and maintained by the whole team.

Refactoring ColdFusion 5 tag-based code into CFCs

I feel the need to refactor my old CF5 based code into CFC's. We already have some code in ColdSpring and Transfer but feel a large rewrite to ColdSpring and Transfer is pointless.
What tips, approaches and gotchas will I hit.
How can I make this easy?
I don't mind keeping ColdSpring in the mix but Transfer is the bit I'm scared of with the size of the project.
edit: my code base has been going for 7-8 years and is vast. To describe it would be difficult, however I'm looking for generic suggestions on approaches

Changing the whole code base just for the sake of it if it basically works would be introducing a lot of potential bugs into your system. I don’t think there is an easy way to do it.
If you look at the areas of your site which are 1: most likely to change and 2: executed the most you may be able to target some areas which could benefit from change and see how easily they would fit into a CFC based framework, and what benefits. But for most of the code if it is working OK, there may be no pressing need to change.
However whenever you need to do a major alteration to part of the system it may be worth looking at that from an OO perspective and moving the existing code over, where applicable.

In one of my ongoing projects (almost same situation, even more -- most of code is really bad) I am using technique I'd called "wave-style". General ideas I use are following:
Splitting processing from output. I can not implement true MVC here, but at least I can move view into separate templates (sometimes re-use them) and prepare all data in basic (model) templates.
Move all repeating code into components -- this is one of most important tips.
Group related functions into components. Say, all customer-related info grouped into CustomerManager.cfc, invoices into InvoiceManager.cfc etc.
Why "wave"? In a big project I can't just sit and rewrite all customer-related code. So I have make it step by step. For example, I have to work on customer signup, extend it with few attributes. I've created basic component, moved there methods to validate form (check login, email etc.) and add customer - so this page works in new style. Lated I will need to improve invoice page, where I need to get invoice owner details: I just add method into customer manager and get rid of direct queries. Later edit customer page... Also it can be called "on demand refactoring" or smth.
There can be additional stuff relying on your current project state. But it helped me a lot. Hope you'll find these tips useful.

Before you change anything: create a full set of regression tests!
When refactoring, the goal has to be preserve functionality first, so that you don't directly affect your clients.
I agree with Sergii's wave-style refactoring also - this allows you to break things into manageable chunks rather than doing everything in one go.
But whatever method you have, the more regression tests you can create, the better - it's really the only way you can confirm you haven't unintentionally changed something.

This is extremely hard (bordering on impossible) to answer without knowing any of your code.
The question is a bit like "I want to disassemble my old Volkswagen and build a new one from the parts, what should I consider?" :-)

My advice would be to start off by encapsulating your business logic into CFCs instead of worrying about the whole presentation layer of your site.
By just concentrating on the business logic, you'll be able to get the most important functionality into CFCs and ease the maintenance nightmare. It also won't be too hard to just "drop-in" these CFCs into your existing site.
After getting as much business logic into the CFCs as you can, you'll notice that the enormous monster has been cut down to size. At that point you can now decide on what you want to do with the presentation layer of your site. You're now free to pick from a multitude of frameworks available to use (CFWheels, FuseBox, ColdBox, Mode-Glue) to port over the presentation layer.
Or you could just say "the heck with it" and rewrite the whole thing in CFWheels from the start :)

If you are not using version control get that set up before you do anything else. Being able to back out of broken refactoring is a serious life saver. After that I agree with what has been posted. You will want to take on small chunks at a time - divide and conquer.

Improving and publishing an application. Need some advice

Last term (August - December 2008) me and some class mates wrote an application in C++. Nothing spectacular, it is an ORM for Sqlite3. We implemented some stuff like reflection to make it work and release the end user from the ugly stuff. Personally, i think we made a nice job, and that our ORM could actually be useful for someone (even though its writen specifically for Sqlite3, its easily adaptable for oter databases).
Consequently, i`ve come to the conclusion that it should be published somewhere (sourceforge most likely) as an open source project. But, as it was a term project, there are some things that need to be addresesed before doing that. Namely, it has some memory leaks that should be fixed, and some parts of the code could be refactored to make everyone´s life easier in the future.
I would like to know more experienced C++ programmers opinion on some issues:
Is it worth rewriting some parts to
apply new techonologies (for example,
boost).
Should our ORM be adapted to latest
C++ standard? Is there any benefit in
doing this?
How will we know when our code is
ready for release?
What are the chances that this ORM
will be forgotten into the mists of
the internet? (i.e is it worth
publishing it beyond personal pride
as a programmer?)
Right now i can`t think of many more questions, but i would like to read on similar experiences.
EDIT: I should probably translate my code + comments to english right? (self question)
Thanks in advance.

I guess I am "more experienced" with regard to your particular question. I co-developed an open source web application language & template system a lot like ColdFusion back in the early days of web design before Java or ASP were around. You can still see it at http://www.steelblue.com/ if you are interested. It's still used at the company I was at when it was developed, but I don't think anywhere else.
What I found is that unless you are already well connected and people are watching what you are doing, getting people to use your open source code is just about as hard as selling somone your closed source program. You really need to advocate for your project and it should have some kind of unique selling proposition that distinguishes it from the compitition.
So, that's the unsolicited advice. Here are some specific answers to the questions you had...all purely my opinion, of course.
I wouldn't rewrite any code unless you have a featuer you want to put in. That feature might be compatibility with a specific platforms or compilers. It might be to support a new db datatype or smarter indicies or whatever. If you are going to put some more serious work into the applicaiton, think about a roadmap of what you can realistically accomplish in the next iteration and what choices will make the app the "most better" at the end of your cycle.
Release the code as soon as it is usable for a specific purpose, any purpose. Two reasons. First off, there might be someone who wants it for that purpose right now. If it's not available, they will use something else. Also, if it's open source, they might contribute back to the project. Second, the sooner you find out how much people want to use the code, the better. Either it will be more popular than you expect and you can get excited about continuing the development....or....you will find that no one is even visiting your web page to see what you've got. In either case, better to know sooner than later what people really want from your project so you can take that into account when planning new releases.
About the "forgotten into the mists." I think most projects are. I don't want to be a downer, but looking at Wikipedia, there were 5 C++ ORM tools popular enough to get mention and they were all open source. As I said above, unless you can sell your idea to people, they are going to go with another proven open source solution. For someone to choose you over them, three things have to happen: 1. They need a feature you have that the others don't. 2. They find your project web site and it demonstrates the superiority of your code. 3. They trust your code enough to give it a shot.
On the other hand, if you are in this for the long haul and want to continue development thigns get easier over time. Eventually the project will get all the basics covered and you can start developing those new featuers that aren't in the other solutions. Also, the longer you are in active development the more trustworthy the project will seem. Finally, you will get more experience in the nitch. 2 years from now you will be better positioned to say where your effort will have the most impact on bettering the project.
A final thought: If you are enjoying it, learning from it, and it's not getting in the way of you keeping food on the table, it's a good use of your time.
Good luck!
-Al

Regarding the open source part:
If you really want to make it an open source project, you really should publish it regardless of it's current state - fully working and debugged - or half working and full of memory leaks.
Just, if it's state is bad, make sure to document it, and give it a suitable version number (less than one?). then others may view your code, suggest improving, join your team, etc...

My--rather random--thoughts on the matter (in the order I think is most important):
How will we know when our code is ready for release?
Like Liran Orevi said: if you're going open source release early. Document it reasonable well, and take the time to provide a road map of planned or hoped for future improvements (these are a invitation for people to help you, so note which ones have no one working on them).
Is it worth rewriting some parts to apply new technologies (for example, boost).
Should our ORM be adapted to latest C++ standard? Is there any benefit in doing this?
SQLite relies on a fairly limited base. Maybe you don't want your tool to demand a much heavier environment. If the code in not currently a tangled and unmaintainable mess, you might want to avoid boost and newest frills. Once you have a stable release (1.0 at least) you can starting thinking about the improvements that can be made for version 2.
What are the chances that this ORM will be forgotten into the mists of the internet? (i.e is it worth publishing it beyond personal pride as a programmer?)
Most things end up in the big /dev/null in the sky, and there is only one way to find out... If it goes anywhere at all, you win. If it doesn't it was a modest investment, and maybe you learned something while you were at it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js