Is it impossible to verify that a website is using their publicly released source code in production?

Is it impossible to verify that a website is using their publicly released source code in production? - web-services

This is a bit of an oddball question, and I've seen nothing similar asked anywhere on the internet.
What I want to do
I want to release the source code of my website. Beyond that, the nature of the website (described below) is such that I not only want to release the source code, but I want users of my website to be able to unequivocally verify that they're using the exact version of the website as is in the source code dump.
The part that makes it tough is that my good will cannot be trusted. (Obviously, it can since I'm going to this length, but from the user's perspective it cannot).
My (mental exercises) to try to fix this
Attempt 1
So, the first thing I thought about was hashing the source code, or even hashing the entire docker container it's running in, and providing an endpoint that broadcasts that hash, so that it may be matched up against the public source code.
Attempt 2
The second thing I thought about was providing users with a read-only shell login so that they would be able to hash the docker image that's being run. The problem here is, there's no way to verify that the docker image is what is running (to my knowlege). I could just build an image of the public source code and put it there for users to hash.
Also, because of the security concern, I really hate the idea of putting users that close to production.
Attempt 3
Finally, I wondered if it would solve the problem if I used some type of blockchain technology, like a distributed app. But that's so complex, and I don't think it provides extra trust.
Why I want to do this
I am building a website that will be handling incredibly personal data. It could destroy people's lives if any of it was leaked (and no, the TLD isn't .xxx or anything like that. In fact, nothing illegal is going on with this data). However, there is an intense social stigma associated with the type of data, and it actually is enough evidence (in some countries) to peruse the death penalty against a user, if any data is leaked.
So, in addition to having a very explicit (and secure) Privacy Policy, I want to make an open source promise to my users, so that problems can be hunted down by volunteers and quickly eliminated. Also so that they're able to verify that I'm not adding code into the running production version to enable extra spying on them.
Is this type of thing theoretically possible?

As long as the server receives some of this incredibly personal user data in a format that it/you can read it, there is no perfect way. You would have to let the users encrypt the data before uploading it, with a key only they know.
If there is no data transfer involved, then the users can check the code in the browser and compare hashes manually. Perhaps there is an automatic way of doing this. Anyway all work on the data then has to be performed on the client side.
The core problem is that a user has to trust the complete environment, from source code over compiler to executing OS and hardware.
You cannot cryptographically ensure that you do not for example intercept the running program on a low layer and read out data there, even if you had a possibility to ensure that you run the exact code on the server.

"Trust", you can build by legalities and Privacy Policies.
We all willingly keep our sensitive data online in many applications/systems and we never check to see if the source code or the architecture of the system is the same as what they promised at the beginning.
As #tystackoverflow said, building a backdoor in the code is not the only way you can ensure the users of a system that their data is not accessible to anyone else. In this case your system architecture should also support encryption of data at higher level that no one (unless accessed through the system) can have direct access to it.
I do understand the risk of these data leaking,
It all depends on how you design the System to be tamper proof and secure, and conveying the idea behind the security measures you have built in to the Application to its users.
Good Luck with the Project !

Related

c++ solution to locking a directory - linux

A simple question. Is it possible to lock a directory under linux ? Actually what I need is that only one application (which I wrote) has an access to a specified directory which is created by this application. So basically it is a cache directory for that app and so-far users have been messing with it. so i wish to prevent this in future.
Is it possible to do this?? and how (language: c++)?

Not possible in standard C++ at all.
Under linux, use setuid permissions on the executable, so it runs in context of its owner. Then you can lock down access permissions for the directory, so it is only accessible to the owner of the executable.
Of course, this doesn't stop users using your program from messing with your cache. You need to design your program so it prevents inappropriate actions by users. And make sure the owner account (which can be set up specifically for your application) does not have more privileges than it needs.

If you trust people who have "root" permission on the system(s) that this is installed on, you can rely on having a special user or group, and use setuid and/or setgid to prevent others from tampering with the file. This does, however, mean that the installer of the software will need to have root permission, so "any" user can't install the software, which in some circumstances may not be a good solution.
The better solution is really to either have a hash of the file stored in the file [plus some constant data or some such, so that the user can't just run sha1 modified-file and get the "right hash"], or encrypt the whole file.
The problem with BOTH of these methods is that you still can't rely on it 100% - someone with enough motivation and resources will figure out what the constant data is, and just calculate that as new hash after modifying the data in your file. And similarly, assuming your application knows how to decrypt the file, the application can be reverse engineered to find the encryption key and encryption method.
So you are fighting a "code war" (pun on cold war) against your "users".
There are commercial solutions available for license management (and that I have been the user of is FlexLM, but there are several others too). This is a little more work, and will probably cost a little bit in license fees, but you will save yourself a whole heap of potential headache if you use a commercial product.

Prevent piracy of desktop application which doesnt need Internet connection?

Suppose for an application which will never receive internet connection during its lifetime, how can you prevent the piracy of the software?
There cannot be a single product key requirement during installation because, once installed legitimately anybody can copy the installation and re-distribute it.
So every time the application runs it should check for something and crash if the check fails.
Now what could it possibly check?
Initially I thought keeping an encrypted binary file will do the job, but as answered here, that seems a negligible prevention.
Any hacker can modify the executable so that instead of crashing when the check fails it should continue running.
So no matter how difficult the check is, the cracked application will always run.
Now I cannot see any possible solution to this problem.
PS: I am a single independent developer who is developing productivity software with very low charge. Seeing this question I believe I just have to let it go. Sigh....
EDIT: I would like to thank all the contributors in this discussion in letting me know the grim reality...
What I understand now is that you are indirectly submitting the source code of your application in the form of the target executable. Its source code can be modified by anybody using a debugger, thus ANY method of preventing piracy through source code of your application is useless. The only possible solution to this problem is to keep your legitimate customers happy by providing them services (apart from the software) and keep your price below their expectations.
I was think of solving this problem for past 3 days and now all seems worthwhile but still learnt a lot in this process, which I wouldn't have otherwise...
I ha

The only standalone thing I've seen that is semi-effective is hardware keys that come with the boxed software. They used to attach to a parallel port or a serial port and get checked when you started the program.
AutoCad and similar programs used to do this, but it is a BIG PAIN for your customers. Any time it doesn't read it, or a key goes bad, customer productivity suffers. It hurts your legitimate customers far more than those who end up pirating it anyway, and a sufficiently motivated pirate can make a VM that will overcome this. Modern versions of this use USB.
My recommendation is to trust people. Upon install, make them click a "I promise I paid for this" button and be done with it. If they click "I didn't pay for this" show them a small paragraph about how to help keep good software coming and prevent customer-harming DRM schemes by simply contributing to the success of good software authors.

You could generate a unique copy for each user, create a database, and check it agents copies you find online if you like playing the biggest game of wack-a-mole ever.

Choice of storage and caching

I hope the title is chosen well enough to ask this question.
Feel free to edit if not and please accept my apologies.
I am currently laying out an application that is interacting with the web.
Explanation of the basic flow of the program:
The user is entering a UserID into my program, which is then used to access multiple xml-files over the web:
http://example.org/user/userid/?xml=1
This file contains several ID's of products the user owns in a DRM-System. This list is then used to access stats and informations about the users interaction with the product:
http://example.org/user/appid/stats/?xml=1
This also contains links to various images which are specific to that application. And those may change at any time and need to be downloaded for display in the app.
This is where the horror starts, at least for me :D.
1.) How do I store that information on the PC of the user?
I thought about using a directory for the userid, then subfolders with the appid to cache images and the xml-files to load them on demand. I also thought about using a zipfile while using the same structure.
Or would one rather use a local db like sqlite for that?
Average Number of Applications might be around ~100-300 and stats and images per app from basically 5-700.
2.) When should I refresh the content?
The bad thing is, the website from where this data is downloaded, or rather the xmls, do not contain any timestamps when it was refreshed/changed the last time. So I would need to hash all the files and compare them in the moment the user is accessing that data, which can take an inifite amount of time, because it is webbased. Okay, there are timeouts, but I would need to block the access to the content until the data is either downloaded and processed or the timeout occurs. In both cases, the application would not be accessible for a short or maybe even long time and I want to avoid that. I could let the user do the refresh manually when he needs it, but then I hoped there are some better methods for that.
Especially with the above mentioned numbers of apps and stuff.
Thanks for reading and all of that and please feel free to ask if I forgot to explain something.

It's probably worth using a DB since it saves you messing around with file formats for structured data. Remember to delete and rebuild it from time to time (or make sure old stuff is thoroughly removed and compact it from time to time, but it's probably easier to start again, since it's just a cache).
If the web service gives you no clues when to reload, then you'll just have to decide for yourself, but do be sure to check the HTTP headers for any caching instructions as well as the XML data[*]. Decide a reasonable staleness for data (the amount of time a user spends staring at the results is a absolute minimum, since they'll see results that stale no matter what you do). Whenever you download anything, record what date/time you downloaded it. Flush old data from the cache.
To prevent long delays refreshing data, you could:
visually indicate that the data is stale, but display it anyway and replace it once you've refreshed.
allow staler data when the user has a lot of stuff visible, than you do when they're just looking at a small amount of stuff. So, you'll "do nothing" while waiting for a small amount of stuff, but not while waiting for a large amount of stuff.
run a background task that does nothing other than expiring old stuff out of the cache and reloading it. The main app always displays the best available, however old that is.
Or some combination of tactics.
[*] Come to think of it, if the web server is providing reasonable caching instructions, then it might be simplest to forget about any sort of storage or caching in your app. Just grab the XML files and display them, but grab them via a caching web proxy that you've integrated into your app. I don't know what proxies make this easy - you can compile Squid yourself (of course), but I don't know whether you can link it into another app without modifying it yourself.

How should a Windows 8 Metro Application connect to a central database?

How should a Windows 8 Metro application connect to a central database?
I've read about local storage, but I haven't read anything about connecting to a central database.
Obviously, this architectural design decision needs to support the disconnected scenario.
WCF web services seem to make sense.
But even if they do make sense, should we really create separate methods for all read/write operations?
Or are OData WCF services the way to go?
It seems like tablet software architecture should be able to borrow a lot from smartphone software architecture (but I am new to both).
Has Microsoft made any recommendations in its app samples?

It appears that others are asking similar questions on the Microsoft Developer Forums.
Here is what I've found:
According to Tim Heuer:
...You cannot directly have a SQL db embedded in your app or use
something like ADO.NET. This is more of an async/services
infrastructure. So if your data was exposed via services, then of
course you could connect that way. There are some other light-weight
methods you could use for local storage as well using things like the
Windows.Storage namespace (which is similar to Isolated Storage in
.NET).
Morten Nielsen agrees:
You can use HttpClient to download pretty much anything from the web.
Why don't you configure your WCF service to return data as JSON, and
use the DataContractJsonSerializer to deserialize the results?
Also, Tim Heuer cautions:
...Please note that while awesome, the SQLWinRT project on codeplex is a
wrapper to communicate with the classic SQLite engine...which uses
APIs that would not pass store validation currently.
Generic Object Storage Helper for WinRT and WinRTFile Based Database seem to have some promise.
But Daniel Stolt raises some good points:
It's awesome that there is good support for building OData clients and
other REST clients - but this only addresses the online scenario. The
"structured" part of Windows.Storage is a very limited model,
essentially limited to name/value pairs, insufficient for all but the
most basic scenarios. Yes there is local file storage, which is great
of course. But forcing every app developer out there to build her own
DBMS on top of local file storage will simply not cut it, especially
with all of System.Data having been removed from the profile. If local
file storage was sufficient for most device apps, then things like
SQLCE would have no purpose today already. And SQLCE clearly has a
purpose, and has played a very important role for occasionally
connected device apps for a very long time. There is also a tremendous
need for synchronization with a server-side database such as SQL
Azure, mostly to be able to roam data between devices. Yes there is
the roaming storage model in WinRT, but it shares the same limitations
of local storage mentioned above, and on top of this is very limited
in capacity (currently 30KB if memory serves). It is simply
insufficient for all but the simplest roaming data needs. Again,
forcing every app developer to design and implement her own
synchronization solution is very bad. You can do much better to enable
developers.
Many people are disappointed that the System.Data namespace is not supported in WinRT.
Richard Bethell said:
I don't even have words for this. This is astonishing. Leave aside for
the moment they want to force you to abstract to middleware for
database connectivity - I don't agree, but I can quasi understand a
rationale for that. I can even see pathways for developing like that.
But no System.Data.... at all? Do you even understand what you've done
to us?
What System.Data can do, outside of just having providers for Sql,
OleDb and other custom providers like Oracle, is provide a rich
abstraction of XML datasets that allow you to very quickly build a
data oriented Service Oriented Architecture.
For instance, I can easily create a web service using SOAP or WCF that
returns DataSets or DataTables, and then consume those objects easily
and directly. Being able to do this allows very rapid construction of
n-tier architectures, even without direct data connections available.
Without System.Data, and the power of DataViews, DataTables, etc. this
gets a lot harder. Sure you can custom create structs, put data in
there, and serve up structs, and use Linq to do whatever sorting,
filtering, etc. you want to do.... but it ends up being twice the
work, and makes code reuse a lot harder. And it means using our
existing service oriented architecture is impossible (without a big
overhaul.)
The withdrawal of System.Data is as big a thing for developers to deal
with as the loss of the Printer object in VB6 to vb.net 1.0 was. What
is harder to understand in this case is why it is necessary -
re-enabling it in the Metro profile can't possibly be a technical
difficulty of the product, can it?
It is valuable enough that I would seriously consider including Mono's
System.Data classes as part of any app I create (which would obviously
have to be open source.)

I think that this is another of those "it depends" questions...
The first and most obvious issue is that it very much depends on the context in which the application is running as to whether, to take the first case "Obviously...support...disconnected" is actually true - if the app is an internal corporate app then quite possibly not in that case no db == not work.
Secondly you could look (hmm, rash... one assumes you could look, this could be a bad assumption) at database synchronisation between a local SQL database and the remote db and so on and so forth.
Taking a step back... yes - you're absolutely right, look at it as being the same as phone or silverlight (although I don't know if there is yet RIA support) - but the thing is at this point its very hard to be prescriptive because given a general purpose platform one can therefore write applications to suit all sorts of purposes.
Not a hugely helpful answer really - but a start.
Having read #Jim G's answer it seems that I should probably withdrawn mine?

Are there cross-platform tools to write XSS attacks directly to the database?

I've recently found this blog entry on a tool that writes XSS attacks directly to the database. It looks like a terribly good way to scan an application for weaknesses in my applications.
I've tried to run it on Mono, since my development platform is Linux. Unfortunately it crashes with a System.ArgumentNullException deep inside Microsoft.Practices.EnterpriseLibrary and I seem to be unable to find sufficient information about the software (it seems to be a single-shot project, with no homepage and no further development).
Is anyone aware of a similar tool? Preferably it should be:
cross-platform (Java, Python, .NET/Mono, even cross-platform C is ok)
open source (I really like being able to audit my security tools)
able to talk to a wide range of DB products (the big ones are most important: MySQL, Oracle, SQL Server, ...)
Edit: I'd like to clarify my goal: I'd like a tool that directly writes the result of a successful XSS/SQL injection attack into the database. The idea is that I want to check that every place in my app does correct output encoding. Detecting and avoiding the data getting there in the first place is an entirely different thing (and might not be possible when I display data that's written to the DB by a third-party application).
Edit 2: Corneliu Tusnea, the author of the tool I linked to above, has since released the tool as free software on codeplex: http://xssattack.codeplex.com/

I think metasploit has most of the attributes you are looking for. It may even be the only one that has all of what you specify, since all the others I can think of are closed source. There are a few existing modules that deal with XSS and one in particular that you should take a peek at: HTTP Microsoft SQL Injection Table XSS Infection. From the sounds of that module it is capable of doing exactly what you are wanting to do.
The framework is written in Ruby I believe, and is supposed to be easy to extend with your own modules which you may need/want to do.
I hope that helps.
http://www.metasploit.com/

Not sure if this is what you're after, its a parameter fuzzer for HTTP/HTTPS.
I haven't used it in a while, but IIRC it acts a proxy between you and the web application in question - and will insert XSS/SQL Injection attack strings into any input fields before deeming whether the response was "interesting" or not, thus whether the application is vulnerable or not.
http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project
From your question I'm guessing it is a type of fuzzer you're looking for, and one specifically for XSS and web applications; if I'm right - then that might help you!
Its part of the Open Web Application Security Project (OWASP) that "jah" has linked you to above.

There are some Firefox plugins to do some XSS testing here:
http://labs.securitycompass.com/index.php/exploit-me/

A friend of mine keeps saying, that php-ids is pretty good. I haven't tried it myself, but it sounds as if it could approximately match your description:
Open Source (LGPL),
Cross Platform - PHP is not in your list, but maybe it's ok?
Detects "all sorts of XSS, SQL Injection, header injection, directory traversal, RFE/LFI, DoS and LDAP attacks" (this is from the FAQ)
Logs to databases.

I don't think there is such a tool, other than the one you pointed us to. I think there's a good reason for that: It's probably not the best way to test that each and every output is properly encoded for the applicable context.
From reading about that tool it seems the premise is to insert random xss vectors into the database and then you browse your application to see if any of those vectors succeed. This is rather a hit and miss methodology, to say the least.
A much better idea, I think, would be to perform code reviews.
You may find it helpful to have a look at some of the resources available at http://owasp.org - namely the Application Security Verification Standard (ASVS), the Testing Guide and the Code Review Guide.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js