This isn't really a technical question, but say you've developed an app for commercial use. If you get questions pertaining to the security of your app from a person who isn't necessarily technically well-versed, saying that you've taken standard security measures like encryption of passwords, protection of routes, secure database connection etc. won't have much meaning to people who don't understand what these terms mean. With that in mind, is there any way to show/prove more generally that your app is secure e.g. is there a certification from AWS for example, that will show clients that your app can be trusted?
For a security aware client, to gain assurance that your software is reasonably secure, you should be able to present the secure development lifecycle that was in place during development and resulted in secure software. Because that is really the only way to gain that assurance.
A secure sdlc includes elements like developer security awareness/education to know about and be able to avoid security issues. It includes feature reviews, security architecture and code reviews during development, static scanning (sast), dynamic scanning (dast), or more recently iast, it also includes penetration testing, and in case of SaaS, also secure operations, configuration management, log management, devsecops.
You simply cannot get this level of assurance afterwards.
You can have some elements of it though.You can run a static scan, you can buy a penetration test, you can show how you deal with security issues and so on. In many cases, that's actually good enough, but be aware that really secure software is not only this.
Related
I'm struggling with the decision between a traditional backend (let's say a Django instance managing everything) and a service oriented architecture for a web app resembling LinkedIn. What I mean with SOA is having a completely independent data access interface - let's say Ruby + Sinatra - that queries the database, an independent chat application - Twisted - which is used via its API, a Django web server that uses those APIs for serving the content, etc.
I can see the advantages of having everything in the project modularized and accesed only via APIs: scalability, testing, etc. However, wouldn't this undermine the site performance? I imagine all modules would communicate via HTTP requests so wouldn't this arquitecture add a lot of latency to basically everything in the site? Is there a better alternative than HTTP?
Secondly, regarding development ease, would this really add much complexity to our developers? Specially during the first phase until we get an MVP.
Edit: We're a small team of two devs and a designer but we have no deadlines so we can handle a bit of extra work if it brings more technical value
Short answer, yes, SOA definitely trades encapsulation and reusability for latency. Long answer, it depends (as it always does) on how you do it.
How much latency affects your application is directly proportional to how fine-grained your services are. If you make very fine-grained services, you will have to make hundreds of sequential calls to assemble a user experience. If you make extremely coarse-grained services, you will not get any reusability out of your services; as they will be too tightly coupled to your application.
There are alternatives to HTTP, but if you are going to use something customized, you need to ask yourself, why are you using services at all? Why don't you just use libraries, and avoid the network layer completely?
You are definitely adding costs and complexities to your project by starting with an API. This has to balanced by the flexibility an API gives you. It might be a situation where you would benefit from internally structuring APIs to your code-base, but just invoking them as modules. Or building libraries instead of stand-alone APIs.
A lot of this depends on how big your project is. Are you a team of 1-3 devs cranking to get out your MVP? Or are you an enterprise with 20-100 devs that all need to figure out a way to divide up a project without stepping on each other?
I am working on a web project and I want to (as far as possible) handle user data in a way that reduces damage to the users privacy in case of someone compromising our servers/databases.
Of course we only have user dat'a that is needed for the website to do it's job but because of the nature of the project we have quite a bit of information on our users (part of the functionality is to apply yourself to jobs and sending your cv with it)
We thought about encrypting/decrypting sensitive data with a private/public keypair of which the private key is encrypted with the users password but found some security and implementation problems with that :P
the question is how do you implement user privacy and a protection against data theft on centralised web sever with browser compatible protocols while for functionality it is required that users can exchange sensible data?
To give some additional insight: this project is not yet in production stage so there is still time to make things right.
we are already doing some basic stuff like
serving https
enforcing https for sites that may handle sensitive data
hashing salted passwords
some hardening of our server and services on it
encrypted harddrives to prevent someone from reading all client information after stealing our servers / harddrives
but that's about it, there is besides the password hashes no mechanism that would stop/at least make it harder for someone who managed to get into (part of) the server to gain all data on all our users. Nor do we see a way to encrypt user data to disable our self from reading them as we need the data (we wouldn't have collected it otherwise) for some part of the website / the functionality we want it to provide. Even if we for example managed somehow (maybe with some javascript) that all data would get to us encrypted (by the client's browser) and we serve the client his privatekey encrypted with some passphrase (like for example his login password) we could not for examle scan user uploaded files for viruses and the like. On the other hand would a client side encryption at least with the browser/webserver concept leave some issues with security at least as we imagine it (you are welcome to prove me wrong) and seems quite like reinventing the wheel, and maybe as this project is not primarily about privacy, but rather privacy is a prefarable property we might not want to reinvent the wheel for it. I strongly believe I am not the first webdeveloper thinking about this, am I? So what have other projects done? What have you done to try to protect your users data?
if relevant we are using django and postrgreSQL for most things and javascript for some UI
The common way to deal with this issue is to split (partition) your data.
Keep minimal data on the Internet-facing web server and pass any sensitive data as quickly as possible to another server that is kept inside a second firewall. Often, data is pulled from the web server by the internal secure server to further increase security. This is how banks and finance houses handle sensitive data from the internet (or at least they should). There is even a set of standards (PCI) that cover the secure handling of credit card transactions that explain all of this in mind-numbing detail.
To further secure the internal server, you can put it on a separate network and secure physical access to it. You can also focus other security tools on it such as Data Loss Protection and Intrusion Protection.
In addition, if you have any data that you don't need to see in the clear, use a client-side encryption library to encrypt it locally. There are still risks of course since the users workstation might be compromised by malware but it still removes risks during data transmission and from server storage risks. It also puts responsibility onto the user rather than just on to your central servers.
You already seem to be a long way ahead of most web developers in ensuring that your customers are kept safe and secure. One other small change it would be worth considering would be to turn on enforced HTTPS for all transactions with your site. That way, there is very little chance of unexpected data leakage such as data being unexpectedly cached.
UPDATE:
Client side encryption can help a lot since it puts the encryption responsibility on the user. Check out LastPass for example. Without doing the encryption client-side, you could never trust the service. Similarly with backup services where you set your key locally so that the backups can never be unlocked by someone on the server - they never have the key.
Partitioning is one of the primary methods for enterprises to secure services that have Internet facing components. As I said, typically, the secure server PULLs data from the less secure one so the less secure server can never have any access to anything more secure even if fully compromised. Indeed there will be a firewall that prevents any traffic from the DMZ (where the less secure service is located) getting to the secure network. Only connections from the secure side are allowed through and they will be tightly controlled by security processes. In a typical bank or other high security setting, you may well find several layers like this, each of which having separate security controls, all partitioned from each other enforcing separation of data and security.
Hope that adds some clarity. Continue to ask if not!
UPDATE 2:
Even for simple, low cost setups, I would still recommend partitioning. For a low cost version, consider having two virtual servers with the dedicated firewall replaced by careful control of the software firewall on the more secure server. Follow the same principals outlined above for everything else.
Sorry for the very involved question, but this is something I've been researching for a while now and it is really frustrating me. I feel like in today's age we have a million and one ways to implement services tat are cross-platform (SOAP) and easy to build (thanks to .NET, java, and other frameworks). However, these technologies have been in the community for 5-10 years, but we are (or at least I am) constantly plagued with the same issues:
Identification (Tracking services) - UDDI; e.g., had to remind a co-worker the 3 times this month where a service is at, despite the fact there is a wiki that discusses the service and a PDF version of the same documentation that lives in a repository where we keep our service docs.
Scalability - Out of the box clustering; As organizations, we spend a lot of money on paying our admins just to watch the utilization of our services and make decisions like, does this service need more RAM, more CPU, more interfaces? How do I load balance this?
Monitoring - error logging, etc; I can't count how many times I have to set up tracing on services in order to see why a bug is happening that only seems to affect one customer, or have to code logic into the service to serialize exceptions, log exceptions to dbs, fail gracefully, etc.
Deployment - easy to deploy; none of this deploying DLLs to 5 load balanced servers
Each one of these problems requires some type of custom solution implemented by the organization. Documentation and UDDIs for #1. Virtualization and load balancing hardware / software for #2. Tracing, writing exceptions to databases / logs, etc for #3. Custom deployment software for #4. I work for a mid-sized organization. I can't even imagine how a company the size of Sun, Google, or Microsoft would tackle these dilemmas.
Maybe my vision is unrealistic, but I dream of having a Framework per se that lives on top of a server cluster that manages all of the above. I was ecstatic to read about Microsoft's AppFabric since it really seems to extend some of the functionality of BizTalk to WCF service implementors: Caching, Hosting, Monitoring, etc. However, from what I've seen, I still don't feel it lives up to my dream for an all-in-one solution that assists the developer and organization in writing services that are scaled across clusters easily, deployed into the cluster easily, and identifiable, possibly even version-able.
So, I don't mean this post to be about my dream. I do actually have a question. For starters, is my dream / want completely unrealistic? Furthermore, what solutions are there available that attempt to solve these problems without confining us to a new and more proprietary way (BizTalk) of developing services? An lastly, in concern to a complete SOA / ESB solution, where do we see the most potential in the market right now or in the future?
I think that you are talking about different kinds of problems here.
1). Developers who don't read documentation. This is an endemic problem, not limited to SOA - just look at some questions on StackOverflow. At least the developer is asking you whether there is a service, rather then just duplicating logic in their own code. I don't see any technical solution to these kinds of problems, you've already provided good registries and documentation, but some developers prefer to talk to people. Maybe, even, this is actually a good thing - human interaction has value above the technical content of the interaction. Or maybe, you're too nice: "No, I won't answer that question, look it up."
2). Scaling. There are technologies addressing this issue. (Disclaimer I work for IBM, who sell some, so I'll reference these - I'm not intending to imply that IBM are the only vendor with solutions in this space.) There are products such as this that can provision a new machine, install a software stack and add it to a cluster to address workload changes. Then at a finer grained level of control in the Java EE world the Application Server can dynamically shape traffic and adjust clusters. See WebSphere Virtual Enterprise
3). Monitoring. I don't "get" what you expect here. In all likelyhood such tricky bugs will require application level trace. For some problems such as finding memory leaks and performance bottlenecks there are very good tools, at least in the Java EE world.
4). I can't speak to the .Net world, but I'd say that Java EE app servers do a reasonable job of deploying the apps across clusters smoothly, and in the cases where we use JNI and need DLLs deploying then we can use products such as the Tivoli stack I mention to manage this.
So, in summary, I do think that vendors are trying to address these issues. And I don't think your life would be simpler without SOA. Imagine instead the same problems applied to myriad separate, independent applications.
Here's my two cents.
I've been a developer at a company that used SOA incorrectly. The worst solution they implemented was field level validation of form elements on a desktop app using SOA. To perform acceptably these require very low latency. A 2-4 second wait to change to a new field gets old fast. The service ran over the network on a biztalk server. Everyone hated it.
If you're going to do this you really need to spend a lot of time dealing with network latency, service failure, timing, and timeout issues.
Don't get carried away and think SOA is the solution to every problem. Used at a high level it's great, used at a low level it makes your applications fragile, slow, and impossible to debug.
If you talk to IBM or one of the big SOA vendors, they got a products that cover each scenario.
Identification (Tracking services) - UDDI; e.g., had to remind a co-worker the 3 times this month where a service is at, despite the fact there is a wiki that discusses the service and a PDF version of the same documentation that lives in a repository where we keep our service docs.
Registry and Repository server. Nice thing is that it does governance (promotion, demotion, versioning, approval) and your ESB typically does a "lookup" for the latest and greatest against the register server.
Scalability - Out of the box clustering; As organizations, we spend a lot of money on paying our admins just to watch the utilization of our services and make decisions like, does this service need more RAM, more CPU, more interfaces? How do I load balance this?
Transaction monitoring software like IBM Tivoli Composite Application Manager for SOA. Basically, it tracks things from a horizontal point of view and to see if there is a service disruption from a end user/end app point of view.
As far as your clustering.... you have to pick good middleware and architecture. Personally speaking, get stuff that is "cloud" ready. App Servers with NoSQL connected by MOM.
Monitoring - error logging, etc; I can't count how many times I have to set up tracing on services in order to see why a bug is happening that only seems to affect one customer, or have to code logic into the service to serialize exceptions, log exceptions to dbs, fail gracefully, etc.
Enterprise standards for your developers and for your vendors. Integration of all business and system events into a single dashboard. (Most companies spilt them). This is done already at most enterprise shops.
Deployment - easy to deploy; none of this deploying DLLs to 5 load balanced servers
Ahh.. Microsoft IIS Web Deployment Tool 2.0. You can sync 100s of MS servers by just updating the master. It's really easy.
This question has been asked a few times on SO from what I found:
When should a web service not be used?
Web Service or DLL?
The answers helped but they were both a little pointed to a particular scenario. I wanted to get a more general thought on this.
When should a Web Service be considered over a Shared Library (DLL) and vice versa?
Library Advantages:
Native code = higher performance
Simplest thing that could possibly work
No risk of centralized service going down and impacting all consumers
Service Advantages:
Everyone gets upgrades immediately and transparently (unless versioned API offerred)
Consumers cannot decompile the code
Can scale service hardware separately
Technology agnostic. With a shared library, consumers must utilize a compatible technology.
More secure. The UI tier can call the service which sits behind a firewall instead of directly accessing the DB.
My thought on this:
A Web Service was designed for machine interop and to reach an audience
easily by using HTTP as the means of transport.
A strong point is that by publishing the service you are also opening the use of the
service to an audience that is potentially vast (over the web or at least throughout the
entire company) and/or largely outside of your control / influence / communication channel
and you don't mind or this is desired. The usage of the service is much easier as clients
simply have to have an internet connection and consume the service. Unlike a library which
may not be so easily done (but can be done). The usage of the service is largely open. You are making it available to whomever feels they could use it and however they feel to use it.
However, a web service is in general slower and is dependent on an internet connection.
It's in general harder to test than a code library.
It may be harder to maintain. Much of that depends on your maintainance and coding practices.
I would consider a web service if several of the above features are desired or at least one of them
is considered paramount and the downsides were acceptable or a necessary evil.
What about a Shared Library?
What if you are far more in "control" of your environment or want to be? You know who will be using the code
(interface isn't a problem to maintain), you don't have to worry about interop. You are in a situation where
you can easily achieve sharing without a lot of work / hoops to jump through.
Examples in my mind of when to use:
You have many applications in your control all hosted on the same server or two that will use the library.
Not so good example, you have many applications but all hosted on a dozen or so servers. Web Service may be a better choice.
You are not sure who or how your code could be used but know it is of good value to many. Web Service.
You are writing something only used by a limited set of applications, perhaps some helper functions. Library.
You are writing something highly specialized and is not suited for consumption by many. Such as an API for your Line of Business
Application that no one else will ever use. Library.
If all things being equal, it would be easier to start with a shared library and turn it into a web service but not so much vice versa.
There are many more but these are some of my thoughts on it...
Based on multiple sources...
Common Shared Library
Should provide a set of well-known operations that perform common tasks (e.g., String parsing, numerical manipulations, builders)
Should Encapsulate common reusable code
Have minimal dependencies on other libraries
Provide stable interfaces
Services
Should provide reusable application-components
Provide common business services (e.g., rate-of-return calculations, performance reports, or transaction history services)
May be used to connect existing software from disparate systems or exchange data between applications
Here are 5 options and reasons to use them.
Service
has peristent state
you need to release updates often
solves major business problem and owns data related to it
need security: user can't see your code, user can't access you storage
need agnostic intereface like REST (you can auto generate shallow REST clients for client languages esily)
need to scale separately
Library
you simply need a collection of resusaable code
needs to run on client side
can't tolerate any downtime
can't tolerate even few milliseconds of latency
simplest solution that couldd possibly work
need to ship code to data (high thoughput or map-reduce)
First provide library. Then service if need arises.
agile approach, you start with simplest solution than expand
needs might evolve and become more like "Service" cases
Library that starts local service.
many apps on the host need to connect to it and send some data to it
Neither
you can't seriously justify even the library case
business value is questionable
Ideally if I want both advantages, I'll need a portable library, with the agnostic interface glue, automatically updated, with obfuscated (hard to decompile) or secure in-house environment.
Possible using both webservice and library to turn it viable.
I just wrote one of my first web applications (Linux, Apache, MySQL, Django), and would like to launch it publicly. It's a webform-based task disguised as a game; I intend to eventually put it on Amazon Mechanical Turk and give small bonuses to people who achieve certain scores.
Even though this app does not have a tremendously high security risk, I need to safeguard it against manipulation and reverse engineering. However, I have little formal training in testing/security. Given that there are tangible prizes to be won, I know people will have an incentive to cheat, whether by altering POST data, pressing "back" and re-submitting data until they win, etc. So far, I have been dealing with these issues on an ad-hoc basis by putting in security tests as I think of possible exploits. However, I realize there are probably lots of forms of manipulation that I haven't thought of yet.
Can anybody recommend some reading materials from which I can learn how to protect my website against manipulation and reverse engineering?
A very good place to read up is OWASP; see http://www.owasp.org/index.php/Main_Page. They have extensive documentation regarding website security.
Edit: For a quick overview, check the "Top Ten."
The Google Browser Security Handbook has a lot of information about potential vulnerabilities in the web architecture, in particular the details that are affected by the behavior of web browsers (as opposed to server based vulnerabilities, like SQL injection attacks and the like). It is a good starting point for learning about how browsers work in ways that impact security, like how they handle cookies, cross domain requests, images and MIME types, etc.
SQL Injection
Prevent malicious users from altering SQL queries via URL query strings.
DoS Attacks
Prevent users from the same IP address from accessing your site an excessive number of times in a small space of time.
Password Strength
When allowing users to create their own passwords, show a password strength indicator which encourages users to enter stronger passwords.
Captcha
Stop non-human users from submitting to forms by presenting a captcha image. You may also want to use this if password authentication is failed multiple times, to prevent robots from guessing passwords.
One book I might recommend is "Security Engineering" by Ross Anderson. It's fairly detailed and it gives a good overview of many different topics relating to computer security, although not all of it is relevant for securing a website.