Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
For development, I use a local LAMP stack, for production I'm using MediaTemple's Django Container (which I'm loving BTW). MT's container uses lighthttpd. Honestly I've never had any other experience with it. I've always used Apache. I've been doing some reading:
Onlamp
TextDrive
Linux.com
Here's are questions:
What strengths does one have over the other?
Would it benefit me to use lighthttpd on my dev setup?
What's up with using both? The Linux.com article talks about using lighttpd with Apache.
The benefit of both: Apache is more powerful and extensible (useless if you don't need that power, but anyway...) and lighttpd is faster at static content. The idea is of splitting your site into static content (css, js, images, etc) and dynamic code that flows through Apache.
I'm not saying you can't do a lot with lighttpd on its own. You can and people do.
If you're using lighttpd exclusively on your production server, I would seriously consider mirroring that on your development and staging servers so you know exactly what to expect before you deploy.
For purely static web pages (.gif, .css, etc.) with n http requests from distinct ip addresses:
1. Apache: Runs n processes (with mod_perl, mod_php in memory)
2. lighttpd: Runs 1 process and 1 threads (You can assign m threads before launching it)
For purely dynamic web pages (.php, .pl) with n http requests from distinct ip addresses:
1. Apache: Runs n processes (with mod_perl, mod_php in memory)
2. lighttpd: Runs 1 lighttpd process thanks to async I/O, and runs m fast-cgi processes for each script language.
Lighttpd consumes much less memory. YouTube used to be a big user of lighttpd until it was acquired by Google. Go to its homepage for more info.
P.S. At my previous company, we used both with a load balancer to distribute the http traffic according to its url suffixes. Why not fully lighttpd? For legacy reasons.
The way you interface between the web server and Django might have even a greater impact on performance than the choice of web server software. For instance, mod_python is known to be heavy on RAM.
This question and its answers discuss other web server options as well.
I wouldn't be concerned on compatibility issues with client software (see MarkR's comment). I've had no such problems when serving Django using lighttpd and FastCGI. I'd like to see a diverse ecosystem of both server and client software. Having a good standard is better than a de facto product from a single vendor.
The answer depends on your projects goals. If it's going to be a large scale site where uptime is critical and load is hight go with lighttpd; it scales amazingly. The only downside is that you have to be more hands on initially. Most hosts won't support this and it really pays to know what you're doing with lighttpd.
If it's a site for your mother that'll get a few thousand visitors a month apache'll work better. She'll be able to move to a new host a lot easier and support is easier to find.
Use a standard web server. Apache is used by 50% of web sites (Netcraft), therefore, if you use Apache, peoples' web browsers, spiders, proxies etc, are pretty much guaranteed to work with your site (its web server anyway).
Lighthttpd is used by 1.5% of web sites (Netcraft), so it's far less likely that people will test their apps with it.
Any performance difference is likely not to matter in production; an Apache server can probably serve static requests at a much higher bandwidth than you have, on the slowest hardware you're likely to deploy in production.
Related
My friend recently asked me the following question: given that Django already has runserver, why didn't wasn't it extended to be a production-ready customer-facing HTTP server? What people do instead is set up an uwsgi server that speaks WSGI and exposes something that Nginx forwards traffic to by reverse proxying...
Based on what I know, many other languages use this pattern: there is a "simple" HTTP server meant for development, as well as an interface for *GI (ASGI/WSGI/FCGI/CGI) that web server is supposed to reverse proxy to. What is the main reason those web servers don't grow production-ready and instead assume presence of another web server?
Here are some of my theories, but I'm not sure if I'm missing something more significant:
History: dynamic websites date back to perl/PHP, both worked as a "dumb" CGI backend that was basically a filter that processed HTTP request (stdin) to a response (stdout). This architecture worked for some time and became a common pattern,
Performance: web applications are often written in languages that don't JIT and having a web server written in such a language would introduce extra overhead while milliseconds matter. Also, this lets us speed up static file serving,
Security: Django's runserver is clearly described as potentially insecure, according to this quote:
DO NOT USE THIS SERVER IN A PRODUCTION SETTING. It has not gone through security audits or performance tests. (And that’s how it’s gonna stay.
The last point seems to suggest that writing a production-ready HTTP server is too complex to fit within Django's goals, what kind of edge cases would need to be supported to get there?
Is any of the points actually valid, or am I missing the elephant in the room here?
Because they don't want to get into the web server business, and I think that's a wise decision.
Creating, developing and most importantly maintaining a web server is not a trivial thing. They couldn't simply write it once and then it's done (in fact, that's pretty much what they did and it's runserver).
Rather than re-invent the wheel, they've chosen to leave it to those who do it best. They're not likely to match the stability and functionality of a proper web server by doing it as a side-project to support running Django applications. They're better spending their time making Django better.
It's also consistent with the UNIX philosophy, but that's not necessary to get into here.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I had to "sporadically" work with Heimdal / MIT Gssapi for kerberos authentication over past couple of years. I had to build an application that was to be used as a web-service running on a Linux box, and serve client applications like browsers, running on Windows and/or Linux Desktops and Workstations. Surely not the easiest of beasts to tame. Eventually when summarizing my work, I could record that the difficulties emanated due to challenges in multiple dimensions. Getting started with gssapi programming is truly a challenge just because of poor documentation, and practically non-existant tutorials. Googling mostly results in either some theoretical discussion on what's kerberos, or leads to content written with presumption that you already know everything besides some particular semantic issue.
Some really good hacks around here contributed to help me, I therefore suppose it would be a good idea to summarize the stuff, from a developer's perspective, and share it here as some sort of a wiki, to give something back to this fantastic place, and fellow programmers.
Haven't really done a wiki like this before, and I am surely no authority on GSSAPI nor Kerberos, so please be kind, but more than that please contribute and correct my mistakes. Site Editors, I am counting on you to do your magic ;)
Getting your project completed successfully will require 3 specific things to be done correctly:
Setup of your test environment
Setup of your libraries
Your code
As I said already, such projects are beasts, just because all the three haven't been put together on the same page anywhere.
Ok So let's begin at the beginning.
Unavoidable theory for a newbie
GSSAPI helps a client application to provide credentials for a server to authoritatively identify the user. Extremely useful because the server applications can modulate their served responses if they wish to, as per the user. Very naturally therefore both - the client and the server applications must be kerberized, or as some would state kerberos-aware.
The kerberos based authentication, requires both the client and server applications, to be members of a Kerberos Realm. KDC (Kerberos Domain Controller) is the designated authority that rules the realm. Microsoft's AD servers are one of the most popularly experienced examples of a KDC, though you can of course be using a *NIX based KDC. But surely without a KDC there can be no Kerberos business at all. Desktops, Servers & workstations joined into the domain identify each other as long as all of them remain joined into the domain.
For your initial experiments, setup the client & server applications in the same realm.
Though Kerberos Authentication can surely be also used across realms by creating trusts between KDCs of these realms, or even merging keytabs from different KDCs that do not trust each other. Your code will not really need any change to accommodate such different and complex-sounding scenarios.
Kerberos Authentication basically works via "tickets (or tokens)". When a member joins the realm, the KDC "grants tokens" to each of them. These tokens are unique; time and FQDN are essential factors for these tickets.
Before you even think of the very first line of your code make sure you have got these two right:
Pitfall #1 When you setup your development and test environment, make sure everything is tested and addressed as FQDN. For example if you want to check connectivity, ping using FQDN, not IP. Needless to say therefore, they must necessarily have the same DNS service configuration.
Pitfall #2 Make sure all the host systems - that are running your KDC, client software, server software have the same time server. Time synchronization is something that one forgets, and realizes to be amiss after a lot of hair-splitting, and head-banging!
Both, the client and server applications NEED kerberos keytabs. So if your application is going to run in a *NIX host, and be a part of a Microsoft Domain, you have to get a kerberos keytab generated, before we start to look at the remaining preparatory steps for gss programming.
Step-by-Step Guide to Kerberos 5 (krb5 1.0) Interoperability at is an absolute must-read.
GSS-API Programming Guide is an excellent bookmark.
Depending upon your *NIX distribution you can install the headers & libraries for building your code. My suggestion however is to download the source and build it yourself. Yes, you might not get it right at one go, but it surely is worth the trouble.
Pitfall #3 Make sure that your application is running in an Kerberos aware environment.
I really learnt this the hard way, but maybe because I am not so smart. In my earliest stages of gssapi programming struggle, I had discovered that kerberos keytabs were absolutely necessary for making my application kerberos-aware. But I simply couldn't find anything about how to load these keytabs in my application. You know why?!! Because no such api exists!!!
Because: The application is to be run in an environment which is aware of the keytabs.
Ok, let me make this simple: Your application that is supposed to do the GSSAPI / Kerberos things has to run after you have set environment variable KRB5_KTNAME to the path where you have stored the keytabs. So either you do something like:
export KRB5_KTNAME=<path/to/your/keytab>
or make use of setenv to set KRB5_KTNAME in your application sufficiently before the very first line of your code that uses gssapi is run.
We are now ready to do the necessary things in the application's code.
I understand there are quite a few other aspects that must be reviewed by the application developer, to write and test an application. I know of a few environment variables, that can be important.
Can anybody please shed some more light upon that?
First of all please let me be clear that I am a windows user and very new to the web world. For the past months I have been learning both python and django, and it has been a great experience for me. Now I have somehow created a small project that I would like to deploy in the production server. Since django has its built-in development server there was no problem for me. But now that I have to deploy it to a production server I googled around and found Nginx + uWSGI or Nginx + Gunicorn as the best option for it. And as uWSGI and Gunicord are incompatible with Windows, I think I should adapt Ubuntu or other Unix system.
So my questions are:
Just to be clear, as I will have to work with one of the above, please explain to me why do I need two servers?
If I have to adapt the Ubuntu environment, do I have to learn Ubuntu shell scripting, SSH and other stuff? Or the hosting provider will help me do that?
Please let me be aware of what else do I need for the above concerned.
Thank you so much for your time and please pardon if my question was a lame question. Hoping for positive response answers.
A typical configuration involves two server processes (which can be run together on the same actual hardware or virtual server) so that the proxy server in front can buffer slow clients. For instance: a slow client will connect to nginx with a request. Nginx will pass the request on to Gunicorn and Gunicorn will respond. Nginx will then consume the Gunicorn response immediately, freeing up the Gunicorn resources right away. At that point, the slow client can take as much time as it wants to consume the response from Nginx without tying up much in the way of server resources. Alternatives to the two-server-process model are to use async workers with Gunicorn and put Gunicorn itself in front, or to use an async-sync combo like Waitress. Nginx in front has the added benefit of doubling as a ready-to-use statics server, though.
Note that "slow clients" can describe: mobile phones that lose their connection and leave the TCP socket hanging until timeout mid-request; mobile phones that are just slow; unreliable connections of all types; hostile denial-of-service clients who are deliberately trying to use server resources; sometimes any old connection that has a hiccup or malfunction for any reason. So this is a problem that will affect nearly any site.
You won't need shell scripting per se but getting used to Ubuntu will take some time. There is a lot to learn even outside of scripting, like how to use the package manager, how to configure packages once they're installed in ways that won't confound future updates, etc. And you will definitely have to learn to use SSH; it is one of the most fundamental server administration tools in the *nix world.
An alternative to learning to use Ubuntu or another server platform is to use a Platform-as-a-Service option like Heroku, as PaaS hosting providers really will take care of all of that stuff for you. I recommend this approach. That having been said, even though I think PaaS is a good option for people who want to focus on development and not server admin regardless of their level of skill, it's also true that a little bit of experience with Linux server platforms goes a long way in helping you to understand the environment that your code runs in. So even if you go with PaaS, you would still benefit from tinkering with Ubuntu a little (or a lot).
Another benefit from a PaaS is that normally their infrastructure handles the Nginx part of the deal (buffering of slow requests via proxy). This is the case with Heroku, for instance. So you won't have to worry about that part of the infrastructure at all.
This part of the question is too broad to answer, but let me know in the comments if you need clarification.
I'm doing it almoast like in this tutorial: http://michal.karzynski.pl/blog/2013/06/09/django-nginx-gunicorn-virtualenv-supervisor/
Nginx is my proxy to django app running on gunicorn and its serving statics, virtualenv for my python enviroment, supervisor to watch my app's running.
It's possible you will run in some error's if not using Postgresql, ask then I will help (used MySQL in the past now it's Postgresql)
Firstly, there's no need to use Ubuntu if you're happier with Windows. I don't know if nginx works on Windows, but I'd be very surprised if it doesn't (in fact, here are the nginx docs for installing on Windows). Apache, meanwhile, definitely does work on Windows. The Django documentation has a full explanation of how to set up Apache/mod_wsgi to serve Django.
You don't need two servers. I'm not sure why you think you do: the usual reason for that is to have the static assets on a separate server, but you don't mention that as a reason. Since you're only talking about a small site, though, you don't even need to do that. One server configured to serve both Django and the static assets will do fine. Again, the docs explain exactly how to do that.
What C++ software stack do developers use to create custom fast, responsive and not very resource hungry web services?
I'd recommend you to take a look on CppCMS:
http://cppcms.com
It exactly fits the situation you had described:
performance-oriented (preferably web service) software stack
for C++ web development.
It should have a low memory footprint
work on UNIX (FreeBSD) and Linux systems
perform well under high server load and be able to handle many requests with great efficiency
[as I plan to use it in a virtual environment] where resources will be to some extent limited.
So far I have only come across Staff WSF, Boost, Poco libraries. The latter two could be used to implement a custom web server...
The problem that web server is about 2% of web development there are so much stuff to handle:
web templates
sessions
cache
forms
security-security-security - which is far from being trivial
And much more, that is why you need web frameworks.
You could write an apache module, and put all your processing code in there.
Or there's CppCMS, or Treefrog or for writing web services (not web sites) use gSOAP or Apache Axis
But ultimately, there's no "easy to use framework" because C++ developers like to build apps from smaller components. There's no Ruby-style framework, but there is all manner of libraries for handling xml or whatever, and Apache offers the http protocol bits in the module spec so you can build up your app quite happily using whatever pieces make sense to you. Now whether there's a market for bundling this up to make something easier to use is another matter.
Personally, the best web app system I wrote (for a company) used a very think web layer in the web server (IIS and ASP, but this applies to any webserver, use php for example) that did nothing except act as a gateway to pass the data from the requests through to a C++ service. The C++ service could then be written completely as a normal C++ command line server with well-defined entry points, using as thin an RPC system as possible (shared memory, but you may want to check out ZeroMQ), which not only increased security but allowed us to easily scale by shifting the services to app servers and running the web servers on different hardware. It was also really easy to test.
I know that it's bad to use the Django web server in production. There's been at least one Stackoveflow question on this already.
But I'm wondering about where to draw the line between development and production? If I'm only allowing HTTP access to one (or a few) IP addresses, then I know I'm in development. What if I open it to all IP addresses, but only e-mail a couple friends to see what they think of what I've built?
As far as I can tell, the problems with using the Django server are:
It's single-threaded
Security
I don't think (1) is likely to be an issue if I'm only sharing it with a few people. For (2)--what's the worst-case scenario? Does it make a difference that I'm running on an Amazon EC2 server that I could very easily restart from a backup if something bad happened?
Well, the answer is very simple actually, you've left development when you have something you must protect: real user personal information, real data in your database that you'd be afraid to lose, etc.
Security isn't a concern until these things are present. The rule about not using the dev server in "production" is guidance, not mandatory. You can fire up the dev server in your production environment any time you want. However, you'd be silly to do so and then open up universal access to it, once your site is truly live and in use by the world.
Setting up mod_wsgi (or some other WSGI container) on a development machine takes all of 5 minutes, and can help you sort out deployment issues before you actually reach deployment. So really, why ever use the development server if you don't have to?