How can I scale up my django app to utilize 4 cores and 8 threads? - django

I have a Django app. I am about to move this from a single core system to a multi core system. (8 threads because of hyperthreading)
I am trying to figure out the changes I need to make to utilize all of the cores and all of the threads. I looked around and found bunch of settings for Rails. I can not find similar settings for Django online.
I found an article that says, I don't need to make any changes to Django. I should just use gunicorn to run 8 instances to utilize 8 threads, but this did not make sense to me.
What do I need to change to make sure my app utilizes 8 threads without wasting resources.

I should just use gunicorn to run 8 instances to utilize 8 threads,
but this did not make sense to me.
this is the easiest way and it definitely makes sense. You will have multiple instances of your app and load will be balanced. OS will utilize multiple cores to run multiple processes of your app.

Related

Do schedulers slow down your web application?

I'm building a jewellery e-commerce store, and a feature I'm building is incorporating current market gold prices into the final product price. I intend on updating the gold price every 3 days by making calls to an api using a scheduler, so the full process is automated and requires minimal interaction with the system.
My concern is this: will a scheduler, with one task executed every 72 hrs, slow down my server (using client-server model) and affect its performance?
The server side application is built using Django, Django REST framework, and PostgresSQL.
The scheduler I have in mind is Advanced Python Scheduler.
As far as I can see from the docs of "Advanced Python Scheduler", they do not provide a different process to run the scheduled tasks. That is left up to you to figure out.
From their docs, they are recommending a "BackgroundScheduler" which runs in a separate thread.
Now there are multiple issues which could arise:
If you're running multiple Django instances (using gunicorn or uwsgi), APS scheduler will run in each of those processes. This is a non-trivial problem to solve unless APS has considered this (you will have to check the docs).
BackgroundScheduler will run in a thread, but python is limited by the GIL. So if your background tasks are CPU intensive, your Django process will get slower at processing incoming requests.
Regardless of thread or not, if your background job is CPU intensive + lasts a long time, it can affect your server performance.
APS seems like a much lower-level library, and my recommendation would be to use something simpler:
Simply using system cronjobs to run every 3 days. Create a django management command, and use the cron to execute that.
Use django supported libraries like celery, rq/rq-scheduler/django-rq, or django-background-tasks.
I think it would be wise to take a look at https://github.com/arteria/django-background-tasks as it is the simplest of all with the least amount of setup required. Once you get a bit familiar with this you can weigh the pros & cons on what is appropriate for your use case.
Once again, your server performance depends on what your background task is doing and how long does it lasts.

How to start building a django website with postgres nginx and gunicorn on separate machines?

I searched all over the internet and I guess I'm lost, Please I want a detailed answer here.
I know how to make all three of these things(gunicorn, Nginx, and Postgres) work on the same machine but what if I wanna have all of them on the different/individual machine.
Let's say I got 3 behemoth powerful machines capable of handling anything but the real question in my mind is where do I start. how can I connect all three of these together? what if I wanna add more storage? How do I connect multiple database machines together
I'm sorry but I'm pretty new in networking and it's really crucial for me to understand this because if I don't my mind won't allow me to go ahead with the process.

(or maybe more)Clojure background process

Let's say Im making a crawler / scraper in clojure, and I want it to run periodically (at predefined times of day).
I want to define my jobs with quartz / quartzite (at least that seems to be the most robust solution.)
Now, to create a daemon process with clojure, I tried lein-daemon plugin, but it seems that it is a pretty risky endevour, since the plugin seems more than a bit buggy (or I am making some heavy mistakes)
What is the best way for me to create this service?
I want it to be able to restart itself upon system reboot, but I want to use clojure (quartzite) for my jobs (loading them from database, etc).
What is the robust - but clojury - way to create a long-running, daemon process?
EDIT:
The deployment environment will be something like a single VPS or a dedicated server.
There may be a dozen jobs loading their parameters from some data store, running anywhere from 1 - 8 times a day (or maybe more).
The correct process depends a lot on your environment. I work on deployment systems for complex web/mobile infrastructure with many long running Clojure processes. For this we use Pallet to create instances with the code checked out and configured, then we have a function that generates init scripts to start the services at boot. This process is appropriate to environments where you need a repeatable build on a cloud provider so it may be too heavy for your situation.
If you are looking for simple recurring jobs you may want to look into Immutant which is an application server for Clojure with good support for recurring jobs.

How to test the real-world performance of a social application without owning thousands of computers?

I'm managing a prototypal design for an open-world social game which is supposed to replace a server which is having scaling issues. How can I test it's behavior in real-world situations - that is, hundreds of clients connected from different places - without having hundreds of clients to connect to it?
There are several answers for that.
Professional load testing applications can do that. See HP Load Runner for instance
Open Source load testing applications - See JMeter
Tweak other applications to do that. In my own case I am using Selenium Grid where from using just 10 computers I can simulate 100 different users.
General answer is: If you do not have hundreds real users to do the real clicking, you will have to script user behaviour somehow and run these scripts in parallel from few computers.
In addition (or combined with) to the methods mentioned by #PavelJanicek, you can also manually limit the resources available to your application (memory, CPU time slots and priority) - this will allow the application to reach its limits with much less users.

Best practice: Multiple django applications on a single Amazon EC2 instance

I've been runnning a single django application on Amazon EC2 using gunicorn to serve the django portion and nginx for the static files.
I'm going to be starting new project soon, and wondering which of the following options would be better:
A larger amazon EC2 instance (Medium) runnning multiple django applications
Multiple smallers EC2 instances (Small/Micro) all running their own django applications?
Would anybody have any experience with this? What would the relevant performance metrics I could measure to get a good cost to performance ratio?
The answer to this question really depends on your app I'm afraid. You need to benchmark to be sure you are running on the right instance type. Some key metrics to watch are:
CPU
Memory usage
Requests per second, per instance size
App startup time
You will also need to tweak nginx/gunicorn settings to make sure you are running with a configuration that is optimised for your instance size.
If costs are a factor for you, one interesting metric is "cost per ten thousand requests", i.e. how much are you paying per 10000 requests for each instance type?
I agree with Mike Ryan's answer. I would add that you also have to evaluate whether your app needs a separate database. Sometimes it makes sense to isolate large/complex applications with their own database, which makes changes and maintenance easier. (Also reduces your risk in the case that something goes wrong). Not all of your user base would be affected in the case of an outage. You might want to create a separate instance for these applications. Note: Django supports multiple databases in one project but, again, that increases complexity for changes and maintenance.