Serve static files from Cloudflare's R2 - django

Architecture Description
I have a Django app hosted on an Azure App Service container and proxy through Cloudflare's DNS.
The app works great, and using WhiteNoise I am able to serve static files stored in the Azure App Service storage container that is provided with the App Service (10 GB). Thing is, the storage serves files used by the Web App only (files uploaded during build, there's no option to manually add other files), and it is limited to 100GB/month of egress bandwidth.
I would like to try and use Cloudflare's R2 storage, as it has unlimited bandwidth, and allows you to upload any kind of files. I'll mainly be using images.
Question
How can static files be serve from Cloudflare's R2 on a Django app?
EDIT:
I have successfully connected to my Cloudflare's R2 bucket using [Boto3][1] but still can't link to the Django app on Azure.

I don't know how well R2 will function as a CDN from a cost/latency perspective, but as long as you stay within the free limits it's probably mostly fine (TTFB latency is going to be the biggest issue as with any object store).
We are working on making it possible to put Cache in front of R2 so that will help on the performance & cost front once that becomes available.
Unfortunately, I think SO is going to be a poor medium to debug your issue as it's unclear. Perhaps get realtime help on the R2 discord and then come back here to post the answer once you figure out your issue?
Since this sounds like tricky work, if you have a working example, we're happy to host it on https://developers.cloudflare.com/r2/examples/.

Related

Django website running too slow on Google Cloud Platform(GCP)

I am planning to migrated from DigitalOcean(DO) to Google Cloud(GCP).
I have taken trial on GCP and hosted Django website on it, but it is running too slow (it take 12-15 seconds to open page). Same website running on DO is very fast(it take hardly 1 second to open page).
Website is hosted on Ubuntu 20 LTS (on DO & GCP) & Apache2 server
On GCP there is no user right now, for testing I am only one user and it is running too slow. I have 2CPU and 8GB memory on VM.
I am not able to find out the issue why it is running slow on GCP and running fast on DO?
Can someone please help to find out solution?
When comparing Google Cloud Platform performance with the local one you should keep in mind that deploying on GCP needs more time to import all the necessary libraries and set up the Django framework.
In general it doesn't make much sense to compare the performance on your local machine with the performance on GCE, as local machines are likely running a different OS than GCE
In addition to that there are various ways to optimize your application’s performance, as the following typical ones:
Scaling configuration, by setting up “min_idle_instances” to be kept running and ready to serve traffic.
Using Warm Up Requests to reduce request and response latency during when your app's code is being loaded to a newly created instance.
I came across PageSpeed Insights, which analyzes the content of a web page, then generates suggestions to make that page faster and could be handy

Why serving static file in production, using django it's discouraged?

I have developed a web application that uses (obviously) some static files, in order to deploy it, I've chosen to serve the files with the WSGI interpreter and use for the job gunicorn behind a firewall and a reverse proxy.
My application uses whitenoise to server staticfiles: Everything works fine and I don't have any issue regarding the performances...but, really, I can't understand WHY the practice to serve those static files using directly the WSGI interpreter it's discouraged (LINK), says:
This is not suitable for production use! For some common deployment strategies...
I mean, my service it's a collection of microservices: DB-Frontend-Services-Etc...If I need to scale them, I can do this without any problem and, in addition, using this philosophy, I'm not worried about the footprint of my microservices: for me, this seems logical, but maybe, for the rest of the world this is a completely out-of-mind strategy.
You've misinterpreted that documentation. It's fine to use Whitenoise to serve static files; that is entirely what it's for. What is not a good idea is to use that internal Django function to do so, since it is inefficient.
Three reasons why I personally serve static from CDN,
1- You are using up bandwidth from your app server and loosing time getting these static files instead of throwing the load to CDN to handle all that. (WhiteNoise should though eliminate that)
2- Some hosting services like AWS will charge you for extra traffic in/out, while you can use cheaper services like Cloudfront and a S3 bucket.
3- I like to keep my app servers for app purposes only, and utilize each service for its job only, this helps me in debugging and reducing my failure points.
On the other hand though, serving static from app server with something like WhiteNoise is much much easier than configuring your CDN.
Hope this helps!
It's quite ok when you use Whitenoise because:
Whitenoise is exactly made for this purpose and therefore efficient
It'll set the HTTP response headers correctly so clients cache the files.
But think of it this way: Instead of serving 1 or 2 requests per web page, you'll often get 10x more requests (usually web pages will request a bunch of images, one or more css files, a couple of js files...). Meaning you have to scale your application server to serve 10x more traffic on average than if you leave the job to a CDN.
By the way, I've written a tutorial on this topic which may help.

Media files on Heroku

If I host a small Django website on Heroku and I am using just one dyno, is it save to upload media files on that server, or should I necessarily use AWS S3 storage to store media files? What are other alternatives for media storage?
No, it is never safe to store things on the Heroku filesystem. Even though you only have one dyno, it is still ephemeral, and can be killed at any time; for example when you push new code.
Using S3 is the way to go (alternatives are the Azure and Google offerings). There are several other advantages for using S3, mostly ability to service files without stressing your small server.
While your site is small, a dyno is very small as well, so a major advantage of S3, if used correctly, is that you can have the backing of the AWS S3 infrastructure to service the files. By "used correctly", I mean that you want to upload and service files directly to/from S3 so your server is only used for signing the S3 urls, but the actual files never go through your server.
Check https://devcenter.heroku.com/articles/s3-upload-python and http://docs.fineuploader.com/quickstart/01-getting-started.html (I strongly recommend Fine-Uploader if you can use the free version or afford the small license fee.).
Obviously, you can also just implement S3 media files in django using django-storage-redux, but that that means your server will be busy uploading files. If that's ok for your small server, then it is ok too.

Configurating Django, Heroku, and a static file server

We used to use the following combination: Django framework with Heroku as the application server and Amazon S3 as the static file server.
But recently we need to build a system which handles a large amount of video data, with data transfer more than 10 TB per month. That means Amazon S3 is no longer an option because it's too expensive.
We opt to set up our own static file server, so it's gonna be Django, Heroku, and an on-premiss file server. We need some suggestions:
Is our decision good enough? Any other options?
Is Nginx a good choice for the file server in this application?
Are there good documentations about uploading large files from a Django+Heroku application to a Nginx server?
Thanks.
1) Yes, your decision is best possible one
2) Nginx is the very best solution. Cloudflare serves traffic with Nginx more than major web apps altogether. Netflix serves 33% all US media traffic with Nginx
3) S3 as an origin is not expensive but traffic costs a lot. That should help https://coderwall.com/p/rlguog/nginx-as-proxy-for-amazon-s3-public-private-files
Large files upload should bypass any kind of backend but saved on disk asynchronous followed by upload to any destination with s separate process. For big files upload you have be careful of authentication, normally authentication happens only after file is uploaded which can be dangerous. To solve that try https://coderwall.com/p/swgfvw/nginx-direct-file-upload-without-passing-them-through-backend

Deploying WordPress on Elastic Beanstalk?

Suppose I create a site in Wordpress, which is running on Elastic Beanstalk. Now, on the running app I will create posts /pages, upload images, etc. That is, some data, videos, files and records in a database will be added to the running application.
3 questions:
If WordPress is running on Elastic Beanstalk with multiple Amazon EC2 instances actually running my WordPress install, then will those files propagate automatically to all running instances? And will this also happen, if a new EC2 instance is fired up - for example, to handle increased load?
From what I see in AWS console, I can deploy different versions of an app-- but as per scenario above, if I deploy a new version, wont I lose all the files uploaded directly into running app (i.e. files and database records)? How do I keep those and at the same time deploy a new version of my app?
The WordPress team keeps issuing upgrades. Can I directly upgrade my running WordPress install, through the web interface? Or do I have to first upgrade my local version of WordPress, and then upload the new version of the app to Beanstalk? If I have to upgrade my local version and then upload, then again I am back to point 1, i.e. changes made by users directly to the older version of running app. How do I preserve those changes?
I've been working on this as well, and have learned a couple of things that are relevant here -- your question about uploads in particular has been on my mind:
(1) The best way to handle uploads, it seems to me, is to either go the NFS/NAS route like you suggest, but one better than that is to use an Amazon S3 plugin for WordPress, so that any uploads automatically copy up to S3 and the URLs in your WordPress media library reflect the FQDN of your bucket and not your specific site. That way you could have one or ten WP nodes in your Beanstalk and media/images are independent of any one of those servers.
(2) You should absolutely be using RDS here. Few things are easier to work with and as stress-free as a Multi-AZ, reserved MySQL RDS instance. Either that or your own EC2 running MySQL that is independent of the Beanstalk, but why run that when RDS is so much easier?
(3) Yes you definitely have to commit changes to your Git repository or local file first (new plugins, changes to themes, WP upgrades) and then upload/install as a revision to the Beanstalk code. Otherwise, all the changes you make via the web interface to one node will never be in the new load for a new node -- in fact you'll have an upgraded database but an older set of code in the Beanstalk application, so it's likely to create errors of some kind or another.
I took an AWS architecture course, and their advice for EC2 and the Beanstalk is to start to think about server instances as very disposable -- so you should try to think about easy ways for your boxes to provision themselves in the bootstrapping process and to take over work for one another without any precious resources on just one box. So losing an instance should never be a big deal. (This is definitely not how we thought in the world of physical servers, where we got everything tweaked 'just so'.)
Good luck!
Well, I'm no expert, but since no one has answered, I'll give it my best shot.
You are absolutely right--kind of. While each EC2 instance does have some local storage, it is destroyed and reset with each new instance. Because of this, Amazon has things like Elastic Block Storage and S3 for persistent files. I don't know how one would configure WP to utilize this, but that will likely be the solution.
I think this problem is solved by my answer to #1. As for the database, all of your EC2 instances should be pulling from the same RDS location. Again, while you could have MySQL running on each EC2 instance, in the interest of persistence, having a separate database makes more sense.
You, again, have most everything right. Local development should always precede live deployment. Upgrading locally then pushing to the live servers will make sure all of your instances remain identical.
Truth-be-told, I am still in the process of learning all of this too, as I said, I'm not an expert. Hopefully someone else will come along and give a more informed answer. However, the key conceptual hurdle here is the idea of elastic scalability--and the salient point of this idea is the separation of elements between what is elastic/scalable/disposable and what is persistent.
Hopefully that helps.
I have deployed a small Wordpress site on EB, S3 and RDS. S3 holds all static data, such as media uploads. This works through a plugin. RDS holds the database. EB holds the latest deployed application. The application is deployed from my dev environment, with a build script. This way, I just have to press one button and I redeploy.
I wrote an article about it here: http://www.cortexcode.com/wordpress-to-aws-code-example/
While it was at first annoying to work with, the speed of AWS is nice and now it's easier than ever. It used to be that I had to upload a bunch of files over FTP, this is way more efficient. :-)
As an addition to all the great answers already:
1) I can highly recommend EFS but also S3 for media files, so they are served from high availability regions in combination with cloudfront. For Wordpress there is one plugin that really speeds up this ( not affiliated to them, just really like the plugin ). There is also an assets plugin, if you'd like to serve JS, CSS files from S3. For the EFS solution, take a look at the AWSlabs docs on git, and specifically this file on how they mount the uploads file.
In general, EBS is really great for Wordpress, but you'll need to think in a different mindset as compared to other hosting solutions ( shared hosting, managed hosting ).
OK I researched a lot on this particular issue, and this is what I learned--
(1) If a wordpress user uploads some files, then his files will be uploaded only to the virtual machine that is actually serving his request at that time. Eg if currently the wordpress site is cloud-deployed and is using 5 virtual machines, now when user makes request he is directed to one virtual machine-- the one with the lowest load at that point... His uploads are stored only on that server. Current Platform-as-a-service solutions (like Amazon Elastic Beanstalk and App Fog) do not have the ability to propagate the changes to all the running instances. Either that (propagate changes to all servers) or use a common storage by all running instances-- these are the only 2 solutions to this problem. (Eg of common storage would be all 5 running virtual machines using Network-Attached-Storage (NAS)... )
(2) With ref to platforms available currently like Amazon Elastic Beanstalk and App Fog, for example, even if user made changes directly to running app- these platforms rely on the local version of code (which the admin deployed initially to cloud)- and there is no way to update the local version of code (on admin's PC) with the changes made by a user to running app-- hence these changes viz, files are lost-- Similarly, changes in database by user to running app are also lost-- unless the admin is using exactly the same database for his local app (that he deployed to cloud)
(3) Any changes to running apps first have to be made to the local app on admin's PC and then pushed to cloud.
I am working on a Cloud PaaS that addresses all these concerns-- viz updates can be made to running apps, code changes made to running app are also updated in code repository accessible by user...The Proof of concept is ready, hopefully it will be as good as I hope it should be :) -- currently the only thing that is actually there is the website (anyacloudpanel.com) -- design work is going on :)
If there is some rule that I should not mention my website( Anya Cloud Panel) -- then I am sorry -- pls feel free to edit and remove my website URL from my answer :)
Thanks,
Arvind.
Deploying WordPress to AWS Elastic Beanstalk does require some change to the normal WordPress deployment as mentioned here a few times. To answer your questions, here is a great tutorial explaining stateless applications and how to deploy to Elastic Beanstalk:
Deploying WordPress to Amazon Web Services AWS EC2 and RDS via ElasticBeanstalk
Be careful if you use a theme from themeforest for example. Some of them are incompatible with wordpress S3 plugin. Then you're screwed, you can not deploy your wordpress on the cloud.