I am making a website in Django, and I am trying my best to make sure it is horizontally scalable.
Due to the application being horizontally scalable, I am unable to save Images that users upload locally, in the media folder.
I was wondering what are some ways I could save the images that the users upload, in such a way that would allow my application to be horizontal scalable?
I do have a MariaDB Galera Cluster that I use to store other data, but it seems like saving images in a shared database might not be the best idea due to performance reasons (Storing Images in DB - Yea or Nay?).
If I attempt to use the media folder, are there any solutions that could sync storage (folder) between different instances of the application?
In general, what are some good practices for serving(download/upload) static content like images for horizontally scalable websites, and does Django provide anything to assist with this matter out of the box?
Related
I know that this question can be mostly answered generally for any Web App, but because I am specifically using Shiny I figured that your answers may be considerably more useful.
I have made a relatively complex app. The data is not complex, but the user interface is.
I am storing the data in S3 using the aws.s3 package, and have built my app using golem. Because most shiny apps are used to analyse or enter some data, they usually deal with a couple of datasets, and a relational database is very useful and fast for that type of app.
However, my app is quite UI/UX extensive. Users can have their own/shared whiteboard space(s) where they drag around items. The coordinates of the items are stored in rds files in my S3 bucket, for each user. They can customise many aspects of the app just for them, font size, colours of various experimental groups (it's a research app), experimental visits that are storing pdf files, .html files and .rds files.
The .rds files stored can contain variables, lists, data.frames, reactiveValues, renderUI() objects etc.. So they are widely different.
As such I have dozens of rds files that are stored in a bucket and everytime the app loads each of these .rds files need to be read one by one in order to recreate the environment appropriate for each user. The number of files/folders in directories are queried to know how many divs need to be generated for the user to click inside their files etc..
The range of objects stored is too wide for me to use a relational database - but my app is taking at least 40 seconds to load. It is also generally slow when submitting data as well, mostly because the data entered often modified many UI elements that need to be pushed to S3 again. Because I have no background in proper Web Dev, I have no idea what is the best way to store user-related UX/UI elements and how to retrieve them seamlessly.
Could anyone please recommend me to appropriate resources for me to learn more about it?
Am I doing it completely wrong? I honestly do not know how else to store and retrieve all these R objects.
Thank you in advance for your help with the above.
There are cases in a project where I'd like to store images on a model.
For example:
Company Logos
Profile Pictures
Programming Languages
Etc.
Recently I've been using AWS S3 for file storage (primarily hosting on Heroku) via ImageField uploads.
I feel like there's a better way to store files than what I've been doing.
For some things (like for the examples above) I think it would make sense to actually just get an image url from a more publically available url than take up space in my own database.
For the experts in the Django community who have built and deployed really professional projects, do you typically store files directly into the Django media folder via ImageField?
or do you normally use a URLField and then pull a url from an API or an image link from the web (e.g., go on any Google image, right click and copy then paste image URL)?
Bonus: What does your image storing setup look like?
Hope this makes sense.
Thanks in advance!
The standard is what you've described, using something like AWS S3 to store the actual image and handle the URL in your database. Here's a few reasons why:
It's cheap. like really cheap
Instead of making your web server serve the files, you're offloading that onto the client (e.g. their browser grabbing the file from S3)
If you're using an ephemeral system (like Heroku), your only option is to use something like S3.
Control. Sure, you can pull an image link from somewhere else that isn't managed by you. But this does not scale. What happens if that server goes offline? What if they take that image down? This way, you control what happens to the objects.
An example of a decently large internet company but not large enough to run their own infrastructure (like Facebook/Instagram, Google, etc.) is VSCO. They're taking a decent amount of photo uploads every day and they're handling them with AWS.
The problem background
I am working on a Django project, which is using PostgreSQL and is hosted on Heroku(also using heroku-postgres). After some time, the amount of data becomes very big and that slows down the application.
I tried replication in order to read from multiple databases, that helped reducing the queue and connection time, but the big table is still the problem.
I can split the data based on group of users, different group do not need to interact with each other.
I have read into Sharding. But since we use heroku-postgres, it's hard to customize the sharding
So I have come up with 2 different ideas below
1. The app cluster with multi-db (Not allowed to embed image yet)
Please see the design here
We can use the middlewares and database-routers to achieve this
But not sure if this is friendly with django
2. The gateway app with multiple sub-apps (Not allowed to embed image yet)
Please see the design here
This require less effort than the previous design
Also possible to set region-based sub-apps in the future
My question is: which of the two is more django-friendly and better for scalability in the long run?
I have my Media Library stored as physical files. When a Sitecore user publishes an item, the files are dispersed to a number of CD servers using WebDeploy.
I would like to switch to Database storage due to some performance issues with WebDeploy, but I'm concerned that it may be too late. I have hundreds of physical Media Library files already attached to items in Sitecore.
How will Sitecore react to switching storage after the fact? Can it handle the two modes simultaneously, or must I migrate all my files into the DB?
I would make the switch, its makes less problems with the media in the database, and less things to keep track of, when running in a Multi server environment.
See more pros and cons here
You can very easy, make all existing media items, to a database media.
I have used this tool, to make the migration:
https://marketplace.sitecore.net/en/Modules/Media_Conversion_Tool.aspx
we have a django app on nginx where users upload media files. the media are huge such as 30min tv and radio programs resulting 100-300mb, and our shared hosting limits the upload to 30mb.
how to embed a smart uploader which will put chunks of 20-30mb instead of trying to upload the large file? we would like not to destroy our highly edited forms, so if there is an easy way to insert such tool as a bulletproof widget, you're awesome.
links, snippets, examples - highly appreciated, and any ideas are welcome. tx in advance.
You should consider alternative hosting (perhaps a virtual private server), as for any serious downloads you will quickly run into the limits of your shared hosting.