Storing large encrypted files on Heroku - facebook-graph-api

We have developed a Facebook application that runs on Heroku. The application generates encrypted text that needs to be stored quickly. Currently, the text is simply written to a text file on the Heroku server, and this is not a scalable solution.
The data will eventually be downloaded to our local machines, but it is essential to have a reliable intermediate storage between the app and the local machine due to the inability of downloading rapidly at our end.
Would you recommend S3 for this purpose? Any alternatives?

+1 to S3. This is because Heroku is a read only file system, so you probably have to source for some third party solution.

Yup, i would recommend the s3. Very reliable.

Worth noting S3 offers server side encryption. In fact, the aws-s3 SDK ruby gem has built in support for client side encryption as well.

Related

AWS Amplify Storage | Upload large file

Using AWS Amplify Storage, uploading a file to AWS S3 should be simple:
Storage.put(key, blob, options)
The above works without problem for smaller files, (no larger than around 4MB).
Uploading anything larger, ex. a 25MB video, does not work: Storage just freezes (app does not freeze, only Storage). No error is returned.
Question: How can I upload larger files using AWS Amplify Storage?
Side note: Described behaviour appears both on Android and iOS.
Amplify now automatically segments large files into 5Mb chunks and uploads them using the Amazon S3 Multipart upload process
https://aws.amazon.com/about-aws/whats-new/2021/10/aws-amplify-javascript-file-uploads-storage/
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#mpu-process
After updating to
"aws-amplify": "ˆ4.3.11",
"aws-amplify-react-native": "^6.0.2"
uploads over 100MB are not freezing UI anymore + we also migrated to resumable uploads. When we used older version of aws-amplify": "^3.1.1", the problems like you mentioned were present.
Here is the pull request from Dec, 2021 for mentioned fixes:
https://github.com/aws-amplify/amplify-js/pull/8336
So the solution is really to upgrade AWS Amplify library.
However, this approach works only on iOS.
Uploading big media files on Android results in network error when calling fetch (as a required step before calling Storage.put method).
Although the same method can perfectly work on the web, in React Native uploading big files was/is not implemented optimally (taking in mind, that we should load all file in memory using fetch()).

Media files on Heroku

If I host a small Django website on Heroku and I am using just one dyno, is it save to upload media files on that server, or should I necessarily use AWS S3 storage to store media files? What are other alternatives for media storage?
No, it is never safe to store things on the Heroku filesystem. Even though you only have one dyno, it is still ephemeral, and can be killed at any time; for example when you push new code.
Using S3 is the way to go (alternatives are the Azure and Google offerings). There are several other advantages for using S3, mostly ability to service files without stressing your small server.
While your site is small, a dyno is very small as well, so a major advantage of S3, if used correctly, is that you can have the backing of the AWS S3 infrastructure to service the files. By "used correctly", I mean that you want to upload and service files directly to/from S3 so your server is only used for signing the S3 urls, but the actual files never go through your server.
Check https://devcenter.heroku.com/articles/s3-upload-python and http://docs.fineuploader.com/quickstart/01-getting-started.html (I strongly recommend Fine-Uploader if you can use the free version or afford the small license fee.).
Obviously, you can also just implement S3 media files in django using django-storage-redux, but that that means your server will be busy uploading files. If that's ok for your small server, then it is ok too.

Configurating Django, Heroku, and a static file server

We used to use the following combination: Django framework with Heroku as the application server and Amazon S3 as the static file server.
But recently we need to build a system which handles a large amount of video data, with data transfer more than 10 TB per month. That means Amazon S3 is no longer an option because it's too expensive.
We opt to set up our own static file server, so it's gonna be Django, Heroku, and an on-premiss file server. We need some suggestions:
Is our decision good enough? Any other options?
Is Nginx a good choice for the file server in this application?
Are there good documentations about uploading large files from a Django+Heroku application to a Nginx server?
Thanks.
1) Yes, your decision is best possible one
2) Nginx is the very best solution. Cloudflare serves traffic with Nginx more than major web apps altogether. Netflix serves 33% all US media traffic with Nginx
3) S3 as an origin is not expensive but traffic costs a lot. That should help https://coderwall.com/p/rlguog/nginx-as-proxy-for-amazon-s3-public-private-files
Large files upload should bypass any kind of backend but saved on disk asynchronous followed by upload to any destination with s separate process. For big files upload you have be careful of authentication, normally authentication happens only after file is uploaded which can be dangerous. To solve that try https://coderwall.com/p/swgfvw/nginx-direct-file-upload-without-passing-them-through-backend

Using FFmpeg to encode audio once uploaded, using Django, VPS (Linux) and Amazon S3

My project will require users to upload uncompressed WAV audio files and once they do, the server will need to encode it in MP3 to serve it on the site. I'm using Django for this project and it'll be hosted on a Linux VPS (from Linode). Due to space and bandwidth, I want to use Amazon S3.
I'm not an expert at this stuff, this project will be covering many new things for me. But any guidance on this would be a great thing for me.
I will most probably be using the django-storages app to talk with Amazon S3. But I'm not sure at what point I would run the server command for FFmpeg to do it's conversions. If a user is uploading an audio file, django-storages will place it on Amazon S3. But then, where and how, do I get FFmpeg to run it's commandline on that file just uploaded to do the encoding to MP3 and then my website to serve and use that MP3 (which should at that point also be on Amazon S3)?
I'm a little confused on how to go about it. Like I say, I'm not an expert! Could anyone guide me on this?
You might consider writing a custom storage backend. This should be pluggable into django-storages, but I've never used the app and can't say for sure. You can find some guidance on writing custom storage backends here: http://docs.djangoproject.com/en/dev/howto/custom-file-storage/
In your backend, you can use Python's subprocess command to run ffmgpeg to handle the mp3 conversion: http://docs.python.org/library/subprocess.html#subprocess.call
May be don't use django-storage for such files, You can convert the audio to a temp mp3 file on server (Linux VPS) and using boto or command line S3 tool or some other way upload mp3 to S3.

How can I encrypt my django code?

I have to upload my django project to a shared hosting provider.
How can I encrypt my code?
I want to hide my code on the server.
Thanks :)
You can't. You could upload .pyc files I suppose, but they are completely and utterly trivial to decompile.
Who are you trying to conceal it from? If it's other users on the shared system, then make sure you have directory permissions properly restricted to your user. If it's the shared hosting provider itself, then there's not much you can do since obfuscation won't buy you all that much; spend some time to find a reputable hosting provider you can trust.
If you really want to hide your code, you have to build custom python interpreter that uses different opcodes (in python bytecode). Then the server only has your hacked binary and pyc's that are not trivial to decode. You can add encryption on top of that, or at least sign your code so that your binary is not that easy to investigate.
Another possibility is to never have your code on disk, only keep it in RAM. You could start your server process via e.g. execnet.