FFMPEG and AWS: What's the most efficient way to handle this? - amazon-web-services

I'm new to AWS and I originally built the FFmpeg functions on my Node.JS API. But I realized this is the wrong way to do it in a real-world app, and that you need to use separate Lambda functions in AWS that handle the video editing separately from the main server.
I'm mainly a front-end developer but I'm open to learning new things.
I basically have the following process in my app:
User uploads video.
I need to take that video and add a watermark to it.
I then need a copy of the watermarked video in a smaller resolution.
I then need a 6 seconds GIF of the smaller resolution video.
Finally, I need to upload the 3 edited files (2 .mp4's and 1 .gif) to S3, and remove the original, non-watermarked video.
Here are my questions to be clear:
Should I upload the original file to S3 or to the server? And why?
Is the process above doable in a single Lambda function? Or do I need more Lambda functions?
How would you handle this problem, personally?
I originally built it by chaining one function to the next with promises, but AWS seems like a different world of doing things and the way I originally built it would not work.
Thanks a lot.
Update
Here are some tests I did with a couple videos:
Test 1
Test 2
Test 3
Test 4
Test 5
Original video resolution
1080p
1080p
1080p
1080p
480p
Original video duration
23 minutes
15 minutes
11 minutes
3.5 minutes
5 minutes
Step 1 duration (Watermarking original video)
30 minutes
18 minutes
14 minutes
4 minutes
2 minutes
Step 2 duration (Watermarking lower resolution)
5 minutes
3 minutes
3 minutes
1 minute
skip (already low res)
Step 3 duration (6 seconds GIF creation)
negligible (15 seconds)
negligible (10 seconds)
negligible (7 seconds)
negligible
negligible
Total
~35 minutes
~21 minutes
~17 minutes
~5 minutes
~2 minutes

Related

C++ Video Capturing using Sink Writer - Memory consumption

I am writing a C++ program (Win64) using C++ Builder 11.1.5 that captures video from a web cam and stores the captured frames in a WMV file using the sink writer interface as described in the following tutorial:
https://learn.microsoft.com/en-gb/windows/win32/medfound/tutorial--using-the-sink-writer-to-encode-video?redirectedfrom=MSDN
The video doesn't need to be real time using 30 Frames per second as the process being recorded is a slow one so I have set the FPS to be 5 (which is fine.)
The recording needs to run for about 8-12 hours at a time and using the algorithms in the sink writer tutorial, I have seen the memory consumption of the program go up dramatically after 10 minutes of recording (in excess of 10 Gb of memory). I also have seen that the final WMV file only becomes populated when the Finalize routine is called. Because of the memory consumption, the program starts to slow down after a while.
First Question: Is it possible to flush the sink writer to free up ram while it is recording?
Second Question: Maybe it would be more efficient to save the video in pieces and finalize the recording every 10 minutes or so then start another recording using a different file name such that when 8 hours is done the program could combine all the saved WMV files? How would one go about combining numerous WMV files into one large file?

CLIPS EnvAssertString API-function slow performance

I'm trying to develop an expert-system which is capable of managing a real-time data flow. During the coding procedure I found a delay in operation which varied from 3 to almost 20 milliseconds and this is totally inappropriate for the project. The application profiling showed that the problem resided in EnvAssertString function, whilst EnvRun did not produce any delay. I tried to temporaily disable garbage collection before EnvAssertString, but it didn't help. The function in question is performed between 10 and 50 times in a row during the processing of a single block of data and the blocks are arriving at a rate of approx. 15 blocks per second.
How can I fix this? Is there any chance of speeding the process up? Is CLIPS at all suitable for a real-time response like that (sevaral calls in a row to EnvAssertString shouldn't take longer than 1 ms)?

Concurrency context switching of 2+ youtube videos on a dual core machine

From what I know, concurrency involves context switching if the number of software threads exceeds the number of physical cores. So, for example, if there's 4 software threads running on 1 physical core, then each software thread would take turns running, and no more than 1 software thread can be making progress simultaneously/in parallel.
I'm trying to apply this idea to youtube videos on my macbook pro. I have 2 cores, and I started 4 youtube videos (which I assume is basically 4 software threads) within a second of each other and played each to the 1 minute mark. Since I have 2 cores, I was under the impression that a maximum of 2 youtube videos can be making progress simultaneously. My eyes and ears perceived that all 4 videos were making progress in parallel and it didn't sound or look like any video was being paused, but I assumed this was because the context switching is occurring at a frequency that is not detectable by my senses.
So I then I set a timer to see how long it takes the 4 videos to get to the 1 minute mark when I start all 4 at approximately the same time (within 1 second of each other.) The 4 videos took approximately 1minute TOTAL to get to the 1 minute mark. I'm confused why it doesn't take at least 2 minutes when I only have 2 cores, so a maximum of 2 videos can be making progress simultaneously (and this is the best case scenario assuming my computer wasn't doing anything else but playing the youtube videos).
It appears that I'm misunderstanding something about concurrency/context switching because I don't see how it all 4 videos can get to the 1 minute mark in a minute. Could someone explain?

Speed up ionic 2 live reload process

Is there anyway to speed up the live reload process in ionic 2 after a save. Every save takes up to 10 to 15 seconds to show the changes, which is close to unbearable. Surely it should not take that long?

Profiling Python on large dataset

I have a dataset with 3 Mio lines to process. Processing functions are cythonized. When I do the entire processing on a small subsample of 10000 lines, processing time is about 1,5 minute, a subsample of 30000 lines gives a processing time of 3 min. However, when I process the whole dataset after 10 hours only 1/4th of the dataset is processed, although I expect a processing time of max. 5 hours. I'm running Ubuntu 14.04 64 Bit and Anaconda 64 bit. RAM usage is at 50%. I deactivated directing to login after a period of inactivity, performance stayed the same. Switching of the screen after inactivity didn't influence execution time eighter. What else could be the reason for this unexpectedly slow execution?