Update wowza StreamPublisher schedule via REST API (or alternative) - amazon-web-services

Just getting started with Wowza Streaming Engine.
Objective:
Set up a streaming server which live streams existing video (from S3) at a pre-defined schedule (think of a tv channel that linearly streams - you're unable to seek through).
Create a separate admin app that manages that schedule and updates the streaming app accordingly.
Accomplish this with as a little custom Java as possible.
Questions:
Is it possible to fetch / update streamingschedule.smil with the Wowza Streaming Engine REST API?
There are methods to retrieve and update specific SMIL files via the REST API, but they only seem to be applicable to those created through the manager. After all, streamingschedule.smil needs to be created manually by hand
Alternatively, is it possible to reference a streamingschedule.smil that exists on an S3 bucket? (In a similar way footage can be linked from S3 buckets with the use of the MediaCache module)
A comment here (search for '3a') seems to indicate it's possible, but there's a lot of noise in that thread.
What I've done:
Set up Wowza Streaming Engine 4.4.1 on EC2
Enabled REST API documentation
Created a separate S3 bucket and filled it with pre-recorded footage
Enabled MediaCache on the server which points to the above S3 bucket
Created a customised VOD edge application, with AppType set to Live and StreamType set to live in order to be able to point to the above (as suggested here)
Created a StreamPublisher module with a streamingschedule.smil file
The above all works and I have a working schedule with linearly streaming content pulled from an S3 bucket. Just need to be able to easily manipulate that schedule without having to manually edit the file via SSH.
So close! TIA

To answer your questions:
No. However, you can update it by creating an http provider and having it handle the modifications to that schedule. Should you want more flexibility here you can even extend the scheduler module to not require that file at all.
Yes. You would have to modify the ServerListenerStreamPublisher solution to accomplish it. Currently it solely looks a the local filesystem to read teh streamingschedule.smil file.
Thanks,
Matt

Related

Google Cloud Run service deployment, is it the best direction in my situation?

I have some experience with Google Cloud Functions (CF). I tried to deploy a CF function recently with a Python app, but it uses an NLP model so the 8GB memory limit is exceeded when the model is triggered. The function is triggered when a JSON file is uploaded to a bucket.
So, I plan to try Google Cloud Run but I have no experience with it. Also, I am not completely sure if it is the best course of action.
If it is, what is the best way of implementing provided that the Run service will be triggered by a file uploaded to a bucket? In CF, you can select the triggering event, in Run I didn't see anything like that. I could use some starting points as I couldn't find my case in the GCP documentation.
Any help will be appreciated.
You can use at least these two things:
The legacy one: Create a GCS notification in PubSub. Then create a push subscription and add the Cloud Run URL in the HTTP push destination
A more recent way is to use Eventarc to invoke directly a Cloud Run endpoint from an event (it roughly create the same thing with a PubSub topic and push subscription, but it's fully configured for you)
EDIT 1
When you use Push notification, you will received a standard PubSub message. The format is described in the documentation for the attributes and for the body content; keep in mind that the raw content is base64 encoded and you have to decode it to get the final format
I personally have a Cloud Run service that log the contents of any requests to be able to get in the logs all the data that I need to develop. When I have a new message format, I configure the push to that Cloud Run endpoint and I automatically get the format
For Eventarc, the format will be added to the UI soon (I view that feature in preview, but it's not yet available). The best solution is to log the content to know what you get to know what to do!

Heroku doesn't update github file system when an image is uploaded from website

I ran into the problem where Heroku doesn't update my GitHub repository (or say static filesystem) when a blog post (including pictures) is created from the website.
Other images survive, whilst the ones saved in my filesystem with the server running on heroku, disapper.
I found this on their documentation.
The Heroku filesystem is ephemeral - that means that any changes to the filesystem whilst the dyno is running only last until that dyno is shut down or restarted.
I'm still confused why not all the pictures disappear and only those added later do.
Is AWS S3 a solution for this? If it is, how can I represent my filesystem using buckets?
Say, for the Blog Post 1 I have 2 picture resolutions, which means storing the files in different folders corresponding to those resolutions.
---1920x1920
-----picture.jpg
---800x800
-----picture.jpg
Does that mean I have to create 2 buckets named 1920x1920 and 800x800 or is there a better way of handling them?
Is AWS S3 a solution for this?
S3 is the recommended solution for this, and the configuration is documented in Heroku DevCentre with specfic instructions for uploading from Python.
Note these Python instructions use the Direct Upload approch: Have the flask app generate a pre-signed URL, which is then passed back to the client Javascript code, so that the user's browser can make the upload to S3 directly. The resulting S3 URL of the image, is then put into a hidden element in the form, which is then received by your app on form submit.
The fact that you have separate image sizes suggests your app does some processing (maybe with PIL) to get these thumbnails. In which case it may be easier to use the Pass-Through approach where your app implements its own upload mechanism, does the processing and then uploads the thumbnails to S3 (The upload to S3 part is well document, such as in this SO thread).
The Pass-Through method carries the warning that this may cause blocking of a single threaded worker. If your site gets a volume of requests that causes this to be an issue, you may need to increase the number of gunicorn workers, or change to a worker type that supports concurrency (This github post has some useful commands/info on concurrent worker types).
The best way to implement this whole thing (although the requirement for a redisgo dyno and worker dyno may push you into the paid teir) may be with Background Tasks using rq. You use the Direct-Upload approach above to upload the original image, then have a background job download that, do the resizing, and put the resulting thumbnails back onto S3.
Does that mean I have to create 2 buckets named 1920x1920 and 800x800 or is there a better way of handling them?
Have one Bucket for the entire app, and just include forward slashes in the object's key to mimic a subdirectory structure.

What service should I use to process my files in a Cloud Storage bucket and upload the result?

I have a software that process some files. What I need is:
start a default image on google cloud (I think docker should be a good solution) using an API or a run command
download files from google storage
process it, run my software using those downloaded files
upload the result to google storage
shut the image down, expecting not to be billed anymore
What I do know is how to create my image hehe. But I can't find any info saying me what google cloud service should I use or even if I could do it like I'm thinking. I think I'm not using the right keywords to find what i need.
I was looking at Kubernetes, but i couldn't figure out how to manipulate those instances to execute a one time processing.
[EDIT]
Explaining better the process I have an app that receive images and send it to Google storage. After that, I need to process that images, apply filters, georeferencing, split image etc. So I want to start a docker image to process it and upload the results to google cloud again.
If you are using any of the runtimes supported by Google Cloud Functions, they are easiest way to do those kind of operations (i.e. fetch something from Google Cloud Storage, perform some actions on those files and upload them again). The Cloud Functions will be triggered by an event of your choice, and after the job, it will die.
Next option in terms of complexity would be to deploy a Google App Engine application in standard environment. It allows you to deploy your own application written in any of the supported languages for this environment. While there is traffic in your application, you will have instances serving, but the number of instances running can go down to 0 when they are not serving, which would mean less cost.
Another option would be Google App Engine in flexible environment. This product allows you to deploy your application in any custom runtime. This option has always at least one instance running, so it would never shut down.
Lastly, you can use Google Compute Engine to "create and run virtual machines on Google infrastructure". Otherwise than GAE, this is not that managed by Google, which means that most of the configuration is up to you. In this case, you would need to programmatically indicate your VM to shut down after you have finished your operations.
Based on your edit where you stated that you already have an app that is inserting images into Google Cloud Storage, your easiest option would be to use Cloud Functions that are triggered by additions, changes, or deletions to objects in Cloud Storage buckets.
You can follow the Cloud Functions tutorial for Cloud Storage to get an idea of the generic process and then implement your own code that handles your specific tasks. There are other tutorials like the Imagemagick tutorial for Cloud Functions that might also be relevant to the type of processing you intend to do.
Cloud Functions is probably your lightest weight approach. You could of course do more full scale applications, but that is likely overkill, more expensive, and more complex. You can write your processing code in Node.js, Python, or Go.

AWS S3 C++: Should I use UploadFile() or PutObject() for uploading a file? Where are the differences? [duplicate]

In the aws-sdk's S3 class, what is the difference between upload() and putObject()? They seem to do the same thing. Why might I prefer one over the other?
The advantage to using AWS SDK upload() over putObject() is as below:
If the reported MD5 upon upload completion does not match, it
retries.
If the file size is large enough, it uses multipart upload to upload
parts in parallel.
Retry based on the client's retry settings.
You can use for Progress reporting.
Sets the ContentType based on file extension if you do not provide
it.
upload() allows you to control how your object is uploaded. For example you can define concurrency and part size.
From their docs:
Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.
One specific benefit I've discovered is that upload() will accept a stream without a content length defined whereas putObject() does not.
This was useful as I had an API endpoint that allowed users to upload a file. The framework delivered the file to my controller in the form of a readable stream without a content length. Instead of having to measure the file size, all I had to do was pass it straight through to the upload() call.
When looking for the same information, I came across: https://aws.amazon.com/blogs/developer/uploading-files-to-amazon-s3/
This source is a little dated (referencing instead upload_file() and put() -- or maybe it is the Ruby SDK?), but it looks like the putObject() is intended for smaller objects than the upload().
It recommends upload() and specifies why:
This is the recommended method of using the SDK to upload files to a
bucket. Using this approach has the following benefits:
Manages multipart uploads for objects larger than 15MB.
Correctly opens files in binary mode to avoid encoding issues.
Uses multiple threads for uploading parts of large objects in parallel.
Then covers the putObject() operation:
For smaller objects, you may choose to use #put instead.
EDIT: I was having problems with the .abort() operation on my .upload() and found this helpful: abort/stop amazon aws s3 upload, aws sdk javascript
Now my various other events from https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html are firing as well! With .upload() I only had 'httpUploadProgress'.
This question was asked almost six years ago and I stumbled across it while searching for information on the latest AWS Node.js SDK (V3). While V2 of the SDK supports the "upload" and "putObject" functions, the V3 SDK only supports "Put Object" functionality as "PutObjectCommand". The ability to upload in parts is supported as "UploadPartCommand" and "UploadPartCopyCommand" but the standalone "upload" function available in V2 is not and there is no "UploadCommand" function.
So if you migrate to the V3 SDK, you will need to migrate to Put Object. Get Object is also different in V3. A Buffer is no longer returned and instead a readable stream or a Blob. So if you got the data through "Body.toString()" you now have to implement a stream reader or handle Blob's.
EDIT:
the upload command can be found in the AWS Node.js SDK (V3) under #aws-sdk/lib-storage. here is a direct link: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/modules/_aws_sdk_lib_storage.html

Difference between upload() and putObject() for uploading a file to S3?

In the aws-sdk's S3 class, what is the difference between upload() and putObject()? They seem to do the same thing. Why might I prefer one over the other?
The advantage to using AWS SDK upload() over putObject() is as below:
If the reported MD5 upon upload completion does not match, it
retries.
If the file size is large enough, it uses multipart upload to upload
parts in parallel.
Retry based on the client's retry settings.
You can use for Progress reporting.
Sets the ContentType based on file extension if you do not provide
it.
upload() allows you to control how your object is uploaded. For example you can define concurrency and part size.
From their docs:
Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.
One specific benefit I've discovered is that upload() will accept a stream without a content length defined whereas putObject() does not.
This was useful as I had an API endpoint that allowed users to upload a file. The framework delivered the file to my controller in the form of a readable stream without a content length. Instead of having to measure the file size, all I had to do was pass it straight through to the upload() call.
When looking for the same information, I came across: https://aws.amazon.com/blogs/developer/uploading-files-to-amazon-s3/
This source is a little dated (referencing instead upload_file() and put() -- or maybe it is the Ruby SDK?), but it looks like the putObject() is intended for smaller objects than the upload().
It recommends upload() and specifies why:
This is the recommended method of using the SDK to upload files to a
bucket. Using this approach has the following benefits:
Manages multipart uploads for objects larger than 15MB.
Correctly opens files in binary mode to avoid encoding issues.
Uses multiple threads for uploading parts of large objects in parallel.
Then covers the putObject() operation:
For smaller objects, you may choose to use #put instead.
EDIT: I was having problems with the .abort() operation on my .upload() and found this helpful: abort/stop amazon aws s3 upload, aws sdk javascript
Now my various other events from https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html are firing as well! With .upload() I only had 'httpUploadProgress'.
This question was asked almost six years ago and I stumbled across it while searching for information on the latest AWS Node.js SDK (V3). While V2 of the SDK supports the "upload" and "putObject" functions, the V3 SDK only supports "Put Object" functionality as "PutObjectCommand". The ability to upload in parts is supported as "UploadPartCommand" and "UploadPartCopyCommand" but the standalone "upload" function available in V2 is not and there is no "UploadCommand" function.
So if you migrate to the V3 SDK, you will need to migrate to Put Object. Get Object is also different in V3. A Buffer is no longer returned and instead a readable stream or a Blob. So if you got the data through "Body.toString()" you now have to implement a stream reader or handle Blob's.
EDIT:
the upload command can be found in the AWS Node.js SDK (V3) under #aws-sdk/lib-storage. here is a direct link: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/modules/_aws_sdk_lib_storage.html