I am trying to build a custom live streaming service as documented here:
https://aws.amazon.com/solutions/implementations/live-streaming-on-aws/
I used the pre-provided cloudformation template for "Live Streaming on AWS with MediaStore" which provisioned all the relevant resources for me. Next, I wanted to test my custom streamer.
I used OBS Studio to stream my webcam output to MediaLivePushEndpoint that was created during AWS cloudformation provisioning. OBS Suggests that it is already streaming the webcam stream to the rtmp endpoint to AWS MediaLive RTMP endpoint.
Now, to confirm if I can watch the stream, when I try to set the Input Nerwork Stream in VLC player to the cloudfront endpoint that was created for me (which looks like this: https://aksj2arbacadabra.cloudfront.net/stream/index.m3u8), VLC is unable to fetch the stream and fails with the following error message in the logs. What am I missing? Thanks!
...
...
...
http debug: outgoing request: GET /stream/index.m3u8 HTTP/1.1 Host: d2lasasasauyhk.cloudfront.net Accept: */* Accept-Language: en_US User-Agent: VLC/3.0.11 LibVLC/3.0.11 Range: bytes=0-
http debug: incoming response: HTTP/1.1 404 Not Found Content-Type: application/x-amz-json-1.1 Content-Length: 31 Connection: keep-alive x-amzn-RequestId: HRNVKYNLTdsadasdasasasasaPXAKWD7AQ55HLYBBXHPH6GIBH5WWY x-amzn-ErrorType: ObjectNotFoundException Date: Wed, 18 Nov 2020 04:08:53 GMT X-Cache: Error from cloudfront Via: 1.1 5085d90866d21sadasdasdad53213.cloudfront.net (CloudFront) X-Amz-Cf-Pop: EWR52-C4 X-Amz-Cf-Id: btASELasdasdtzaLkdbIu0hJ_asdasdasdbgiZ5hNn1-utWQ==
access error: HTTP 404 error
main debug: no access modules matched
main debug: dead input
qt debug: IM: Deleting the input
main debug: changing item without a request (current 2/3)
main debug: nothing to play
Updates based on Zach's response:
Here are the parameters I used while deploying the cloudformation template for live streaming using MediaLive (notice that I am using RTMP_PUSH):
I am using MediaLive and not MediaPackage so when I go to MediaLive to my channel, I see this:
Notice that it says that it cannot find the "stream [stream]" but I confirmed that the rtmp endpoint I add to my OBS is exactly the one which was created as an output for me from my cloudformation stack:
Finally, when I try to go to media store to see if there are any objects, it is completely empty:
Vader,
Thank you for the clarification here, I can see the issue is with your settings in OBS. When you setup your input for MediaLive you created a unique Application Name and Instance. Which is part of the URI, the Application Name is LiveStreamingwithMediaStore and the Instance is stream, in OBS you are going to want remove stream from the end of the Server URI and place it in the Stream Key portion, where you currently have a 1.
OBS Settings:
Server: rtmp://server_ip:1935/Application_Name/
Stream Key: Instance_Name
Since you posted the screenshot here on an open forum, which really helped determine the issue, but does expose settings that would allow someone to send to the RTMP input I would suggest that you change the Application Name and Instance.
Zach
Related
We have service which is used to download time series data from influxdb .We are not manipulating influx response , after updating some meta information , we push the records as such.
So there is no content length attached to response.
Want to give this service via Amazon API Gateway. Is it possible to integrate such a service with API gateway , mainly is there any limit on response size .Service not waiting for whole query results to come , but will API gateway do the same or it will wait for the whole data to be wrote to output stream.
When I tried , I observed content-length header being added by API Gateway.
HTTP/1.1 200 OK
Date: Tue, 26 Apr 2022 06:03:31 GMT
Content-Type: application/json
Content-Length: 3024
Connection: close
x-amzn-RequestId: 41dfebb4-f63e-43bc-bed9-1bdac5759210
X-B3-SpanId: 8322f100475a424a
x-amzn-Remapped-Connection: keep-alive
x-amz-apigw-id: RLKwCFztliAFR2Q=
x-amzn-Remapped-Server: akka-http/10.1.8
X-B3-Sampled: 0
X-B3-ParentSpanId: 43e304282e2f64d1
X-B3-TraceId: d28a4653e7fca23d
x-amzn-Remapped-Date: Tue, 26 Apr 2022 06:03:31 GMT
Is this means that API Gateway waits for whole response/EOF from integration?
If above case is true , then what's the maximum bytes limit api gateway buffer can hold?
Will API Gateway time out , if response from integration is too large or do not end stipulated time ?
I'd like to try and automate an S3 bucket replication of a Github repo (for the sole reason that Cloudformation modules must reference templates in S3).
This quickstart I tried to use looked like it could do it, but it doesn't result in success for me, even though github reports success in pushing via the webhook for my repository.
https://aws-quickstart.github.io/quickstart-git2s3/
I configured these parameters.
I am not sure what to configure for allowed IP's, so I tested fully open.
AllowedIps 0.0.0.0/0 -
ApiSecret **** -
CustomDomainName - -
ExcludeGit True -
OutputBucketName - -
QSS3BucketName aws-quickstart -
QSS3BucketRegion us-east-1 -
QSS3KeyPrefix quickstart-git2s3/ -
ScmHostnameOverride - -
SubnetIds subnet-124j124 -
VPCCidrRange 172.31.0.0/16 -
VPCId vpc-l1kj4lk2j1l2k4j
I tried manually executing the code build as well but got this error:
COMMAND_EXECUTION_ERROR: Error while executing command: python3 - << "EOF" from boto3 import client import os s3 = client('s3') kms = client('kms') enckey = s3.get_object(Bucket=os.getenv('KeyBucket'), Key=os.getenv('KeyObject'))['Body'].read() privkey = kms.decrypt(CiphertextBlob=enckey)['Plaintext'] with open('enc_key.pem', 'w') as f: print(privkey.decode("utf-8"), file=f) EOF . Reason: exit status 1
The github webhook page reports this response:
Headers
Content-Length: 0
Content-Type: application/json
Date: Thu, 24 Jun 2021 21:33:47 GMT
Via: 1.1 9b097dfab92228268a37145aac5629c1.cloudfront.net (CloudFront)
X-Amz-Apigw-Id: 1l4kkn14l14n=
X-Amz-Cf-Id: 1l43k135ln13lj1n3l1kn414==
X-Amz-Cf-Pop: IAD89-C1
X-Amzn-Requestid: 32kjh235-d470-1l412-bafa-l144l1
X-Amzn-Trace-Id: Root=1-60d4fa3b-73d7403073276ca306853b49;Sampled=0
X-Cache: Miss from cloudfront
Body
{}
From the following link:
https://aws-quickstart.github.io/quickstart-git2s3/
You can see the following excerpts I have included:
Allowed IP addresses (AllowedIps)
18.205.93.0/25,18.234.32.128/25,13.52.5.0/25
Comma-separated list of allowed IP CIDR blocks. The default addresses listed are BitBucket Cloud IP ranges.
As such, since you said you're using GitHub, I believe you should use this URL to determine the IP range:
https://api.github.com/meta
As that API will respond with JSON, you should search for the attribute hooks since I believe that it described using hooks.
Why don't you copy/checkout the file you want before you run your cloudformation commands, no reason to get too fancy.
git checkout -- path/to/some/file
aws cloudformation ...
Otherwise why not fork the repo and add your Cloudformation stuff, and then it's all there. You could also delete everything you didn't need and merge/pull changes in the future. That way your deploys will be reproducible, and you can rollback easily from one commit to the other.
I have been trying to read the AWS Lambda#Edge documentation, but I still cannot figure out if the following option is possible.
Assume I have an object (image.jpg, with size 32922 bytes) and I have setup AWS as static website. So I can retrieve:
$ GET http://example.com/image.jpg
I would like to be able to also expose:
$ GET http://example.com/image
Where the response body would be a multipart/related file (for example). Something like this :
--myboundary
Content-Type: image/jpeg;
Content-Length: 32922
MIME-Version: 1.0
<actual binary jpeg data from 'image.jpg'>
--myboundary
Is this something supported out of the box in the AWS Lambda#Edge API ? or should I use another solution to create such response ? In particular it seems that the response only deal with text or base64 (I would need binary in my case).
I finally was able to find complete documentation. I eventually stumble upon:
API Gateway - PORT multipart/form-data
which refers to:
Enabling binary support using the API Gateway console
The above documentation specify the steps to handle binary data. Pay attention that you need to base64 encode the response from lambda to pass it to API Gateway.
I encounter S3 SignatureDoesNotMatch while trying to write Dataframe to S3 with Spark.
The symptom/things have tried:
The code fail sometimes but works sometimes;
The code can read from S3 without any problem, and be able to write to S3 from time to time, which rules out wrong config settings like S3A / enableV4 / Wrong Key / Region Endpoint etc.
The S3A endpoint had been set according to the S3 docs S3 Endpoint;
Made sure the AWS_SECRETY_KEY does not contain any non-alphanumeric as per suggested here;
Made sure server time is in-sync by using NTP;
The following was tested on EC2 m3.xlarge with spark-2.0.2-bin-hadoop2.7 running on Local mode;
The issue is gone when the files are written to local fs;
right now the workaround was to mount the bucket with s3fs and write to there; however this is not ideal as s3fs dies quite often from the stress Spark put to it;
The code can be boiled down to:
spark-submit\
--verbose\
--conf spark.hadoop.fs.s3n.impl=org.apache.hadoop.fs.s3native.NativeS3FileSystem \
--conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3.S3FileSystem \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem\
--packages org.apache.hadoop:hadoop-aws:2.7.3\
--driver-java-options '-Dcom.amazonaws.services.s3.enableV4'\
foobar.py
# foobar.py
sc = SparkContext.getOrCreate()
sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", 'xxx')
sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key", 'xxx')
sc._jsc.hadoopConfiguration().set("fs.s3a.endpoint", 's3.dualstack.ap-southeast-2.amazonaws.com')
hc = SparkSession.builder.enableHiveSupport().getOrCreate()
dataframe = hc.read.parquet(in_file_path)
dataframe.write.csv(
path=out_file_path,
mode='overwrite',
compression='gzip',
sep=',',
quote='"',
escape='\\',
escapeQuotes='true',
)
Spark spills the following error.
Set log4j to verbose, it appears the following had happened:
Each individual will be output to staing location on S3 /_temporary/foorbar.part-xxx;
A PUT call will move the partitions into final location;
After a few successfully PUT calls, all the subsequent PUT call failed due to 403;
As the reuqets were made by aws-java-sdk, not sure what to do on application level;
-- The following log were from another event with the exact same error;
>> PUT XXX/part-r-00025-ae3d5235-932f-4b7d-ae55-b159d1c1343d.gz.parquet HTTP/1.1
>> Host: XXX.s3-ap-southeast-2.amazonaws.com
>> x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
>> X-Amz-Date: 20161104T005749Z
>> x-amz-metadata-directive: REPLACE
>> Connection: close
>> User-Agent: aws-sdk-java/1.10.11 Linux/3.13.0-100-generic OpenJDK_64-Bit_Server_VM/25.91-b14/1.8.0_91 com.amazonaws.services.s3.transfer.TransferManager/1.10.11
>> x-amz-server-side-encryption-aws-kms-key-id: 5f88a222-715c-4a46-a64c-9323d2d9418c
>> x-amz-server-side-encryption: aws:kms
>> x-amz-copy-source: /XXX/_temporary/0/task_201611040057_0001_m_000025/part-r-00025-ae3d5235-932f-4b7d-ae55-b159d1c1343d.gz.parquet
>> Accept-Ranges: bytes
>> Authorization: AWS4-HMAC-SHA256 Credential=AKIAJZCSOJPB5VX2B6NA/20161104/ap-southeast-2/s3/aws4_request, SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-server-side-encryption-aws-kms-key-id, Signature=48e5fe2f9e771dc07a9c98c7fd98972a99b53bfad3b653151f2fcba67cff2f8d
>> ETag: 31436915380783143f00299ca6c09253
>> Content-Type: application/octet-stream
>> Content-Length: 0
DEBUG wire: << "HTTP/1.1 403 Forbidden[\r][\n]"
DEBUG wire: << "x-amz-request-id: 849F990DDC1F3684[\r][\n]"
DEBUG wire: << "x-amz-id-2: 6y16TuQeV7CDrXs5s7eHwhrpa1Ymf5zX3IrSuogAqz9N+UN2XdYGL2FCmveqKM2jpGiaek5rUkM=[\r][\n]"
DEBUG wire: << "Content-Type: application/xml[\r][\n]"
DEBUG wire: << "Transfer-Encoding: chunked[\r][\n]"
DEBUG wire: << "Date: Fri, 04 Nov 2016 00:57:48 GMT[\r][\n]"
DEBUG wire: << "Server: AmazonS3[\r][\n]"
DEBUG wire: << "Connection: close[\r][\n]"
DEBUG wire: << "[\r][\n]"
DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 403 Forbidden
<< HTTP/1.1 403 Forbidden
<< x-amz-request-id: 849F990DDC1F3684
<< x-amz-id-2: 6y16TuQeV7CDrXs5s7eHwhrpa1Ymf5zX3IrSuogAqz9N+UN2XdYGL2FCmveqKM2jpGiaek5rUkM=
<< Content-Type: application/xml
<< Transfer-Encoding: chunked
<< Date: Fri, 04 Nov 2016 00:57:48 GMT
<< Server: AmazonS3
<< Connection: close
DEBUG requestId: x-amzn-RequestId: not available
I experienced exactly the same problem and found a solution with the help of this article (other resources are pointing in the same direction). After setting these configuration options, writing to S3 succeeded:
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version 2
spark.speculation false
I am using Spark 2.1.1 with Hadoop 2.7. My final spark-submit command looked like this:
spark-submit
--packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.3
--conf spark.hadoop.fs.s3a.endpoint=s3.eu-central-1.amazonaws.com
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
--conf spark.executor.extraJavaOptions=-Dcom.amazonaws.services.s3.enableV4=true
--conf spark.driver.extraJavaOptions=-Dcom.amazonaws.services.s3.enableV4=true
--conf spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2
--conf spark.speculation=false
...
Additionally, I defined these environment variables:
AWS_ACCESS_KEY_ID=****
AWS_SECRET_ACCESS_KEY=****
I had the same issue and resolved it by upgrading from aws-java-sdk:1.7.4 to aws-java-sdk:1.11.199 and hadoop-aws:2.7.7 to hadoop-aws:3.0.0.
However to avoid the dependency mismatches when interacting with AWS I had to rebuild Spark and provide it with my own version of Hadoop 3.0.0.
I speculate that the root cause is the way that the v4 signature algorithm takes in the current timestamp and then all Spark executors are using the same signature to authenticate their PUT requests. But if one slips outside the 'window' of time allowed by the algorithm the request, and all further requests, fail causing Spark to rollback the the changes and error out. This explains why calling .coalesce(1) or .repartition(1) always works but the failure rate climbs in proportion to the number of partitions being written.
What do you mean "s3a" dies? I'm curious about that. If you have stack traces, file them on the Apache JIRA server, project HADOOP, component fs/s3.
s3n doesn't support v4 API. it's not a matter of endpoint, but of the new signature mech. It's not going to have its jets3t library upgraded except for security reasons, so stop trying to work with it.
One problem that Spark is going to have with S3, irrespective of driver, is that it's an eventually consistent object store, where: renames take O(bytes) to complete, and the delayed consistency between PUT and LIST can break the commit. More succintly: Spark assumes that after you write something to a filesystem, if you do an ls of the parent directory, you find the something you just wrote. S3 doesn't offer that, hence the term "eventually consistency". Now, in HADOOP-13786 we are trying to better, and HADOOP-13345 see if we can't use Amazon Dynamo for a faster, consistent view of the world. But you will have to pay the dynamodb premium for that feature.
Finally, everything currently known about s3a troubleshooting, including possible causes of 403 errors, is online. Hopefully it'll help, and, if there's another cause you identify, patches are welcome
I'm trying to create an AWS s3 bucket using libCurl thusly:
Location end-point
curl_easy_setopt(curl, CURLOPT_URL, "http://s3-us-west-2.amazonaws.com/");
Assembled RESTful HTTP header:
PUT / HTTP/1.1
Date:Fri, 18 Apr 2014 19:01:15 GMT
x-amz-content-sha256:ce35ff89b32ad0b67e4638f40e1c31838b170bbfee9ed72597d92bda6d8d9620
host:tempviv.s3-us-west-2.amazonaws.com
x-amz-acl:private
content-type:text/plain
Authorization: AWS4-HMAC-SHA256 Credential=AKIAISN2EXAMPLE/20140418/us-west-2/s3/aws4_request, SignedHeaders=date;x-amz-content-sha256;host;x-amz-acl;content-type, Signature=e9868d1a3038d461ff3cfca5aa29fb5e4a4c9aa3764e7ff04d0c689d61e6f164
Content-Length: 163
The body contains the bucket configuration
http://s3.amazonaws.com/doc/2006-03-01/">us-west-2
I get the following exception back.
MalformedXMLThe XML you provided was not well-formed or did not validate against our published schema
I've been able to carry out the same operation through the aws cli.
Things I've also tried.
1) In the xml, used \ to escape the quotes (i.e., xmlns=\"http:.../\").
2) Not providing a CreateBucketConfiguration ("Although s3 documentation suggests this is not allowed when sending the request to a location endpoint").
3) A get service call to the same end point is listing all the provisioned buckets correctly.
Please do let me know if there is anything else I might be missing here.
Ok, the problem was that I was not transferring the entire xml across as was revealed by a wireshark trace. Once I fixed it, the problem went away.
Btw... escaping the quotes with a \ works but the & quot ; does not.