I need to create a Kinesis Stream programatically and then start performing operations on it. Create Stream is an async operation. What is the best way to establish when the stream is ready?
Is it polling for the a stream status and waiting for a response with StreamStatus ACTIVE?
You can poll, but otherwise if you have the option you can stream to an intermediate destination such as CloudWatch, and then once the stream is active it will begin consuming those events.
Related
I am new to an event hub, I try to integrate using .Net Core. I am able to read the incoming event data successfully but for some reason, I want to re-read the data so is it possible?
Yes - assuming that the data hasn't passed the retention period.
Events are not removed from the stream when read; they remain available for any consumer who wishes to read them until they pass the configured retention period and age out of the stream.
When connecting, Event Hub consumers request a position in the stream to start reading from, which allows them to specify any event with a known offset or sequence number. Consumers may also begin reading at a specific point in time or request to begin at the first available event.
More information can be found in the Reading Events sample for the Azure.Messaging.EventHubs package.
I am Using AWS IVS (Interactive Video Service) for live streaming. I need the notification when the stream start and the stream ends. In the Amazon Event bridge, I have created a Rule. source as IVS and the target as a queue. but I am not getting the messages to the queue when the stream start and the stream ends. I am polling to the queue but the queue is empty. I think the event pattern in the Event Bridge is wrong. can someone help me to validate the event pattern below? or how to get notification when stream start or stream end from the AWS IVS?
{
"source": [
"aws.ivs"
],
"detail": {
"stream_status": [
"Stream End",
"Stream Start",
"Session Created"
]
}
}
The EventBridge sample event had a bug where event_name was shown improperly as eventName. If you manually specify event_name, the events will properly fire and you should be good to use this rule for your needs.
Refer to the documentation here.
Imo, you have to manage it by yourself. AWS does not provide any automated messages when your IVS endpoint is ingesting data.
The best solution I've been thinking of right now is to have an observer pattern using websocket.
The dirtier implementation would be to send a message using a websocket whenever your data source is streaming. This means that you have to trigger it somewhere with your interface if you're using another broadcasting service.
The best way would be a service checking your stream health and sessions regularly and notifying your clients whenever you have a live session, as well as providing info whenever your session health is dropping.
I have a Kinesis producer which writes a single type of message to a stream. I want to process this stream in multiple, completely different consumer applications. So, a pub/sub with a single publisher for a given topic/stream. I also want to make use of checkpointing to ensure that each consumer processes every message written to the stream.
Initially, I was using the same App Name for all consumers and producers. However, I started getting the following error once I started more than one consumer:
com.amazonaws.services.kinesis.model.InvalidArgumentException: StartingSequenceNumber 49564236296344566565977952725717230439257668853369405442 used in GetShardIterator on shard shardId-000000000000 in stream PackageCreated under account ************ is invalid because it did not come from this stream. (Service: AmazonKinesis; Status Code: 400; Error Code: InvalidArgumentException; Request ID: ..)
This seems to be because consumers are clashing with their checkpointing as they are using the same App Name.
From reading the documentation, it seems the only way to do pub/sub with checkpointing is by having a stream per consumer application, which requires each producer to know about all possible consumers. This is more tightly coupled than I want; it's really just a queue.
It seems like Kafka supports what I want: arbitrary consumption of a given topic/partition, since consumers are completely in control of their own checkpointing. Is my only option to move to Kafka, or some other alternative, if I want pub/sub with checkpointing?
My RecordProcessor code, which is identical in each consumer:
override def processRecords(processRecordsInput: ProcessRecordsInput): Unit = {
log.trace("Received record(s) from kinesis")
for {
record <- processRecordsInput.getRecords
json <- jawn.parseByteBuffer(record.getData).toOption
msg <- decode[T](json.toString).toOption
} yield subscriber ! msg
processRecordsInput.getCheckpointer.checkpoint()
}
The code parses the message and sends it off to the subscriber. For now, I'm simply marking all messages as successfully received. I can see messages being sent on the AWS Kinesis dashboard, but no reads happen, presumably because each application has its own AppName and doesn't see any other messages.
The pattern you want, that of one publisher to & multiple consumers from one Kinesis stream, is supported. You don't need a separate stream per consumer.
How do you do that? You need to give a different application-name to every consumer. That way, checkpointing info of one consumer won't collide with that of another.
Check the first response to this: https://forums.aws.amazon.com/message.jspa?messageID=554375
I know that Kinesis typical use case is event streaming, however we'd like to use it to broadcast some information to have it in near real time in some apps besides making it available for further stream processing. KCL seems to be the only viable option to use Kinesis as stream API is too low level.
As far I understand to use KCL we'd have to generate random applicationId so all apps could receive all the data, but this means creating a new DynamoDB table each time an application starts. Of course we can perform clean up when application stops but when application doesn't stop gracefully there would be DynamoDB table hanging around.
Is there a way/pattern to use Kinesis streams in a broadcast fashion?
I am trying to use AWS Kinesis stream for one of our data streams. I would like to monitor pending messages on my stream for ops purposes(scale downstream according to backlog), but unable to find any API that gives (approx) pending messages in my stream.
This looks strange as messages get expired after 7 days and if the producers and consumers are isolated and can't communicate, how do you know messages are expiring. How do you handle this problem?
Thanks!
There is no such concept as "pending" message in Kinesis. All the incoming data will be placed on a shard.
Your consumer application should be in running state all the time, to keep track of changes in your stream. The application (with the help of KCL) will continue to poll "Shard Iterator" in the background, thus you will be notified about the new data when it comes.
Roughly; you can see Kinesis as a FIFO queue and the messages will disappear in a short time if you don't pop them.
If your application will process a few messages in an hour, you should think about changing your architecture. Kinesis is probably not the correct tool for you.