rdkafka consumer query about partition size - c++

Let's say I don't have access to the set of producers that commits on the partition of interest, but just have control over a bunch of C++ consumers.
Since I'm running benchmarks over a complex program, I'd like to know the spread between the offset my consumers are fetching and the total offset stored in the partition.
e.g., >> reading message #1234 of 5678 total in partition 0 of topic foo
I misunderstood the purpose of RdKafka::Consumer->outq_len() and RdKafka::Topic->OFFSET_END, because they seem always equal to 0 and -1, respectively.
How can I acquire the 5678 value of my example?

You need to subscribe to librdkafka's statistics to get an updated view of your consumer's lag.
Register an Event callback class and regularily call poll() on your handle, check for EVENT_STATS and then parse the corresponding JSON message and look for lo_offset, hi_offset and consumer_lag.

Related

Power Query Excel. Read and store variable from worksheet

I'm sorry for this poor title but can't formulate correct title at the time.
What is the process:
Get message id aka offset from a sheet.
Get list of updates from telegram based in this offset.
Store last message id which will be used as offset in p.1
Process updates.
Offset allows you to fetch updates only received after this offset.
E.g i get 10 messages 1st time and last message id aka offset is 100500. Before 2nd run bot received additional 10 message so there's 20 in total. To not load all of 20 messages (10 of which i already processed) i need to specify offset from 1st run, so API will return only last 10 messages.
PQ is run in Excel.
let
// Offset is number read from table in Excel. Let's say it's 10.
Offset = 10
// Returns list of messages as JSON objects.
Updates = GetUpdates(Offset),
// This update_id will be used as offset in next query run.
LastMessageId = List.Last(Updates)[update_id],
// Map Process function to each item in the list of update JSON objects.
Map = List.Transform(Updates, each Process(_))
in
Map
The issue is that i need to store/read this offset number each time query is executed AND process updates.
Because of lazy evaluation based on code below i can either output LastMessageId from the query or output result of a Map function.
The question is: how can i do both things store/load LastMessageId from Updates and process those updates.
Thank you.

TopologyTestDriver with streaming groupByKey.windowedBy.reduce not working like kafka server [duplicate]

I'm trying to play with Kafka Stream to aggregate some attribute of People.
I have a kafka stream test like this :
new ConsumerRecordFactory[Array[Byte], Character]("input", new ByteArraySerializer(), new CharacterSerializer())
var i = 0
while (i != 5) {
testDriver.pipeInput(
factory.create("input",
Character(123,12), 15*10000L))
i+=1;
}
val output = testDriver.readOutput....
I'm trying to group the value by key like this :
streamBuilder.stream[Array[Byte], Character](inputKafkaTopic)
.filter((key, _) => key == null )
.mapValues(character=> PersonInfos(character.id, character.id2, character.age) // case class
.groupBy((_, value) => CharacterInfos(value.id, value.id2) // case class)
.count().toStream.print(Printed.toSysOut[CharacterInfos, Long])
When i'm running the code, I got this :
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 1
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 2
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 3
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 4
[KTABLE-TOSTREAM-0000000012]: CharacterInfos(123,12), 5
Why i'm getting 5 rows instead of just one line with CharacterInfos and the count ?
Doesn't groupBy just change the key ?
If you use the TopologyTestDriver caching is effectively disabled and thus, every input record will always produce an output record. This is by design, because caching implies non-deterministic behavior what makes itsvery hard to write an actual unit test.
If you deploy the code in a real application, the behavior will be different and caching will reduce the output load -- which intermediate results you will get, is not defined (ie, non-deterministic); compare Michael Noll's answer.
For your unit test, it should actually not really matter, and you can either test for all output records (ie, all intermediate results), or put all output records into a key-value Map and only test for the last emitted record per key (if you don't care about the intermediate results) in the test.
Furthermore, you could use suppress() operator to get fine grained control over what output messages you get. suppress()—in contrast to caching—is fully deterministic and thus writing a unit test works well. However, note that suppress() is event-time driven, and thus, if you stop sending new records, time does not advance and suppress() does not emit data. For unit testing, this is important to consider, because you might need to send some additional "dummy" data to trigger the output you actually want to test for. For more details on suppress() check out this blog post: https://www.confluent.io/blog/kafka-streams-take-on-watermarks-and-triggers
Update: I didn't spot the line in the example code that refers to the TopologyTestDriver in Kafka Streams. My answer below is for the 'normal' KStreams application behavior, whereas the TopologyTestDriver behaves differently. See the answer by Matthias J. Sax for the latter.
This is expected behavior. Somewhat simplified, Kafka Streams emits by default a new output record as soon as a new input record was received.
When you are aggregating (here: counting) the input data, then the aggregation result will be updated (and thus a new output record produced) as soon as new input was received for the aggregation.
input record 1 ---> new output record with count=1
input record 2 ---> new output record with count=2
...
input record 5 ---> new output record with count=5
What to do about it: You can reduce the number of 'intermediate' outputs through configuring the size of the so-called record caches as well as the setting of the commit.interval.ms parameter. See Memory Management. However, how much reduction you will be seeing depends not only on these settings but also on the characteristics of your input data, and because of that the extent of the reduction may also vary over time (think: could be 90% in the first hour of data, 76% in the second hour of data, etc.). That is, the reduction process is deterministic but from the resulting reduction amount is difficult to predict from the outside.
Note: When doing windowed aggregations (like windowed counts) you can also use the Suppress() API so that the number of intermediate updates is not only reduced, but there will only ever be a single output per window. However, in your use case/code you the aggregation is not windowed, so cannot use the Suppress API.
To help you understand why the setup is this way: You must keep in mind that a streaming system generally operates on unbounded streams of data, which means the system doesn't know 'when it has received all the input data'. So even the term 'intermediate outputs' is actually misleading: at the time the second input record was received, for example, the system believes that the result of the (non-windowed) aggregation is '2' -- its the correct result to the best of its knowledge at this point in time. It cannot predict whether (or when) another input record might arrive.
For windowed aggregations (where Suppress is supported) this is a bit easier, because the window size defines a boundary for the input data of a given window. Here, the Suppress() API allows you to make a trade-off decision between better latency but with multiple outputs per window (default behavior, Suppress disabled) and longer latency but you'll get only a single output per window (Suppress enabled). In the latter case, if you have 1h windows, you will not see any output for a given window until 1h later, so to speak. For some use cases this is acceptable, for others it is not.

Watermark trigger in Onyx does not fire

I have an Onyx stream of segments that are messages with a timestamp (coming in in chronological order). Say, they look like this:
{:id 1 :timestamp "2018-09-04 13:15:42" :msg "Hello, World!"}
{:id 2 :timestamp "2018-09-04 21:32:03" :msg "Lorem ipsum"}
{:id 3 :timestamp "2018-09-05 03:01:52" :msg "Dolor sit amet"}
{:id 4 :timestamp "2018-09-05 09:28:16" :msg "Consetetur sadipscing"}
{:id 5 :timestamp "2018-09-05 12:45:33" :msg "Elitr sed diam"}
{:id 6 :timestamp "2018-09-06 08:14:29" :msg "Nonumy eirmod"}
...
For each time window (of one day) in the data, I want to run a computation on the set of all its segments. I.e., in the example, I would want to operate on the segments with ids 1 and 2 (for Sept 4th), next on the ids 3, 4 and 5 (for Sept 5th), and so on.
Onyx offers windows and triggers, and they should do what I want out of the box. If I use a window of :window/type :fixed and aggregate over :window/range [1 :day] with respect to :window/window-key :timestamp, I will aggregate all segments of each day.
To only trigger my computations when all segments of a day have arrived, Onyx offers the trigger behaviour :onyx.triggers/watermark. According to the documentation, it should fire
if the value of :window/window-key in the segment exceeds the upper-bound in the extent of an active window
However, the trigger does not fire, even though I can see that later segments are already coming in and several windows should be full. As a sanity check, I tried a simple :onyx.triggers/segment trigger, which worked as expected.
My failed attempt at creating a minimal example:
I modified the fixed windows toy job to test watermark triggering, and it worked there.
However, I found out that in this toy job, the reason the watermark trigger is firing might be:
Did it close the input channel? Maybe the job just completed which can trigger the watermark too.
Another aspect that interacts with watermark triggering is the distributed work on tasks by peers.
The comments to issue #839 (:trigger/emit not working with :onyx.triggers/watermark) in the Onyx repo pointed me to issue #840 (Watermark doesn't work with Kafka topic having > 1 partition), where I found this clue (emphasis mine):
The problem is that all of your data is ending up on one partition, and the watermarks always takes the minimum watermark over all of the input peers (and if using the native kafka watermarks, the minimum watermark for a given peer).
As you call g/send with small amounts of data, and auto partition assignment, all of your data is ending up on one partition, meaning that the other partition's peer continues emitting a watermark of 0.
I found out that:
It’s impossible to use it with the current watermark trigger, which relies on the input source. You could try to pull the previous watermark implementation [...]
In my task graph, however, the segments I want to aggregate in windows, are only created in some intermediate task, they don't originate from the input task as such. The input segments only provide information how to create/retrieve the content of the segments to that intermediate task.
Again, this constructs works fine in above mentioned toy job. The reason is that the input channel is closed at some point, which ends the job, which in turn triggers the watermark. So my toy example is actually not a good model, because it is not an open-ended stream.
If a job does get the segments in question from an actual input source, but without timestamps, Onyx seems to provide room to specify a assign-watermark-fn, which is an optional attribute of an input task. That function sets the watermark on each arrival of a new segment. In my case, this does not help, since the segments do not originate from an input task.
I came up with a work-around myself now. The documentation basically gives a clue how that can be done:
This is a shortcut function for a punctuation trigger that fires when any piece of data has a time-based window key that is above another extent, effectively declaring that no more data for earlier windows will be arriving.
So I changed the task that emits the segments so that for every segment there will be emitted another "sentinel" like segment as well:
[{:id 1 :timestamp "2018-09-04 13:15:42" :msg "Hello, World!"}
{:timestamp "2018-09-03 13:15:42" :over :out}]
Note that the :timestamp is predated by the window range (here, 1 day). So it will be sent to the previous window. Since my data comes in chronologically, a :punctuation trigger can tell from the presence of a "sentinel" segment (with keyword :over) that the window can be closed. Don't forget to evict (i.e., :trigger/post-evictor [:all]) and throw away the "sentinel" segment from the final window. Adding :onyx/max-peers 1 in the task map makes sure that a sentinel always arrives eventually, especially when using grouping.
Note that two assumptions go into this work-around:
The data comes in chronological
There are no windows without segments

Akka Streams `groupBy` capacity change on SubFlow completion?

When using groupBy in a stream flow definition with some max capacity of n:
source.groupBy(Int.MaxValue, _.key).to(Sink.actorRef)
If I hook up the subflows that result to say, an Actor sink, and purposefully cause the subflows to terminate on some message, will that free up the groupBy capacity? Will it go from n to n-1 back to n if a subflow is ended by the sink? Is this a viable way to set up a dynamic-ish graph?
Regarding how groupBy works in general: yes, the maxSubstreams capacity is dynamic, i.e. it represents the maximum number of active substreams.
The GroupBy stage keeps a reference of each subflow in its internal state, and this is removed whenever that specific subflow completes.
With regards to your specific example, I don't think there is a way to make sure that "a subflow is ended by the sink". This is because by using to(Sink.actorRef) after a groupBy all the flows are going to feed one single actor.

avoiding collisions when collapsing infinity lock-free buffer to circular-buffer

I'm solving two feeds arbitrate problem of FAST protocol.
Please don't worry if you not familar with it, my question is pretty general actually. But i'm adding problem description for those who interested (you can skip it).
Data in all UDP Feeds are disseminated in two identical feeds (A and B) on two different multicast IPs. It is strongly recommended that client receive and process both feeds because of possible UDP packet loss. Processing two identical feeds allows one to statistically decrease the probability of packet loss.
It is not specified in what particular feed (A or B) the message appears for the first time. To arbitrate these feeds one should use the message sequence number found in Preamble or in tag 34-MsgSeqNum. Utilization of the Preamble allows one to determine message sequence number without decoding of FAST message.
Processing messages from feeds A and B should be performed using the following algorithm:
Listen feeds A and B
Process messages according to their sequence numbers.
Ignore a message if one with the same sequence number was already processed before.
If the gap in sequence number appears, this indicates packet loss in both feeds (A and B). Client should initiate one of the Recovery process. But first of all client should wait a reasonable time, perhaps the lost packet will come a bit later due to packet reordering. UDP protocol can’t guarantee the delivery of packets in a sequence.
// tcp recover algorithm further
I wrote such very simple class. It preallocates all required classes and then first thread that receive particular seqNum can process it. Another thread will drop it later:
class MsgQueue
{
public:
MsgQueue();
~MsgQueue(void);
bool Lock(uint32_t msgSeqNum);
Msg& Get(uint32_t msgSeqNum);
void Commit(uint32_t msgSeqNum);
private:
void Process();
static const int QUEUE_LENGTH = 1000000;
// 0 - available for use; 1 - processing; 2 - ready
std::atomic<uint16_t> status[QUEUE_LENGTH];
Msg updates[QUEUE_LENGTH];
};
Implementation:
MsgQueue::MsgQueue()
{
memset(status, 0, sizeof(status));
}
MsgQueue::~MsgQueue(void)
{
}
// For the same msgSeqNum should return true to only one thread
bool MsgQueue::Lock(uint32_t msgSeqNum)
{
uint16_t expected = 0;
return status[msgSeqNum].compare_exchange_strong(expected, 1);
}
void MsgQueue::Commit(uint32_t msgSeqNum)
{
status[msgSeqNum] = 2;
Process();
}
// this method probably should be combined with "Lock" but please ignore! :)
Msg& MsgQueue::Get(uint32_t msgSeqNum)
{
return updates[msgSeqNum];
}
void MsgQueue::Process()
{
// ready packets must be processed,
}
Usage:
if (!msgQueue.Lock(seq)) {
return;
}
Msg msg = msgQueue.Get(seq);
msg.Ticker = "HP"
msg.Bid = 100;
msg.Offer = 101;
msgQueue.Commit(seq);
This works fine if we assume that QUEUE_LENGTH is infinity. Because in this case one msgSeqNum = one updates array item.
But I have to make buffer circular because it is not possible to store entire history (many millions of packets) and there are no reason to do so. Actually I need to buffer enough packets to reconstruct the session, and once session is reconstructed i can drop them.
But having circular buffer significantly complicates algorithm. For example assume that we have circular buffer of length 1000. And at the same time we try to process seqNum = 10 000 and seqNum = 11 000 (this is VERY unlikely but still possible). Both these packets will map to the array updates at index 0 and so collision occur. In such case buffer should 'drop' old packets and process new packets.
It's trivial to implement what I want using locks but writing lock-free code on circular-buffer that used from different threads is really complicated. So I welcome any suggestions and advice how to do that. Thanks!
I don't believe you can use a ring buffer. A hashed index can be used in the status[] array. Ie, hash = seq % 1000. The issue is that the sequence number is dictated by the network and you have no control over it's ordering. You wish to lock based on this sequence number. Your array doesn't need to be infinite, just the range of the sequence number; but that is probably larger than practical.
I am not sure what is happening when the sequence number is locked. Does this mean another thread is processing it? If so, you must maintain a sub-list for hash collisions to resolve the particular sequence number.
You may also consider an array size as a power of 2. For example, 1024 will allow hash = seq & 1023; which should be quite efficient.