Fluentd - Azure Event Hubs input plugin - azure-eventhub

Some of the Azure services we're using (e.g. APIM, Function Apps, etc) are sending logs to Azure Event Hubs.
How do I consume these logs in Fluentd?
From the Fluentd plugins page, I can't see any input plugins specifically for Azure Event Hubs. There is however a Kafka plugin that might work - not sure.
There is also an Azure Event Hubs output plugin - see here - but I'm looking for an input plugin.
Logstash (which is an alternate log forwarding solution) has an Azure Event Hubs input plugin - see here - but we're looking to use Fluentd for a few other reasons.
Has anyone done this before?

Related

Consume Azure Event Hub from Ruby

Is there a way of consuming events from Event Hubs from Ruby?
If not, how would one connect a Ruby application that needs to consume events published into Event Hubs?
e.g.: using SendGrid and a webhook? Are there more streamlined options?

Can KAFKA producer read log files?

Log files of my application keep accumulating on a server.I want to dump them into HDFS through KAFKA.I want the Kafka producer to read the log files,send them to Kafka broker and then move those files to another folder.Can the Kafka producer read log files ? Also, is it possible to have the copying logic in Kafka producer ?
Kafka maintains feeds of messages in categories called topics.
We'll call processes that publish messages to a Kafka topic producers.
We'll call processes that subscribe to topics and process the feed of published messages consumers..
Kafka is run as a cluster comprised of one or more servers each of which is called a broker.
So, at a high level, producers send messages over the network to the Kafka cluster which in turn serves them up to consumers like this:
So this is not a suitable for your application where you want to injest log files. Instead you can try flume.
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.
As you know, Apache Kafka is publish-subscribe messaging system. you can send message from your application. To send message from your application you can use kafka clients or kafka rest api.
In short, you can read your log with your application and can send these logs to kafka topics.
To handle these logs, you can use apache storm. You can find many integrated solution for these purposes. And by using storm you can
add any logic your stream processing.
You can read many useful detailed information about storm kafka integration.
Also to put your processed logs to hdfs, you can easily integrate your storm with hadoop. You can check this repo for it.
Kafka was developed to support high volume event streams such as real-time log aggregation. From the kafka documentation
Many people use Kafka as a replacement for a log aggregation solution. Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing. Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This allows for lower-latency processing and easier support for multiple data sources and distributed data consumption
Also I got this little piece of information from this nice article which almost similar to your use-case
Today, Kafka has been used in production at LinkedIn for a number of projects. There are both offline and online usage. In the offline case, we use Kafka to feed all activity events to our data warehouse and Hadoop, from which we then run various batch analysis

How to configure OSB to consume messages from Amazon SQS

I'm newbie to AWS and trying to work on the SQS for the first time. I've an Oracle Service Bus (OSB) in non-cloud environment and would like to configure OSB to consume messages from Amazon SQS. The documentation mentions to use REST API and poll repeatedly for messages. I also read about the 'client library for JMS' so that the OSB could treat SQS as JMS provider. What is the best approach to achieve this? Appreciate your inputs.
The easiest (not necessarily the purest way) would be to create a Java EE app that imports the SQS libraries and pulls messages from AWS and puts them on a local queue for OSB to process. The example code snippets are in Java, so it should be relatively straight forward.
The purest way would be to set it up as a remote JMS provider. However, how to set that up is not so clear - you may end up writing most of the code that went into option #1 above, but making a JMS client library instead of a MDB.

Can BAM and CEP monitor requests from client like Zipkin

I am wondering if I can use BAM and CEP to monitor requests from client, and even find the bottleneck of the service.
I found zipkin, a project that could do this, but the base of my application is WSO2, I don't want to get other projects from scratch.
Yes, you can use BAM/CEP for this. If you need real time monitoring you can use CEP and you can use BAM for batch process. From BAM 2.4.0 onwards, CEP features have been added inside BAM also hence you can use BAM and do real time analytics.
What type of services are involved with your scenario? Depends on this you can use already existing data publisher or write new data publisher for BAM/CEP to publish your request details. For example if you are having chain of axis2 webservice calls for a request from client, and you want to monitor where the bottle neck/more time consumed, then you may use the service stats publishing, and monitor the average time take to process the message which will help you to see where the actual delay has been introduced. For this you can use existing service statistics publisher feature. Also BAM will allow you to create your own dashboard to visualize, hence you can customize the dashboard.
Also with BAM 2.4.0 we have introduced notifications feature also which you can define some threshold value and configure to send notification if that cross that threshold value.

Configuring WSO2 AS and BAM

I am using BAM 2.2.0. I configured the service statistics toolbox inside the BAM server based on the documentation. And run the example program "service-stats"it is working fine.
But when I configure BAM inside my AS server using service data publisher configuration, and test the BAM server.Connection was established. But when I click the dashboard in BAM it shows an empty page with some msg how to configure AS .
Currently inside the BAM available script I can see "service-stats-271". How can I visualize my AS service status in BAM. Both servers are running in two machines.
Seeing "service-stats-271" is natural after installing the toolbox. But first you should see whether data has arrived to Cassandra from AS. For that you can login to Cassandra Explorer in BAM Management Console and see whether there is your column family existing in the EVENT_KS keyspace. If yes see whether there are any rows relevant to your events. If yes see whether the data has arrived to the H2 datbase during the execution of the Hive query.
If your data is not correctly published to Cassandra, please read [1] again. Sometimes you may not have done the following instruction mentioned there.
Go to <WSO2 Application Server home>/repository/conf/etc and open bam.xml file and enable ServiceDataPublishing as follows:
<BamConfig>
<ServiceDataPublishing>enable</ServiceDataPublishing>
</BamConfig>
[1] http://docs.wso2.org/wiki/display/BAM220/Setting+up+Service+Statistics+Data+Agent