Kinesis agent not sending .log files through firehose - amazon-web-services

I've setup a Kinesis firehose and the installed the Kinesis agent as described in the AWS docs. I can get test data through to the S3 bucket, but the Kinesis agent won't send any .log files through. I suspect a problem connecting the agent to the firehose.
My /etc/aws-kinesis/agent.json file is below. I've also tried with the "firehose.endpoint" without the https:// but I still can't get any data through.
I've verified that the aws-kinesis-agent service is running.
I'm not using the kinesis.endpoint/kinesisStream, but I've left the flow in the agent.json file. Could this be a problem?
What am I missing?
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "https://firehose.us-west-2.amazonaws.com",
"flows": [
{
"filePattern": "/home/ec2-user/src/Fake-Apache-Log-Generator/*.log*",
"kinesisStream": "yourkinesisstream",
"partitionKeyOption": "RANDOM"
},
{
"filePattern": "/home/ec2-user/src/Fake-Apache-Log-Generator/*.log*",
"deliveryStream": "apachelogfilesdeliverystream"
}
]
}
EDIT:
The log file at /var/log/aws-kinesis-agent/aws-kinesis-agent.log showed 0 records being parsed. The log message led me to this post, and I made the recommended fixes. In addition I had to remove the flow for kinesis from the /etc/aws-kinesis/agent.json file to avoid an Exception that showed up in the log files.
Botton line is that the aws-kinesis-agent can't read files from /home/ec2-user/ or its subdirectories, and you have to fix up the agent.json file.

Kinesis agent is not able to read the logs from a file which is at /home/ec2-user/<any-file> due to some permissions issue. Try changing your logs location to /tmp/logs/<log-file>.

Add the kinesis agent to the sudoers group:
sudo usermod -aG sudo aws-kinesis-agent-user
Another possibility is data flow, see this answer: https://stackoverflow.com/a/64610780/5697992

Related

How to redirect multiple ECS log streams into a single log stream in CloudWatch

I currently have my application running in ECS. I have enabled the awslogs agent indicating the Log group and the region. Everything works great, send the logs to the Log group and create a Log stream. However, every time I restart the container, it creates a new Log stream.
Is there a way that instead of creating a Log stream as the container restarts, it all goes into a single Log stream?
I've been looking for a solution for a long time and I haven't found anything.
For example, instead of there being 2 Log streams, there is only 1 each time the container is restarted.
Something like this:
The simplest way is to use the PutLogEvents api directly. Beyond that you can get as fancy as you want. You could use a firelens side car container in your task to handle all events using a logging api that writes directly to cloudwatch.
For example, you can do this in python with boto3 cloudwatch put_log_events
response = boto3.client("logs").put_log_events(
logGroupName="your-log-group",
logStreamName="your-log-stream",
logEvents=[
{"timestamp": 123, "message": "log message"},
],
)

How to get stream url and key from AWS MediaLive?

I need to design a solution where I take a Zoom live stream as input and save chunks of duration 10 seconds in a s3 bucket. I need to save them in bucket for using AWS Transcribe on them.
For live streaming to a custom client, Zoom takes a stream url and stream key. I first tried to use AWS IVS for streaming. IVS gives a stream url and key which I supplied to zoom. But I didn't find a solution to intercept the stream and store audio chunks in s3.
Next I found about MediaLive which seemed promising as it takes an input source and output destination. I set the input type as RTMP (Push) but I am not getting a stream url or stream key that I can send to Zoom.
How can I get these stream url and key? Or is it that I am approaching it all wrong? Any help is appreciated.
Thanks for your message. The RTMP details belong to the MediaLive Input you defined, independent of whatever Channel to which the Input might be attached. Have a look at the Inputs section in your Console.
Alternatively, you can run a command like this from your AWS CLI or your CloudShell prompt:
aws medialive describe-input --input-id 1493101
.
{
"Arn": "arn:aws:medialive:us-west-2:123456123456:input:1493107",
"AttachedChannels": [],
"Destinations": [
{
"Ip": "44.222.111.85",
"Port": "1935",
"Url": "rtmp://44.222.111.85:1935/live/1"
}
],
"Id": "1493107",
"InputClass": "SINGLE_PIPELINE",
"InputDevices": [],
"InputPartnerIds": [],
"InputSourceType": "STATIC",
"MediaConnectFlows": [],
"Name": "RTMP-push-6",
"SecurityGroups": [
"313985"
],
"Sources": [],
"State": "DETACHED",
"Tags": {},
"Type": "RTMP_PUSH"
}
.
The two parameters after the ":1935/" in the URL are the App name and Instance name. They should be unique and not blank. You can use simple values as per my example. The stream key can be left blank on your transmitting device.
You can test the connectivity into the MediaLive Channel using an alternate source of RTMP to confirm the cloud side is listening correctly. There are various phone apps that will push RTMP; ffmpeg also works.
I suggest adding a VOD source to your medialive channel as the first source to your channel in order to confirm the channel starts correctly and produces a short bit of good output to your intended destinations. All the metrics and alarms should be healthy. When that works as intended, then switch to your intended RTMP input.
You can monitor network-in bytes and input video frame rate metrics from AWS CloudWatch. Channel event logs will also be logged to CloudWatch if you enable the Channel logging option on your MediaLive channel (recommended).
I hope this helps!
Input security groups can be created from the AWS MediaLive Console or via the AWS CloudShell CLI, or your local aws-cli, using a command of the form:
'aws medialive create-input-security-group'. Add the 'help' parameter for details on syntax.

How to disable JSON format and send only the log message to Cloudwatch with Fluentbit?

I am trying to setup Firelens for my Fargate tasks. I would like to send logs to multiple locations, Cloudwatch and Elasticsearch.
But only to Cloudwatch I want to disable JSON format and send only the log message as it is.
I have the below configuration for Cloudwatch output.
[OUTPUT]
Name cloudwatch
Match *
auto_create_group true
log_group_name /aws/ecs/containerinsights/$(ecs_cluster)/application
log_stream_name $(ecs_task_id)
region eu-west-1
Currently logs are coming like this,
{
"container_id": "1234567890",
"container_name": "app",
"log": "2021/08/10 18:42:49 [notice] 1#1: exit",
"source": "stderr"
}
I want only the line,
2021/08/10 18:42:49 [notice] 1#1: exit
in Cloudwatch.
I had a similar issue using just CloudWatch where everything was wrapped in JSON - I imagine it'll be the same when using several targets.
The solution was to add the following to the output section:
log_key log
This tells Fluent Bit to only include the data in the log key when sending to CloudWatch.
The docs have since been updated to include that line by default in this PR.

Custom metrics (Disk-space-utilization) is missing on the AWS console

I want to create a cloudwatch alarm for the diskspace-utilization.
I've folowed the AWS doc below
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/mon-scripts.html
It is creating the cron on my instance and I've checked my system log as well.
Sep 22 12:20:01 ip-#### CRON[13921]: (ubuntu) CMD
(~/volumeAlarm/aws-scripts-mon/mon-put-instance-data.pl
--disk-space-util --disk-space-avail --disk-space-used --disk-path=/ --from-cron)
Sep 22 12:20:13 ip-#### CRON[13920]: (ubuntu) MAIL (mailed 1 byte of output; but got status 0x004b, #012)
also manually running the command,
./mon-put-instance-data.pl --disk-space-util --disk-space-avail
--disk-space-used --disk-path=/
shows the result,
print() on closed filehandle MDATA at CloudWatchClient.pm line 167.
Successfully reported metrics to CloudWatch. Reference Id:####
But there is no metrics in the aws console, So that I can set the alarm,
Please help, If someone solved the problem.
CloudWatch scripts will get the instance's meta data and write it to a
local file /var/tmp/aws-mon/instance-id, if the file or folder has
incorrect permission that the script cannot write to file
/var/tmp/aws-mon/instance-id, then it may throw error like "print() on
closed filehandle MDATA at CloudWatchClient.pm line 167". Sorry for
making assumption. A possible scenario is: the root user executed the
mon-get-instance-stats.pl or mon-put-instance-data.pl scripts
initially, and the scripts has generated the file/folder on place,
then the root user switched back to different user and execute the
CloudWatch scripts again, this error shows up. To fix this, you need
to remove the folder /var/tmp/aws-mon/, and re-execute the CloudWatch
scripts to re-generate the folder and files again.
This is the support answer that I get from the aws support on having the same issue may be it will help u too. Also do check your AWSAccessKey for the EC2 instance as well.

Cloudwatch logs - No Event Data after time elapses

I've looked on the AWS forums and elsewhere but haven't found a solution. I have a lambda function that, when invoked, creates a log stream which populates with log events. After about 12 hours or so, the log stream is still present, but when I open it, I see the following:
The link explains how to start sending event data, but I already have this set up, and I am sending event data, it just disappears after a certain time period.
I'm guessing there is some setting somewhere (either for max storage allowed or for whether logs get purged), but if there is, I haven't found it.
Another reason for missing data in the log stream might be a corrupted agent-state file. First check your logs
vim /var/log/awslogs.log
If you find something like "Caught exception: An error occurred (InvalidSequenceTokenException) when calling the PutLogEvents operation: The given sequenceToken is invalid. The next expected sequenceToken is:" you can regenerate the agent-state file as follows:
sudo rm /var/lib/awslogs/agent-state
sudo service awslogs stop
sudo service awslogs start
TL;DR: Just use the CLI. See Update 2 below.
This is really bizarre but I can replicate it...
I un-checked the "Expire Events After" box, and lo and behold I was able to open older log streams. What seems REALLY odd is that if I choose to display the "Stored Bytes" data, many of the files are listed at 0 bytes even though they have log events:
Update 1:
This solution no longer works as I can only view the log events in the first two log streams. What's more is that the Stored Bytes column displays different (and more accurate) data:
This leads me to believe that AWS made some kind of update.
UPDATE 2:
Just use the CLI. I've verified that I can retrieve log events from the CLI that I cannot retrieve via the web console.
First install the CLI (if you haven't already) and use the following command:
aws logs get-log-events --log-group-name NAME-OF-LOGGROUP --log-stream-name LOG-STREAM-NAME // be sure to escape special characters such as /, [, $ etc