aws cloudwatch log for elasticbeanstalk - amazon-web-services

From webapp, end-user will specify start time and end time for fetching logs (either in .zip format or to just display log in new tab). I want to use cloudwatch for logging of elasticbeanstalk. What are the available JAVA api's for doing this. like enabling cloudwatch log in elasticbeanstalk and creating log stream etc

Why do you want use java api? You can follow the below step to install & configure cloud-watch log in EB ENV.
You should add cloud-watch policy with your elastic beanstalk ec2 role.
Write config in .ebextension to install & configure the cloud-watch logs on EB based servers.
Example config for cloud-watch log installation & configuration:
packages:
yum:
awslogs: []
container_commands:
01_get_awslogs_conf_file:
command: "cp .ebextensions/awslogs.conf /etc/awslogs/awslogs.conf"
03_restart_awslogs:
command: "sudo service awslogs restart"
04_start_awslogs_at_system_boot:
command: "sudo chkconfig awslogs on"
Your awslogs.conf should be available in .ebextensions directory.
Example file of awslogs.conf
[general]
state_file = value
logging_config_file = value
use_gzip_http_content_encoding = [true | false]
[logstream1]
log_group_name = value
log_stream_name = value
datetime_format = value
time_zone = [LOCAL|UTC]
file = value
file_fingerprint_lines = integer | integer-integer
multi_line_start_pattern = regex | {datetime_format}
initial_position = [start_of_file | end_of_file]
encoding = [ascii|utf_8|..]
buffer_duration = integer
batch_count = integer
batch_size = integer
If you are not getting log under cloud-watch log on AWS console then check the agent log on your server. Agent default log path will be /var/log/awslogs.log
Hope, This will help you to setup cloud watch log on EB.

Related

Airflow 1.1.10 remote S3 logs

I am trying to enable Remote Airflow logs, to do that I followed these steps:
apache-airflow install
pip install apache-airflow[crypto,postgres,ssh,s3,log]==1.10.10
Airflow.cfg file:
remote_logging = True
remote_base_log_folder = s3://bucket-name/logs
encrypt_s3_logs = False
logging_level = INFO
fab_logging_level = WARN
remote_log_conn_id = MyS3Conn
I have Airflow running in a docker container in an AWS ECS Blue/Green deploy.
I read that if airflow is hosted on an EC2 server you should create the connection leaving everything blank in the configuration apart from connection type which should stay as S3.
The S3hook will default to boto and this will default to the role of the EC2 server you are running airflow on. Assuming this role has rights to S3 your task will be able to access the bucket.
So I applied this, but I don't know if using docker it works as intended.
If I run a dag I see the logs which are createds in the /urs/local/airflow/logs folder in the docker container, but there is no new files in the specified folder in S3.

How to access EC2 instance metadata service? From outside EC2

With the below configuration for AWS cloud watch:
awslogs.conf
[/var/log/messages]
datetime_format = %b %d %H:%M:%S
file = /var/log/messages
buffer_duration = 2500
log_group_name = /var/log/messages
log_stream_name = {cluster}{instance_id}
and below script used with --userdata option for aws ec2 command:
userdata.sh
# Above agentlogs.conf file is copied to /etc/awslogs/awslogs.conf in AWS EC2 instance
# Configure cloudwatch config file
cat > /etc/cloudwatch-logs.ini <<EOF
[/var/log/messages]
datetime_format = %b %d %H:%M:%S
file = /var/log/messages
buffer_duration = 2500
log_stream_name = {cluster}{instance_id}
initial_position = start_of_file
log_group_name = /var/log/messages
EOF
an EC2 is launched from this script(running outside EC2):
spin_up_ec2.sh
# Using AWS CLI, we spin up EC2 instance using userdata.sh,
# Using metadata service How to read values of {cluster} & {instance_id} syntax, shown above:
aws logs describe-log-streams --log-group-name /var/log/messages --log-stream-name-prefix <grab_cluster_name_value><grab_instance_id_value> --region us-east-1
spin_up_ec2.sh is sitting outside EC2, within same VPC, in different subnet. So, am not sure, how to avail EC2 metadata service?
EC2 is running in private subnet.
{cluster} value would be something like clust1
{instance_id} value would be something like i-1a52627268bc
1)
How can a shell script(spin_up_ec2.sh) client talk to EC2 metadata service, to retrieve values of {cluster} & {instance_id}?
2)
Does launching EC2 in public subnet, help? To talk to metadata service
The Amazon EC2 instance metadata is not available outside of an instance.
You could make API calls to AWS services to obtain similar information (eg retrieve the subnet in which an EC2 instance is located).

AWS: Elastic Beanstalk and Auto Scaling (When Out of Memory)

I am using Elasticbeanstalk for my microservices architecture. I would like to setup a loadbalancer which can spin off another instance once my memory is exhausted 100% while i am unable to see any metrics.
Another Question: In How much time an instance could be spinned off.
let me know if there are some other wayouts for this problem.
When you set up your Beanstalk environment you can choose it to be Load Balanced or Single instance (you can also change this at any time). In the Configuration of the environment, you can choose scale up and down metrics, however you won't see any memory related metrics in the list.
You can set up a CloudWatch alarm to trigger the load balancer scale up and down event, but as mentioned by #Stefan you will actually need to create memory related metrics to do this, as they don't exist by default.
You can do this is with a config file in your .ebextensions folder within your deployment package which writes to CloudWatch every n minutes.
You have to give your Beanstalk IAM role permissions to write to CloudWatch using an inline policy.
You will then start to see memory metrics in CloudWatch and can therefore use them to scale up and down.
YAML for .config file will look like this (you may not want all these metrics):
packages:
yum:
perl-DateTime: []
perl-Sys-Syslog: []
perl-LWP-Protocol-https: []
perl-Switch: []
perl-URI: []
perl-Bundle-LWP: []
sources:
/opt/cloudwatch: https://aws-cloudwatch.s3.amazonaws.com/downloads/CloudWatchMonitoringScripts-1.2.1.zip
container_commands:
01-setupcron:
command: |
echo '*/5 * * * * root perl /opt/cloudwatch/aws-scripts-mon/mon-put-instance-data.pl `{"Fn::GetOptionSetting" : { "OptionName" : "CloudWatchMetrics", "DefaultValue" : "--mem-util --disk-space-util --disk-path=/" }}` >> /var/log/cwpump.log 2>&1' > /etc/cron.d/cwpump
02-changeperm:
command: chmod 644 /etc/cron.d/cwpump
03-changeperm:
command: chmod u+x /opt/cloudwatch/aws-scripts-mon/mon-put-instance-data.pl
option_settings:
"aws:autoscaling:launchconfiguration" :
IamInstanceProfile : "aws-elasticbeanstalk-ec2-role"
"aws:elasticbeanstalk:customoption" :
CloudWatchMetrics : "--mem-util --mem-used --mem-avail --disk-space-util --disk-space-used --disk-space-avail --disk-path=/ --auto-scaling"
Links:
Troubleshooting CPU and Memory Issues in Beanstalk
How to monitor memory on Beanstalk
How to add permissions to IAM role
Example .config file which includes setting up CloudWatch triggers

How to make CloudWatch logs agent running properly?

What I'm trying to do is to monitor log file through CloudWatch logs agent.
I have installed CloudWatch to my EC2 Linux Instance (EC2 Instance has Instance profile and IAM Role that are connected).
The installation was successful, but when I'm using sudo service awslogs status
I'm having this status massage dead but pid file exists.
In my error log file ( /var/log/awslogs.log) I have only this line that repeats over and over again - 'AccessKeyId'.
How can I fix Cloud Watch logs agent and make it to work?
This means that your AWS Logs agent requires your AWS Access Key/Secret. This can be provided in /etc/awslogs/awscli.conf in following format:
[plugins]
cwlogs = cwlogs
[default]
region = YOUR_INSTANCE_REGION (e.g. us-east-1)
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
Restart the service after making this change:
sudo service awslogs restart
Hope this helps!!!

How to add connectors to presto on Amazon EMR

I've set up a small EMR cluster with Hive/Presto installed, I want to query files on S3 and import them to Postgres on RDS.
To run queries on S3 and save the results in a table in postgres I've done the following:
Started a 3 node EMR cluster from the AWS console.
Manually SSH into the Master node to create an EXTERNAL table in hive, looking at an S3 bucket.
Manually SSH into each of the 3 nodes and add a new catalog file:
/etc/presto/conf.dist/catalog/postgres.properties
with the following contents
connector.name=postgresql
connection-url=jdbc:postgresql://ip-to-postgres:5432/database
connection-user=<user>
connection-password=<pass>
and edited this file
/etc/presto/conf.dist/config.properties
adding
datasources=postgresql,hive
Restart presto by running the following manually on all 3 nodes
sudo restart presto-server
This setup seems to work well.
In my application, there are multiple databases created dynamically. It seems that those configuration/catalog changes need to be made for each database and the server needs to be restarted to see the new config changes.
Is there a proper way for my application (using boto or other methods) to update configurations by
Adding a new catalog file in all nodes /etc/presto/conf.dist/catalog/ for each new database
Adding a new entry in all nodes in /etc/presto/conf.dist/config.properties
Gracefully restarting presto across the whole cluster (ideally when it becomes idle, but that's not a major concern.
I believe you can run a simple bash script to achieve what you want. There is no other way except creating a new cluster with --configurations parameter where you provide the desired configurations. You can run below script from the master node.
#!/bin/sh
# "cluster_nodes.txt" with private IP address of each node.
aws emr list-instances --cluster-id <cluster-id> --instance-states RUNNING | grep PrivateIpAddress | sed 's/"PrivateIpAddress"://' | sed 's/\"//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,""); print;}' > cluster_nodes.txt
# For each IP connect with ssh and configure.
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Connecting $line"
scp -i <PEM file> postgres.properties hadoop#$line:/tmp;
ssh -i <PEM file> hadoop#$line "sudo mv /tmp/postgres.properties /etc/presto/conf/catalog;sudo chown presto:presto /etc/presto/conf/catalog/postgres.properties;sudo chmod 644 /etc/presto/conf/catalog/postgres.properties; sudo restart presto-server";
done < cluster_nodes.txt
During Provision of your cluster:
You can provide the configuration details at the time of provision.
Refer to Presto Connector Configuration on how to add this automatically during the provision of your cluster.
You can provide the configuration via the management console as follows:
Or you can use the awscli to pass those configurations as follows:
#!/bin/bash
JSON=`cat <<JSON
[
{ "Classification": "presto-connector-postgresql",
"Properties": {
"connection-url": "jdbc:postgresql://ip-to-postgres:5432/database",
"connection-user": "<user>",
"connection-password": "<password>"
},
"Configurations": []
}
]
JSON`
aws emr create-cluster --configurations "$JSON" # ... reset of params