AWS Systems Manager Event for Hybrid Activation - amazon-web-services

I am using AWS Systems Manager to manage some on-prem [Hybrid Activation] servers. When a server is registered with SSM and joins the fleet I want some actions to be taken (tags added, SSM docs run, etc ...). But I cannot find any documentation on any events generated by SSM when this occurs. Unsure if "Association State Change" fits the bill or not.

Currently AWS Systems Manager does not generate any events for this occurrence. Hopefully they will add it in the future.

Related

How to design a server monitoring system running on AWS

I am building some form of a monitoring agent application that is running on AWS EC2 machines.
I need to be able to send commands to the agent running on a specific EC2 instance and only an agent running on that instance should pick it up and act on it. New EC2 instances can come and go at any point in time.
I can use kinesis and push all commands for all instances there and agents can pick up the ones targeted for them. The problem with this is that agents will have to receive a lot of commands that are not for them and filter it out.
I can also use SQS per instance, but then this will require to create/delete SQS every time new instance is being provisioned.
Would like to hear if there are already proven solutions for a similar scenario.
There already is a fully functional feature provided by AWS. I would rather use that one as opposed to reinventing the wheel, as it is a robust, well-integrated, and proven solution that’s being leveraged by thousands of AWS customers to gain operational insights into their instance fleets:
AWS Systems Manager Agent (SSM Agent) is a piece of software that can be installed and configured on an EC2 instance (and it’s pre-installed on many of the default AMIs, including both versions of Amazon Linux, Ubuntu, and various versions of Windows Server). SSM Agent makes it possible to update, manage, and configure these resources. The agent processes requests from the Systems Manager service in the AWS Cloud, and then runs them as specified in the request. SSM Agent then sends status and execution information back to the Systems Manager service by using the Amazon Message Delivery Service.
You can learn more about AWS Systems Manager and the breadth and depth of functionality it provides here.
Have you considered using Simple Notifications Service? Each new EC2 instance could subscribe to a topic using e.g. http, and remove previous subscribers.
That way the topic would stay constant regardless of EC2 rotation.
It might be worth noting that SNS supports subscription filters, so it can decide which messages deliver to which endpoint.
To my observation, AWS SWF could be the option here. Since Amazon SWF is to coordinate work across distributed application components and it provides SDKs for various platforms. Refer to the official FAQs for more in-depth understanding. https://aws.amazon.com/swf/faqs/
Not entirely clear what the volume of the monitoring system messages will be.
But the architecture requirements described sounds to me as follows:
The agents on the EC2 instances are (constantly?) polling some centralized service, which is a poll based architecture
The messages being sent are to a specific predetermined EC2 instance, which is a push based architecture.
To support both options without significant filtering of the messages I suggest you try using an intermediate PubSub system such Kafka, which can be managed on AWS by MSK.
Then to differentiate between the instances, create a Kafka topic named by the EC2 instance ID.
This should give you a unique topic that the instance will easily know to access messages for itself on a topic denoted by it's own instance ID.
You can also send/push Producer messages to a specific EC2 instance by sending messages to the topic in the cluster named by it's EC2 instance ID.
Since there are many EC2 instances coming and going you will end up with many topics. To handle the volume of topics, you can trigger and notify CloudWatch on each EC2 termination event and check CloudWatch to see which EC2 instances were terminated and consequently their topic needs deleting.
Alternatively, you can trigger a Lambda directly on the EC2 termination event event and log it by creating a file denoted by the instance ID to an S3 Bucket, which you can watch using an additional Lambda that will delete old EC2 instance topics from the Kafka cluster when their instance ID's appear there.

What is the difference between aws system manager and aws cloudwatch?

I am kind of confused with the difference between aws system manager and aws cloudwatch?
Could someone help me to get clear with the difference?
Thank you very much.
They have different purposes.
aws system manager in the core of its functionality allows you to manage a fleet of instances as well as on-premise servers. Using the manger you can updated hundreds of instances with just a single command, execute custom scripts on all of them, monitor their patch compliance (i.e. do all your instances of interest have latest updates) and so on.
aws cloudwatch is primary used as a central location for storing variety of logs, from your applications (e.g. lambda execution logs), aws services and so on. It also allows you to monitor performance metrics of your instances (e.g. CPU utilization) as well as other resources. Other functionality is to respond to live events from resources (e.g. execute lambda whenever an instance is terminated)
In short, AWS System Manger is a centralized tool to automate management of AWS resources.
Whereas AWS Cloudwatch is centralized tool for monitoring AWS resource logs.
These short video resources might help -
AWS System Manager -
https://www.youtube.com/watch?v=MK4ZoCs-muo&ab_channel=AmazonWebServices
AWS Cloudwatch -
https://www.youtube.com/watch?v=a4dhoTQCyRA&ab_channel=AmazonWebServices

Audit k8s cluster events on AWS

For an EKS cluster, cloudtrail logs cluster events such as create, update and delete. However we are using kubeadm to provision clusters. How do we log an audit trail of these cluster events? Thanks.
CloudTrail logs API events in AWS, so I don't think you can use it for K8S events. However, you can use log shippers to send custom metrics to CloudWatch. From there you can emit events and create dashboards.
For this you have a couple of options, you can use the CloudWatch agent, An Elastic Beat, Logstash, or maybe use something like Splunk if you don't want to use CloudWatch.
From the K8S documentation, there's an Audit log (possibly at /var/log/kube-audit for your cluster) which...
Kubernetes auditing provides a security-relevant chronological set of records documenting the sequence of activities that have affected system by individual users, administrators or other components of the system. It allows cluster administrator to answer the following questions:
You can ship/parse this log with another service.
If you need more control over the outcome, you can write a custom Beat, based on the libbeat specification. https://github.com/elastic/beats/tree/master/libbeat
Otherwise, I think a lot of people use Filebeat: https://github.com/elastic/beats/tree/master/deploy/kubernetes
K8S also supports custom Audit Policies for further control

Kubernetes: Get mail once deployment is done

Is there a way to have post deployment mail in kubernetes on GCP/AWS ?
It has become harder to maintaining deployment on kubernetes once deployment team size grows. Having a post deployment mail service will ease up the process. As it'll also say who applied the deployment.
You could try to watch deployment events using https://github.com/bitnami-labs/kubewatch and webhook handler.
Another thing could be implementing customized solution with kubernetes API, for instance in python: https://github.com/kubernetes-client/python then run it as a separate notification pod in your cluster
Third option is to have deployment managed in ci/cd pipeline where actual deployment execution step is "approval" type, you should see user who approved and next step in the pipeline after approving could be the email notification
Approval in circle ci: https://circleci.com/docs/2.0/workflows/#holding-a-workflow-for-a-manual-approval
I don’t think such feature is built-in in Kubernetes.
There is a watch mechanism though, what you could use. Run the following GET query:
https://<api-server-url>/apis/apps/v1/namespace/<namespace>/deployments?watch=true
The connection will not close and you’ll get a “notification” about each deployment. Check the status fields. Then you can send the mail or do something else.
You’ll need to pass an authorization token to gain access to the API server. If you have kubectl setup, you could run a local proxy, which then won’t need the token: kubectl proxy.
You can attach handlers to container lifecycle events. Kubernetes supports preStop and postStart events. Kubernetes sends the postStart event immediately after the container is started. Here is the snippet of the pod manifest deployment file.
spec:
containers:
- name: <******>
images: <******>
lifecycle:
postStart:
exec:
command: [********]
Considering GCP, one option could be create a filter to get the info about your deployment finalization at Stackdriver Logging, and with the filter you can use the CREATE METRIC option, also in Stackdriver Logging.
With the metric created, use Stackdriver Monitoring to create an alert to send e-mails. More details at official documentation.
It looks like no one has mentioned "native tool" Kubernetes provides for that yet.
Please note, that there is a concept of Audit in Kubernetes.
It provides a security-relevant chronological set of records documenting the sequence of activities that have affected system by individual users, administrators or other components of the system.
Each request on each stage of its execution generates an event, which is then pre-processed according to a certain policy and processed by certain backend.
That allows cluster administrator to answer the following questions:
what happened?
when did it happen?
who initiated it?
on what did it happen?
where was it observed?
from where was it initiated?
to where was it going?
Administrator can specify what events should be recorded and what data they should include with the help of Audit policy/ies.
There are a few backends that persist audit events to an external storage.
Log backend, which writes events to a disk
Webhook backend, which sends events to an external API
Dynamic backend, which configures webhook backends through an AuditSink API object.
In case you use log backend, it is possible to collect data with tools such as a fluentd. With that data you can achieve more than just a post deployment mail in Kubernetes.
Hope that helps!

AWS X-Ray Crossacount data collection

I have an application that is distributed over two AWS accounts.
One part of the application ingest data from one account into the other account.
The producer part is realised as python lambda microservices.
The consumer part is a spring-boot app in elastic beanstalk and additional python lambdas that further distribute data to external systems after they have processed by the spring-boot app in EBeanstalk.
I don't have an explicit X-Ray daemon running anywhere.
I am wondering if it is possible to send the x-ray traces of the one account to the account so i can monitor my application in one place.
I could not find any hints in the documentation regarding cross account usage. Is this even doable ?
If you running X-Ray daemon, you can provide RoleARN to the daemon, so it assumes the role and sends data it receives from X-Ray SDK from Account 1 to Account 2.
However if you have enabled X-Ray on API Gateway or AWS Lambda, segments generated by these services are sent to the account they run in and its not possible to send data cross account for these services.
Please let me know if you have questions. If yes, include the architecture flow and solution stack you are using to better guide you.
Thanks,
Yogi
It is possible but you'd have to run your own xray-daemon as a service.
By default, lambda uses its own xray daemon process to send traces to the account it is running in. However, the X-Ray SDK supports environment variables which can be used to use a custom xray daemon process instead. These environment variables are applicable even if the microservice is running inside a lambda function.
Since your lambda is written in python, you can refer to this AWS Doc which talks about an environment variable. You can set the value to the address of the custom xray daemon service.
AWS_XRAY_DAEMON_ADDRESS = x.x.x.x:2000
Let's say you want to send traces from Account 1 to Account 2. You can do that by configuring your daemon to assume a role. This role must be present in the Account 2 (where you want to send your traces). Then use this role's ARN by passing in the options while running your XRay daemon service in Account 1 (from where you want the traces to be sent). The options to use are mentioned in this AWS Doc here.
--role-arn, arn:aws:iam::123456789012:role/xray-cross-account
Make sure you attach permissions in Account 1 as well to send traces to Account 2.
This is now possible with this recent launch: https://aws.amazon.com/about-aws/whats-new/2022/11/amazon-cloudwatch-cross-account-observability-multiple-aws-accounts/.
You can link accounts to another account to share traces, metrics, and logs.