Store logs from applications in kubernetes - amazon-web-services

What is the recommended approach to store the logs of applications deployed on Kubernetes? I read about ELK stack, but not sure about the pros and cons. Needs recommendations.

If you ask specifically about storing application logs in kubernetes cluster, there are a few different approaches. First I would recommend you to familiarize with this article in the official kubernetes documentation.

As per my experience with the Kubernetes logging, I would suggest you go with EFK stack (Fluentd/flunetbit --> Kafka --> Logstash/flunetd --> Elasticserach --> kibana), this one has initial challenges during setup but once this is up and running, it will be like a super scalable system where you don't need to worry about volume of logs you are shipping.
Another approach you can take is shipping logs directly from fluentd/fluentbit/filebeat to Elasticsearch. The drawback of this approach is if ES has some issue then you may lose your logs.
I hope it helps.

I want to emphasize the response from #javajon. There is a KataCoda exercise specifically for logging at https://katacoda.com/javajon/courses/kubernetes-observability/efk.
Logging is a very large topic with lots of variables. In order to get any specific advice, you'll need to comment about your goals for logging. Is it related to performance, compliance, security, debugging, observability or something else?

Try to get some knowledge by yourself.
Every storage have some pros and cons according to requirement we use them.
Visit https://medium.com/volterra-io/kubernetes-storage-performance-comparison-9e993cb27271
and learn more.
I will surely somehow help.

Related

Deploy hyperledger on AWS - production setup

My company is currently evaluating hyperledger(fabric) and we're using it for our POC. It looks very promising and we're targeting rolling out to production in next few months.
We're targeting AWS as our production environment.
However, we're struggling to find good tutorial/practices/recommendations about operating hyperledger network in such environment.
I'm aware that Cello is aiming to solve/ease deploying/monitoring hyperledger network but i also read that its not production ready yet. Question is, should we even consider looking at Cello at this point?
If not, what are our alternatives? Docker swarm, kubernetes?
I also didn't find information about recommended instance types. I understand this is application and AWS specific but what are the minimal system requirements
(memory&CPU&network) for example for 'peer' node (our application is not network intensive, nor a lot of transactions will be submitted per hour/day, only few of them per day).
Another question is where to create those instances on AWS from geographical&decentralization point of view. Does it make sense all of them to be created in same region? Or, we must create instances running in different regions?
Tnx a lot.
Igor.
yes, look at Cello.. if nothing else it will help you see the aws deployment model.
really nothing special..
design the desired system, peers, orderer, gateways, etc..
then decide who many ec2 instance u need to support that.
as for WHERE (region).. depends on where the connecting application is and what kind of fault tolerance you need for your business model.
one of the businesses I am working with wants a minimum of 99.99999 % availability. so, multi-region is critical. its just another ec2 instance with sockets open from different hosts..
aws doesn't provide much in terms of support for hyperledger. they have some templates which allow you to setup the VMs initially, but that's stuff you can do yourself as well.
you are right, the documentation is very light and most of the time confusing. I got to the point where I can start from scratch with a brand new VM and got everything ready and deploy my own network definition and chaincode and have the scripts to do that.
IBM cloud has much better support for hyperledger however. you can design your network visually, you can download your connection profiles, deploy and instantiate chaincode, create and join channels, handle certificates, pretty much everything you need to run and support such a network. It's light years ahead of AWS. They even have a full CI / CD pipepline that you could replicate for your own project. if you look at their marbles demo, you'll see what i mean.
Cello is definitely worth looking at, with the caveat that it's incubation meaning, not real yet, not production ready and not really useful until it becomes a fully fledged product.

Best way to Setup Cloudwatch in 'n' number of servers?

We want to setup cloudwatch in more that 50 servers for which in general we will have to do it manually logging into each server.But we would like to reduce the manual work.
While browsing through we found below two ideas:
1)Opswork( aws internally uses chef)
2) Chef
Are the above approaches correct to achieve what intend to?
Which approach is best suitable?
Your suggestions will be of great help... Thank you
We performed this activity using chef.The process was simple.
There are a number of cookbooks already available on the chef supermarket which is of great help to beginners.
We did not try Opswork so i will not be able to comment which is a better approach.

Amazon inspector

I would like to use AWS tool, like in topic. To me it looks like there are two releases of this tool. One with AWS agent installed on EC2 instance, allows tracking security issues. New one with some benchmarking, and so on. So I'm interested in the new one.
I've red docs, set up sample, test env. but still it looks a bit unclear for me. I understand that they are using public database of vulnerabilities. As well as benchmarking, or testing against best practices.
The question is - how can I know that all of that is tested in lowest 15min. target? Or in the other words - if time is short - what is less tested?
Is anyone use this tool and would like to share knowledge, insights?
A report provided at the end of the testing gives you an overview of the scanning results. The results indicates which of your preselected resources has security issues.

How to convert a WAMP stacked app running on a VPS to a scalable AWS app?

I have a web app running on php, mysql, apache on a virtual windows server. I want to redesign it so it is scalable (for fun so I can learn new things) on AWS.
I can see how to setup an EC2 and dump it all in there but I want to make it scalable and take advantage of all the cool features on AWS.
I've tried googling but just can't find a simple guide (note - I have no command line experience of Linux)
Can anyone direct me to detailed resources that can lead me through the steps and teach me? Or alternatively, summarise the steps in an answer so I can research based on what you say.
Thanks
AWS is growing and changing all the time, so there aren't a lot of books to help. Amazon offers training that's excellent. I took their three day class on Architecting with AWS that seems to be just what you're looking for.
Of course, not everyone can afford to spend the travel time and money to attend a class. The AWS re:Invent conference in November 2012 had a lot of sessions related to what you want, and most (maybe all) of the sessions have videos available online for free. Building Web Scale Applications With AWS is probably relevant (slides and video available), as is Dissecting an Internet-Scale Application (slides and video available).
A great way to understand these options better is by fiddling with your existing application on AWS. It will be easy to just move it to an EC2 instance in AWS, then start taking more advantage of what's available. The first thing I'd do is get rid of the MySql server on your own machine and use one offered with RDS. Once that's stable, create one or more read replicas in RDS, and change your application to read from them for most operations, reading from the main (writable) database only when you need completely current results.
Does your application keep any data on the web server, other than in the database? If so, get rid of all local storage by moving that data off the EC2 instance. Some of it might go to the database, some (like big files) might be suitable for S3. DynamoDB is a good place for things like session data.
All of the above reduces the load on the web server to just your application code, which helps with scalability. And now that you keep no state on the web server, you can use ELB and Auto-scaling to automatically run multiple web servers (and even automatically launch more as needed) to handle greater load.
Does the application have any long running, intensive operations that you now perform on demand from a web request? Consider not performing the operation when asked, but instead queueing the request using SQS, and just telling the user you'll get to it. Now have long running processes (or cron jobs or scheduled tasks) check the queue regularly, run the requested operation, and email the result (using SES) back to the user. To really scale up, you can move those jobs off your web server to dedicated machines, and again use auto-scaling if needed.
Do you need bigger machines, or perhaps can live with smaller ones? CloudWatch metrics can show you how much IO, memory, and CPU are used over time. You can use provisioned IOPS with EC2 or RDS instances to improve performance (at a cost) as needed, and use difference size instances for more memory or CPU.
All this AWS setup and configuration can be done with the AWS web console, or command-line tools, or SDKs available in many languages (Python's boto library is great). After learning the basics, look into CloudFormation to automate it better (I've written a couple of posts about that so far).
That's a bit of the 10,000 foot high view of one approach. You'll need to discover the details of each AWS service when you try to use them. AWS has good documentation about all of them.
Depending on how you look at it, this is more of a comment than it is an answer, but it was too long to write as a comment.
What you're asking for really can't be answered on SO--it's a huge, complex question. You're basically asking is "How to I design a highly-scalable, durable application that can be deployed on a cloud-based platform?" The answer depends largely on:
The specifics of your application--what does it do and how does it work?
Your tolerance for downtime balanced against your budget
Your present development and deployment workflow
The resources/skill sets you have on-staff to support the application
What your launch time frame looks like.
I run a software consulting company that specializes in consulting on Amazon Web Services architecture. About 80% of our business is investigating and answering these questions for our clients. It's a multi-week long project each time.
However, to get you pointed in the right direction, I'd recommend that you look at Elastic Beanstalk. It's a PaaS-like service that abstracts away the underlying AWS resources, making AWS easier to use for developers who don't have a lot of sysadmin experience. Think of it as "training wheels" for designing an autoscaling application on AWS.

Should I use CouchDB or SimpleDB?

I'm creating an application that will be hosted on amazon EC2 and a lot of the data that'll be saved is more document oriented (as well as saving tweets and such related to those documents).
Right now I'm at a crossroads... should I use simpleDB or couchDB? Whats the pros/cons of using either? Should I just try both for a month and decide then?
You may find the the article Amazon SimpleDB and CouchDB Compared to be useful.
I've also found that MongoDB gives excellent performance.
Keep in mind that if your code lives in EC2, SimpleDB will be presumably hosted in the same data center that your code is, which would give SimpleDB a lower latency than CouchDB for requests from an EC2 server. Also, Amazon doesn't charge you bandwidth costs between EC2 and SimpleDB.
I would expect SimpleDB to be both faster and cheaper for code running in EC2, for those reasons.
SimpleDB is hosted and maintained by Amazon for you, CouchDB is all up to you. That's the big difference.
I would absolutely do some benchmark of the two solutions with your own use-case, if that's possible, i.e. if you can build a reasonable subset of your application to run on either databases (they have quite different APIs so this might not be easy).
If you develop in .Net environment there's an excellent lib for SimpleDB called Simple Savant which really eases the integration..
I've built some live solutions using SimpleDB and it works very well, especially with a caching layer in front of it (cf memcached et al). However I've recently started scoping out a new project and have decided to move to CouchDB for the primary reason of having control over the data.
As your commitment to SimpleDB grows, it gets harder and harder to migrate away to anything else (ah the joys of vendor lock in) and, frankly, that just isn't great for our business.
I remain a strong evangelist of cloud tech, and Amazon in particular, but I feel a lot better running couchdb on EC2 than I do with SimpleDB.
Roger