What is the best solution for Airflow deploying on AWS? [closed] - amazon-web-services

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I've been thinking with my team which solution is the best for deploying Apache Airflow on AWS in terms of cost and performance. We did some research and found out some solutions, among them using Kubernetes (EKS), a machine on EC2 and using ECS (Fargate). But, on Google there isn't so much detailed contents about it. Also, we did some estimates based on our calculations, however, we're not so sure about that. We are looking for a discussion about the trade off of each solution.
So, my question is: is there someone who is going through this or someone who has been through this? And, which is, if exists, the best solution?

In late 2020, AWS announced Amazon Managed Workflows for Apache Airflow (MWAA). It is a fully managed service that makes it easy to run open-source versions of Apache Airflow (including v2) on AWS.
I'd suggest having a read through the documentation to find out more and determine if it meets your requirements.
From my personal experience: I had previously managed an Airflow stack using EC2 and ECS worker pools. Moving over to MWAA has definitely been a better solution & provided a much better user experience.

Related

Replace cloud load balancer for Kubernetes in AWS [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 days ago.
Improve this question
Would it be possible to expose a Kubernetes cluster to the work without a cloud “load balancer” e.g. from AWS Network Load Balancer?
I know MetalLB for bare-metal Kubernetes installation, but aren't sure if this may be a solution? Any advice would be appreciated.
Yes and as you said MetalLB would be one way. You can use it internally across your cluster using ARP or set to a static route. Was there anything in particular keeping you from going with MetalLB?

Amazon web services and tensorflow [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
As someone who has a laptop with insufficient processing power, I am having a hard time trying to train my neural network models. So, I thought of Amazon Web Services as a solution. But I have a few questions. First of all, as far as I know, Amazon SageMaker supports TensorFlow. I could not understand if the service is free or not though. I have heard some people say that it is free for a specific time, others say that it is free unless you surpass a limit. I would be more than happy if someone could clarify or put forward other alternatives that would help me out.
Thanks a lot!
Google cloud has similar options and they give $300 credit to developers.
Since google is the creator of tensorflow, I am guessing their cloud would be the one most up to date the latest. Try it out.
https://cloud.google.com/ml-engine/docs/pricing
They have a free tier, and this is all well documented at https://aws.amazon.com/sagemaker/pricing/
You should look into EC2 Spot Instances.
There is a market for AWS computing resources with prices rising and falling with supply and demand. You can set a max price as long as you are flexible on the availability. When the prices fall (usually at night), you can take advantage of (big data) computing resources at 90% off.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/how-spot-instances-work.html

AWS setup cloud infrastructure for Java Project [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Well, I want to know the basic of cloud infrastructure for Amazon cloud.
Can anyone help me how I can move ahead with this? and what would be the best for me?
Below mentioned is my requirement:
Project: Java EE based architecture
Deployment Server: Tomcat
DataBase: MySQL
Instance: Amazon ec2 and AWS Elastic Beanstalk (However I'm not sure what is
good for Java related project)
Space: 100 GB for now and it should be salable on the based on instant requirement.
Hosting Server: Linux
Here I want to know every possible things which can be good for initial setup for my production server.
Also, I would like to know what are the services that I need to purchase based on my requirement, and suggest me for the same, also guide me the best prices as well for the specific service.
Looking forward to hear from you everyone guys,
Have a nice time ahead!
Kuldeep
I would recommend you to start with Amazon EC2. For mysql, you can use Amazon RDS as well as it handles all the maintenance activity for DB. Also, you can start with m3.xl machines initially and can upgrade based on requirement.
As per storage is concerned, start with 100GB SSD EBS first . You can anytime attach more storage as needed.
Here are some useful links regarding machine types and costing:
https://aws.amazon.com/ec2/instance-types/?sc_channel=PS&sc_campaign=acquisition_SG&sc_publisher=google&sc_medium=ec2_b&sc_content=sitelink&sc_detail=ec2%20instance&sc_category=ec2&sc_segment=instance_types&sc_matchtype=p&sc_country=SG&s_kwcid=AL!4422!3!154807442051!p!!g!!ec2%20instance&ef_id=WG8jbAAAAIccH41L:20170703093155:s
https://calculator.s3.amazonaws.com/index.html

GCP network usage monitoring [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
is there any network monitor for GCP VMs?
I think that my apps is experiencing some network problem, keep on facing JSONP failed request
Thanks in advance
There are many tools that can make network monitoring (and much more). For example, I use Pandora FMS, that allows me to monitor my cloud VMs and my phisical hosts. Using network checks and running local commands in the VMs using the agents, I monitor all I need. Of course It include an alerting system. Take a look on their web, they have both open and enterprise version.
https://pandorafms.com/monitoring-solutions/virtual-server-monitoring/
Hope this helps!
Stackdriver is best for GCP as that has better visibility. There is logging, monitoring, trace, debugger, profile.
links:
https://cloud.google.com/logging | https://cloud.google.com/monitoring | https://cloud.google.com/trace | https://cloud.google.com/debugger https://cloud.google.com/profiler
You can use stackdriver, there are few networking metrics see here

Clusters available for using Hadoop/MapReduce framework [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Does anyone know any free accessible clusters that are open to public and that use a Hadoop/MapReduce framework? There are plenty of tutorials of how to use MapReduce, but is there a way to test the examples without using my local single machine and installing the required framework?
Thanks!
Amazon EC2 has ready to use Hadoop cluster for per time rent, not very expensive even for play. Other way is to play with Cloudera Hadoop VM http://www.cloudera.com/downloads/virtual-machine/. You can run cluster on several virtual machines.
I will soon have a solution - it's not free, but it is VERY cheap.
I have built a small cluster for training and education (via web access) and will be live in May 2013.
I will rent out 4 node cluster for $2 a day or $10 a week.
Since the cluster is not very big, it will handle data sets of only 20-40GB, but will have full web access to run mapreduce, pig scripts.
Whilst I am asking for some money, it's not really a business - just hoping that I can pay the power bills!
http://jyrocluster.com
Regards,
Serge
You could also use Apache Whirr to deploy your own test cluster on Amazon EC2. This gives you more control than Elastic Map Reduce. It should be cheap if you are using it only to test map reduce jobs for short periods of time.
You can give CloudxLab a try. Though it is not free, it is quite affordable. It provides a complete environment to practice Hadoop, Spark, Kafka, Hive, Pig, HBase, Oozie, Zookeeper, Flume, Sqoop, Mahout, R, Linux, Python, Scala, NumPy, Scipy, scikit-learn etc. You will not have to install or configure any software on your local machine to use CloudxLab. Many of the popular trainers are already using CloudxLab.