I would like to let my AWS EKS node to communicate with AWS RDS. Both of them are in the same subscription and region so no need to implement any sci-fi architectures. Just a simple one would be enough.
I started to investigate and I found a couple of stackoverflow threads.
This is the first idea where the Security Groups for Pods is "implemented". This is not my case. I'm happy to share the RDS with all the whole nodes. Am I wrong?
This is the second idea (actually in the same thread) where they suggest to put all the different resources (RDS and EKS) in the same VPC (shared?). Is it a good idea?
And finally here the VPC Peering Connection is suggested as a good solution. Is it really a good solution? I can see here the announcement which stands that: "all data transfer over a VPC Peering connection that stays within an Availability Zone (AZ) is now free". This is good, but looks like an enterprise solution for a simple problem.
Can you help me here in choosing a good solution which can properly fit my scenario? Can I set a proper IAM/Roles instead?
Related
I was starting a neptune database from this base stack
https://s3.amazonaws.com/aws-neptune-customer-samples/v2/cloudformation-templates/neptune-base-stack.json
However now i am wondering why a NAT Gateway and also an Internet Gateway are started in this stack? are they required for updates within Neptune? This seems like a huge security risk.
On top of that these gateways are not cheap.
I would be happy for an explanation on this
The answer is no, it's not required, AWS just sneaked some unecessary costly ressources into the template..
Anyways if you want to use the updated template without NAT and IG GWs use this one that i just created https://neptune-stack-custom.s3.eu-central-1.amazonaws.com/base.json
I would like to setup a Ray cluster to use Rtune over 4 gpus on AWS. But each gpu belongs to a different member of our team. I have scoured available resources for an answer and found nothing. Help ?
In order to start a Ray cluster using instances that span multiple AWS accounts, you'll need to make sure that the AWS instances can communicate with each other over the relevant ports. To enable that, you will need to modify the AWS security groups for the instances (though be sure not to open up the ports to the whole world).
You can choose which ports are needed via the arguments --redis-port, --redis-shard-ports, --object-manager-port, and --node-manager-port to ray start on the head node and just --object-manager-port, and --node-manager-port on the non-head nodes. See the relevant documentation.
However, what you're trying to do sounds somewhat complex. It'd be much easier to use a single account if possible, in which case you could use the Ray autoscaler.
We have a VPN setup with two static routes
10.254.18.0/24
10.254.19.0/24
We have a problem that we can only ever communicate from AWS - to one of the above blocks at a time. At some times it is .18 and at other times it is .19 - I cannot figure out what is the trigger.
I never have any problem communicating from either of my local subnets out to aws at the same time.
Kinda stuck here. Any suggestions?
What have we tried? Well the 'firewall' guys said they dont see anything being blocked. But I read another post here that stated the same thing and the problem still ended up being the firewall.
Throughout the course of playing with this the "good" subnet has flipped 3 times. Meaning
Right now I can talk to .19 but not .18
10 min ago I could talk to .18 but not .19
It just keeps flipping.
We've been able to get this resolved. We changed the static routes configured in AWS from:
10.254.18.0/24
10.254.19.0/24
To use instead:
10.254.18.0/23
This will encompass all the addresses we need and has resolved the issue. Here was Amazon's response:
Hello,
Thank you for contacting AWS support. I can understand you have issues
with reaching your two subnets: 10.254.18.0/24 and 10.254.19.0/24 at
the same time from AWS.
I am pretty sure I know why this is happening. On AWS, we can accept
only one SA (security association) pair. On your firewall, the
"firewall" guys must have configured policy based VPN. In policy/ACL
based VPN, if you create following policys for eg: 1) source
10.254.18.0/24 and destination "VPC CIDR" 2) source 10.254.19.0/24 and destination "VPC CIDR"
OR 1) source "10.254.18.0/24, 10.254.19.0/24" and destination "VPC CIDR"
In both the cases, you will form 2 SA pairs as we have two different
source mentioned in the policy/ACL. You just have to use source as
"ANY" or "10.254.0.0/16" or "10.254.0.0/25", etc. We would prefer if
you can use source as "ANY" then micro-manage the traffic using
VPN-filters if you are using Cisco ASA device. How to use VPN-filters
is given in the configuration file for CISCO ASA. If you are using
some other device then you will have to find a solution accordingly.
If your device supports route based VPN then I would advice you to
configure route based VPN. Route based VPNs always create only one SA
pair.
Once you find a solution to create only one ACL/Policy on your
firewall, you will be able to reach both the networks at the same
time. I can see multiple SA formation on your VPN. This is the reason
why you cannot reach both the subnets at the same time.
If you have any additional questions feel free to update the case and
we will respond to them.
My question is 2 fold:
**UPDATE*******
I fixed number 1.
I had to specify the region in the config. I guess this is because my keys associate the east by default.
If anyone has an answer to 2 that would be great.
1) I am ultimately trying to setup a 4 node cluster (2 in each region). In the main region (east-us-1) the nodes see each other perfectly fine but in the west, they don't seem to see each other. I'd like to make sure they can see each other before I try multi region (which I'm not entirely sure how to do yet). I've installed the plugin.
Basically, why in a different region are the nodes not seeing each other when it's the same config. I can telnet to/from each server on 9200/9300.
Here is my config:
cloud:
aws:
access_key:
secret_key:
discovery:
type: ec2
ec2:
groups: ELASTIC-SEARCH
2) Is there a way to designate a specific node to "Hold all the data" and then distribute it among them all?
While it's not the answer you want: Don't do that.
It'll be much easier to have two clusters in two regions, and keep them in sync on your application layer. Also, Elasticsearch has introduced the concept of a Tribe-node in 1.0 to make this a bit easier.
Elasticsearch, like any distributed database, is very sensitive to network issues. In this case you're relying on the Internet working reliably. It tends not to.
The setup you suggest will be quite prone to split brains or outages. If you configure minimum master nodes to be a quorum, which you always should, the cluster will go down whenever there's a connection problem between the regions.
We've written two articles that go much more in depth than this about this topic, which you may want to look into:
Elasticsearch in Production has a section on networking related issues.
Elasticsearch Internals: Networking Introduction describes the network topology of Elasticsearch. Specifically, you'll see just how many connections Elasticsearch needs to have working reliably.
I am working on a project and am at a point where the POC is done and now want to move towards a real product. I am trying to understand the Amazon cloud offerings just to see if I need to be aware of them at development time. I have a bunch of questions that I cannot get answered from the Amazon site. Its probably because I am new to the whole web services thing and have never hosted a site before. I am hoping someone out here will explain this to me like I am a C programmer :)
I see amazon has a bunch of offerings -
EC2
Elastic Block Store
Simple DB
AuotScaling
Elastic Load Balancing
I understand EC2 is virtual server instances that I can use and these could come pre-loaded with what I want (say Apache + python). I have the following questions -
If I want a custom instance of something (like say a custom apache module I wrote for my project). Can I create a server instance using the exact modules and make it the default the next time I create a new instance or in Autoscaling?
Do I get an IP Address to access this? Can I set my own hostname to it? I mean do I get a DNS record? Or is it what Elastic IP is?
How do I access it from the outside? SSH? Remote Desktop? Or is it entirely up to how I configure the instance?
What do they mean by Inter-Region or Intra-Region data transfer? What is data transfer to begin with? Is it just people using my instance? So if I go live with it that will be the cost I have to pay for people using it?
What is the difference between AutoScaling and Elastic Load Balancing?
What is Elastic Block Store? Is it storage? If so do I have to worry about backups or do they take care of it?
About the Simple DB -
It looks like the interface to use this is different to my regular SQL calls. Am I correct?
If so the whole development needs to be tailored specifically for Amazon. Which kind of sucks. Is there a better alternative?
Do I get data backups or do I have to worry about it myself?
Will I be able to connect to the DB using regular tools to inspect the DB (during or afte development). Or do I get other tools made by Amazon for it?
What about security? The DB is obviously somewhere in the cloud farm away from the EC2 instance. My DB password is going over the wire and so is all my data totally unencrypted. Don't I have to worry about that? The question comes up only because I don't own any of the hardware.
I really hope some one points me in the right direction here.
Thanks for taking the time to read.
P
I just went through the question and here I tried to answer few of them,
1) AWS EC2 instances doesnt publish pre-configured instances, in fact its configured by the developers and made it publicly available to the users so that they can use it. One can any one of those instances or you can just opt for what ever OS you want which is raw and provision it accordingly and create a snap shot of it so that you can use it for autos caling.The snap shot becomes the base AMI in your case.
2) Every instance you boot will have a public DNS attach to it, you can use the public DNS to connect to that instance using ssh if your are a linux user or using putty if you are a windows users. Apart from that, you can also attach a elastic IP which comes with a cost will is like peanuts and attach it to the instance and access your instance through the elastic IP and you can either map the public DNS or elastic ip to map to a website by adding a A record or Cname respectively.
3)AWS owns databases in the different parts of the world. For example you deploy your application depending upon your customer base, if you target customers are based out of India, the nearest region available is Singapore which is called as ap-southeast-1 by AWS. Each region will have multiple availability zones, example ap-southeast-1a and ap-southeast-1b, which are two different databases and geographically part. Intre region means from ap-southeast-1a to ap-southeast-1b. Inter Region means, from ap-southeast-1 to us-east-1 which is Northern Virginia Data centre. AWS charges from in coming and out going bandwidth, trust me its nothing.
They chargge 1/8th of a cent per GB. Its a thing to even think about it.
4)Elastic Load balancer is cluster which divides the load equally to all your regions across availability zones (if you are running in multi AZ) ELB sits on top the AWS EC2 instances and monitors the instance health periodically and enables auto scaling
5) To help you understand what is autoscaling please go through this document http://aws.amazon.com/autoscaling/
6)Elastic Block store or EBS are like hard disk which is a persistent data storage which can be attached to your instance.Regarding back up yes dependents upon your use case. I do backups of EBS periodically.
7)Simple Db now renamed as dynamo DB is nosql DB, I hope you understand what is nosql db, its a non RDMS db systems. Please read some documentation to understand what is nosql db is.
8)If you have mysql or oracle db you can opt for RDS, please read the documents.
9)I personally feel you are newbie to the entire cloud eco system, you need to understand what exactly cloud does first.
10)You dont have to make large number of changes to development as such, just make sure it works fine in your local box, it can be deployed to cloud with out much ado.
11) You dont have to use any extra tool for that, change the database end point to RDS(if your use it) or else install mysql in your ec2 instance and connect to the local db which resides in the ec2 instance and connect to it,which is as simple as your development mode.
12)You dont have to worry about any security issues aws, it is secured. Dont follow the myths, I am have been using aws since 3 years running I dont even know remember how many applications, like(e-commerce,m-commerce,social media apps) I never faced any kind of security issues and also aws allows to set your security how ever you want.
Go ahead, happy coding. Contact me if you have any problem.
The answer above is a good summary on AWS. Just wanted to add
AWS offers full data center, so it depends what you are trying to achieve. For starters you will need,
EC2 - This is your server, it comes with instance storage, which will be lost on restart
EBS - Your mounted storage, the data is persisted across reboots
S3 - Provides storage (RESTful API's on top, the cost is usage based rather than "provisioned" as in EBS)
Databases - can start with Amazon RDS, which provides managed database services, you can chose between various available databases. You can also install your own database using EC2 + EBS, you will have to take care of managing the database yourself.
Elastic IP: Public facing IP address, you can point your DNS server to this.
One great tool to calculate the pricing,
http://calculator.s3.amazonaws.com/calc5.html
Some other services to take in account are:
VPC (Virtual Private Cloud). This is your own private network. You can define subnets, route tables and internet gateways there. I would strongly recommend to use VPC for any serious deployment of more than one instance.
Glacier - this will replace your tape library to storing backups.
Cloud Formation - great tool for deployment and automation of instances.