Is AWS Glue Connection a Single Point of Failure? - amazon-web-services

To create a Glue Connection Resource we can just pick one AZ/Subnet form the account VPC. Does it mean if that AZ goes down the Glue jobs that using that connection will fail? If yes how we can make it multi AZs to avoid a single point of failure?

There is no option to switch a subnet at runtime. The Elastic network interfaces with a glue prefix as seen in CloudTrail, show that they are attached to a specific subnet, VPC and security group. These match the connection metadata that you would have added in Glue. If a job fails due to the unlikely event of a complete AZ going down, then you can edit the connection and switch to another subnet. Maybe the folks at AWS support could take a feature request to enhance the product in future.

Related

AWS EC2 VPC - How could one find the destination and data of high unexpected traffic from an EC2 instance?

For an EC2 instance (Linux) it was encountered huge unexpected network traffic, in and out. From 500 MB/5min to 6GB/5min for 6+ hours continuously. We do not have VPC flow logs enabled. It is suspected a security breach, an unwanted transferring of data.
We would be interested in knowing where and what data was transferred.
Questions:
Since this happened in the past and we did not have VPC flow logs enabled, is there a way for AWS to determine where the data was transferred (IP, hostname)?
In the case it happens in the future, I guess the solution is to have enabled AWS VPC Flowlogs on the EC2 instance interface https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html#flow-logs-default and check for dstaddr and pkt-dstaddr for outgoing traffic. May you confirm?
I guess to find what data (which files) was transmitted it is not possible for AWS to tell but which local (on EC2) solution would you advise? I am thinking to have a Cloudwatch monitor to alert us when throughput reaches a set threshold and then I can run a packet capture tool (tcpdump) to capture traffic on that interface (locally or on S3 - depending on the size).
Except AWS Flow Logs which implies additional costs, which local (on EC2, Linux)data traffic tool for monitoring would you recommend to run 24/7 and save logs?
Thank you.
AWS Shield is always running on all AWS accounts. If you have a Business or Enterprise support plan you could escalate to the AWS Shield Response Team (SRT) and they could assist you. AWS Support

How can I access aws resources in VPC from AWS glue?

I have a glue job which is hitting an API hosted over an EC2 instance.
The problem is EC2 instance resides within a VPC blocking all public access.
I tried creating an endpoint interface in my VPC but still can't access the REST API.
The host is always unreachable but when I try to access the API from VPC it is working fine.
The security group associated with the EC2 instance is used while creating the VPC Endpoint.
Any help is appreciated
If you go to AWS Glue console, under connections, create a connection. What is meant by a dummy connection, is just be a non-existent database or resource for example: jdbc:mysql://some-fake-endpoint-here:3306/mydb. After this you choose the correct VPC, subnet and security group. Which means a test connection will not work in this context but what it brings is a way to introduce your VPC, Subnet and Security group information to the job. Testing such a connection can be done using a python-shell job or launch an ec2 instance in the same vpc or same subnet and run something like nc -vz endport port.
This connection metadata information will facilitate the launching of elastic network interfaces in your account that allow glue DPUs to communicate with your resource at runtime. More on how connections in glue is discussed here.

How to simulate AWS Dynamodb outage within one availability zone

I want to test my dynamodb with one availability zone failure.
The basic idea comes to my mind is that find out EC2 instance(s) which dynamodb is running on within an availability zone, and stop or terminated it/them or do something in security group or NACL that related to the EC2 instance(s).
But I am even not found the EC2 instance that running dynamodb in my AWS account.
Any idea is welcome!!!
DynamoDB doesn't run on EC2 instances in your account. Like for example S3 it's a AWS-managed service running on hardware outside of your account. Because of that you can't test partial failures of DynamoDB, because you have no possibility to induce such failures into DynamoDB.

How do I set the AWS peering connection DNS resolution options through CloudFormation?

I have two VPCs:
VPC1 which holds our RDS instance.
VPC2 which holds our cluster of EC2 instances.
We have successfully setup a VPC peering connection, routes and security groups to allow appropriate communication.
In order to resolve the RDS instance AZ-appropriate local IP address from it's hostname, we need to follow these instructions and set --requester-peering-connection-options AllowDnsResolutionFromRemoteVpc=true.
If I do this manually through the AWS Console or the AWS CLI it all works fine, however I'm creating the cluster of EC2 instances through CloudFormation and the option is missing from the CloudFormation documentation.
The effect of this is that my stack starts up and fails because the services themselves cannot connect to the database.
Am I doing something obvious wrong, or is this just Amazon being incomplete?
Thanks!
Due to the frequency of updates, there are many times where an AWS feature isn't available in CloudFormation (ALB targeting Lambda used to be) - you end up having to create a custom resource to manage it. It's not too bad, just make sure that your lambda responds with success or failure in all scenarios, including exceptions, otherwise your stack will be 'in progress' for hours.

What is the minimal set of outbound rules required of the master/slave security groups for an EMR cluster?

I'm trying to secure a pipeline for analyzing controlled-access genomic data with Amazon Elastic MapReduce (EMR), and it would help to know the minimal set of outbound rules required of the master and slave security groups of an EMR cluster. I'm sure it differs from region to region, and the IP ranges given at http://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html probably subsume them, but it would be great to know exactly which CIDR blocks we should worry about. It looks like EMR pokes just the right holes among the inbound rules for everything to work, but I've found the cluster gets stuck on provisioning if the outbound rules are anything other than "allow all traffic."
We had the identical problem. The way we addressed this problem is by doing the following.
From the ip-ranges.json, use the EC2 CIDR block & AMAZON service cidr block. You may substract CLOUDFRONT & ROUTE53 blocks.
The reason is you need to be able to talk to EMR webservice endpoints that are hsoted outside your VPC. EMR uses a subset of EC2 instances to spin up cluster.
If you have a support contract, ask Amazon to provide you with the CIDR block (we paid for a consulting engagement and this was one of the things they did).
Also, as the EMR webservice is on a public DNS endpoint (not 10.*), there should be a route to the internet gateway.