Accessing AWS DocumentDB from a separate VPC using VPC Sharing? - amazon-web-services

The latest DocumentDB documentation states that a jump host is necessary for accessing the database from outside its native VPC:
By design, you access Amazon DocumentDB (with MongoDB compatibility)
resources from an Amazon EC2 instance within the same Amazon VPC as
the Amazon DocumentDB resources. However, suppose that your use case
requires that you or your application access your Amazon DocumentDB
resources from outside the cluster's Amazon VPC. In that case, you can
use SSH tunneling (also known as "port forwarding") to access your
Amazon DocumentDB resources.
However, VPC sharing seems to allow multiple accounts/VPCs to share the same resources.
Is it possible to use VPC sharing to access a documentDB resource in another VPC without having to use jump hosts?
Thank you in advance for your consideration and response.

Yes.
https://aws.amazon.com/documentdb/faqs/
Amazon DocumentDB clusters deployed within a VPC can be accessed directly by EC2 instances or other AWS services that are deployed in the same VPC. Additionally, Amazon DocumentDB can be accessed by EC2 instances or other AWS services in different VPCs in the same region or other regions via VPC peering.
We will get the documentation updated.

Related

Accessing S3 from inside EKS using boto3

I have a Python application deployed on EKS (Elastic Kubernetes Service). This application saves large files inside an S3 bucket using the AWS SDK for Python (boto3). Both the EKS cluster and the S3 bucket are in the same region.
My question is, how is communication between the two services (EKS and S3) handled by default?
Do both services communicate directly and internally through the Amazon network, or do they communicate externally via the Internet?
If they communicate via the internet, is there a step by step guide on how to establish a direct internal connection between both services?
how is communication between the two services (EKS and S3) handled by default?
By default the network topology of your EKS offers route to the public AWS S3 endpoints.
Do both services communicate directly and internally through the Amazon network, or do they communicate externally via the Internet?
Your cluster needs to have network access to the said public AWS S3 endpoints. Example, worker nodes running in public subnet or the use of NAT gateway in private subnet.
...is there a step by step guide on how to establish a direct internal connection between both services?
You create VPC endpoints for S3 in the VPC that your EKS runs to ensure network communication with S3 stay within AWS network. VPC endpoints for S3 support both interface and gateway type. Try this article to learn about the basic of S3 endpoints, you can use the same method to create endpoints in the VPC where your EKS runs. Request to S3 from your pods will then use the endpoint to reach out to S3 within AWS network.
You can add S3 access to your EKS node IAM role, this link shows you how to add ECR registry access to EKS node IAM role, but it is the same for S3.
The other way is to make environment variables available in your container, see this link, though I would recommend the first way.

How to configure an Instance_1 in Region_1 to be able to use a custom VPC?

I have lightsail instances in multiple regions.
I want to allow Instance_1 in Region_1 to be able to communicate with a custom aws vpc from that region.
I understand that each lightsail instance is an independent vps (virtual private server).
Is it correct to say that- when vpc peering is enabled (under account settings), then all the lightsail instances in the region get access to the default vpc of the region?
Is there any way to enable it only for 1 lightsail instance?
Assuming a region has multiple vpc's (say a default vpc and an additional vpc), then is there any way to enable vpc peering to the non default aws vpc?
No.
VPC Peering in Amazon Lightsail only permits connection to the Default VPC in a Region.
It also looks like all resources would be included in the peering relationship.
If you need better control, you would need to use Amazon EC2 instead of Amazon Lightsail.
(I suspect that these limitations are intentional, to encourage people with more requirements to use Amazon EC2. Amazon Lightsail is marketed as a 'starter' product with a lower price and therefore less functionality.)

List of AWS services that don’t require a VPC to run

Google failed me again or may be I wasnt too clear in my question.
Is there an easy way or rather how do we determine what services are VPC bound and what services are non-vpc ?
For example - EC2, RDS require a VPC setup
Lambda, S3 are publicly available services and doesn't need a VPC setup.
The basic services that require an Amazon VPC are all related to Amazon EC2 instances, such as:
Amazon RDS
Amazon EMR
Amazon Redshift
Amazon Elasticsearch
AWS Elastic Beanstalk
etc
These resources run "on top" of Amazon EC2 and therefore connect to a VPC.
There are also other services that use a VPC, but you would only use them if you are using some of the above services, such as:
Elastic Load Balancer
NAT Gateway
So, if you wish to run "completely non-vpc", then avoid services that are "deployed". It means you would use AWS Lambda for compute, probably DynamoDB for database, Amazon S3 for object storage, etc. This is otherwise referred to as going "serverless".

AWS EMR on VPC with EC2 Instance

I am doing a reading on AWS EMR on VPC but it seems like it is more of design consideration for AWS EMR Service to access EMR cluster for calls.
What I am trying to do is host a VPC with ALB and EC2 instance running an application as a service to access EMR cluster.
VPC -> Internet Gateway -> Load Balancer -> EC2 (Application endpoints) -> EMR Cluster
I don't want Cluster to be accessible from outside except through Public IP of IG. But Public IP can access only EC2 instance hosting application which calls EMR cluster on same VPC.
Is it recommended approach?
The design looks something like below.
Some challenges I am tackling is how to access S3 from EMR if on VPC,
and if the application is running on EC2 can it access EMR cluster, and if EMR cluster would be available publicly?
Any guidance links or recommendations would be welcome.
EDIT:
Or if I create EMR on VPC do i need to wrap it inside of another VPC something like below?
The simplest design is:
Put everything in a public subnet in a VPC
Use Security Groups to control access to the EMR cluster
If you are security-paranoid, then you could use:
Put publicly-accessible resources (eg EC2) in a public subnet
Put EMR in a private subnet
Use a NAT Gateway or VPC-Endpoints to allow EMR to communicate with S3 (which is outside the VPC)
The first option is simpler and Security Groups act as firewalls that can fully protect the EMR cluster. You would create three security groups:
ELB-SG: Permit inbound access from the Internet on your desired ports. Associate the security group with your Load Balancer.
EC2-SG: Permit inbound access from ELB-SG (from the Security Group itself). Associate the security group with your EC2 instances.
EMR-SG: Permit inbound access from EC2-SG (from the Security Group itself). Associate EMR-SG with the EMR cluster.
This will permit only the Load Balancer to communicate with the EC2 instances and only the EC2 instances to communicate with the EMR cluster. The EMR cluster will be able to connect directly to the Internet to access Amazon S3 due to default rules permitting Outbound access.

Connecting Kubernetes minions to classic (non-VPC) AWS resources

I'm looking to spin up a Kubernetes cluster on AWS that will access resources (e.g. RDS, ElastiCache) that are not on a VPC.
I was able to set up access to RDS by enabling ClassicLink on the kubernetes-vpc VPC, but this required commenting out the creation of one of Kubernetes' route tables (which conflicted with ClassicLink's route tables), which breaks some of Kubernetes networking. ElastiCache is more difficult, as it looks like its access is only grantable via classic EC2 security groups, which can't be associated with a VPC EC2 instance, AFAICT.
Is there a way to do this? I'd prefer not to use a NAT instance to provide access to ElastiCache.