I want to understand, why there is a need to configure vpc to access only redis and not other services( like dynamo db ) from lambda function
Redis (either on an EC2 instance, or via ElastiCache) runs inside your VPC. So does EC2 and RDS and Redshift and several other AWS services. If you want to access one of those services that run inside your VPC then you would have to configure your Lambda function to have VPC access.
Other services, like DynamoDB, that don't exist inside the VPC obviously don't require VPC access.
As for the reason why some services are inside the VPC and not others, the services that exist in the VPC tend to be things where you specify a specific server configuration (server size, number of servers, etc) and you run those servers inside your virtual network (VPC). On the other hand there are some services like DynamoDB where the actual back-end server size and quantity are not exposed to you. Amazon completely manages these servers and you don't even get to know how many of them there are. These servers do not exist inside your virtual network, and you only access these services via the AWS API.
Related
My company has an Azure function app that accesses a service on an AWS EC2 instance.
We use an AWS security group to only allow access to the service's port by the 7 possible outbound IP addresses used by the Azure function app. We find these listed in Azure portal.
However when the function app "scales" apparently it can be executed from a list of ~600 CIDR ranges of possible Azure IP addresses, from AzureCloud.eastus2 in my case.
The function app fails to access the needed web service in these cases and fails.
AWS security groups only allow 60 inbound rules so I couldn't set 600 even if I wanted.
Is there a better approach to opening an AWS instance's port to an Azure function app?
Just like using a NAT Gateway with AWS Lambda functions to provide a static outbound IP, it appears you can do the same thing in Azure Functions with a Virtual network NAT gateway. That appears to be your best option.
I can't figure out how to make them talk using API calls. Previously I used API Gateways which would trigger lambdas and that lambdas would interact with dynamodb and other services and send me back json response. Now I want to shift to EC2 instances and totally skip API gateway usage. And let a server I run in ec2 do the computation for me. Do I need to deploy a web service(DJango RESTFUL) in EC2 instance and then use it to call in my frontend? If yes I need little guidance how
And Suppose I want to access s3 storage from my DJango restufl in EC2. Can I do it without having to enter the access key and ID and use roles instead just like how I would access s3 from the ec2 instance without access key and ID. Traditionally with SDK we have to use access key and secret keys to even get authorized to use services in SDK so I was wondering if there was a way to get over this since the program will be running in EC2 instance itself. One really inefficient way will be to run a batch command that makes the EC2 interact with services I need without SDK and with roles instead but It is really inefficient and too much work as far as I can see.
As you are familiar with API Gateway, you can use the same to connect to your EC2 instance, its private integration, with the use of VPC Links.
You can create an API Gateway API with private integration to provide your customers access to HTTP/HTTPS resources within your Amazon Virtual Private Cloud (Amazon VPC). Such VPC resources are HTTP/HTTPS endpoints on an EC2 instance behind a Network Load Balancer in the VPC.
You can go though this document for step by step integration.
If you do not want to use API gateway any more, then you can simply use Route53 to route traffic to EC2 instance, all you need is the IP address of the EC2 instance and a hosted zone created using Route53.
Here is a tutorial for your reference.
I want to execute AWS CLI commands of RDS not via the internet, but via a VPC network for mainly creating manual snapshots of RDS.
However, VPC endpoints support only RDS Data API according to the following document:
VPC endpoints - Amazon Virtual Private Cloud
Why? I need to execute a command within closed network for security rules.
Just to reiterate you can still connect to your RDS database through the normal private network using whichever library you choose to perform any DDL, DML, DCL and TCL commands. Although in your case you want to create a snapshot which is via the service endpoint.
VPC endpoints are to connect to the service APIs that power AWS (think the interactions you perform in the console, SDK or CLI), at the moment this means for RDS to create, modify or delete resources you need to use the API over the public internet (using HTTPS for encrypted traffic).
VPC endpoints are added over time, just because a specific API is not there now does not mean it will never be there. There is an integration that has to be carried out by the team of that AWS service to allow VPC endpoints to work.
I am trying to connect to services and databases running inside a VPC (private subnets) from an AWS Glue job. The private resources should not be exposed publicly (e.g., moving to a public subnet or setting up public load balancers).
Unfortunately, AWS Glue doesn't seem to support running inside user defined VPCs. AWS does provide something called Glue Database Connections which, when used with the Glue SDK, magically set up elastic network interfaces inside the specified VPC for Glue/Spark worker nodes. The network interfaces then tunnel traffic from Glue to a specific database inside the VPC. However, this requires the location and credentials of specific databases, and it is not clear if and when other traffic (e.g., a REST call to a service) is tunnelled through the VPC.
Is there a reliable way to setup a Glue -> VPC connection that will tunnel all traffic through a VPC?
You can create a database connection with NETWORK connection type and use that connection in your Glue job. It will allow your job to call a REST API or any other resource within your VPC.
https://docs.aws.amazon.com/glue/latest/dg/connection-using.html
Network (designates a connection to a data source within an Amazon
Virtual Private Cloud environment (Amazon VPC))
https://docs.aws.amazon.com/glue/latest/dg/connection-JDBC-VPC.html
To allow AWS Glue to communicate with its components, specify a
security group with a self-referencing inbound rule for all TCP ports.
By creating a self-referencing rule, you can restrict the source to
the same security group in the VPC and not open it to all networks.
However, this requires the location and credentials of specific
databases, and it is not clear if and when other traffic (e.g., a REST
call to a service) is tunnelled through the VPC.
I agree the documentation is confusing, but according to this paragraph on the page you linked, it appears that all traffic is indeed tunneled through the VPC, since you have to have a NAT Gateway or VPC endpoints to allow Glue to access things outside the VPC once you have configured it with VPC access:
All JDBC data stores that are accessed by the job must be available
from the VPC subnet. To access Amazon S3 from within your VPC, a VPC
endpoint is required. If your job needs to access both VPC resources
and the public internet, the VPC needs to have a Network Address
Translation (NAT) gateway inside the VPC.
I'm creating a VPC to host a web app at AWS, and I want to use load balancers. Do I need to create a endpoint for elb like I have to s3?
Confusing AWS uses 'endpoint' to refer to a couple of different things. Judging by your question are you referring to this: https://aws.amazon.com/blogs/aws/new-vpc-endpoint-for-amazon-s3/
Essentially before VPC endpoints were introduced the only way to access certain AWS services was using a public URL, this is fine unless you are working in a locked down VPC where an instance might not have access to the public internet. With the introduction of VPC endpoints a few days ago you can now access AWS service(s) directly from a private instance.
As of right now S3 is the only one supported but no doubt it will be rolled out to similar services, e.g. DynamoDB, SQS, SNS, etc in the near future.
The exception to this is services that are able to live inside a VPC that you create, i.e. when creating them you tell them which VPC, and often subnet as well where they should be created. Examples of this are ELB, RDS, EC2, Redshift, etc. For these there is no need to create an endpoint, they already exist in your VPC and can be accessed directly.