How to connect FTP server through AWS Glue

How to connect FTP server through AWS Glue - amazon-web-services

I have wrote python shell script in AWS GLUE using paramiko client to connect to FTP server, but connection is failing as FTP server owner has not given access to the VPC.
Now I need to add a subnet for Glue , so that I can provide that subnet IP to the FTP owner so he can provide access to the GLUE service, but I am not able to find out how to map or add that subnet to glue or create a subnet for Glue
Can someone help me with this?

You can do the following to get an Elastic IP for the Glue and then whitelist in your FTP service.
Add a development endpoint
Choose proper VPC, Subnet and Security Group while creating an endpoint.
Create an Elastic IP, and attach to the Glue Endpoint. Refer Access a development endpoint by attaching an Elastic IP address
Use this Elastic IP for whitelisting in FTP service.

Related

AWS Transit Gateway Cross Account VPC Access using AWS Client VPN Endpoint

I am trying to connect to one of my EC2 from my local machine using AWS Client VPN Endpoint.
I have Landing Zone Setup.
Transit Gateway and AWS Client VPN Endpoint is created in Shared Account and Transit Gateway is shared with Application Account using AWS RAM.
VPC is also created in Shared Account, I am able to ping/connect with the instance launched in Shared Account, but I am not able to ping/connect to the server launched in Application Account.
I also tried to ping from EC2 machine in Shared Account to EC2 machine in Application Account, this also did not worked, ideally I was expecting this should connect.
I have tried to put most of the details and configurations which I did in the following images. It will be great if someone could help me to understand the root cause.
Note: I have not configured DNS Servers while creating AWS Client VPN Endpoint.

If you follow the routes in your picture, you want to connect from your machine to an IP address in the range 1.8.2.2/26.
This already fails at the start since the client VPN has no routes configured for that range. Only for 1.8.2.6/26. So your packet doesn't get passed the client VPN. Add a route at the client VPN for 1.8.2.2/26 to go to subnet SA.
That should get you at least one step further :)

Unable to ping Private IP of DMS Replication Instance from on-premises over Site-to-Site VPN & DMS source DB endpoint test connection fails

What am I trying to do?
I want to migrate and automatically replicate data from SQL Server in my on-premises Windows Server to DB in AWS Cloud. I am using AWS DMS (Database Migration Service) for this.
What have I done/tried already?
I have set up a site-to-site VPN (between on-premises network and AWS VPC)
I am able to ping EC2 instance in VPN from Windows Server on-premises
I am able to ping Windows Server on-premises from EC2 instance in VPN
I have created a DMS Replication Instance. Its Private IP is within the allowed VPC CIDR of the VPN connection set already
I am able to ping the Private IP of DMS Replication Instance from EC2 instance
However, I am NOT able to ping the Private IP of DMS Replication Instance from Windows Server on-premises
I have set-up a DB Server in my on-premises Windows Server. I added this DB as a DMS source endpoint. When I tried to test the connection, it failed with the following error message:
I have linked a Security Group to the DMS Replication Instance. This is the same Security Group I used in the VPN connection set up
My DMS source DB endpoint configuration is as follows:
What do I want to know?
Why am I not able to ping the private IP of DMS Replication Instance while I am able to ping an EC2 instance by setting up VPN
Why the DMS endpoint test connection is failing?
Could you help me in doing this DB migration please?

Probably the following debugging method would help you.
As you have mentioned that you are able to ping the EC2 instance private IP from your on-premise network, it was clear that Site-Site VPN is successful.
You did not mention that you created the DMS instance in the same subnet as the other windows instance which you are able to ping from your on-premise network. If you are created DMS in a different subnet please make sure the route table associated with that subnet has route propagation enabled . Then please check in the security groups that in the inbound rules you are allowing the port numbers and IP addresses. This way we can make sure all the things are setup proper in AWS.
From your on-premise sites please make a telnet test with the following command.
Windows/Linux:
Open command prompt in windows or terminal in linux and try
telnet <<DMS IP>> <<Port Number>>
If it is successful connected then you have connectivity between on-premise to DMS host.
If it is not successfully connected or timed out then you need to contact your on-premise network manager or who is in-charge and tell them that you have an issue connecting with AWS Subnet x.x.x.x/x CIDR from on-premise network

How can AWS Glue access IP whitelisted resource

If I have a service that needs to have IP whitelisting, how can I connect AWS Glue to it? I read that I seem to be able to put AWS Glue in a private VPC and configure a NAT gateway. I can then allow my NAT IP to connect to the service. However, I cannot find anyway to configure my Glue Job to run inside a subnet/VPC. How do I do this?

The job will run automatically in a VPC if you attach a Database connection to a resource which is inside the VPC. For example, I have a job that reads data from S3 and writes into an Aurora database in a private VPC using a Glue connection (configured as JDBC).
That job automatically has access to all the resources inside the VPC, as explained here. If the VPC has enabled NAT for external access, then your job can also take advantage of that.
Note if you use a connection that requires VPC and you use S3, you will need to enable an endpoint for S3 in that VPC as well.

The answer for your question is answered here -- https://stackoverflow.com/a/64414639 Note that Glue is a 'managed' service so it does not release any list IP addresses such that can be whitelisted. As a workaround you can use a EC2 instance to run your custom python OR pyspark script and whitelist the IP address of that particular EC2 instance

How can I connect AWS glue connection to MongoDb in Atlas which is hosted on Goolge Cloud Platform?

Hi I am facing trouble in crawling Mongo data to S3 using a crawler from AWS-Glue. In Mongo Atlas you need to whitelist IPs that it expects connections from. As Aws-Glue is serverless I do not have any fixed IP as such. Please suggest any solutions for this.

According to the document: Connecting to a JDBC Data Store in a VPC, AWS Glue jobs belong to the specified VPC ( and the VPC has NAT gateway ) should have a fixed IP address. For example, after configured a NAT gateway for the VPC, HTTP requests from the EC2 server belongs to the VPC has fixed IP address.
I'v not tested this for Glue. How about setting VPC and that NAT gateway for the Glue job ?

Running Get command on EC2 from Lambda

I am new to AWS environment.I have installed apache Atlas in EC2 instance and from Lambda I am trying to get metadata from glue data catalog and post it in apache atlas(apache atlas uses rest end-points) running on ec2.I am able to get the glue data catalog metadata in lambda function.
How can i access use curl/httpGet call from lambda function to access service running on port 21000 on localhost on my EC2 instance?
Update1 : Resolved by allowing all traffic for inbound on private IP for the EC2 instance in security group.
Update2 : Now I am able to access the rest URL(by its private IP) and glue catalog both within Lambda.What I did is I created a private and public subnet and put my EC2 instance and lambda on same private subnet with NAT configured on a public subnet.
Now my lambda is working but I am not able to ssh on my EC2 instance.Is there a way to get that working also?

"localhost" is relative to each computer. What is "localhost" on your EC2 server is different from what is "localhost" on AWS Lambda, etc. You need to stop trying to access "locahost" and use the server's IP address instead.
To access port 21000 on the EC2 server the Lambda function needs to be placed in the same VPC that the EC2 instance is in, and the EC2 server needs to be listening to external traffic on port 21000, not just localhost traffic. You would assign a security group to the Lambda function, and in the security group assigned to the EC2 server you would open port 21000 for traffic coming from the Lambda function's security group. Finally, the Lambda function would access the EC2 server by addressing it via the server's private IP.

I'm not familiar with Apache Atlas and whether it exposes it's own HTTP endpoints to external clients. What you need is a server running on EC2 for that.
EC2 server doesn't magically accept HTTP calls from external connections and route to the local resources you want (in this case, Atlas). Install Apache Server, nginx or any other server in your EC2 instance. Configure it properly and write some code that takes the data POSTed by your Lambda and submits to the local Apache Atlas API.
The following page contains some instructions in this direction. Search the web if you need more help, there are tons of tutorials for doing this already. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Tutorials.WebServerDB.CreateWebServer.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to connect FTP server through AWS Glue - amazon-web-services

Related

AWS Transit Gateway Cross Account VPC Access using AWS Client VPN Endpoint

Unable to ping Private IP of DMS Replication Instance from on-premises over Site-to-Site VPN & DMS source DB endpoint test connection fails

How can AWS Glue access IP whitelisted resource

How can I connect AWS glue connection to MongoDb in Atlas which is hosted on Goolge Cloud Platform?

Running Get command on EC2 from Lambda

Categories

Resources