AWS Neptune Connection Failed - Loading Data Into Neptune Cluster from S3 Bucket using CURL command - amazon-web-services

I am trying to bulk load RDF N-triples data from S3 bucket into Neptune loader. I have created a S3 bucket, IAM Role, Endpoint and Neptune cluster as per the following link https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-data.html.
And I am trying to execute a CURL command locally from windows using command prompt to load the data,
curl -X POST -H "Content-Type: application/json" https://<clusterEndpoint>:<clusterPort>/loader -d "{\"source\":\"s3://<bucketName>\",\"format\":\"ntriples\",\"iamRoleArn\":\"arn:aws:iam::<account-id>:role/<role-name>\",\"region\":\""<region>\",\"failOnError\":\"FALSE\",\"parallelism\":\"MEDIUM\",\"updateSingleCardinalityProperties\":\"FALSE\",\"queueRequest\":\"TRUE\"}"
On executing the above Curl command I am getting the following error,
Failed to connect to <neptuneClusterEndoint> port <portNumber>: Timed out
Also, when I tried to check the cluster status using the command curl http://<neptuneCluster>:<portNumber>/status , I got the same time out error.
I am trying to make a Neptune Load using Curl command without creating an EC2 instance. May I know why I am getting the connection failed error? Is there a way to use curl command to make a Neptune load successfully?

The curl command needs to have access to the Neptune VPC. That could be via an EC2 bastion host over an SSH tunnel for example. As you are trying to avoid using EC2 you will need to setup an alternative way to access Neptune such as a load balancer. You could also use a Lambda function so long as the Lambda function has access to the VPC. There are many other ways you could decide to connect but as Neptune does not expose a public IP address you will need to configure a way for your curl command to access that VPC. Also be aware that if you have IAM Authentication enabled on the Neptune cluster that the request will have to be signed using SigV4 credentials.
If you have a Neptune Notebook configured, you can just use the %load command and it can also handle any SigV4 needs for you as well.

Related

Hit AWS ECS load balancer endpoint using java

I am trying to hit a load balancer endpoint using AWS SDK for java, however I dont see any API for in the AmazonECSClient class. I see the option to set endpoint, region, credentials etc.
AmazonECSClient.builder()
.withCredentials(new DefaultAWSCredentialsProviderChain())
.withRegion(region)
.withEndPoint()
.build();
The endpoint is tested using curl command and it works - curl http://elb-dummy-endpoint.us-east-1.elb.domain.com:80/invocations -d '{"query": "some query"}' -H 'Content-Type: application/json'
Do I have to do a regular API call?
I am trying to hit a load balancer endpoint using AWS SDK for java
You would not use the AWS SDK for hitting a load balancer endpoint. The AWS SDK is for interacting with the AWS API for doing things like creating a load balancer. The load balancer is serving your API, not the AWS API, so you would not use the AWS SDK to interact with the load balancer.
The AmazonECSClient class you are trying to use is for doing things like creating/updating/deleting ECS clusters, services and tasks. It is not a client for the application you have running on ECS.
The endpoint is tested using curl command and it works - curl
http://elb-dummy-endpoint.us-east-1.elb.domain.com:80/invocations -d
'{"query": "some query"}' -H 'Content-Type: application/json'
You are testing it here with curl to make basic HTTP calls, you are not using the AWS CLI tool. In Java you would do the same thing, make a basic HTTP call against the endpoint.

Downloading a file in AWS EC2 from a protected server accessed by VPN

I have a file on a server located at let's say: https://example.com/myfolder/file.gz
The domain example.com is protected by inbound rules such that I can only access it by connecting to a VPN.
So on my personal hardware when I try to download the file by using curl -O https://example.com/myfolder/file.gz, it gives me "Unknown host error". However, when I connect to VPN via Cisco AnyConnnect and then hit the curl command, file.gz gets downloaded.
Now what I want is to enable an AWS EC2 instance to be able to download file.gz by executing curl command. However, when I SSH into my EC2 instance and execute the curl command, it gives me the same "Unknown host error" as expected.
I have searched around VPC, ENI, OpenVPN and other AWS offerings but since I'm a beginner to AWS I didn't quite understand what could be the solution to my problem.
FYI - One way to possibly resolve this issue would be to create an elastic IP and whitelist that in the server that I'm trying to access, (in this case example.com) but I want to create more EC2 instances trying to do the same thing, so it would become tedious to manage.

Connecting to Postgres using private IP

When creating my Postgres Cloud SQL instance I specified that would like to connect to it using private IP and chose my default network.
My VM sits in the same default network.
Now, I follow instructions as described here https://cloud.google.com/sql/docs/postgres/connect-compute-engine
and try executing
psql -h [CLOUD_SQL_PRIVATE_IP_ADDR] -U postgres
from my VM, but get this error:
psql: could not connect to server: Connection timed out Is the server
running on host "CLOUD_SQL_PRIVATE_IP_ADDR" and accepting TCP/IP connections on
port 5432?
Anything I am under-looking?
P.S. My Service Networking API (whatever that is) is enabled.
If you have ssh to a VM in the same network you can connect to Cloud SQL using cloud SQL proxy:
Open the ssh window (VM-instances in Computer engine and click on ssh), then download the proxy file with:
wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O cloud_sql_proxy
Execute, in the ssh shell
chmod +x cloud_sql_proxy
Create a service account with role Cloud SQL Client and create an api key. Download the json key in your local computer.
In the ssh vm shell click on the wheel and "upload", and upload the key file
5.
./cloud_sql_proxy -instances=<Instance connection name>=tcp:5432 -credential_file=<name of the json file>
where "Instance connection name" can be found in SQL-Overview -> Connect to this instance
Finally
psql "host=127.0.0.1 port=5432 sslmode=disable user=<your-user-name> dbname=<your-db-name>"
On the other hand, if you want to connect to cloud sql from your local computer and the cloud sql instance does not have a public ip you have to connect through a bastion host configuration.
https://cloud.google.com/solutions/connecting-securely
According to this document connect via private ip, you need to setup following item:
You must have enabled the Service Networking API for your project. If you are using shared VPC , you also need to enable this API for the host project.
Enabling APIs requires the servicemanagement.services.bind IAM permission.
Establishing private services access requires the Network Administrator IAM role.
After private services access is established for your network, you do not need the Network Administrator role to configure an instance to use private IP.

AWS CLI S3 Access

I am trying to access S3 Bucket using AWS CLI. I have installed and configured AWS CLI referring to AmazonAWSCLI. So, when I try to list using $aws s3 ls s3://xyz/ I am getting the following error:
Not supported proxy scheme asusproxy
I tried to setup proxy using
export HTTP_PROXY=http://username:password#a.b.c.d:n
But, still getting the same error. What might be the issue?
Note:-
The same setup works on an amazon ec2 machine
If you need to access AWS through proxy servers, you should configure the HTTP_PROXY and HTTPS_PROXY environment variables with the IP addresses for your proxy servers.
try:
export HTTP_PROXY=http://username:password#a.b.c.d:n
export HTTPS_PROXY=http://username:password#a.b.c.d:n
the setup works on your ec2 instance because there is no proxy server.

connect with ssh to Amazon Elastic search

I want to run a python script, (which I have run on a docker ubuntu installation) on AWS. It sends data to from Twitter to Elastic Search. I want to run it on Amazon Elasticsearch Service. I have set up Amazon Elasticsearch Service on AWS but I don't know how to get the script into the system and get it running.
What would the ssh be to access the Elastic Search Server?
Once I am able to access it where would I place a python script in order to feed data into the Elasticsearch server?
I tried
PS C:\Users\hpBill> ssh root#search-wbcelastic-*******.us-east-1.es.amazonaws.com/
but just get this:
ssh.exe": search-wbcelastic-**********.us-east-1.es.amazonaws.com/: no address associated with name
I have this information
Domain status
Active
Endpoint
search-wbcelastic-*********.us-east-1.es.amazonaws.com
Domain ARN
arn:aws:es:us-e******1:domain/wbcelastic
Kibana
search-wbcelastic-********.us-east-1.es.amazonaws.com/_plugin/kibana/
You cannot SSH directly into AWS Cloud Search. So that SSH command will never work. You have two option to run the Python script either launch a EC2 instance with AWS CLI or store and run the script from your local machine with AWS CLI. Here is the developer guide for the AWS CLI for Cloud Search
http://docs.aws.amazon.com/cloudsearch/latest/developerguide/using-cloudsearch-command-line-tools.html