Accessing ElasticSearch API on AWS - amazon-web-services

AWS recommends using its SDKs (such as boto3) or command-line tools to configure an ElasticSearch cluster.
However, some ElasticSearch API endpoints are not exposed in AWS APIs (e.g. _cat/shards).
Even some AWS support documents (such as this one on cluster rebalancing) seem to make direct request to the cluster API.
The trouble is: such requests need to be authenticated using AWS4Auth (only certain IAM roles have permissions to write to ElasticSearch, in my setup) – and even AWS recommends against making manually creating signed HTTP requests because it's such a pain.
My question is: do I need to manually create signed HTTP requests against my ES cluster in order to manage it, or is there an easier way that I've missed?

Based on the comments.
The proposed solution is to use third party aws-requests-auth:
This package allows you to authenticate to AWS with Amazon's signature version 4 signing process with the python requests library.
The example of its use for ElasticSearch is:
from aws_requests_auth.aws_auth import AWSRequestsAuth
from elasticsearch import Elasticsearch, RequestsHttpConnection
es_host = 'search-service-foobar.us-east-1.es.amazonaws.com'
auth = AWSRequestsAuth(aws_access_key='YOURKEY',
aws_secret_access_key='YOURSECRET',
aws_host=es_host,
aws_region='us-east-1',
aws_service='es')
# use the requests connection_class and pass in our custom auth class
es_client = Elasticsearch(host=es_host,
port=80,
connection_class=RequestsHttpConnection,
http_auth=auth)
print es_client.info()

I ended up finding this AWS doc on request signing for AWS ElasticSearch. It clearly shows that the intended approach is to use scripts, using the HTTP client of choice for the language.
As Marcin mentioned in hist answer, aws-requests-auth is one choice to simplifiy this in Python

Related

AWS HTTP API Gateway as a proxy to private S3 bucket

I have a private S3 bucket with lots of small files. I'd like to expose the contents of the bucket (only read-only access) using AWS API Gateway as a proxy. Both S3 bucket and AWS API Gateway belong to the same AWS account and are in the same VPC and Availability Zone.
AWS API Gateway comes in two types: HTTP API, REST API. The configuration options of REST API are more advanced, additionally, REST API supports much more AWS services integrations than the HTTP API. In fact, the use case I described above is fully covered in one of the documentation tabs of REST API. However, REST API has one huge disadvantage - it's about 70% more expensive than the HTTP API, the price comes with more configuration options but as for now, I need only one - integration with the S3 service that's why I believe this type of service is not well suited for my use case. I started searching if HTTP API can be integrated with S3, and so far I haven't found any way to achieve it.
I tried creating/editing service-linked roles associated with the HTTP API Gateway instance, but those roles can't be edited (only read-only access). As for now, I don't have any idea where I should search next, or if my goal is even achievable using HTTP API.
I am a fan of AWSs HTTP APIs.
I work daily with an API that serves a very similar purpose. The way I have done it is by using AWS Lambda functions integrated with the APIs paths.
What works for me is this:
Define your API paths, and integrate them with AWS Lambda functions.
Have your integrated Lambda function return a signed URL for any objects you want to provide access to through API calls.
There are several different ways to pass the name of the object(s) you want to the Lambda function servicing the API call.
This is the short answer. I plan to give a longer answer at a later time. But this has worked for me.

How can I enable the API in AWS Managed Workflows for Apache Airflow?

I'm testing the waters for running Apache Airflow on AWS through the Managed Workflows for Apache Airflow (MWAA). The version of Airflow that AWS have deployed and are managing for me is 1.10.12.
When I try to access the v1 REST API at /api/experimental/test I get back status code 403 Forbidden.
Is it possible to enable the experimental API in MWAA? How?
I think MWAA provide a REST endpoint to use the CLI
https://$WEB_SERVER_HOSTNAME/aws_mwaa/cli
It's quite confusing because you fisrt need to create a cli-token using the awscli to then hit the endpoint using that token. You will need a policy to allow your awscli to request that token.
Lastly there isn't support for all the commands, just a bunch.
Anyway it's all explained on the user guide
https://docs.aws.amazon.com/mwaa/latest/userguide/amazon-mwaa-user-guide.pdf
By default, api.auth_backend configuration option is set to airflow.api.auth.backend.deny_all in MWAA environments. You need to override it to one of the authentication methods mentioned in the documentation as shown in the figure bellow:
Note: it is highly discouraged to use airflow.api.auth.backend.default as it'll
leave your environment publicly accessible.
[2021/07/29] Edit:
Based on this comment, AWS blocked access to the REST API.

Local kibana AWS Signature v4

i'm trying to connect a custom kibana with the AWS elasticsearch service, AWS ES cluster uses api keys based authentication and requires a signed request, viewing the code i can see there is a module aws4 installed but i don't see how is it used to sign requests, i can't see no configuration related in kibana.
Thanks.

AWS SDK for JS in Browser with a CognitoUser instead of IAM credentials?

I have a browser app that interacts with S3. Since it was mostly an in-house tool, after handling authenticating to an API, it directly received the ID and secret for a very restricted IAM user which was then used to setup the AWS SDK in the browser.
I am now trying to change that app to use Cognito for authentication, so it can be accessed by external users without compromising our security.
I wound up using AWS Amplify just to handle the authentication part, and now I'm trying to figure out if there's a way of using the credentials I get from Cognito to setup the AWS JavaScript SDK and replicate the same functionality from that point on. (The way Amplify currently handles interaction with S3 does not cover all of the app's needs)
Is there a way of doing this? I find the SDK documentation extremely confusing, and have been unable to determine if what I'm trying to do can be done at all.
Additionally, if there's a way to use the JS SDK only (without Amplify) to login a user via Cognito, that would also be preferable to me, but that's a secondary concern.
Yes, you can easily do this with Amplify, and I recommend this approach.
Here's an example from the docs using the Route53 module from the AWS JS SDK, but you can use any of the AWS modules of course.
Via https://aws-amplify.github.io/docs/js/authentication#working-with-aws-service-objects
import Route53 from 'aws-sdk/clients/route53';
Auth.currentCredentials()
.then(credentials => {
const route53 = new Route53({
apiVersion: '2013-04-01',
credentials: Auth.essentialCredentials(credentials)
});
// more code working with route53 object
// route53.changeResourceRecordSets();
})

How can I invoke AWS SageMaker endpoint to get inferences?

I want to get real time predictions using my machine learning model with the help of SageMaker. I want to directly get inferences on my website. How can I use the deployed model for predictions?
Sagemaker endpoints are not publicly exposed to the Internet. So, you'll need some way of creating a public HTTP endpoint that can route requests to your Sagemaker endpoint. One way you can do this is with an AWS Lambda function fronted by API gateway.
I created an example web app that takes webcam images and passes them on to a Sagemaker endpoint for classification. This uses the API Gateway -> Lambda -> Sagemaker endpoint strategy that I described above. You can see the whole example, including instructions for how to set up the Lambda (and the code to put in the lambda) at this GitHub repository: https://github.com/gabehollombe-aws/webcam-sagemaker-inference/
You can invoke the SageMaker endpoint using API Gateway or Lambda.
Lambda:
Use sagemaker aws sdk and invoke the endpoint with lambda.
API Gateway:
Use API Gateway and pass parameters to the endpoint with AWS service proxy.
Documentation with example:
https://aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/
Hope it helps.
Use the CLI like this:
aws sagemaker-runtime invoke-endpoint \
--endpoint-name <endpoint-name> \
--body '{"instances": [{"in0":[863],"in1":[882]}]}' \
--content-type application/json \
--accept application/json \
results
I found it over here in a tutorial about accessing Sagemaker via API Gateway.
As other answers have mentioned, your best option is fronting the SageMaker endpoint with a REST API in API Gateway. The API then lets you control authorisation and 'hides' the backend SageMaker endpoint from API clients, lowering the coupling between API clients (your website) and your backend. (By the way, you don't need a Lambda function there, you can directly integrate the REST API with SageMaker as a backend).
However, if you are simply testing the endpoint after deploying it and you want to quickly get some inferences using Python, there's two options:
After deploying your endpoint with predictor = model.deploy(...), if you still have the predictor object available in your Python scope, you can simply run predictor.predict(), as documented here. However, it's rather likely that you've deployed the endpoint a while ago and you can no longer access the predictor object, and naturally one doesn't want to re-deploy the entire endpoint just to get the predictor.
If your endpoint already exists, you can invoke it using boto3 as follows, as documented here:
import boto3
payload = "string payload"
endpoint_name = "your-endpoint-name"
sm_runtime = boto3.client("runtime.sagemaker")
response = sm_runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="text/csv",
Body=payload
)
response_str = response["Body"].read().decode()
Naturally, you can adjust the above invocation according to your content type, to send JSON data for example. Then just be aware of the (de)serializer the endpoint uses, as well as the ContentType in the argument to invoke_endpoint.