I have an AWS question: I have an application running on Beanstalk. I have two environments, XXX-LIVE and XXX-TEST.
I would like to know how I can get the Environment name using the SDK, since I want to point to my test database if the code is running on the XXX-TEST environment?
So far I have only found the .RetrieveEnvironmentInfo() method of the object AWSClientFactory.CreateAmazonElasticBeanstalkClient();
But this requires that you provide the Environment name/ID.
Can anyone help?
Here's how we do it for our application in ruby:
def self.beanstalk_env
begin
uuid = File.readlines('/sys/hypervisor/uuid', 'r')
if uuid
str = uuid.first.slice(0,3)
if str == 'ec2'
metadata_endpoint = 'http://169.254.169.254/latest/meta-data/'
dynamic_endpoint = 'http://169.254.169.254/latest/dynamic/'
instance_id = Net::HTTP.get( URI.parse( metadata_endpoint + 'instance-id' ) )
document = Net::HTTP.get( URI.parse( dynamic_endpoint + 'instance-identity/document') )
parsed_document = JSON.parse(document)
region = parsed_document['region']
ec2 = AWS::EC2::Client.new(region: region)
ec2.describe_instances({instance_ids:[instance_id]}).reservation_set[0].instances_set[0].tag_set.each do |tag|
if tag.key == 'elasticbeanstalk:environment-name'
return tag.value
end
end
end
end
rescue
end
'No_Env'
end
Your instance's IAM-policy will have to allow ec2:describe:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:Describe*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
You can add custom "environment-name" parameter to both environments. Set the value to the name of the environment or just specify "test" or "production".
If the database access URL is the only difference between the two, then set URL as a parameter and you will end up with identical code with no branches.
More details on customization can be found here: Customizing and Configuring AWS Elastic Beanstalk Environments
Related
Using Former2, I noticed some subnets are not created with their associated tags, I checked this with Troposphere and Raw:
EC2Subnet20 = template.add_resource(ec2.Subnet(
'EC2Subnet20',
AvailabilityZone=GetAtt(EC2Instance2, 'AvailabilityZone'),
CidrBlock="30.0.2.0/24",
VpcId=Ref(EC2VPC),
MapPublicIpOnLaunch=False
))
EC2Subnet21 = template.add_resource(ec2.Subnet(
'EC2Subnet21',
AvailabilityZone=GetAtt(EC2Instance, 'AvailabilityZone'),
CidrBlock="30.0.21.0/24",
VpcId=Ref(EC2VPC),
MapPublicIpOnLaunch=False,
Tags=[
{
"Key": "Name",
"Value": "TS-Subnet-Private-Dev"
}
]
))
I have manually checked AWS/VPC/Subnets and ther relevant entry for EC2Subnet20 does have 2x Tags.
Ok, I hope there is a way to do this. So I I have a list, lets say var.env. And I want to iterate over that var.env to make multiple policies and attach those to their counterpart assumable role. I hope that make sense.
What this would look like is.
# list
variable "env" {
type = list(strings)
default = ['dev', 'stage', 'prod']
}
# creating the policy documents data
data "aws_iam_policy_document" "policy_document" {
for_each = var.env
statement {
actions = [
"s3:*"
]
effect = "Allow"
resources = ["arn:...:mybucket_${each.key}"]
}
}
# attaching the policy document data to a policy
module "bucket_policy" {
source = "terraform-aws-modules/iam/aws//modules/iam-policy"
version = "4.10.0"
for_each = var.env
name = "example_${each.key}_bucket_policy"
path = "/"
description = "${each.key} Policy document"
policy = data.aws_iam_policy_document.policy_document[each.key].json
}
# creating the assumable roles
module "assumable_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role"
version = "4.10.0"
for_each = var.env
create_role = true
role_name = "example_${each.key}_bucket_role"
role_requires_mfa = false
trusted_role_services = [
s3.amazonaws.com"
]
# now this DOESN"T WORK but something like this...
custom_role_policy_arns = [
module.bucket_policy.arn[each.key]
]
}
I truly appreciate the advice and everything.
Thank you beforehand.
The correct way to referrer to the arn is as follows. Instead of
module.bucket_policy.arn[each.key]
it should be
module.bucket_policy[each.key].arn
Will the terraform fail if a user in the data does not exist?
I need to specify a user in the nonproduction environment by the data block:
data "aws_iam_user" "labUser" {
user_name = "gitlab_user"
}
Then I use this user in giving the user permissions:
resource "aws_iam_role" "ApiAccessRole_abc" {
name = "${var.stack}-ApiAccessRole_abc"
tags = "${var.tags}"
assume_role_policy = <<EOF
{
"Version": "2019-11-29",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"AWS": [
"${aws_iam_user.labUser.arn}"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
In the production environment this user does not exist. Would the terraform break if this user does not exist? What would be a good approach to use the same terraform in both environments?
In Terraform a data block like you showed here is both a mechanism to fetch data and also an assertion by the author (you) that a particular external object is expected to exist in order for this configuration to be applyable.
In your case then, the answer is to ensure that the assertion that the object exists only appears in situations where it should exist. The "big picture" answer to this is to review the Module Composition guide and consider whether this part of your module ought to be decomposed into a separate module if it isn't always a part of the module it's embedded in, but I'll also show a smaller solution that uses conditional expressions to get the behavior you wanted without any refactoring:
variable "lab_user" {
type = string
default = null
}
data "aws_iam_user" "lab_user" {
count = length(var.lab_user[*])
user_name = var.lab_user
}
resource "aws_iam_role" "api_access_role_abc" {
count = length(data.aws_iam_user.lab_user)
name = "${var.stack}-ApiAccessRole_abc"
tags = var.tags
assume_role_policy = jsonencode({
Version = "2019-11-29"
Statement = [
{
Sid = ""
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
AWS = [data.aws_iam_user.lab_user[count.index].arn]
}
},
]
})
}
There's a few different things in the above that I want to draw attention to:
I made the lab username an optional variable rather than a hard-coded value. You can than change the behavior between your environments by assigning a different value to that lab_user variable, or leaving it unset altogether for environments that don't need a "lab user".
In the data "aws_iam_user" I set count to length(var.lab_user[*]). The [*] operator here is asking Terraform to translate the possibly-null string variable var.lab_user into a list of either zero or one elements, and then using the length of that list to decide how many aws_iam_user queries to make. If var.lab_user is null then the length will be zero and so no queries will be made.
Finally, I set the count for the aws_iam_role resource to match the length of the aws_iam_user data result, so that in any situation where there's one user expected there will also be one role created.
If you reflect on the Module Composition guide and conclude that this lab user ought to be a separate concern in a separate module then you'd be able to remove this conditional complexity from the "gitlab user" module itself and instead have the calling module either call that module or not depending on whether such a user is needed for that environment. The effect would be the same, but the decision would be happening in a different part of the configuration and thus it would achieve a different separation of concerns. Which separation of concerns is most appropriate for your system is, in the end, a tradeoff you'll need to make for yourself based on your knowledge of the system and how you expect it might evolve in future.
As suggested in the comments it will fail.
One approach that I can suggest is to supply the username as a var that you pass externally from a file dev.tfvars and prod.tfvars and run terraform with:
terraform apply --var-file example.tfvars
Then in your data resource you can have a count or for_each to check whether the var has been populated or not (if var has not been passed, you can skip the data interpolation)
count = var.enable_gitlab_user ? 1 : 0
The AWS direct approach would be to switch from IAM user in the Principal to tag-based Condition or even Role chaining. You can take a look at this AWS blog post for some ideas. There are examples for both cases.
I have a .Net core client application using amazon Textract with S3,SNS and SQS as per the AWS Document , Detecting and Analyzing Text in Multipage Documents(https://docs.aws.amazon.com/textract/latest/dg/async.html)
Created an AWS Role with AmazonTextractServiceRole Policy and added the Following Trust relation ship as per the documentation (https://docs.aws.amazon.com/textract/latest/dg/api-async-roles.html)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "textract.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Subscribed SQS to the topic and Given Permission to the Amazon SNS Topic to Send Messages to the Amazon SQS Queue as per the aws documentation .
All Resources including S3 Bucket, SNS ,SQS are in the same us-west2 region
The following method shows a generic error "InvalidParameterException"
Request has invalid parameters
But If the NotificationChannel section is commented the code is working fine and returning the correct job id.
Error message is not giving a clear picture about the parameter. Highly appreciated any help .
public async Task<string> ScanDocument()
{
string roleArn = "aws:iam::xxxxxxxxxxxx:instance-profile/MyTextractRole";
string topicArn = "aws:sns:us-west-2:xxxxxxxxxxxx:AmazonTextract-My-Topic";
string bucketName = "mybucket";
string filename = "mytestdoc.pdf";
var request = new StartDocumentAnalysisRequest();
var notificationChannel = new NotificationChannel();
notificationChannel.RoleArn = roleArn;
notificationChannel.SNSTopicArn = topicArn;
var s3Object = new S3Object
{
Bucket = bucketName,
Name = filename
};
request.DocumentLocation = new DocumentLocation
{
S3Object = s3Object
};
request.FeatureTypes = new List<string>() { "TABLES", "FORMS" };
request.NotificationChannel = channel; /* Commenting this line work the code*/
var response = await this._textractService.StartDocumentAnalysisAsync(request);
return response.JobId;
}
Debugging Invalid AWS Requests
The AWS SDK validates your request object locally, before dispatching it to the AWS servers. This validation will fail with unhelpfully opaque errors, like the OP.
As the SDK is open source, you can inspect the source to help narrow down the invalid parameter.
Before we look at the code: The SDK (and documentation) are actually generated from special JSON files that describe the API, its requirements and how to validate them. The actual code is generated based on these JSON files.
I'm going to use the Node.js SDK as an example, but I'm sure similar approaches may work for the other SDKs, including .NET
In our case (AWS Textract), the latest Api version is 2018-06-27. Sure enough, the JSON source file is on GitHub, here.
In my case, experimentation narrowed the issue down to the ClientRequestToken. The error was an opaque InvalidParameterException. I searched for it in the SDK source JSON file, and sure enough, on line 392:
"ClientRequestToken": {
"type": "string",
"max": 64,
"min": 1,
"pattern": "^[a-zA-Z0-9-_]+$"
},
A whole bunch of undocumented requirements!
In my case the token I was using violated the regex (pattern in the above source code). Changing my token code to satisfy the regex solved the problem.
I recommend this approach for these sorts of opaque type errors.
After a long days analyzing the issue. I was able to resolve it .. as per the documentation topic only required SendMessage Action to the SQS . But after changing it to All SQS Action its Started Working . But Still AWS Error message is really misleading and confusing
you would need to change the permissions to All SQS Action and then use the code as below
def startJob(s3BucketName, objectName):
response = None
response = textract.start_document_text_detection(
DocumentLocation={
'S3Object': {
'Bucket': s3BucketName,
'Name': objectName
}
})
return response["JobId"]
def isJobComplete(jobId):
# For production use cases, use SNS based notification
# Details at: https://docs.aws.amazon.com/textract/latest/dg/api-async.html
time.sleep(5)
response = textract.get_document_text_detection(JobId=jobId)
status = response["JobStatus"]
print("Job status: {}".format(status))
while(status == "IN_PROGRESS"):
time.sleep(5)
response = textract.get_document_text_detection(JobId=jobId)
status = response["JobStatus"]
print("Job status: {}".format(status))
return status
def getJobResults(jobId):
pages = []
response = textract.get_document_text_detection(JobId=jobId)
pages.append(response)
print("Resultset page recieved: {}".format(len(pages)))
nextToken = None
if('NextToken' in response):
nextToken = response['NextToken']
while(nextToken):
response = textract.get_document_text_detection(JobId=jobId, NextToken=nextToken)
pages.append(response)
print("Resultset page recieved: {}".format(len(pages)))
nextToken = None
if('NextToken' in response):
nextToken = response['NextToken']
return pages
Invoking textract with Python, I received the same error until I truncated the ClientRequestToken down to 64 characters
response = client.start_document_text_detection(
DocumentLocation={
'S3Object':{
'Bucket': bucket,
'Name' : fileName
}
},
ClientRequestToken= fileName[:64],
NotificationChannel= {
"SNSTopicArn": "arn:aws:sns:us-east-1:AccountID:AmazonTextractXYZ",
"RoleArn": "arn:aws:iam::AccountId:role/TextractRole"
}
)
print('Processing started : %s' % json.dumps(response))
What I am trying to achieve is to copy objects from S3 in one account (A1 - not controlled by me) into S3 in another account (A2 - controlled by me).
For that OPS from A1 provided me a role I can assume, using boto3 library.
session = boto3.Session()
sts_client = session.client('sts')
assumed_role = sts_client.assume_role(
RoleArn="arn:aws:iam::1234567890123:role/organization",
RoleSessionName="blahblahblah"
)
This part is ok.
Problem is that direct copy from S3 to S3 is failing because that assumed role cannot access my S3.
s3 = boto3.resource('s3')
copy_source = {
'Bucket': a1_bucket_name,
'Key': key_name
}
bucket = s3.Bucket(a2_bucket_name)
bucket.copy(copy_source, hardcoded_key)
As a result of this I get
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
in this line of code:
bucket.copy(copy_source, hardcoded_key)
Is there any way I can grant access to my S3 for that assumed role?
I would really like to have direct S3 to S3 copy without downloading file locally before uploading it again.
Please advise if there is a better approach than this.
Idea is to have this script running inside of a AWS Data Pipeline on daily basis for example.
To copy objects from one S3 bucket to another S3 bucket, you need to use one set of AWS credentials that has access to both buckets.
If those buckets are in different AWS accounts, you need 2 things:
Credentials for the target bucket, and
A bucket policy on the source bucket allowing read access to the target AWS account.
With these alone, you can copy objects. You do not need credentials for the source account.
Add a bucket policy to your source bucket allowing read access to the target AWS account.
Here is a sample policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DelegateS3Access",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:root"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::BUCKET_NAME",
"arn:aws:s3:::BUCKET_NAME/*"
]
}
]
}
Be sure to replace BUCKET_NAME with your source bucket name. And replace 123456789012 with your target AWS account number.
Using credentials for your target AWS account (the owner of the target bucket), perform the copy.
Additional Notes:
You can also copy objects by reversing the two requirements:
Credentials for the source AWS account, and
A bucket policy on the target bucket allowing write access to the source AWS account.
However, when done this way, object metadata does not get copied correctly. I have discussed this issue with AWS Support, and they recommend reading from the foreign account rather than writing to the foreign account to avoid this problem.
This is a sample code to transfer data between two S3 buckets with 2 different AWS account using boto 3.
from boto.s3.connection import S3Connection
from boto.s3.key import Key
from Queue import LifoQueue
import threading
source_aws_key = '*******************'
source_aws_secret_key = '*******************'
dest_aws_key = '*******************'
dest_aws_secret_key = '*******************'
srcBucketName = '*******************'
dstBucketName = '*******************'
class Worker(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.source_conn = S3Connection(source_aws_key, source_aws_secret_key)
self.dest_conn = S3Connection(dest_aws_key, dest_aws_secret_key)
self.srcBucket = self.source_conn.get_bucket(srcBucketName)
self.dstBucket = self.dest_conn.get_bucket(dstBucketName)
self.queue = queue
def run(self):
while True:
key_name = self.queue.get()
k = Key(self.srcBucket, key_name)
dist_key = Key(self.dstBucket, k.key)
if not dist_key.exists() or k.etag != dist_key.etag:
print 'copy: ' + k.key
self.dstBucket.copy_key(k.key, srcBucketName, k.key, storage_class=k.storage_class)
else:
print 'exists and etag matches: ' + k.key
self.queue.task_done()
def copyBucket(maxKeys = 1000):
print 'start'
s_conn = S3Connection(source_aws_key, source_aws_secret_key)
srcBucket = s_conn.get_bucket(srcBucketName)
resultMarker = ''
q = LifoQueue(maxsize=5000)
for i in range(10):
print 'adding worker'
t = Worker(q)
t.daemon = True
t.start()
while True:
print 'fetch next 1000, backlog currently at %i' % q.qsize()
keys = srcBucket.get_all_keys(max_keys = maxKeys, marker = resultMarker)
for k in keys:
q.put(k.key)
if len(keys) < maxKeys:
print 'Done'
break
resultMarker = keys[maxKeys - 1].key
q.join()
print 'done'
if __name__ == "__main__":
copyBucket()