Search programmatically if name contains a pattern in AWS SSM parameter - amazon-web-services

I'm looking for a programmatic way to retrieve parameters just by giving the name or a part of the complete path (instead of giving the full path with the name).
It's pretty easy using the Parameter Store AWS Systems Manager console, if I type tokens, I retrieve all parameters where the Name contains tokens :
Is there a way to do the same but using AWS CLI or AWS SDK (python or Go preferably) ?

I think this is what you are after:
aws ssm describe-parameters --parameter-filters Key=Name,Values=token,Option=Contains

Or with Python:
import boto3
response = boto3.client("ssm").describe_parameters(
ParameterFilters=[
{
'Key': 'Name',
'Option': 'Contains',
'Values': [
'token',
]
},
]
)

Related

How to upload to AWS S3 with Object Tagging

Is there a way to upload file to AWS S3 with Tags(not add Tags to an existing File/Object in S3). I need to have the file appear in S3 with my Tags , ie in a single API call.
I need this because I use a Lambda Function (that uses these S3 object Tags) is triggered by S3 ObjectCreation
You can inform the Tagging attribute on the put operation.
Here's an example using Boto3:
import boto3
client = boto3.client('s3')
client.put_object(
Bucket='bucket',
Key='key',
Body='bytes',
Tagging='Key1=Value1'
)
As per the docs, the Tagging attribute must be encoded as URL Query parameters. (For example, "Key1=Value1")
Tagging ā€” (String) The tag-set for the object. The tag-set must be
encoded as URL Query parameters. (For example, "Key1=Value1")
EDIT: I only noticed the boto3 tag after a while, so I edited my answer to match boto3's way of doing it accordingly.
Tagging directive is now supported by boto3. You can do the following to add tags if you are using upload_file()
import boto3
from urllib import parse
s3 = boto3.client("s3")
tags = {"key1": "value1", "key2": "value2"}
s3.upload_file(
"file_path",
"bucket",
"key",
ExtraArgs={"Tagging": parse.urlencode(tags)},
)
If you're uploading a file using client.upload_file() or other methods that have the ExtraArgs parameter, you specify the tags differently you need to add tags in a separate request. You can add metadata as follows, but this is not the same thing. For an explanation of the difference, see this SO question:
import boto3
client = boto3.client('s3')
client.upload_file(
Filename=path_to_your_file,
Bucket='bucket',
Key='key',
ExtraArgs={"Metadata": {"mykey": "myvalue"}}
)
There's an example of this on the S3 docs, but you have to know that "metadata" corresponds to tags be aware that metadata is not exactly the same thing as tags though it can function similarly.
s3.upload_file(
"tmp.txt", "bucket-name", "key-name",
ExtraArgs={"Metadata": {"mykey": "myvalue"}}
)

Tag EIP with Boto3

Is it possible to tag an Elastic IP address with boto3? I know you can tag them through the console and I have tagged many of them this way but I can seem to find any documentation on tagging them with boto3.
I this is not possible with boto3 is there some other library I can use to accomplish this?
The boto3 library offers a generic create_tags method that can be used to apply tags to a number of different types of AWS resources, including Elastic IPs. Here's an example of how to use it:
import boto3
ec2 = boto3.client('ec2', region_name='us-east-1')
response = ec2.create_tags(
Resources=[
'eipalloc-094ca1234de5abcd',
],
Tags=[
{
'Key': 'Description',
'Value': 'Production EIP'
},
]
)
Note: this adds the tag Description=Production EIP to whatever tags are already present on the resource (or it overwrites the Description tag if it already exists).

RDS generate_presigned_url does not support the DestinationRegion parameter

I was trying to set up encrypted RDS replica in another region, but I got stuck on generating pre-signed URL.
It seems that boto3/botocore does not allow DestinationRegion parameter, which is defined as a requirement on AWS API (link) in case we want to generate PreSignedUrl.
Versions used:
boto3 (1.4.7)
botocore (1.7.10)
Output:
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "DestinationRegion", must be one of: DBInstanceIdentifier, SourceDBInstanceIdentifier, DBInstanceClass, AvailabilityZone, Port, AutoMinorVersionUpgrade, Iops, OptionGroupName, PubliclyAccessible, Tags, DBSubnetGroupName, StorageType, CopyTagsToSnapshot, MonitoringInterval, MonitoringRoleArn, KmsKeyId, PreSignedUrl, EnableIAMDatabaseAuthentication, SourceRegion
Example code:
import boto3
url = boto3.client('rds', 'eu-east-1').generate_presigned_url(
ClientMethod='create_db_instance_read_replica',
Params={
'DestinationRegion': 'eu-east-1',
'SourceDBInstanceIdentifier': 'abc',
'KmsKeyId': '1234',
'DBInstanceIdentifier': 'someidentifier'
},
ExpiresIn=3600,
HttpMethod=None
)
Same issue was already reported but got closed.
Thanks for help,
Petar
Generate Pre signed URL from the source region, then populate the create_db_instance_read_replica with that url.
The presigned URL must be a valid request for the CreateDBInstanceReadReplica API action that can be executed in the source AWS Region that contains the encrypted source DB instance
PreSignedUrl (string) --
The URL that contains a Signature Version 4 signed request for the CreateDBInstanceReadReplica API action in the source AWS Region that contains the source DB instance.
import boto3
session = boto3.Session(profile_name='profile_name')
url = session.client('rds', 'SOURCE_REGION').generate_presigned_url(
ClientMethod='create_db_instance_read_replica',
Params={
'DBInstanceIdentifier': 'db-1-read-replica',
'SourceDBInstanceIdentifier': 'database-source',
'SourceRegion': 'SOURCE_REGION'
},
ExpiresIn=3600,
HttpMethod=None
)
print(url)
source_db = session.client('rds', 'SOURCE_REGION').describe_db_instances(
DBInstanceIdentifier='database-SOURCE'
)
print(source_db)
response = session.client('rds', 'DESTINATION_REGION').create_db_instance_read_replica(
SourceDBInstanceIdentifier="arn:aws:rds:SOURCE_REGION:account_number:db:database-SOURCE",
DBInstanceIdentifier="db-1-read-replica",
KmsKeyId='DESTINATION_REGION_KMS_ID',
PreSignedUrl=url,
SourceRegion='SOURCE'
)
print(response)

Boto3 - Create S3 'object created' notification to trigger a lambda function

How do I use boto3 to simulate the Add Event Source action on the AWS GUI Console in the Event Sources tab.
I want to programatically create a trigger such that if an object is created in MyBucket, it will call MyLambda function(qualified with an alias).
The relevant api call that I see in the Boto3 documentation is create_event_source_mapping but it states explicitly that it is only for AWS Pull Model while I think that S3 belongs to the Push Model. Anyways, I tried using it but it didn't work.
Scenarios:
Passing a prefix filter would be nice too.
I was looking at the wrong side. This is configured on S3
s3 = boto3.resource('s3')
bucket_name = 'mybucket'
bucket_notification = s3.BucketNotification(bucket_name)
response = bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': 'arn:aws:lambda:us-east-1:033333333:function:mylambda:staging',
'Events': [
's3:ObjectCreated:*'
],
},
]})

How do you full text search an Amazon S3 bucket?

I have a bucket on S3 in which I have large amount of text files.
I want to search for some text within a text file. It contains raw data only.
And each text file has a different name.
For example, I have a bucket name:
abc/myfolder/abac.txt
xyx/myfolder1/axc.txt
& I want to search text like "I am human" in the above text files.
How to achieve this? Is it even possible?
The only way to do this will be via CloudSearch, which can use S3 as a source. It works using rapid retrieval to build an index. This should work very well but thoroughly check out the pricing model to make sure that this won't be too costly for you.
The alternative is as Jack said - you'd otherwise need to transfer the files out of S3 to an EC2 and build a search application there.
Since october 1st, 2015 Amazon offers another search service with Elastic Search, in more or less the same vein as cloud search you can stream data from Amazon S3 buckets.
It will work with a lambda function to make sure any new data sent to an S3 bucket triggers an event notification to this Lambda and update the ES index.
All steps are well detailed in amazon doc with Java and Javascript example.
At a high level, setting up to stream data to Amazon ES requires the following steps:
Creating an Amazon S3 bucket and an Amazon ES domain
Creating a Lambda deployment package.
Configuring a Lambda function.
Granting authorization to stream data to Amazon ES.
Although not an AWS native service, there is Mixpeek, which runs text extraction like Tika, Tesseract and ImageAI on your S3 files then places them in a Lucene index to make them searchable.
You integrate it as follows:
Download the module: https://github.com/mixpeek/mixpeek-python
Import the module and your API keys:
from mixpeek import Mixpeek, S3
from config import mixpeek_api_key, aws
Instantiate the S3 class (which uses boto3 and requests):
s3 = S3(
aws_access_key_id=aws['aws_access_key_id'],
aws_secret_access_key=aws['aws_secret_access_key'],
region_name='us-east-2',
mixpeek_api_key=mixpeek_api_key
)
Upload one or more existing S3 files:
# upload all S3 files in bucket "demo"
s3.upload_all(bucket_name="demo")
# upload one single file called "prescription.pdf" in bucket "demo"
s3.upload_one(s3_file_name="prescription.pdf", bucket_name="demo")
Now simply search using the Mixpeek module:
# mixpeek api direct
mix = Mixpeek(
api_key=mixpeek_api_key
)
# search
result = mix.search(query="Heartgard")
print(result)
Where result can be:
[
{
"_id": "REDACTED",
"api_key": "REDACTED",
"highlights": [
{
"path": "document_str",
"score": 0.8759502172470093,
"texts": [
{
"type": "text",
"value": "Vetco Prescription\nVetcoClinics.com\n\nCustomer:\n\nAddress: Canine\n\nPhone: Australian Shepherd\n\nDate of Service: 2 Years 8 Months\n\nPrescription\nExpiration Date:\n\nWeight: 41.75\n\nSex: Female\n\nā„ž "
},
{
"type": "hit",
"value": "Heartgard"
},
{
"type": "text",
"value": " Plus Green 26-50 lbs (Ivermectin 135 mcg/Pyrantel 114 mg)\n\nInstructions: Give one chewable tablet by mouth once monthly for protection against heartworms, and the treatment and\ncontrol of roundworms, and hookworms. "
}
]
}
],
"metadata": {
"date_inserted": "2021-10-07 03:19:23.632000",
"filename": "prescription.pdf"
},
"score": 0.13313256204128265
}
]
Then you parse the results
You can use Filestash (Disclaimer: I'm the author), install you own instance and connect to your S3 bucket. Eventually give it a bit of time to index the entire thing if you have a whole lot of data and you should be good
If you have an EMR, then create a spark application and do a search . We did this. This will work as distributed searcn
I know this is really old, but hopefully someone find my solution handy.
This is a python script, using boto3.
def search_word (info, search_for):
res = False
if search_for in info:
res = True
elif search_for not in info:
res = False
return res
import boto3
import json
aws_access_key_id='AKIAWG....'
aws_secret_access_key ='p9yrNw.....'
client = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key = aws_secret_access_key)
s3 = boto3.resource('s3')
bucket_name = 'my.bucket.name'
bucket_prefix='2022/05/'
search_for = 'looking#emailaddress.com'
search_results = []
search_results_keys = []
response = client.list_objects_v2(
Bucket=bucket_name,
Prefix=bucket_prefix
)
for i in response['Contents']:
mini = {}
obj = client.get_object(
Bucket=bucket_name,
Key=i['Key']
)
body = obj['Body'].read().decode("utf-8")
key = i['Key']
if search_word(body, search_for):
mini = {}
mini[key] = body
search_results.append(mini)
search_results_keys.append(key)
# YOU CAN EITHER PRINT THE KEY (FILE NAME/DIRECTORY), OR A MAP WHERE THE KEY IS THE FILE NAME/DIRECTORY. AND THE VALUE IS THE TXT OF THE FILE
print(search_results)
print(search_results_keys)
there is serverless and cheaper option available
Use AWS Glue and you can convert the txt fils into a table
use AWS AThena and you can run sql queries on top of it.
I wouldrecommend you to put data in parquets on s3 and this makes the data size on s3 very small and super fast!