Automate AWS Marketplace publishing through CLI - amazon-web-services

I have my product uploaded to AWS as an AMI through Hashicorp's Packer. Now I'ld like to automate the last step, publishing it to the marketplace. The product already exists, it's only about adding a revision.
After reading this article, the API_StartChangeSet doc, this add revisions user guide & fiddling with the marketplace console, I think I just have to
aws marketplace-catalog start-change-set --catalog AWSMarketplace --change-set-name "$VERSION" --change-set '[ {"ChangeType": "AddRevisions", "Entity": {"Identifier": "REDACTED#29","Type": "ServerProduct#1.0"}, "Details": "{\"DataSetArn\": \"?????\", \"RevisionArns\": [\"?????\"] }" ]'
I'm having a hard time coming up with "Details" part. I've my AMI id. I guess that goes in the RevisionsArns ? What should I put in the DataSetArn, the "EntityArn" from the output of aws marketplace-catalog describe-entity --catalog AWSMarketplace --entity-id REDACTED ?

Details facet here is just a product type specific facet, encoded as json string. For the AMI that you are offering in the AWS Marketplace, it could include support information, region availability or any other info that provides a descriptive text regarding your change. For example:
"Details": "{\"Description\":{}, \"PromotionalResources\":{}, \"RegionAvailability\":{}, \"SupportInformation\":{}}",
The example you found does not necessarily mean that you have to have EntityArn and RevisionsArns. The Details facet is used as an information describing the details of your change.
Check here.

Turns out I didn't found the good documentation, my last link being about AWS Data Exchange, whose "Details" field's contents were confusing.
Here the relevant documentation: Marketplace catalog AMI add version, and here's the snippet I was looking for
"Details": "{
\"Version\": {
\"VersionTitle\": \"*My new title*\",
\"ReleaseNotes\": \"*My new Release notes*\"
},
\"DeliveryOptions\": [
{
\"Details\": {
\"AmiDeliveryOptionDetails\": {
\"AmiSource\": {
\"AmiId\": \"ami-1234567890abcdef\",
\"AccessRoleArn\": \"arn:aws:iam::12345678901:role/AwsMarketplaceAmiIngestion\",
\"UserName\": \"ec2-user\",
\"OperatingSystemName\": \"AMAZONLINUX\",
\"OperatingSystemVersion\": \"Amazon Linux 2 AMI 2.0.20210126.0 x86_64 HVM gp2\"
},
\"UsageInstructions\": \"Easy to use AMI\",
\"RecommendedInstanceType\": \"m4.xlarge\",
\"SecurityGroups\": [
{
\"IpProtocol\": \"tcp\",
\"FromPort\": 443,
\"ToPort\": 443,
\"IpRanges\": [
\"0.0.0.0/0\"
]
}
]
}
}
}
]
}"

Related

Get AWS Reservation Utilization by custom tag

I am currently assigning AWS media-live channels to a specific group by a custom tag and want to get the (CostExplorer) GetReservationUtilization for a group's channels by filtering by tag. The AWS documentation for GetReservationUtilization lists the Filtering options as:
"Filter": {
.
.
"Tags": {
"Key": "string",
"MatchOptions": [ "string" ],
"Values": [ "string" ]
}
.
.
}
I interpret it as it should be possible to sort by a custom set tag via:
"Key": "Group",
"Value": [customId]
But I get an error that says "An error occurred (ValidationException) when calling the GetReservationUtilization operation: Tags expression is not allowed, allowed expression(s): And, Not, Dimensions"
Feels like I have tried everything possible but I cant seem to get it to work.
Have you looked at the examples boto3 documentation?
Seems you may need to wrap the tag element inside of the And, Not or supply dimensions
For anyone coming here in the future, sorting reservation-utilization by the tag dimension is currently not supported. The following dimensions are supported:
AZ
CACHE_ENGINE
DEPLOYMENT_OPTION
INSTANCE_TYPE
LINKED_ACCOUNT
OPERATING_SYSTEM
PLATFORM
REGION
SERVICE
SCOPE
TENANCY
As specified in the AWS API docs.

AWS-Console: DynamoDB scan on nested field

I have below table in DynamoDB
{
"id": 1,
"user": {
"age": "26",
"email": "testuser#gmail.com",
"name": "test user"
}
}
Using AWS console, I want to scan all the records whose email address contains gmail.com
I am trying this but it is giving no results.
I am new to AWS, not sure what's wrong here. Is it not possible to scan on nested fields?
I've been trying to figure this out myself but it would seem that nested item scans are not supported through the console.
I'm going based off of this which offer some alternative options via CLI or SDK: https://forums.aws.amazon.com/thread.jspa?messageID=931016

AWS Pinpoint/Ionic - "Resource not found" error when trying to send push through CLI

I am new at programming with AWS services, so some fundamental things are pretty hard for me. Recently, I was asked to develop an app that used Amazon Pinpoint to send push notifications, as a test for considering future implementations.
As you can see in another question I posted in here (Amazon Pinpoint and Ionic - Push notifications not working when app is in background), I was having trouble trying to send push notifications to users when my app is running in the background. The app was developed using Ionic by following these steps.
When I was almost giving up, I decided to try sending the pushes directly through Firebase, and it finally worked. Some research took me to this question, in which another user described the problem as only happening in AWS Console, so the solution would be to use CLI. After searching a little about it, I found this tutorial about how to sending pinpoint messages to users using CLI, that seems to be what I wanted. Combining it with this documentation about phonegap plugin, I was able to generate a JSON I thought could be a solution:
{
"ApplicationId":"io.ionic.starter",
"MessageRequest":{
"Addresses": {
"": {
"BodyOverride": "",
"ChannelType": "GCM",
"Context": {
"": ""
},
"RawContent": "",
"Substitutions": {},
"TitleOverride": ""
}
},
"Context": {
"": ""
},
"Endpoints": {"us-east-1": {
"BodyOverride": "",
"Context": {},
"RawContent": "",
"Substitutions": {},
"TitleOverride": ""
}
},
"MessageConfiguration": {
"GCMMessage": {
"Action": "OPEN_APP",
"Body": "string",
"CollapseKey": "",
"Data": {
"": ""
},
"IconReference": "",
"ImageIconUrl": "",
"ImageUrl": "",
"Priority": "High",
"RawContent": "{\"data\":{\"title\":\"sometitle\",\"body\":\"somebody\",\"url\":\"insertyourlinkhere.com\"}}",
"RestrictedPackageName": "",
"SilentPush": false,
"SmallImageIconUrl": "",
"Sound": "string",
"Substitutions": {},
"TimeToLive": 123,
"Title": "",
"Url": ""
}
}
}
}
But when I executed it in cmd with aws pinpoint send-messages --color on --region us-east-1 --cli-input-json file://test.json, I got the response An error occurred (NotFoundException) when calling the SendMessages operation: Resource not found.
I believe I didn't write the JSON file correctly, since it's my first time doing this. So please, if any of you know what I am doing wrong, no mattering which step I misunderstood, I would appreciate the help!
"Endpoints" field in the Message request deals with the endpoint id (the identifier associated with an end user device while registering to pinpoint and not the region.)
In case if you haven't registered any endpoints with Pinpoint, you can use the "Addresses" field. After registering the GCM Channel in Amazon Pinpoint, you can get the GCM device token from your device and specify it here.
Here is a sample for sending direct messages using Amazon Pinpoint Note: The example deals with sending SMS message. You should have registered a SMS channel first and created an endpoint with the endpoint id as "test-endpoint1". Otherwise, you can use the "Addresses" field instead of "Endpoints" field.
aws pinpoint send-messages --application-id $APP_ID --message-request '{"MessageConfiguration": {"SMSMessage":{"Body":"hi hello"}},"Endpoints": {"test-endpoint1": {}}}
Also Note: ApplicationId is generated by Pinpoint. When you visit the Pinpoint console and choose your application, the URL will be of the format
https://console.aws.amazon.com/pinpoint/home/?region=us-east-1#/apps/someverybigstringhere/
Here "someverybigstringhere" is the ApplicationId and not the name you give for your project.

Querying AWS using Packer

I'm using Packer to query AWS to find an AMI to use as a source AMI. I'd like to find AMI by tags. Here is my code.
"source_ami_filter": {
"filters": {
"tag": "type=Ubuntu Base"
},
"owners": ["self"],
"most_recent": true
}
which receives this error
amazon-ebs: Error querying AMI: InvalidParameterValue: The filter 'Filter.tag' is invalid
I can't for the life of me figure out how to format that filter. Any help would be greatly appreciated.
Your sample code is very close, but the tag name should be specified in the filters key instead of the value.
This modification of your code should work to find the AMI with a "type" tag containing the value "Ubuntu Base":
"source_ami_filter": {
"filters": {
"tag:type": "Ubuntu Base"
},
"owners": ["self"],
"most_recent": true
}
The Packer documentation for source_ami_filter explains that "any filter described in the docs for DescribeImages is valid."
Then the AWS EC2 documentation for DescribeImages shows that a filter for a value contained in a given tag should use the format, tag:key=value:
tag:key=value - The key/value combination of a tag assigned to the resource. Specify the key of the tag in the filter name and the value of the tag in the filter value. For example, for the tag Purpose=X, specify tag:Purpose for the filter name and X for the filter value.

How do you full text search an Amazon S3 bucket?

I have a bucket on S3 in which I have large amount of text files.
I want to search for some text within a text file. It contains raw data only.
And each text file has a different name.
For example, I have a bucket name:
abc/myfolder/abac.txt
xyx/myfolder1/axc.txt
& I want to search text like "I am human" in the above text files.
How to achieve this? Is it even possible?
The only way to do this will be via CloudSearch, which can use S3 as a source. It works using rapid retrieval to build an index. This should work very well but thoroughly check out the pricing model to make sure that this won't be too costly for you.
The alternative is as Jack said - you'd otherwise need to transfer the files out of S3 to an EC2 and build a search application there.
Since october 1st, 2015 Amazon offers another search service with Elastic Search, in more or less the same vein as cloud search you can stream data from Amazon S3 buckets.
It will work with a lambda function to make sure any new data sent to an S3 bucket triggers an event notification to this Lambda and update the ES index.
All steps are well detailed in amazon doc with Java and Javascript example.
At a high level, setting up to stream data to Amazon ES requires the following steps:
Creating an Amazon S3 bucket and an Amazon ES domain
Creating a Lambda deployment package.
Configuring a Lambda function.
Granting authorization to stream data to Amazon ES.
Although not an AWS native service, there is Mixpeek, which runs text extraction like Tika, Tesseract and ImageAI on your S3 files then places them in a Lucene index to make them searchable.
You integrate it as follows:
Download the module: https://github.com/mixpeek/mixpeek-python
Import the module and your API keys:
from mixpeek import Mixpeek, S3
from config import mixpeek_api_key, aws
Instantiate the S3 class (which uses boto3 and requests):
s3 = S3(
aws_access_key_id=aws['aws_access_key_id'],
aws_secret_access_key=aws['aws_secret_access_key'],
region_name='us-east-2',
mixpeek_api_key=mixpeek_api_key
)
Upload one or more existing S3 files:
# upload all S3 files in bucket "demo"
s3.upload_all(bucket_name="demo")
# upload one single file called "prescription.pdf" in bucket "demo"
s3.upload_one(s3_file_name="prescription.pdf", bucket_name="demo")
Now simply search using the Mixpeek module:
# mixpeek api direct
mix = Mixpeek(
api_key=mixpeek_api_key
)
# search
result = mix.search(query="Heartgard")
print(result)
Where result can be:
[
{
"_id": "REDACTED",
"api_key": "REDACTED",
"highlights": [
{
"path": "document_str",
"score": 0.8759502172470093,
"texts": [
{
"type": "text",
"value": "Vetco Prescription\nVetcoClinics.com\n\nCustomer:\n\nAddress: Canine\n\nPhone: Australian Shepherd\n\nDate of Service: 2 Years 8 Months\n\nPrescription\nExpiration Date:\n\nWeight: 41.75\n\nSex: Female\n\n℞ "
},
{
"type": "hit",
"value": "Heartgard"
},
{
"type": "text",
"value": " Plus Green 26-50 lbs (Ivermectin 135 mcg/Pyrantel 114 mg)\n\nInstructions: Give one chewable tablet by mouth once monthly for protection against heartworms, and the treatment and\ncontrol of roundworms, and hookworms. "
}
]
}
],
"metadata": {
"date_inserted": "2021-10-07 03:19:23.632000",
"filename": "prescription.pdf"
},
"score": 0.13313256204128265
}
]
Then you parse the results
You can use Filestash (Disclaimer: I'm the author), install you own instance and connect to your S3 bucket. Eventually give it a bit of time to index the entire thing if you have a whole lot of data and you should be good
If you have an EMR, then create a spark application and do a search . We did this. This will work as distributed searcn
I know this is really old, but hopefully someone find my solution handy.
This is a python script, using boto3.
def search_word (info, search_for):
res = False
if search_for in info:
res = True
elif search_for not in info:
res = False
return res
import boto3
import json
aws_access_key_id='AKIAWG....'
aws_secret_access_key ='p9yrNw.....'
client = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key = aws_secret_access_key)
s3 = boto3.resource('s3')
bucket_name = 'my.bucket.name'
bucket_prefix='2022/05/'
search_for = 'looking#emailaddress.com'
search_results = []
search_results_keys = []
response = client.list_objects_v2(
Bucket=bucket_name,
Prefix=bucket_prefix
)
for i in response['Contents']:
mini = {}
obj = client.get_object(
Bucket=bucket_name,
Key=i['Key']
)
body = obj['Body'].read().decode("utf-8")
key = i['Key']
if search_word(body, search_for):
mini = {}
mini[key] = body
search_results.append(mini)
search_results_keys.append(key)
# YOU CAN EITHER PRINT THE KEY (FILE NAME/DIRECTORY), OR A MAP WHERE THE KEY IS THE FILE NAME/DIRECTORY. AND THE VALUE IS THE TXT OF THE FILE
print(search_results)
print(search_results_keys)
there is serverless and cheaper option available
Use AWS Glue and you can convert the txt fils into a table
use AWS AThena and you can run sql queries on top of it.
I wouldrecommend you to put data in parquets on s3 and this makes the data size on s3 very small and super fast!