What is the aws-cli command for AWS Macie to create a job? - amazon-web-services

actually I want to create a job in AWS macie using the aws cli.
I ran the following command:-
aws macie2 create-classification-job --job-type "ONE_TIME" --name "maice-poc" --s3-job-definition bucketDefinitions=[{"accountID"="254378651398", "buckets"=["maice-poc"]}]
but it is giving me an error:-
Unknown options: buckets=[maice-poc]}]
Can someone give me a correct command?

The s3-job-definition requires a structure as value.
And in your case, you want to pass in a JSON-formatted structure parameter, so you should wrap the JSON starting with bucketDefinitions in single quotes. Also instead of = use the JSON syntax : for key-value pairs.
The following API call should work:
aws macie2 create-classification-job --job-type "ONE_TIME" --name "macie-poc" --s3-job-definition '{"bucketDefinitions":[{"accountId":"254378651398", "buckets":["maice-poc"]}]}'

Related

New version of task definition in ECS Fargate

I have a ECS cluster running.I want to create a new version of task definition using awscli.
I know i need to use below command to create new version.
aws ecs register-task-definition --family API-servie-fetch --cli-input-json file://TD-DC.json
But i am not getting from where to get this JSON file "file://TD-DC.json" ?
I belive i have to update image tag and version number in this file but from where i can get this file ?
Note:- My task is already running and i only want to update it with new image rest all parameters should be same.
You can obtained current task definition in json format using describe-task-definition. Once you have it, you can modify it as you want, and then upload as new version.
If you work in command line, you can use jq to modify/process the original task definition in json format.

What is the boto3 equivalent of the following AWS CLI command?

I am trying to write the following AWS CLI command in boto3:
aws s3api list-objects-v2 --bucket bucket --query "Contents[?LastModified>='2020-02-20T17:00:00'] | [?LastModified<='2020-02-20T17:10:00']"
My goal is to be able to list only those s3 files in a specified bucket that were created within a specific interval of time. In the example above, when I run this command it returns only files that were created between 17:00 and 17:10.
When looking at the boto3 documentation, I am not seeing anything resembling the '--query' portion of the above command, so I am not sure how to go about returning only the files that fall within a specified time interval.
(Please note: Listing all objects in the bucket and filtering from the larger list is not an option for me. For my specific use case, there are simply too many files to go this route.)
If a boto3 equivalent does not exist, is there a way to launch this exact command via a Python script?
Thanks in advance!
There's no equivalent one from boto3. But s3pathlib solves your problem. Here's the code snippet solves your problem:
from datetime import datetime
from s3pathlib import S3path
# define a bucket
p_bucket = S3Path("bucket")
# filter by last modified
for p in p_bucket.iter_objects().filter(
# any Filterable attribute can be used for filtering
S3Path.last_modified_at.between(
datetime(2000,1,1),
datetime(2020,1,1),
)
):
# do what ever you like
print(p.console_url) # click link to open it in console, inspect results
If you want to use other S3Path attributes for filtering, and use other comparators, or even define your custom filter, you can follow this document

Is there any way to specify --endpoint-url in aws cli config file

The aws command is
aws s3 ls --endpoint-url http://s3.amazonaws.com
can I load endpoint-url from any config file instead of passing it as a parameter?
This is an open bug in the AWS CLI. There's a link there to a cli plugin which might do what you need.
It's worth pointing out that if you're just connecting to standard Amazon cloud services (like S3) you don't need to specify --endpoint-url at all. But I assume you're trying to connect to some other private service and that url in your example was just, well, an example...
alias aws='aws --endpoint-url http://website'
Updated Answer
Here is an alternative alias to address the OP's specific need and comments above
alias aws='aws $([ -r "$SOME_CONFIG_FILE" ] && sed "s,^,--endpoint-url ," $SOME_CONFIG_FILE) '
The SOME_CONFIG_FILE environment variable could point to a aws-endpoint-override file containing
http://localhost:4566
Original Answer
Thought I'd share an alternative version of the alias
alias aws='aws ${AWS_ENDPOINT_OVERRIDE:+--endpoint-url $AWS_ENDPOINT_OVERRIDE} '
This idea I replicated from another alias I use for Terraform
alias terraform='terraform ${TF_DIR:+-chdir=$TF_DIR} '
I happen to use direnv with an /Users/darren/Workspaces/current-client/.envrc containing
source_up
PATH_add bin
export AWS_PROFILE=saml
export AWS_REGION=eu-west-1
export TF_DIR=/Users/darren/Workspaces/current-client/infrastructure-project
...
A possible workflow for AWS-endpoint overriding could entail cd'ing into a docker-env directory, where /Users/darren/Workspaces/current-client/app-project/docker-env/.envrc contains
source_up
...
export AWS_ENDPOINT_OVERRIDE=http://localhost:4566
where LocalStack is running in Docker, exposed on port 4566.
You may not be using Docker or LocalStack, etc, so ultimately you will have to provide the AWS_ENDPOINT_OVERRIDE environment variable via a mechanism and with an appropriate value to suit your use-case.

DataFlow gcloud CLI - "Template metadata was too large"

I've honed my transformations in DataPrep, and am now trying to run the DataFlow job directly using gcloud CLI.
I've exported my template and template metadata file, and am trying to run them using gcloud dataflow jobs run and passing in the input & output locations as parameters.
I'm getting the error:
Template metadata regex '[ \t\n\x0B\f\r]*\{[ \t\n\x0B\f\r]*((.|\r|\n)*".*"[ \t\n\x0B\f\r]*:[ \t\n\x0B\f\r]*".*"(.|\r|\n)*){17}[ \t\n\x0B\f\r]*\}[ \t\n\x0B\f\r]*' was too large. Max size is 1000 but was 1187.
I've not specified this at the command line, so I know it's getting it from the metadata file - which is straight from DataPrep, unedited by me.
I have 17 input locations - one containing source data, all the others are lookups. There is a regex for each one, plus one extra.
If it's running when prompted by DataPrep, but won't run via CLI, am I missing something?
In this case I'd suspect the root cause is a limitation in gcloud that is not present in the Dataflow API or Dataprep. The best thing to do in this case is to open a new Cloud Dataflow issue in the public tracker and provide details there.

AWS CloudSearch request using CLI returns Invalid Javascript Object error

I'm trying to query my AWS Cloudsearch (2013 API) domain using the AWS CLI on Ubuntu. I haven't been able to get it to work successfully when the search is restricted to a specific field. The following query:
aws --profile myprofile cloudsearchdomain search
--endpoint-url "https://search-mydomain-abc123xyz.eu-west-1.cloudsearch.amazonaws.com"
--query-options {"fields":["my_field"]}
--query-parser "simple"
--return "my_field"
--search-query "foo bar"
...returns the following error:
An error occurred (SearchException) when calling the Search operation: q.options contains invalid javascript object
If I remove the --query-options parameter from the above query, then it works. From the AWS CLI docs regarding the fields options of the --query-options parameter:
An array of the fields to search when no fields are specified in a search... Valid for: simple , structured , lucene , and dismax
aws cli version:
aws-cli/1.11.150 Python/2.7.12 Linux/4.10.0-28-generic botocore/1.7.8
I think the documentation is a bit misleading as JSon does not like embedded double quotes inside double quotes, you would need to replace with single quote as
--query-options "{'fields':['my_field']}"
or you can escape the double quote
--query-options "{\"fields\":[\"my_field\"]}"