Is there a way to create Quicksight analysis purely through code (boto3)?

Is there a way to create Quicksight analysis purely through code (boto3)? - amazon-web-services

What I currently have in my Quicksight account is a Data Source (Redshift), some datasets (some Redshift views) and an analysis (graphs and charts that use the datasets). I can view all of these on the AWS Quicksight Console. But when I use boto3 to create a data source and datasets, nothing shows up on the console. They do however show up when I use the list_data_sources and list_data_sets calls.
After this, I need to create all the graphs by code that I created manually. I can't currently find an option to do this through code. There is a 'create_template' api call which is supposed to create a template through an existing Quicksight analysis. But it requires the ARN of the analysis which I can't find.
Any suggestions on what to do?

Note: this only answers why the data sets/sources do not appear in the console. As for the other question, I assume mjgpy3 was of some help.
Summary
Add the permissions at the bottom of this post to your data set and data source in order for them to appear in the console. Make sure to fill in the principal arn with your details.
Details
In order for data sets and data sources to appear in the console when created via the API, you must ensure that the correct permissions have been added to them. Without adding the correct permissions, it is true that the CLI lists them whereas the console does not.
If you have created data sets/sources via the console, you can use the CLI (aws quicksight describe-data-set-permissions and aws quicksight describe-data-source-permissions) to view what permissions AWS gives them so that your account can interact with them.
I've tested this and these are what AWS assigns them as of 25/03/2020.
Data Set permissions:
"permissions": [
{
"Principal": "arn:aws:quicksight:<region>:<aws_account_id>:user/default/{IAM user name}",
"Actions": [
"quicksight:UpdateDataSetPermissions",
"quicksight:DescribeDataSet",
"quicksight:DescribeDataSetPermissions",
"quicksight:PassDataSet",
"quicksight:DescribeIngestion",
"quicksight:ListIngestions",
"quicksight:UpdateDataSet",
"quicksight:DeleteDataSet",
"quicksight:CreateIngestion",
"quicksight:CancelIngestion"
]
}
]
Data Source permissions:
"permissions": [
{
"Principal": "arn:aws:quicksight:<region>:<aws_account_id>:user/default/{IAM user name}",
"Actions": [
"quicksight:UpdateDataSourcePermissions",
"quicksight:DescribeDataSource",
"quicksight:DescribeDataSourcePermissions",
"quicksight:PassDataSource",
"quicksight:UpdateDataSource",
"quicksight:DeleteDataSource"
]
}
]

It sounds like your smaller question is regarding the ARN of the analysis.
The format of analysis ARNs is
arn:aws:quicksight:$AWS_REGION:$AWS_ACCOUNT_ID:analysis/$ANALYSIS_ID
Where
$AWS_REGION is replaced with the region in which the analysis lives
$AWS_ACCOUNT_ID is replaced with your AWS account ID
$ANALYSIS_ID is replaced with the analysis ID
If you're looking for the $ANALYSIS_ID it's the GUID-looking thing on the end of the URL for the analysis in the QuickSight URL
So, if you were on an analysis at the URL
https://quicksight.aws.amazon.com/sn/analyses/018ef6393-2c71-4842-9798-1aa2f0902804
the analysis ID would be 018ef6393-2c71-4842-9798-1aa2f0902804 (this is a fake ID I injected for this example).
Your larger question seems to be whether you can use the create_template API to duplicate your analysis. The answer at this moment (12/16/19) is, unfortunately, no.
You can use the create_dashboard API to publish a Dashboard from a template made with create_template but you can't create an Analysis from a template.
I'm answering this bit just to clarify since you may actually be okay with creating a dashboard (basically the published version of an analysis) rather than another analysis.

There are multiple ways you can find analysis id associated. Use any of the following.
A dashboard url has dashboard id included, Use this ID to execute API call describe-dashboard and you would see analysis ARN in the source entity.
Click on "save as" option on the dashboard and it would take you to the associated analysis. [ One might not see this option if a dashboard is created from a template ]
A dashboard ID can also be found by using list_dashboards API call. Print all the dashboard ID and name. You can match the ID with the given dashboard name.Look at the whole list because a dashboard id is unique but the dashboard name is not. One can have multiple dashboards with the same name.

Yes you can create lambda and trigger using cron Job
import boto3
quicksight = boto3.client('quicksight')
response = quicksight.create_ingestion(AwsAccountId=XXXXXXX,
DataSetId=YYYY,IngestionId=ZZZZ)
https://aws.amazon.com/blogs/big-data/automate-dataset-monitoring-in-amazon-quicksight/
https://aws.amazon.com/blogs/big-data/event-driven-refresh-of-spice-datasets-in-amazon-quicksight/

I've been playing with this as well and ran into the same issue. Make sure that your permissions are set up properly for the data source and the data set by referencing the quicksight user as follows:
arn:aws:quicksight:{region}:xxxxxxxxxx:user/default/{user}
I would include all the quicksight permissions found in the docs to start with and shave down from there. If nothing else, create the data source/set from the console, and then use the describe-* CLI call to see what they use.
It's kind of wonky.

Related

Adding row level permission tag configuration to a dataset

I am trying to embed an AWS QuickSight dashboard for anonymous access. For that dataset used in the dashboard must have tags that specify row level security. From what I see the only way to do this is via update-data-set cli command (or related API request). But this is insane - for this command to work I have to specify additional parameters like dataset name or even physical table map. But I have no intention to modify those, I just need to add RLS tags. Is there a straightforward way to add RLS tags to an existing dataset?

I ended up generating skeleton JSON for update-data-set via --generate-cli-skeleton parameter, then filling it with data from describe-data-set command, and adding block
"RowLevelPermissionTagConfiguration": {
"Status": "ENABLED",
"TagRules": [
{
"TagKey": "my_tag",
"ColumnName": "my_column"
}
]
}
and supplying this resulting JSON file via update-data-set --cli-input-json file://thatfile.json
Cumbersome, but it worked.

GCP: Is it possible to have an access to a resource if don't have project access?

It is my first expirience in Google Cloud Platform and I'm confused.
I've got an access to a resource:
xxx#gmail.com has granted you the following roles for resource resource_name(projects/project_name/datasets/ClientsExport/tables/resource_name) BigQuery Data Editor
But if I open BigQuery Data Editor, I don't see project_name and resource_name. Search by resource_name also returns no result.
Is it only access that I have in the project (I didn't get another accesses and mails).
Could you please help me with this? Maybe should I get some additional access to resource_name will be available? If is there another way to find the resource?
Thank you in advance!

In the message you have access to BigQuery data inside a table. You can query them from your project, you are autorised to access them (and to write also, because you are editor).
However, this table isn't in your project, it's in another project that's why you don't see it directly in the BigQuery console. In addition, you haven't the right to read the metadata (roles/bigquery.metadataViewer) on the dataset of the other project. Eventually, you can't also view the table schema in the console, but the bq CLI allow you to view it.
I had some discussions with Google BigQuery team about that (because I got the same issue in my company), and updates should happen by the end of the year (or soon in 2022) to fix this "view" issue in the console.

It looks like you have IAM permission to access a specific resource in BigQuery but cannot access it from the GUI.
Some reasons you may not see access on your GUI:
You have permission to interact with BigQuery but don't have access to any of the data.
You aren't a member of the organization which provided the resources and they have higher level permissions (on the org level) which prevents sharing of resources outside of the org.
Your access is restricted to the command line/app level. (If your account is a service account then this is likely the case.)

List all LogGroups using cdk

I am quite new to the CDK, but I'm adding a LogQueryWidget to my CloudWatch Dashboard through the CDK, and I need a way to add all LogGroups ending with a suffix to the query.
Is there a way to either loop through all existing LogGroups and finding the ones with the correct suffix, or a way to search through LogGroups.
const queryWidget = new LogQueryWidget({
title: "Error Rate",
logGroupNames: ['/aws/lambda/someLogGroup'],
view: LogQueryVisualizationType.TABLE,
queryLines: [
'fields #message',
'filter #message like /(?i)error/'
],
})
Is there anyway I can add it so logGroupNames contains all LogGroups that end with a specific suffix?

You cannot do that dynamically (i.e. you can't make this work such that if you add a new LogGroup, the query automatically adjusts), without using something like AWS lambda that periodically updates your Log Query.
However, because CDK is just a code, there is nothing stopping you from making an AWS SDK API call inside the code to retrieve all the log groups (See https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CloudWatchLogs.html#describeLogGroups-property) and then populate logGroupNames accordingly.
That way, when CDK compiles, it will make an API call to fetch LogGroups and then generated CloudFormation will contain the log groups you need. Note that this list will only be updated when you re-synthesize and re-deploy your stack.
Finally, note that there is a limit on how many Log Groups you can query with Log Insights (20 according to https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html).

If you want to achieve this, you can create a custom resource using AwsCustomResource and AwsSdkCall classes to do the AWS SDK API call (as mentioned by #Tofig above) as part of the deployment. You can read data from the API call response as well and act on it as you want.

AWS Personalize: Dumping User-item interaction Dataset Created By PutEvent

Following AWS Personalize documents, I successfully imported my datasets (User, Item, Interaction) from S3, created an EventTrcker, trained the model, and deployed the campaign. The solution works without any issue and I get the recommendations.
I rely on Putevent to add new user-item interaction events. I also dump those interaction events using Lambda+firehose in my s3. But I am wondering if AWS Personalize internally creates/augments the original user-item interaction dataset? How I can access and download the revised version of the dataset? I cannot see any new dataset in "Dataset groups > Datasets" rather than my original 3 datasets...
I prefer to dump it regularly from AWS Personalize to my S3 storage rather than using my own Lambda+Firehose solution.
This is the output of my Putevent call. I see 200...but not sure it works fine or not...should I see any new dataset in "Dataset groups > Datasets" created by putevents?
{
"ResponseMetadata": {
"RequestId": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/json",
"date": "Mon, 04 Jan 2021 18:04:28 GMT",
"x-amzn-requestid": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
"content-length": "0",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}

Update: Now it's possible
AWS documentation:
https://docs.aws.amazon.com/personalize/latest/dg/export-data.html
You can use this AWS CLI command for exporting only interactions, that were added but PutEvents/PutUsers/PutItems API calls:
aws personalize create-dataset-export-job \
--job-name job name \
--dataset-arn dataset ARN \
--job-output "{\"s3DataDestination\":{\"kmsKeyArn\":\"kms key ARN\",\"path\":\"s3://bucket-name/folder-name/\"}}" \
--role-arn role ARN \
--ingestion-mode PUT
In that case --ingestion-mode PUT will make sure, that:
Specify PUT to export only data that you imported incrementally using the console or the PutEvents, PutUsers, or PutItems operations.
So I believe it covers your use case.
No, it's not possible
It's simply impossible right now to export this data.
There is no API to retrieve a dump of your Interactions dataset in Personalize.
I believe Lambda + Firehose workaround for this is correct approach.
But how to test, if PutEvents works?
To make sure, that Interactions added through PutEvents, you can make use of Filters feature:
https://docs.aws.amazon.com/personalize/latest/dg/filter-expressions.html
Pretty much create a new Filter, with similar expression:
EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("your_event_type_name")
Which will exclude from recommendations any item, that user previously interacted with.
Then you can test, if events added through PutEvents API are recognized correctly:
Create Filter expression as described above.
Create any campaign for simple recommendations (User-Personalization recipe).
Connect the filter to campaign.
Get recommendations for any user and save them somewhere.
Call PutEvents API with any of the recommended items, that was returned in 4 and user id from 4.
Again get recommendations for the same user as in 4.
If the item, that you did added with PutEvents call is no longer recommended, then you have a proof, that events added through PutEvents call are correctly added to Interactions dataset.
What if PutEvents call doesn't affect recommendations in that case?
Then simply you are providing incorrect values in API call. Personalize might return 200 response, even if event provided was invalid.
To fix that, try:
Make sure date is in correct format. Personalize might ignore events with very old timestamps, if there are much more newer events (it's possible to configure it in Solution config).
Check if you are not passing any strange values like "null" or "undefined" for sessionId, userId, trackingId in PutEvents params. It might cause ignoring the event by Personalize (https://github.com/aws/aws-sdk-js/issues/3371)
Make sure, you are passing correct eventType value (should match eventType in Solution and Filter).
If it still doesn't work, raise a support ticket to AWS with an example PutEvents API call params.
Are there any simpler solutions?
Well, maybe there are, but in our project we use this approach and it also tests, if filtering feature is working correctly. You will probably make use of Filtering anyways in the future, so I believe it's good enough method.

Get the BigQuery Table creator and Google Storage Bucket Creator Details

I am trying to identify the users who created tables in BigQuery.
Is there any command line or API that would provide this information. I know that audit logs do provide this information, but I was looking for a command line which could do the job so that i could wrap this in a shell script and run them against all the tables at one time. Same for Google Storage Buckets as well. I did try
gsutil iam get gs://my-bkt and looked for "role": "roles/storage.admin" role, but I do not find the admin role with all buckets. Any help?

This is a use case for audit logs. BigQuery tables don't report metadata about the original resource creator, so scanning via tables.list or inspecting the ACLs don't really expose who created the resource, only who currently has access.
What's the use case? You could certainly export the audit logs back into BigQuery and query for table creation events going forward, but that's not exactly the same.

You can find it out using Audit Logs. You can access them both via Console/Log Explorer or using gcloud tool from the CLI.
The log filter that you're interested in is this one:
resource.type = ("bigquery_project" OR "bigquery_dataset")
logName="projects/YOUR_PROJECT/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"
protoPayload.resourceName = "projects/YOUR_PROJECT/datasets/curb_tracking/tables/YOUR_TABLE"
If you want to run it from the command line, you'd do something like this:
gcloud logging read \
'
resource.type = ("bigquery_project" OR "bigquery_dataset")
logName="projects/YOUR_PROJECT/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"
protoPayload.resourceName = "projects/YOUR_PROJECT/datasets/curb_tracking/tables/YOUR_TABLE"
'\
--limit 10
You can then post-process the output to find out who created the table. Look for principalEmail field.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js