Amazon Connect - How to generate report based on CTRs - amazon-web-services

We have created a contact centre with two contact flows and 1 customer queue flow. Under Metrics there are multiple report types which focus on the Queue or Agents. But what I need is to get a report based on the Customer's choice for the Get customer Input blocks. ie., path taken by the customer at each intersection. Is there a way to achieve this ?
Example:
customer selected option A at level 1 and Option 3 at Level 2, contact flow name etc.. I believe these info will reside in CTRs(Contact attributes) but how to get a cumulative report on all records. Because as far I checked we can only get one contact at a time using Contact search.

You can stream the CTR data to a redshift database using the streaming feature in Amazon Connect. The general steps are to create a kinesis stream and then turn on streaming in Amazon connect and attach the kinesis stream.
There is a quickstart to help set this up here: https://aws.amazon.com/quickstart/connect/data-streaming/

Related

Bigquery BQ Alerts for Slots and Query concurrency

I'm trying to establish email alerts at a project level to send email alerts for when a certain number of query/job concurrency is reached e.g. 5 concurrent queries. We have a flat-rate pricing model.
I want to do a similar email notification when total slot Usage exceeds a certain threshold as well e.g. slot usage reaching 1000 slots
As a next step I would like to throttle new incoming queries based on the above mentioned thresholds. Meaning if there are already for example 5 queries actively running the 6th one will be put on hold until one of the 5 running earlier have completed.
You may create an Alert Policy in which you can set your desired metric type (eg. slots) and then configure your desired threshold similar to below.
In creating an Alert Policy you may also configure the notification channel to email notification which is also included on the same documentation.
For the available metric types for SLOTS in BigQuery, you may refer to this Google Cloud Metrics for BigQuery documentation.
For your next step, you may code (python, node.js, etc) using BigQuery API to count queries actively running (through JOB ID) and when the count hits 5, you may print "query queue is full" and then wait for the total JOBS to hit below 5 before running the next query. You may refer to this BigQuery Managing Jobs API Documentation.

BigQuery summary

Where/how can I easily see how many BigQuery analysis queries have been run per month. How about storage usage overall/changes-over-time(monthly)?
I've had a quick look at "Monitoring > Dashboards > Bigquery". Is that the best place to explore? It only seems to go back to early October - was that when it was released or does it only display the last X weeks of data? Trying metrics explorer for Queries Count (Metric:bigquery.googleapis.com/job/num_in_flight) was giving me a weird unlabelled y-axis, e.g. a scale of 0 to 0.08? Odd as I expect to see a few hundred queries run per week.
Context: It would be good to have a high level summary of BigQuery, as the the months progress, to give an idea to the wider organisation and management on the scale of usage.
You can track your bytes billed by exporting BigQuery usage logs.
Setup logs export (this is using the Legacy Logs Viewer)
Open Logging -> Logs Viewer
Click Create Sink
Enter "Sink Name"
For "Sink service" choose "BigQuery dataset"
Select your BigQuery dataset to monitor
Create sink
Create sink
Once Logs is enabled, all queries to be executed will store data usage logs in table "cloudaudit_googleapis_com_data_access_YYYYMMDD" under the BigQuery dataset you selected in your sink.
Created cloudaudit_googleapis_com_* tables
Here is a sample query to get bytes used per user
#standardSQL
WITH data as
(
SELECT
protopayload_auditlog.authenticationInfo.principalEmail as principalEmail,
protopayload_auditlog.metadataJson AS metadataJson,
CAST(JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
"$.jobChange.job.jobStats.queryStats.totalBilledBytes") AS INT64) AS totalBilledBytes,
FROM
`myproject_id.training_big_query.cloudaudit_googleapis_com_data_access_*`
)
SELECT
principalEmail,
SUM(totalBilledBytes) AS billed_bytes
FROM
data
WHERE
JSON_EXTRACT_SCALAR(metadataJson, "$.jobChange.job.jobConfig.type") = "QUERY"
GROUP BY principalEmail
ORDER BY billed_bytes DESC
Query results
NOTES:
You can only track the usage starting at the date when you set up the logs export
Table "cloudaudit_googleapis_com_data_access_YYYYMMDD" is created daily to track all logs
I think Cloud Monitoring is the only place to create and view metrics. If you are not happy with what they provide for BigQuery by default, the only other alternative is to create your own customized carts and dashboards that satisfy your need. You can achieve that using Monitoring Query Language. Using MQL you can achieve the stuff you described in you question. Here are the links for more detailed information.
Introduction to BigQuery monitoring
Introduction to Monitoring Query Language

How do I turn on cost controls at user level on BigQuery?

Felipe Hoffa wrote this very helpful guide on how to turn on custom cost control for a project in BigQuery. However, according to the doc, it should be possible to configure custom cost control as user level as well. I really need to do this for my production data warehouse project because I can't let one person's mishap stop all the other users from using the data warehouse. Please help!
Go to console.cloud.google.com > I&AM > Quotas. Then filter by bigquery in the services dropdown. You will find a row like the one bellow:
You are looking to edit the Query usage per day per user. To calculate the number of Bytes you can use a service like: https://convertlive.com/u/convert/terabytes/to/bytes#1

is Google Bigquery suitable for inserting data from IoT devices?

I am working on a startup company where we would sell an IoT device of some sort . these devices will be connected to our server hosted in Google cloud and will send data every 1 second where my server will store it in database as a time series. Let's say we have 1000 device connected and all are sending their data every second , Is it suitable to use google bigquery to insert these data in table every second for each device to it's corresponding table to the owner of the device ?
since my data is in form of a time series i am thinking of using partitioned table for each user ( owner of my device ) but with the limits and quotas listed in the official documentation i am worrying of reaching the limit with my high number of inserts every second ( not to say that I will query the data based on user demand on my phone app ) .
if it's not suitable what would be suited for my use case ?
EDIT : my main concern is the huge amount of inserts per second which can exceeds BigQuery limits or might cause slow down since it's mainly for data warehouse . BigTable seems expensive for us and CloudSQL it seems the way to go but we are worried of slow query times once the table get filled since i am inserting 86400 row per user per day .
Thanks.
You should check out CLOUD IOT CORE - fully managed service to easily and securely connect, manage, and ingest data from globally dispersed devices
Device data captured by Cloud IoT Core gets published to Cloud Pub/Sub for downstream analytics. You can do ad hoc analysis using Google BigQuery, easily run advanced analytics and apply machine learning with Cloud Machine Learning Engine, or visualize IoT data results with rich reports and dashboards in Google Data Studio.
Check also IoT Core with PubSub, Dataflow, and BigQuery

Matillion for Amazon Redshift support for job monitoring

I am working on Amazon Matillion for Redshift and we have multiple jobs running daily triggered by SQS messages. Now I am checking the possibility of creating a UI dashboard for stakeholders which will monitor live progress of jobs and will show report of previous jobs, like Job name, tables impacted, job status/reason for failure etc. Does Matillion maintain this kind of information implicitly? Or I will have to maintain this information for each job.
Matillion has an API which you can use to obtain details of all task history. Information on the tasks API is here:
https://redshiftsupport.matillion.com/customer/en/portal/articles/2720083-loading-task-information?b_id=8915
You can use this to pull data on either currently running jobs or completed jobs down to component level including name of job, name of component, how long it took to run, whether it ran successfully or not and any applicable error message.
This information can be pulled into a Redshift table using the Matillion API profile which comes built into the product and the API Query component. You could then build your dashboard on top of this table. For further information I suggest you reach out to Matillion via their Support Center.
The API is helpful, but you can only pass a date as a parameter (this is for Matillion for Snowflake, assume it's the same for Redshift). I've requested the ability to pass a datetime so we can run the jobs throughout the day and not pull back the same set of records every time our API call runs.