AWS RDS Performance Insights not showing SQL Queries - amazon-web-services

I enabled Performance Insights on an existing SQL Server database (MySql 5.6.46) in AWS RDS.
But still, it shows 0 sessions and “No active sessions in the selected time range” no matter what duration I've select from the top list.
Is there some condition I need to meet in order to have my query get recorded in Performance Insights? What're the criteria? How can I troubleshoot this?

I created AWS Support case where AWS Engineer explained to me:
Unfortunately, this is a known issue from our end where Performance Insights does not get enabled when it is issued in the same API call as engine version upgrade as RDS follows a priority in executing multiple requests that have been submitted as part of the same API call - for example in this case, request to enable Performance Insights and request to upgrade the instance to 11.1 version. Performance Insights call is evaluated first followed by the engine upgrade. This means that when Performance Insights request was being considered, the instance was still on the previous incompatible version, hence the request did not go through successfully.
The workaround to resolve this issue is to disable Performance insights, wait a few minutes and then re-enable Performance Insights.
Enabling/disabling Performance Insights does not cause an outage/downtime. The Performance Insights agent is designed to stay out of your database workloads' way. When Performance Insights detects heavy load or depleted resources, it backs off, still collecting data, but only when it is safe to do so.

Related

How do I disable RDS performance insights for a t4g MySQL instance?

I'm trying to temporarily disable performance insights for an RDS MySQL t4g instance (which is using the latest MySQL version and has performance insights already enabled and in effect). The RDS documentation describes a "Performance Insights" section where this can be toggled, which I remember using when setting up the instance:
But this section is entirely missing when I go to the "Modify" view from the instance page. This is true even when I click on the option to "Modify retention tier" from the performance insights page directly:
Also (and I'm only mentioning this as possibly another effect of the same underlying issue), when I've visited the insights page for this instance in the past couple days, about 2 out of 3 times all the performance insight data over any timespan just appears as NaN, even after a system update and multiple reboots.
I do get these metrics to appear if I hard refresh enough times, but for another instance it always shows up right away.
I have combed over the entire performance insights documentation and have not found any reason why this section wouldn't appear, but I feel like I'm missing something that would be causing this to happen. Is there anywhere else I should be looking?
EDIT: I just tried to change a different (unrelated) setting on the same instance and got this message that my instance with already-enabled performance insights doesn't support performance insights, so I have to disable it...when I don't even have the option!

General guidance around Bigtable performance

I'm using a single node Bigtable cluster for my sample application running on GKE. Autoscaling feature has been incorporated within the client code.
Sometimes I experience slowness (>80ms) for the GET calls. In order to investigate it further, I need some clarity around the following Bigtable behaviour.
I have cached the Bigtable table object to ensure faster GET calls. Is the table object persistent on GKE? I have learned that objects are not persistent on Cloud Function. Do we expect any similar behaviour on GKE?
I'm using service account authentication but how frequently auth tokens get refreshed? I have seen frequent refresh logs for gRPC Java client. I think Bigtable won't be able to serve the requests over this token refreshing period (4-5 seconds).
What if client machine/instance doesn't scale enough? Will it cause slowness for GET calls?
Bigtable client libraries use connection pooling. How frequently connections/channels close itself? I have learned that connections are closed after minutes of inactivity (>15 minutes or so).
I'm planning to read only needed columns instead of entire row. This can be achieved by specifying the rowkey as well as column qualifier filter. Can I expect some performance improvement by not reading the entire row?
According to GCP official docs you can get here the cause of slower performance of Bigtable. I would like to suggest you to go through the docs that might be helpful. Also you can see Troubleshooting performance issues.

Stackdriver Logging Client Libraries - What happens during Google Downtime?

If you embed the Stackdrvier client library in your application and the Google stack driver API has downtime (Google documentation indicates 99.95% or 21.92 minutes of downtime/month)
My question is: What will happen to my application during the downtime? Will logging info build up in memory? Will it cause application errors or will it discard the log data and continue on?
Logging API downtimes can have different root causes and consequences. Google System Engineers have mechanisms in place to track and take mitigation actions so the downtime and its consequences are minimal but Google cannot guarantee data loss prevention in all outages all the time related to logging API.
Hopefully your application and pipeline can withstand up to (21.56 minutes) expected downtime a month (SLA 99.95%) as per the internal SLOs and SLAs of GCP.
The three scenarios you listed are plausible. In this period, your application sending the logs may have 500 responses from the network so it has to be able to deal with this kind of issue.
If the logging data manages to reach Google's platform but an outage prevents the data to be accessible, then Google's team will try their best to release backlogs, repopulate data, etc. They will post general notice on https://status.cloud.google.com/
If the issue is caused by the logging agent not sending data to our platform, then logging data may not be retrievable (but it could still be an infrastructural outage with one of the GCP products) or linked to something other than an outage like your application or its underlying host running out of resources or the logging agent being corrupted which is not covered by GCP Stackdriver SLA [1].
If the pipeline that ingests data from Logging API is backlogged, it could cause an outage but GCP team will try their best to make the data accessible after the outage ends.
If you suspect issues with Logging API malfunctioning, please contact support or file issue tracker or inspect open issues where Google's product team will provide updates live. Links below:
[1] https://cloud.google.com/stackdriver/sla#sla_exclusions
[2]
create new incident:
https://issuetracker.google.com/issues/new?component=187203&template=0
[3]
open issues:
https://issuetracker.google.com/savedsearches/559764

WSO2 Throttling API

I have gone through read the various questions involving throttling on stack overflow. However, I didn't find anyone with a similar issue to what I'm seeing. I have gone through the tutorials and setup process on the WSO2 site regarding throttling.
This is what I have done:
Setup an additional tier to allow 5 calls per minute on the
following levels (Advanced Throttling, Application Throttling,
Subscription Throttling).
Edit the API and set the subscription tier level to the new custom
tier
Set the Application to the new tier level
Set the Advanced Throttling Policy to apply to the API, then I saved & published
Ran 1100 HTTP requests from an application that calls the API on an
interval every second. Every request made was successfully processed
without any throttling.
I installed version 1.9 of API manager and setup the very same rules
The requests were throttled correctly.
Any help would be greatly appreciated, I'm not really sure if it is a bug or a configuration issue on my end.
Regards
So after much digging in the WSO2 documentation. I have found that in order to use the advanced throttling techniques (which are enabled by default) you must use Traffic Manager (which is disabled by default).
There are instructions on how to use Traffic Manager in the WSO2 documentation. If advanced throttling is disabled the basic throttling works as expected.
It took some time to discover this as the documentation doesn't clearly make the distinction very clear in the documentation.
I hope this helps someone having a similar issue.

How to relieve a rate-limited API?

We run a website which heavily relies on the Amazon Product Advertising API (APAA). What happens is that when we experience a sudden spike in users it happens that we hit the rate-limit and all functions relying on the APAA shut down for a while. What can we do so that doesn't happen?
So, obviously we have some basic caching in place, but the APAA doesn't allow us to cache data for a very long time, and APAA queries can vary a lot so there may not be any cached data at all to query.
I think that your only option is to retry the API calls until they work — but do so in a smart way. Unfortunately, that's what everybody that gets throttled does and AWS expects people to handle that themselves.
You can implement an exponential backoff and add jitter to prevent cluster calls. AWS has a great blog post about solutions for this kind of problem: https://www.awsarchitectureblog.com/2015/03/backoff.html