GCP Functions Node.js Huge Latency - google-cloud-platform

We are running simple GCP Functions (pure, no Firebase, or any other layer added) that just handle HTTP requests using Node.js engine (previously version 8, now 10) and return some "simple JSON response". What we see is that sometimes (but not rarely) there is a huge latency when the request is "accepted by GCP" and before it gets to our function code. If I say huge I'm not speaking ms but units of seconds! And it is not a cold start (we have separate log messages on the global scope so we know when cold start occurs). Functions have currently 256 or 512 mb and run in close region.
We log at the very first line of the GCP function, for example:
or
Does anyone also experience that? And is that normal that sometimes this delay may take up to 5s (or rarely even more)?
By the way, sometimes the same thing happens on the output side as well. So if unlucky, it may take up to 10s. Thanks in advance for any reply, no matter if you have or have not similar experience.

All such problems I have seen have been related with cold start or it was not possible to prove that they are not related with code start.
This question could be even to broad to stackoverflow. We do not have any chance to reproduce it without example at least functions and number of the executions, however I will try to answer.
It seems that latency analyzes are done mainly on logs. I think you should try to use "Trace" functionality that is available in GCP (direct link and documentation). This should give you data to be able to track the issue.
Example i have used it on helloworld cloud function and was curl'ing it from bush script. It seems that over few hundreds of invocations there was one execution with latency 10 times greater than usually.
I hope it will help somehow :)!

Related

My chatbot (lex v2) exceeded intent & utterances quotas and i need more intents. [AWS][LEXV2]

I am creating a large chatbot, the problem is that I have already exceeded the limits that amazon has, I need approximately 1800 intents(it is a large project), and the "hard limits" cannot be increased (I already spoke with an amazon agent) , I wanted to know if anyone has experienced this problem and how to solve it (not changing Dialogflow/wattson tools).
I was thinking of creating a "Chatbot Orchestrator" and splitting the chatbot into several parts (experiences) and invoking the corresponding bot and intent.
Any ideas?
A possible solution is use Kendra for search the intent, basically i need to activate the fallback intent and use Kendra in a lambda function for the search of the answer.
in this document, there is an example.
Kendra, as mentioned already, is an alternative.
However, I would suggest you do a deep dive through your intents to see how many pertain to the same context and can be combined and effectively managed through the use of slots and Lambdas to get the right behaviour.
Another approach would be to use separate bots if you have clean divisions between the intents. Note that your costs here could increase quite substantially as you'd need to invoke all the bots, evaluate the confidence scores and then decide which response to return to the client.

Background jobs occur very frequently and eat memory

I'd like to optimize my notification system, so here is how it works now:
Every time some change occurred on application, we're calling background job (Sidekiq) in order to compute some values and then to notify users via email.
This approach worked very well for a while, but suddenly we got memory leak as there were a lot of actions very frequently and we had about 30-50 workers per second so I need to refactor this.
What I would like to do is, instead of running worker immediately, to store it in array and perform bit later.
But I'm afraid that also will cause a problem, but just "delayed" problem.
I'm looking forward to hear more approaches and solutions as well.
Thanks in advance
So I found one very interesting solution:
I'm storing values to Redis directly as key - value, where the value is dataset with data I'd need later for computation. Then I'm using simple cron job, which occurs service which is responsible for reading data from Redis and computing them. I optimized Sidekiq workers to work only when cron is executed, everything works perfectly fine and even much faster then before.
I'm still eager to hear if there is any other approach/solution.
Thanks

Code to run that can help benchmark a new SAS drive?

We just got a new Server at work that will be used only for running SAS code, and I've been asked to run some tests and make sure that it's performing better than our other servers. I'm not an expert at this so I want to make sure I avoid making naive mistakes that don't properly measure the performance of the server. My header looks like this:
options fullstimer;
%LET BenchStartTime = %sysfunc(datetime(),22.);
Which I use as a check for the "real time" report in the log. I have a vague understanding of the difference between "user cpu time" and "system cpu time", but if anyone wants to offer up additional information on that, that would be helpful.
Anyway, the main point of this post is that I want to know if there are any standard benchmark tests that I should be using in order to see if this new server is better than the old ones. Currently I'm using something I found online which is just appending a bunch of copies of sashelp.class (but I think this might be a bad idea because pulling from the C: and loading it into a different drive might be the same across all servers, right? If the C: is the slowest drive, doesn't that become the bottleneck?), and I'm also using code that basically generates a bunch of random data of a fixed size and comparing runtimes. Is this the correct approach? Is there something else I should be doing? How many times should I be running these benchmark tests to make sure that it isn't a fluke?
Thanks for your help!
I would test by doing the things that you normally do. If you run larger merges, then you're basically talking I/O; so just make a very large dataset, write it out, read it in, etc., and perform the same test on the other machine. Perform each test a few times each in a fresh SAS session.
Further, it sounds like you need to make sure the new server can handle multiple concurrent sessions. You can simulate this in part by submitting many connections from one computer by using SAS/CONNECT; that allows you to start multiple concurrent sessions. Then do so, perhaps I would create a script that starts a local SAS session, signs on to the server and rsubmits a job of a normal difficulty that takes maybe 5 to 10 minutes, and then run that script 20 times concurrently (you can script this or just double click a .bat file 20 times). See how it handles it as compared to the other server.
On SAS 9.2 and later you can use the %SASHOME%/SASFoundation/9.x/sasiotest.exe utility. You will find some further Guidance in Support note 51659.
Edit:
A similiar utility can be downloaded for Unix Plattform, details in Support note 51660.

Make my desktop app appears to load/quit faster

I currently have a GUI single-threaded application in C++ and Qt. It takes a good 1 minute to load (read from disk) and ~5 seconds to close (saving settings, finalize connections, ...).
What can I do to make my application appear to be faster?
My first thought was to have a server component of the app that does all the works while the GUI component is only for displaying. The communication is done via socket, pipe or memory map. That seems like an overkill (in term of development effort) since my application is only used by a handful of people.
The first step is to start profiling. Use an actual, low-overhead profiling tool (eg, on Linux, you could use oprofile), not guesswork. What is your app doing in that one minute it takes to start up? Can any of that work be deferred until later, or perhaps skipped entirely?
For example, if you're loading, say, a list of document templates, you could defer that until the user tells you to create a new document. If you're scanning the system for a list of fonts, load a cached list from last startup and use that until you finish updating the font list in a separate thread. These are just examples - use a profiler to figure out where the time's actually going, and then attack the code starting with the largest time figures.
In any case, some of the more effective approaches to keep in mind:
Skip work until needed. If you're doing initialization for some feature that's used infrequently, skip it until that feature is actually used.
Defer work until after startup. You can take care of a lot of things on a separate thread while the UI is responsive. If you are collecting information that changes infrequently but is needed immediately, consider caching the value from a previous run, then updating it in the background.
For your shutdown time, hide your GUI instantly, and then spend those five seconds shutting down in the background. As long as the user doesn't notice the work, it might as well be instantaneous.
You could employ the standard trick of showing something interesting while you load.
Like many games nowadays show a tip or two while they are loading
It looks to me like you're only guessing at where all this time is being burned. "Read from disk" would not be high on my list of candidates. Learn more about what's really going on.
Use a decent profiler.
Profiling is a given, of course.
Most likely, you may find I/O is substantial - reading in your startup files. As bdonlan notes, deferring work is a standard technique. Google 'lazy evaluation'.
You can also consider caching data that does not change. Save a cache in a faster format, such as binary. This is most useful if you happen to have a large static data set read into something like an array.

how to perform profiling for a website?

I currently have a django site, and it's kind of slow, so I want to understand what's going on. How can I profile it so to differentiate between:
effect of the network
effect of the hosting I'm using
effect of the javascript
effect of the server side execution (python code) and sql access.
any other effect I am not considering due to the massive headache I happen to have tonight.
Of course, for some of them I can use firebug, but some effects are correlated (e.g. javascript could appear slow because it's doing slow network access)
Thanks
client side:
check with firebug if/which page components take long to load, and how long the browser needs to render the page after loading is completed. If everything is fast but rendering takes its time, then probably your html/css/js is the problem, otherwise it's server side.
server side (i assume you sit on some unix-alike server):
check the web server with a small static content (a small gif or a little html page), using apache bench (ab, part of the apache webserver package) or httperf, the server should be able to answerat least 100 requests per second (of course this depends heavily on the size of your test content, webserver type, hardware and other stuff, so dont take that 100 to seriously). if that looks good,
test django with ab or httperf on a "static view" (one that doesnt use a database object), if thats slow it's a hint that you need more cpu power. check cpu utilization on the server with top. if thats ok, the problem might be in the way the web server executes the python code
if serving semi-static content is ok, your problem might be the database or IO-bound. Database problems are a wide field, here is some general advice:
check i/o throughput with iostat. if you see lot's of writes then you have get a better disc subsystem, faster raid, SSD hard drives .. or optimize your application to write less.
if its lots of reads, the host might not have enough ram dedicated as file system buffer, or your database queries might not be optimized
if i/o looks ok, then the database might be not be suited for your workload or not correctly configured. logging slow queries and monitoring database activity, locks etc might give you some idea
if you let us know what hardware/software you use i might be able to give more detailed advice
edit/PS: forgot one thing: of course your app might have a bad design and does lots of unnecessary/inefficient things ...
Take a look at the Django debug toolbar - that'll help you with the server side code (e.g. what database queries ran and how long they took); and is generally a great resource for Django development.
The other non-Django specific bits you could profile with yslow.
There are various tools, but problems like this are not hard to find because they are big.
You have a problem, and when you remove it, you will experience a speedup. Suppose that speedup is some factor, like 2x. That means the program is spending 50% of its time waiting for the slow part. What I do is just stop it a few times and see what it's waiting for. In this case, I would see the problem 50% of the times I stop it.
First I would do this on the client side. If I see that the 50% is spent waiting for the server, then I would try stopping it on the server side. Then if I see it is waiting for SQL queries, I could look at those.
What I'm almost certain to find out is that more work is being requested than is actually needed. It is not usually something esoteric like a "hotspot" or an "algorithm". It is usually something dumb, like doing multiple queries when one would have been sufficient, so as to avoid having to write the code to save the result from the first query.
Here's an example.
First things first; make sure you know which pages are slow. You might be surprised. I recommend django_dumpslow.