I'm using the following monitoring query to get the current maximum vCPUs utilization (how many are used).
fetch consumer_quota
| metric 'serviceruntime.googleapis.com/quota/allocation/usage'
| filter
(resource.service == 'compute.googleapis.com') &&
(metric.quota_metric == 'compute.googleapis.com/cpus')
|group_by 60s, [value_usage : max(value.usage)]
It only returns a number when I increase the time value (from 60s to 24000s), and the result is not accurate. What should I change in the query in order to get the current usage of vCPUs? i.e- right now all my instances are using 392 vCPUs.
Thanks!
Related
I'm new to Dataflow.
I'd like to use the Dataflow streaming template "Pub/Sub Subscription to BigQuery" to transfer some messages, say 10000 per day.
My question is about pricing since I don't understand how they're computed for the streaming mode, with Streaming Engine enabled or not.
I've used the Google Calculator which asks for the following:
Machine Type, Number of worker nodes used by the job, If streaming or Batch job, Number of GB of Persistent Disks (PD), Hours the job runs per month.
Consider the easiest case, since I don't need many resources, i.e.
Machine type: n1-standard1
Max Workers: 1
Job Type: Streaming
Price: in us-central1
Case 1: Streaming Engine DISABLED
Hours using the vCPU = 730 hours (1 month always active). Is this always true for the streaming mode? Or there can be a case in a streaming mode in which the usage is lower?
Persistent Disks: 430 GB HDD, which is the default value.
So I will pay:
(vCPU) 730 x $0.069(cost vCPU/hour) = $50.37
(PD) 730 x $0.000054 x 430 GB = $16.95
(RAM) 730 x $0.003557 x 3.75 GB = $9.74
TOTAL: $77.06, as confirmed by the calculator.
Case 2 Streaming Engine ENABLED.
Hours using the v CPU = 730 hours
Persistent Disks: 30 GB HDD, which is the default value
So I will pay:
(vCPU) 30 x $0.069(cost vCPU/hour) = $50.37
(PD) 30 x $0.000054 x 430 GB = $1.18
(RAM) 30 x $0.003557 x 3.75 GB = $9.74
TOTAL: $61.29 PLUS the amount of Data Processed (which is extra with Streaming Engine)
Considering messages of 1024 Byte, we have a traffic of 1024 x 10000 x 30 Bytes = 0.307 GB, and an extra cost of 0.307 GB x $0.018 = $0.005 (almost zero).
Actually, with this kind of traffic, I will save about $15 in using Streaming Engine.
Am I correct? Is there something else to consider or something wrong with my assumptions and my calculations?
Additionally, considering the low amount of data, is Dataflow really fitted for this kind of use? Or should I approach this problem in a different way?
Thank you in advance!
It's not false, but not perfectly accurate.
In the streaming mode, your Dataflow always listen the PubSub subscription and thus you need to but up full time.
In batch processing, you normally start the batch, it performs its job and then it stops.
In your comparison, you consider to have a batch job that runs full time. It's not impossible, but it doesn't fit your use case, I think.
About streaming and batching, all depends on your need of real time.
If you want to ingest the data in BigQuery with low latency (in few seconds) to have real time data, streaming is the good choice
If having data only updated every hour or every day, batch is a more suitable solution.
A latest remark, if your task is only to get message from PubSub and to stream write to BigQuery, you can consider to code it yourselves on Cloud Run or Cloud Functions. With only 10k messages per day, it will be free!
I am trying to schedule tasks in different machines. These machines have dynamique available ressources, for example:
machine 1: max capacity 4 core.
At T=t1 => available CPU = 2 core;
At T=t2 => available CPU = 1 core;
Each interval has a fixed time (Ex: 1 minute).
So in CPLEX, I have a cumulFunction to sum the used ressource in a machine :
cumulFunction cumuls[host in Hosts] =
sum(job in Jobs) pulse(itvs[task][host], requests[task]);
Now the problem is in the constraint:
forall(host in Hosts) {
cumuls[host] <= ftoi(available_res_function[host](**<<Current Period>>**));
}
I can't find a way to get the current period so that I could compare the used ressources to the available in that specefic period.
PS: available_res_function is a stepFunction of the available ressources.
Thank you so much for your help.
What you can do is to add a set of pulse in your cumul function.
For instance, in the sched_cumul function you could change:
cumulFunction workersUsage =
sum(h in Houses, t in TaskNames) pulse(itvs[h][t],1);
into
cumulFunction workersUsage =
sum(h in Houses, t in TaskNames) pulse(itvs[h][t],1)+pulse(1,40,3);
if you want to mention that 3 workers less are available between time 1 and 40.
We are hitting "User-rate limit exceeded" but all the quota graphs are well under the limits unless I'm interpreting them wrong.
I'm aware of the user limit of 250 units per second.
There's also a per 100 seconds limit of 25,000 per user
and 2,000,000 units per 100 seconds across all users.
We are making a request every 20 seconds from a runner that connects to the GmailAPI and lists messages retrieves messages.
There are never more than a few messages in the inbox at a time.
According queries per 100 seconds graph there's 625 units
I really don't know where it's getting the 250 per second from.
Some of the other graphs are even more confusing.
ids =
#gmail.fetch_all(max: 100, items: :messages) do |token|
#gmail.list_user_messages('me', max_results: 100, q: query, page_token: token)
end.map(&:id)
puts "Processing #{ids.count} emails for #{watcher[:id]}"
on_complete = lambda do |result, err|
retrieve_complete(result, err, watcher)
end
ids.each_slice(1000) do |ids_array|
#gmail.batch do |gm|
ids_array.each { |id| gm.get_user_message('me', id, &on_complete) }
end
end
I am running an Aerospike cluster in Google Cloud. Following the recommendation on this post, I updated to the last version (3.11.1.1) and re-created all servers. In fact, this change cause my 5 servers to operate in a much lower CPU load (it was around 75% load before, now it is on 20%, as show in the graph bellow:
Because of this low load, I decided to reduce the cluster size to 4 servers. When I did this, my application started to receive the following error:
All batch queues are full
I found this discussion about the topic, recommending to change the parameters batch-index-threads and batch-max-unused-buffers with the command
asadm -e "asinfo -v 'set-config:context=service;batch-index-threads=NEW_VALUE'"
I tried many combinations of values (batch-index-threads with 2,4,8,16) and none of them solved the problem, and also changing the batch-index-threads param. Nothing solves my problem. I keep receiving the All batch queues are full error.
Here is my aerospace.conf relevant information:
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
paxos-recovery-policy auto-reset-master
pidfile /var/run/aerospike/asd.pid
service-threads 32
transaction-queues 32
transaction-threads-per-queue 4
batch-index-threads 40
proto-fd-max 15000
batch-max-requests 30000
replication-fire-and-forget true
}
I use 300GB SSD disks on these servers.
A quick note which may or may not pertain to you:
A common mistake we have seen in the past is that developers decide to use 'batch get' as a general purpose 'get' for single and multiple record requests. The single record get will perform better for single record requests.
It's possible that you are being constrained by the network between the clients and servers. Reducing from 5 to 4 nodes reduced the aggregate pipe. In addition, removing a node will start cluster migrations which adds additional network load.
I would look at the batch-max-buffer-per-queue config parameter.
Maximum number of 128KB response buffers allowed in each batch index
queue. If all batch index queues are full, new batch requests are
rejected.
In conjunction with raising this value from the default of 255 you will want to also raise the batch-max-unused-buffers to batch-index-threads x batch-max-buffer-per-queue + 1 (at least). If you do not do that new buffers will be created and destroyed constantly, as the amount of free (unused) buffers is smaller than the ones you're using. The moment the batch response is served the system will strive to trim the buffers down to the max unused number. You will see this reflected in the batch_index_created_buffers metric constantly rising.
Be aware that you need to have enough DRAM for this. For example if you raise the batch-max-buffer-per-queue to 320 you will consume
40 (`batch-index-threads`) x 320 (`batch-max-buffer-per-queue`) x 128K = 1600MB
For the sake of performance the batch-max-unused-buffers should be set to 13000 which will have a max memory consumption of 1625MB (1.59GB) per-node.
I am trying to determine how many nodes I need for my EMR cluster. As part of best practices the recommendations are:
(Total Mappers needed for your job + Time taken to process) / (per instance capacity + desired time) as outlined here: http://www.slideshare.net/AmazonWebServices/amazon-elastic-mapreduce-deep-dive-and-best-practices-bdt404-aws-reinvent-2013, page 89.
The question is how to determine how many parallel mappers the instance will support since AWS don't publish? https://aws.amazon.com/emr/pricing/
Sorry if i missed something obvious.
Wayne
To determine the number of parallel mappers , you will need to check this documentation from EMR called Task Configuration where EMR had a predefined mapping set of configurations for every instance type which would determine the number of mappers/reducers.
http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-task-config.html
For example : Lets say you have 5 m1.xlarge core nodes. According to the default mapred-site.xml configuration values for that instance type from EMR docs, we have
mapreduce.map.memory.mb = 768
yarn.nodemanager.resource.memory-mb = 12288
yarn.scheduler.maximum-allocation-mb = 12288 (same as above)
You can simply divide the later with former setting to get the maximum number of mappers supported by one m1.xlarge node = (12288/768) = 16
So, for the 5 node cluster , it would a max of 16*5 = 80 mappers that can run in parallel (considering a map only job). The same is the case with max parallel Reducers(30). You can do similar math for a combination of mappers and reducers.
So, If you want to run more mappers in parallel , you can either re-size the cluster or reduce the mapreduce.map.memory.mb(and its heap mapreduce.map.java.opts) on every node and restart NM to
To understand what the above mapred-site.xml properties mean and why you do need to do those calculations , you can refer it here :
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
Note : The above calculations and statements are true if EMR stays in its default configuration using YARN capacity scheduler with DefaultResourceCalculator. If for example , you configure your capacity scheduler to use DominantResourceCalculator, it will consider VCPU's + Memory on every nodes (not just memory's) to decide on parallel number of mappers.