Mapping of Google Cloud VMs to physical machines - google-cloud-platform

I am using Google Cloud to run a few experiments. Now, when I create a VM instance of say 4 VCPUs, what is the mapping of those 4 VCPUs to the actual physical machine? Also, what does 4 VCPUs actually entail? Am I getting a machine that has say, 4 processors? Or do I get 4 nodes on a machine that has say, 8 processors? If the latter is the case, doesn't the utilization of the remaining 4 nodes affect the performance of my job?
In the Google Cloud documentation, they say that For the n1 series of machine types, a virtual CPU is implemented as a single hardware hyper-thread. The thing is, I'm not exactly sure what a single hardware hyper-thread means. An interesting fact is that I did cat /proc/cpuinfo on an 8 VCPU instance that I had reserved, and it had a field called cpu cores whose value was 4. Again, what does that indicate?
I would like to understand the underlying hardware below the VM instances as it would help me in optimizing jobs that have multithreading enabled.
Any help will be appreciated. Thanks.

When we run cat /proc/cpuinfo, which shows 8, it means that the system has access to 8 threads (as in your example) and cpu cores is 4 because that is the number of physical cores.
About your question of "optimizing jobs that have multithreading enabled", the only difference is that you're accessing the vCPUs through a hypervisor rather than directly as hyper-threads on an Intel processor. In fact, multi-threading strategies for applications shouldn't really be any different just because of the hypervisor layer.
You can also read the discussion here which is about the relations among virtual CPU, hyper-thread, and physical core.

Related

Will a T2D VM be ~twice as fast as N2D for the same price?

I have been looking over the new GCP price lists and I'm somewhat confused about the T2D VMs. The documentation states that these are running with hyperthreading disabled, at one physical core per vCPU. However, pricing per vCPU stays the same, which would make more sense if you were getting half the threads.
So is the following correct?
N2D 4 vCPU: 2 cores+HT for ~118€/mo (n2d-standard-4)
T2D 4 vCPU: 4 plain cores for ~118€/mo (t2d-standard-4)
If so, that should be a nearly 2x speed boost for scalable compute workloads.
Testing this with two 4 vCPU/16 GB instances in Geekbench:
N2D: 2247 multithreaded
T2D: 4424 multithreaded
So it looks like, yes, you do get the same amount of full cores on T2D that you get threads on N2D for the same base price, and single core performance is similar at least in this superficial benchmark.
So, firstly, T2D cores are always Milan, for N2D you have to remember to specify the architecture, otherwise you get the much slower Rome cores.
For each vCPU on N2D you get a thread, where each core has 2 threads. For T2D you get a whole core per vCPU as HT is disabled. For single-threaded loads, their speed will be identical, for multi-threaded it depends on the specific workload, but you can get up to something like 70% more performance on T2D.
Now, T2D is at a premium over N2D for regular instances from what I see, but the price is indeed the same for 1y/3y committed or spot instances. So you get more for the same price.

GCloud N2 machines: 128 vCPUs in 1 chip? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 12 months ago.
Improve this question
I saw that GCloud offers N2 instances with up to 128 vCPUs. I wonder what kind of hardware that is. Do they really put 128 cores into 1 chip? If so, Intel doesn't make them generally available for sale to the public, right? If they use several chips, how do they split the cores? Also, I assume that all cores are on the same node, do they place more than 2 CPU chips on that node or do they have chips with 56 cores (which also is a lot)?
Thanks!
You can easily build or purchase a system with 128 vCPUs. Duplicating Google's custom hardware and firmware is another matter. 128 vCPUs is not large today.
Google Cloud publishes the processor families: CPU platforms
The Intel Ice Lake Xeon motherboards support multiple processor chips.
With a two-processor motherboard using the 40 core model (8380), 160 vCPUs are supported.
For your example, Google is using 32-core CPUs.
Note: one physical core is two vCPUs. link
I am not sure what Google is using for n2d-standard-224 which supports 224 vCPUs. That might be the Ice Lake 4 processor 28-core models.
GCloud N2 machines: 128 vCPUs in 1 chip?
Currently, the only processors that support 64 cores (128 vCPUs) that I am aware of are ARM processors from Ampere. That means Google is using one or more processor chips on a multi-cpu motherboard.
If so, Intel doesn't make them generally available for sale to the
public, right?
You can buy just about any processor on Amazon, for example.
If they use several chips, how do they split the cores? Also, I
assume that all cores are on the same node, do they place more than 2
CPU chips on that node or do they have chips with 56 cores (which also is a lot)?
You are thinking in terms of laptop and desktop technology. Enterprise rack mounted servers typically support two or more processor chips. This has been the norm for a long time (decades).

How do I compare a 200Mhz Cloud Function and a 2 vCPU Compute Engine Instance?

How do I compare a 200Mhz Cloud Function and a 2 vCPU Compute Engine Instance? I can't seem to find information on how many Mhz does a single vCPU stands for? Or how else do I compare these values?
You can find here the CPUs used by Google Cloud according to the region. And... you don't know the type of CPU that run Cloud Functions, it also depends on the region of deployment.
"200Mhz" means nothing. It depends on the CPU generation, the optimisation embedded and the computation that you perform on it (do you leverage the optimisation or not?). To compare strictly, it's almost impossible.
And, if you have 2 CPUs, that means your process can leverge multi CPU computation, that is not the case of all the apps (python in their basic form for example).
The real question is: does it enough for your workload?
I wrote an article where I compared Cloud Run and compute Engine, if it interest you

Virtualization: Can I built a 4-core virtual CPU with many physicals processors?

I use a mechanical simulation software that take 5-10 hours to resolve one simulation. My licence software is limited to 4-core.
Spec of machine that actualy run the software:
Windows 7 Pro
1x Xeon E5-2650 v2 2.60GHz (8-core)
32GB Ram
SSD
I'm trying to find a way to reduce as much as possible the time of simulation.
In virtualization, is it possible to take for exemple, 2x physical 8-core 2.50GHz CPU and make a vCPU of 4-core at 10GHz per core? Each virtual core will take 4 physicals core in this exemple. Is this possible to make it?
Any suggestion?
It is possible to use AWS EC2 Amazon server to make it?
Thank you!
Amazon EC2 is available in a variety of Instance Types. Each type has a set number of virtual CPUs, RAM, etc. Each vCPU is a hyperthread of an Intel Xeon core, so you might want to check your licensing to confirm how it defines a 'core'.
By choosing Instance Type, you can control how many CPUs you get and you will know the type of processor being used. However, you cannot combine CPUs to make a faster CPU.

Detecting CPU and Core information from my Intel System

I am currently using Windows 8 Pro OS, along with the Processor: Intel(R) Core(TM) i7-4790 CPU # 3.60GHz, with RAM 8 GB.
I wanted to know how many Physical processors and how many actual Cores my System has. With my very basic understanding for Hardware and this discussion here, when I am searching Intel Information for this processor at this Intel site here, it says:
# of Cores 4
# of Threads 8
In the Task Manager of my System for CPU, it says:
Maximum Speed: 3.60 GHz
Sockets: 1
Cores: 4
Physical processors: 8
Am I correct in assuming that I have 1 Physical processor with 4 actual physical cores, and each physical core has 2 virtual cores (= 2 threads). As such the total physical processors are 8, as mentioned in my Task Manager. But, if my assumption is correct, then why say physical processors =8, and not virtual processors?
I need to know the core details of my machine as I need to write Low Latency programs, using maybe OpenMP.
Thanks for your time...
From the perspective of your operating system, even HyperThreaded processors are "real" processors - they exist in the CPU. They use real, physical resources like instruction decoders and ALUs. Just because those resources are shared between HT cores doesn't mean they're not "real".
General computing will see a speedup by using Hyper Threading, because the various threads are doing different kinds of things, leveraging the shared resources. A CPU-intensive task running in parallel may not see as high of performance however, due to the strain on the shared resources. For example, if there's only one ALU, it doesn't make sense to have two threads competing for it.
Run benchmarks and determine for your application what the appropriate settings are, regarding HT being enabled or not. With a question this broad, we can't give you a definitive answer.