Total Instances: I have created an EMR with 11 nodes total (1 master instance, 10 core instances).
job submission: spark-submit myApplication.py
graph of containers: Next, I've got these graphs, which refer to "containers" and I'm not entirely what containers are in the context of EMR, so this isn't obvious what its telling me:
actual running executors: and then I've got this in my spark history UI, which shows that I only have 4 executors ever got created.
Dynamic Allocation: Then I've got spark.dynamicAllocation.enabled=True and I can see that in my environment details.
Executor memory: Also, the default executor memory is at 5120M.
Executors: Next, I've got my executors tab, showing that I've got what looks like 3 active and 1 dead executor:
So, at face value, it appears to me that I'm not using all my nodes or available memory.
how do I know if i'm using all the resources I have available?
if I'm not using all available resources to their full potential, how do I change what I'm doing so that the available resources are being used to their full potential?
Another way to go to see how many resources are being used by each of the nodes of the cluster is to use the web tool of Ganglia.
This is published on the master node and will show a graph of each node's resource usage. The issue will be if you have not enable Ganglia at the time of cluster creation as one of the tools available on the EMR cluster.
Once enable however you can go to the web page and see how much each node is being utilized.
Related
Given that AWS with EMR provide you with their optimized Spark experience, then:
If I am planning to only use S3 / EMRFS for both directly reading and directly writing and not using s3DistCP,
Why do I need at least 1 Core Node?
My suspicion is that at least 1 Core Node is needed to get around the issue of Spark shuffle files due to yarn dynamic resource allocation being lost in the past when Core Nodes could be deallocated with scaling.
According to AWS personnel:
Core nodes host the EMRFS/HDFS daemon. So you need at least 1 Core
node to talk to S3 using EMRFS.
I got that myself, but add my suspicion is that at least 1 Core Node is also needed to get around the issue of Spark shuffle files - due to Yarn Dynamic Resource Allocation for Spark - being lost in the past when Core Nodes could be deallocated with scaling. Core Nodes cannot be deallocated after auto-scaling or initial allocation.
That said, I note that some 2 years ago a lot has gone into EMR Spark resilience: https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-spark-applications-using-amazon-ec2-spot-instances-with-amazon-emr/
I'm trying to run a 100 node AWS Batch job, when I set my computing environment to use only m4.xlarge and m5.xlarge instances everything works fine and my job is picked up and runs.
However, when I begin to include other instance types in my compute environment such as m5.2xlarge, the job is stuck in the runnable state indefinitely. The only variable I am changing in these updates is the instance types in the compute environment.
I'm not sure what is causing this job to not be picked up when I include other instance types in the computing environment. In the documentation for Compute Environment Parameters the only note is:
When you create a compute environment, the instance types that you select for the compute environment must share the same architecture. For example, you can't mix x86 and ARM instances in the same compute environment.
The JobDefinition is multi-node:
Node 0
vCPUs: 1
Memory: 15360 MiB
Node 1:
vCPUs: 2
Memory: 15360 MiB
My computing environment max vCPUs is set to 10,000, is always in a VALID state and always ENABLED. Also my EC2 vCPU limit is 6,000. CloudWatch provides no logs because the job has not started, I'm not sure what else to try here. I am also not using the optimal setting for instance types because I ran into issues with not getting enough instances.
I just resolved this issue, the problem is with the BEST_FIT strategy in Batch. The jobs that I'm submitting are not close enough to the instance type so they never get picked up.
I figured this out by modifying the job definition to use 8 vCPU and 30GB of memory and the job began with the m5.2xlarge instances.
I'm going to see if using the BEST_FIT_PROGRESSIVE strategy will resolve this issue and report back, although I doubt it will.
--
Update: I have spoken with AWS Support and got some more insight. The BEST_FIT_PROGRESSIVE allocation strategy has built-in protections for over-scaling so that customers do not accidentally launch thousands of instances. Although this has the side effect of what I am experiencing which leads to jobs failing to start.
The support engineers recommendation was to use a single instance type in the Compute Environment and the BEST_FIT allocation strategy. Since my jobs have different instance requirements I was able to successfully create three separate ComputeEnvironments targeting difference instances types (c5.large, c5.xlarge, m4.xlarge), submit jobs and have them run in the appropriate Compute Environment.
I've setup Hadoop in my laptop,
and when I submit a job on Hadoop (though MapReduce and Tez),
the status always ACCEPTED, but the progress always stuck at 0% and description wrote something like "waiting for AM container to be allocated".
When I check the node through YARN UI(localhost:8088),
it shows that the active node is 0
But from HDFS UI(localhost:50070), it shows that there are one live node.
Is that the main reason that cause the job stuck since there are no available node? If that's the case, what should I do?
In your YARN UI, it shows you have zero vcores and zero memory so there is no way for any job to ever run since you lack computing resources. The datanode is only for storage (HDFS in this case) and does not matter as far as why your application is stuck.
To fix your problem, you need to update your yarn-site.xml and provide settings for the memory and vcore properties described in the following:
http://blog.cloudera.com/blog/2015/10/untangling-apache-hadoop-yarn-part-2/
You might consider using a Cloudera QuickStart VM or Hortonworks Sandbox (at least as a reference for configuration values for the yarn-site.xml).
https://www.cloudera.com/downloads/quickstart_vms/5-10.html
https://hortonworks.com/products/sandbox/
I am really hoping to use Presto in an ETL pipeline on AWS EMR, but I am having trouble configuring it to fully utilize the cluster's resources. This cluster would exist solely for this one query, and nothing more, then die. Thus, I would like to claim the maximum available memory for each node and the one query by increasing query.max-memory-per-node and query.max-memory. I can do this when I'm configuring the cluster by adding these settings in the "Edit software settings" box of the cluster creation view in the AWS console. But the Presto server doesn't start, reporting in the server.log file an IllegalArgumentException, saying that max-memory-per-node exceeds the useable heap space (which, by default, is far too small for my instance type and use case).
I have tried to use the session setting set session resource_overcommit=true, but that only seems to override query.max-memory, not query.max-memory-per-node, because in the Presto UI, I see that very little of the available memory on each node is being used for the query.
Through Google, I've been led to believe that I need to also increase the JVM heap size by changing the -Xmx and -Xms properties in /etc/presto/conf/jvm.config, but it says here (http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html) that it is not possible to alter the JVM settings in the cluster creation phase.
To change these properties after the EMR cluster is active and the Presto server has been started, do I really have to manually ssh into each node and alter jvm.config and config.properties, and restart the Presto server? While I realize it'd be possible to manually install Presto with a custom configuration on an EMR cluster through a bootstrap script or something, this would really be a deal-breaker.
Is there something I'm missing here? Is there not an easier way to make Presto allocate all of a cluster to one query?
As advertised, increasing query.max-memory-per-node, and also by necessity the -Xmx property, indeed cannot be achieved on EMR until after Presto has already started with the default options. To increase these, the jvm.config and config.properties found in /etc/presto/conf/ have to be changed, and the Presto server restarted on each node (core and coordinator).
One can do this with a bootstrap script using commands like
sudo sed -i "s/query.max-memory-per-node=.*GB/query.max-memory-per-node=20GB/g" /etc/presto/conf/config.properties
sudo restart presto-server
and similarly for /etc/presto/jvm.conf. The only caveats are that one needs to include the logic in the bootstrap action to execute only after Presto has been installed, and that the server on the coordinating node needs to be restarted last (and possibly with different settings if the master node's instance type is different than the core nodes).
You might also need to change resources.reserved-system-memory from the default by specifying a value for it in config.properties. By default, this value is .4*(Xmx value), which is how much memory is claimed by Presto for the system pool. In my case, I was able to safely decrease this value and give more memory to each node for executing the query.
As a matter of fact, there are configuration classifications available for Presto in EMR. However, please note that these may vary depending on the EMR release version. For a complete list of the available configuration classifications per release version, please visit 1 (make sure to switch between the different tabs according to your desired release version). Specifically regarding to jvm.config properties, you will see in 2 that these are not currently configurable via configuration classifications. That being said, you can always edit the jvm.config file manually per your needs.
Amazon EMR 5.x Release Versions
1
Considerations with Presto on Amazon EMR - Some Presto Deployment Properties not Configurable:
2
When you pick a more performant node, say a r3.xlarge vs m3.xlarge, will Spark automatically utilize the additional resources? Or is this something you need to manually configure and tune?
As far as configurations go, which are the most configuration values to tune to get the most out of your cluster?
It will try..
AWS has a setting you can enable in your EMR cluster configuration that will attempt to do this. It is called spark.dynamicAllocation.enabled. In the past there were issues with this setting where it would give too many resources to Spark. In newer releases they have lowered the amount they are giving to spark. However, if you are using Pyspark they will not take python's resource requirements into account.
I typically disable dynamicAllocation and set the appropriate memory and cores settings dynamically from my own code based upon what instance type is selected.
This page discusses what defaults they will select for you:
http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-spark-configure.html
If you do it manually, at a minimum you will want to set:
spark.executor.memory
spark.executor.cores
Also, you may need to adjust the yarn container size limits with:
yarn.scheduler.maximum-allocation-mb
yarn.scheduler.minimum-allocation-mb
yarn.nodemanager.resource.memory-mb
Make sure you leave a core and some RAM for the OS, and RAM for python if you are using Pyspark.