we have ambari and HDP 2.6.4 cluster with 245 workers machines
each worker have ( datanode component and resource manager component )
now we want to add 10 new workers machines to the cluster
but we want to disable the datanode machines so no HDFS data will transfer from the old datanodes to the new datanodes
I thinking to set maintenance mode on the new datanode ,
but not sure if this action is enough in order to disable the new datanodes machine on the new workers
so the target is to avoid replication of HDFS data from the old datanode to the new datanode
I will happy to get any advice about this
Related
Component : Aurora PostgresSQL Serverless V2 (0.5 - 4 ACUs) - Multi AZ deployment
Post instance startup CPU utilization stabilizes at around 55% - 60% on writer node only and does not comes down. Reader node stabilizes at ~19%.
The only query running as checked with pg_stat_activity on database are as follows :
RDS Replication
Autovacuum
WAL Process
Checkpoint process
Other Internal process
Number of connections to DB : 1
Database processes running in writer node : 13
Kindly advice what additional can be checked and probable cause of issue.
Tried to kill Autovacuum process
Checked number of process from pg_stat_activity
Hi you could enable and review monitoring stats at OS level using AWS Aurora Enhanced Monitoring, it will provide you details at process level and identify what process is using CPU resources.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_Monitoring.OS.Viewing.html
master node - does this node stores hdfs data in aws emr cluster?
task node - if this node does not store hdfs data, is it purely computational node? in this case does hadoop transfer to task node? does this not defeat data localization computation advantgae?
(Other than the edge case of a master-only cluster with no core or task instances...)
The master instance does not store any HDFS data, nor does it act as a computational node. The master instance runs services like the YARN ResourceManager and HDFS NameNode.
The only nodes that store data are those that run HDFS DataNode, which are only the core instances.
The core and task instances both run YARN NodeManager and thus are the "computational nodes".
Regarding your question, "in this case does hadoop transfer to task node", I assume that you are asking whether or not Hadoop transfers (HDFS) data to the task instances so that they may perform computations on HDFS data. In a sense, yes, task instances may read HDFS blocks remotely from core instances where the blocks are stored.
It's true that this means that task instances can never take advantage of data locality for HDFS data, but there are many cases where this does not matter anyway, such as for tasks that are read shuffle data from other nodes, or tasks that are reading data from remote storage anyway (e.g., Amazon S3). Furthermore, depending upon the core instance type being used, keep in mind that even the HDFS blocks might be getting stored in remote storage (i.e., EBS). That said, even when your task instances are reading data from a remote DataNode or remote service like S3 or EBS, it might not even be noticeable to the point that you need to worry about data locality.
I have Redisson cluster configuration below in yaml file,
subscriptionConnectionMinimumIdleSize: 1
subscriptionConnectionPoolSize: 50
slaveConnectionMinimumIdleSize: 32
slaveConnectionPoolSize: 64
masterConnectionMinimumIdleSize: 32
masterConnectionPoolSize: 64
readMode: "SLAVE"
subscriptionMode: "SLAVE"
nodeAddresses:
- "redis://X.X.X.X:6379"
- "redis://Y.Y.Y.Y:6379"
- "redis://Z.Z.Z.Z:6379"
I understand it is enough to give one of master node ip address in the configuration and Redisson automatically identifies all the nodes in the cluster, but my questions are below,
1 Are all nodes identified at the boot of the application and used for future connections?
2 what if one of the master node goes down, when the application is running, the request to the particular master will fail and the redisson api automatically tries contacting the other nodes(master) or does it try to connect to same master node repeatedly and fail?
3 Is it a best practice to give DNS instead of server ip's?
Answering to your questions:
That's correct, all nodes are identified at the boot process. If you use Config.readMode = MASTER_SLAVE or SLAVE (which is default) then all nodes will be used. If you use Config.readMode = MASTER then only master node is used.
Redisson tries to reach master node until the moment of Redis topology update. Till that moment it doesn't have information about elected new master node.
Cloud services like AWS Elasticache and Azure Cache provides single hostname bounded to multiple ip addresses.
We have been running a production grade system where we want to start a secondary namenode in AWS EMR automatically.
Below is the output of jps in which secondary namenode daemon is not running
[root#ip-10-2-23-23 ~]# jps
6241 Bootstrap
7041 ResourceManager
10754 RunJar
6818 WebAppProxyServer
10787 SparkSubmit
7619 JobHistoryServer
6922 ApplicationHistoryServer
3661 Main
4877 Main
6318 NameNode
8943 LivyServer
4499 Jps
5908 Bootstrap
4791 Main
10619 StatePusher
9918 HistoryServer
Secondary namenode is required to do namenode checkpointing and do regular creation of fsImage .I have not configured any HA for Namenode.
Command we ran manually to create FsImage is
hdfs secondarynamenode -checkpoint
How a secondary namenode can be started in AWS EMR or there is any configuration ?
Hadoop version : Hadoop 2.8.3-amzn-0
AWS EMR doesn't run secondary Namenode proces on EMR so FSImage won't be created, Running a cron every hour to create a FSImage solves the problem of too much disk usage because the FSImage merges the snapshot (Namenode metadata) to create a new FsImage of smaller size. FSImage creation is a costly operation for Namenode and it utilizes instance resources. If there are too many snapshots pending for merging , Namenode may never recover from this tedious process , so it is better to create FSImage frequently via cron .In a standard Hadoop system this job is done by running a secondary Namenode on a separate instance but EMR doesn't have concept of two masters so Master node is always a single point of failure.
hdfs secondarynamenode -checkpoint
Other solution to this problem is running EMR on custom Hadoop like MapR .
We're using Amazon EMR for our oozie workflows which contains Spark jobs. In our cluster, we have 1 Master, 2 core nodes and using third party tool for Task nodes as spot instances.
Autoscaling is setup for task nodes based on Yarn memory usage. We have configured to launch Application Master only in core nodes as task nodes are spot instances which would go down any time.
Now the problem is sometimes running jobs occupy the Core nodes memory fully (AM + Task executors), which leaves other jobs in accepted state waiting for core node to free up to launch AM.
I'd like to know is it possible to limit only AM to launch in Core nodes and task executors in task nodes. This way we'll be able to run multiple jobs in parallel.