Installing Cloudera Manager on an AWS EC2 instance, following the official instruction:
http://www.cloudera.com/documentation/archive/manager/4-x/4-6-0/Cloudera-Manager-Installation-Guide/cmig_install_on_EC2.html
I successfully run the .bin package, but when I visit the IP:7180 , the browser says my access has been denied...Why ...
I tried to confirm the status of cm server: service cloudera-scm-server status. At first it said
cloudera-scm-server is dead and pid file exists
The log file showed mentioned "unknown host ip-10-0-0-110" then I add a map between ip-10-0-0-110 and the EC2 instance **public** ip. Then restart the scm-server service. It could run normally, but the IP:7180 remained unaccessable, saying ERR_CONNECTION_REFUSED. I have uninstalled both the iptables and closed my windows firewall.
After a few minute, cloudera-scm-server is dead and pid file exists appeared again...
Using: tail -40 /var/log/cloudera-scm-server/cloudera-scm-server.out
JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera Java HotSpot(TM) 64-Bit
Server VM warning: INFO: os::commit_memory(0x0000000794223000,
319201280, 0) failed; error='Cannot allocate memory' (errno=12)
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (malloc) failed to allocate 319201280 bytes for committing reserved memory.
An error report file with more information is saved as:
/tmp/hs_err_pid5523.log
What type of EC2 instance are you using? The error is pretty descriptive and indicates that CM is unable to access memory. Maybe you are using an instance type with too little RAM.
Also - the docs you are referencing are out of date. The latest docs on deploying CDH5 in the cloud can be found here: https://www.cloudera.com/documentation/director/latest/topics/director_get_started_aws.html
These docs also recommend using Cloudera Director which will simplify much of the deployment and configuration of your cluster.
Related
I have an AWS EKS cluster (kubernetes version 1.14) which runs JupyterHub application.
One of the users notebook servers is returning a 500 error
500 : Internal Server Error
Redirect loop detected. Notebook has JupyterHub version unknown (likely < 0.8), but the hub expects 0.9.6. Try installing JupyterHub==0.9.6 in the user environment if you continue to have problems.
You can try restarting your server from the homepage.
Only one user is experiencing this issue, others are not. When I do "kubectl get pod", this users pod shows that it is in state "terminating" (it appears to be stuck in this state).
I was able to fix it, but I can't say this is the right approach. (I would have preferred to diagnose the root cause)
First, I tried deleting the pod kubectl delete pod <pod_name> -- it did not work
Second, I tried force deleting the pod kubectl delete pod <pod_name> --grace-period=0 --force -- it worked, but it turns out this only deletes the handle, the pod resources are then orphaned on the cluster
I checked the node status kubectl get node and noticed one node was stuck in NotReady state. I recycled this node -- still did not work, the user notebook server was still stuck and returning 500 err
Finally, I simply deleted the user notebook server from the jupyter hub admin page. This fixed it....
I have good experience in working with Elasticsearch, I have worked with version 2.4 and now trying to learn new Elasticsearch.
I am trying to implement Filebeat to send my apache and system logs to my Elasticsearch endpoint. To save my time I preferred to launch a t2.medium single node instance over AWS Elasticsearch Service under the public domain and I have attached the access policy to allow everyone to access the cluster.
The AWS Elasticsearch instance is up and running healthy.
I launched a Ubuntu(18.04) server, downloaded the filebeat tar and made the following configuration in filebeat.yml:
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["https://my-public-test-domain.ap-southeast-1.es.amazonaws.com:443"]
18.04- # Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
I enabled the required modules :
filebeat modules enable system apache
Then as per the filebeat documentation I changed the ownership of the filebeat file and started the filebeat with the following commands :
sudo chown root filebeat.yml
sudo ./filebeat -e
When I started the filebeat I faced the following permission and ownership issues :
Error loading config from file '/home/ubuntu/beats/filebeat-7.2.0-linux-x86_64/modules.d/system.yml', error invalid config: config file ("/home/ubuntu/beats/filebeat-7.2.0-linux-x86_64/modules.d/system.yml") must be owned by the user identifier (uid=0) or root
To resolve this I changed the ownership for the files which were throwing errors.
When I restarted the filebeat service , I started facing the following issue :
Connection marked as failed because the onConnect callback failed: cannot retrieve the elasticsearch license: unauthorized access, could not connect to the xpack endpoint, verify your credentials
Going through this link , I found that to work with AWS Elasticsearch I will need Beats OSS versions.
So I again downloaded the OSS version for beat from this link and followed the same procedure as above, but still no luck. Now I am facing the following errors :
Error 1:
Attempting to reconnect to backoff(elasticsearch(https://my-public-test-domain.ap-southeast-1.es.amazonaws.com:443)) with 12 reconnect attempt(s)
Error 2:
Failed to connect to backoff(elasticsearch(https://my-public-test-domain.ap-southeast-1.es.amazonaws.com:443)): Connection marked as failed because the onConnect callback failed: 1 error: Error loading pipeline for fileset system/auth: This module requires an Elasticsearch plugin that provides the geoip processor. Please visit the Elasticsearch documentation for instructions on how to install this plugin. Response body: {"error":{"root_cause":[{"type":"parse_exception","reason":"No processor type exists with name [geoip]","header":{"processor_type":"geoip"}}],"type":"parse_exception","reason":"No processor type exists with name [geoip]","header":{"processor_type":"geoip"}},"status":400}
From the second error I can understand that the geoip plugin is not available because of which I facing this error.
What else needs to be done to get this working?
Has anyone been to successfully connect Beats to AWS Elasticsearch?
What other steps I could to take to mitigate the above issue?
Envrionment Details:
AWS Elasticsearch Version : 6.7
File Beat : 7.2.0
First, you need to use OSS version of filebeat with AWS ES https://www.elastic.co/downloads/beats/filebeat-oss
Second, AWS ElasticSearch does not provide GeoIP module, so you will need to edit pipelines for any of the default modules you want to use, and make sure GeoIP is removed/commented out.
For example in /usr/share/filebeat/module/system/auth/ingest/pipeline.json (that's the path when installed from deb package - your path will be different of course) comment out:
{
"geoip": {
"field": "source.ip",
"target_field": "source.geo",
"ignore_failure": true
}
},
Repeat the same for apache module.
I've spent hours trying to make filebeat iis module works with AWS elasticsearch. I kept getting ingest-geoip error, Below fixed the issue.
For windows iis logs, AWS elasticsearch remove geoip from filebeat module configuration:
C:\Program Files (x86)\filebeat\module\iis\access\ingest\default.json
C:\Program Files (x86)\filebeat\module\iis\access\manifest.yml
C:\Program Files (x86)\filebeat\module\iis\error\ingest\default.json
C:\Program Files (x86)\filebeat\module\iis\error\manifest.yml
Im trying to install chef on my AWS ec2 instances. Im using one ec2 as a workstation, another as a node and hosted chef as the chef server.
On the workstation, Im able to create a simple project (LAMP stack) and upload it to the chef server.
When I run knife bootstrap on the workstation with the key pair for the nodes ec2 instance, it successfully converges.
Chef Client finished, 1/1 resources updated in 01 minutes 23 seconds
When I go to the chef node however and run chef-client, I get the following error:
Private Key Not Found:
----------------------
Your private key could not be loaded. If the key file exists, ensure that it is
readable by chef-client.
Relevant Config Settings:
-------------------------
validation_key "/etc/chef/validation.pem"
System Info:
------------
chef_version=14.12.9
ruby=ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-linux]
program_name=/usr/bin/chef-client
executable=/opt/chef/bin/chef-client
Running handlers:
[2019-05-17T21:35:04+00:00] ERROR: Running exception handlers
Running handlers complete
[2019-05-17T21:35:04+00:00] ERROR: Exception handlers complete
Chef Client failed. 0 resources updated in 00 seconds
[2019-05-17T21:35:04+00:00] FATAL: Stacktrace dumped to /home/ubuntu/.chef/cache/chef-stacktrace.out
[2019-05-17T21:35:04+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2019-05-17T21:35:04+00:00] FATAL: Chef::Exceptions::PrivateKeyMissing: I cannot read /etc/chef/client.pem, which you told me to use to sign requests!
On the workstation, I run knife list and I can see my node and validator listed. In hosted chef I can't see the chef node listed under nodes.
Please can you help me understand the error and how to fix it?
I was under the impression that bootstrap takes care of the nodes certificates so Im surprised the node can't load the private key.
Thank you
i'm running through a simple and useless toy using PCF on azure, trying to create and run the stream 'time | log'
i successfully get SCDF started, and the stream created, but when i try to deploy the stream, SCDF creates two (cf) apps that won't run - they exist as far as cf-apps is concerned
○ → cf apps
Getting apps in org tess / space tess as admin...
OK
name requested state instances memory disk urls
yascdf-server started 1/1 2G 2G yascdf-server.apps.cf.tess.info
yascdf-server-LE7xs4r-tess-log stopped 0/1 512M 2G yascdf-server-LE7xs4r-tess-log.apps.cf.tess.info
yascdf-server-LE7xs4r-tess-time stopped 0/1 512M 2G yascdf-server-LE7xs4r-tess-time.apps.cf.tess.info
if i try to view the logs for either, nothing ever returns. but the logs in apps manager look like this:
2017-08-10T10:24:42.147-04:00 [API/0] [OUT] Created app with guid de8fee78-0902-4df7-a7ae-bba8a7710dca
2017-08-10T10:24:43.314-04:00 [API/0] [OUT] Updated app with guid de8fee78-0902-4df7-a7ae-bba8a7710dca ({"route"=>"97e1d26b-d950-479e-b9df-fe1f3b0c8a74", :verb=>"add", :relation=>"routes", :related_guid=>"97e1d26b-d950-479e-b9df-fe1f3b0c8a74"})
the routes don't work:
404 Not Found: Requested route ('yascdf-server-LE7xs4r-tess-log.apps.cf.tess.info') does not exist.
and trying to (re)start the route i get:
○ → cf start yascdf-server-LE7xs4r-tess-log
Starting app yascdf-server-LE7xs4r-tess-log in org tess / space tess as admin...
Staging app and tracing logs...
The app package is invalid: bits have not been uploaded
FAILED
here's the SCDF shell stuff i ran, if this helps:
server-unknown:>dataflow config server http://yascdf-server.apps.cf.tess.info/
Successfully targeted http://yascdf-server.apps.cf.cfpush.info/
dataflow:>app import --uri http://.../1-0-4-GA-stream-applications-rabbit-maven
Successfully registered applications: [<chop>]
dataflow:>stream create tess --definition "time | log"
Created new stream 'tess'
dataflow:>stream deploy tess
Deployment request has been sent for stream 'tess'
dataflow:>
anyone know what's going on here? i'd be grateful for a nudge...
Spring Cloud Data Flow: Server
1.2.3 (using built spring-cloud-dataflow-server-cloudfoundry-1.2.3.BUILD-SNAPSHOT.jar)
Spring Cloud Data Flow: Shell
1.2.3 (using downloaded spring-cloud-dataflow-shell-1.2.3.RELEASE.jar)
Deployment Environment
PCF v1.11.6 (on Azure)
pcf dev v0.26.0 (on mac)
App Starters
http://bit-dot-ly/1-0-4-GA-stream-applications-rabbit-maven
Logs
stream deploy log
It has been identified that java-buildpack 4.4 (JBP4) was used by OP and by running SCDF against this version, there is an issue with memory allocation in reactor-netty (used by JBP4 internally), which causes the out-of-memory error. The reactor team is addressing this issue in the upcoming 0.6.5 release. JBP4 will adapt to it eventually.
With all this said, SCDF is still not compatible with JPB4. It is recommended to downgrade to JPB 3.19 or latest in this release line instead.
We are facing a strange problem. We are running a Magento based store. In our admin, when we try to see orders, we are getting the error:
SQLSTATE[HY000]: General error: 126 Incorrect key file for table '/rdsdbdata/tmp/#sql_20b_0.MYI'; try to repair it
After lot of research, I found that tmp folder has run out of space.
I executed the command: show variables like '%tmpdir%'
And the value of folder was: /rdsdbdata/tmp
I ssh into my server and executed: df -h
This returned:
/dev/xvda1 mounted on /
tmpfs mounted on /dev/shm
/dev/xvdb mounted on /mnt/data
But I could not find the location: /rdsdbdata/tmp anywhere
So I'm not able to clear memory.
enter image description here
I ssh into my server
Not really. Your database is on an RDS instance, which can't be accessed over SSH. You must have ssh'ed into your web server, instead.
RDS provides you with a managed server with MySQL -- and nothing else -- running on it. It's not the machine where you were looking. You can't perform any administration on the underlying server. Everything -- including increasing the amount of allocated storage -- is done through the AWS console or API.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ModifyInstance.MySQL.html