Chef server on AWS t2 micro memory issue

Chef server on AWS t2 micro memory issue - amazon-web-services

I am using AWS to install chef server 12 on EC2 T2 micro. I have downloaded the 64-bit deb package version, which is applicable.
I have setup the following on the box:
hosts file added:
a. added ec2 ip to ec2 public dns name
installed and started ntp daemon
I am getting a a few errors but the "main" one is listed below listed below:
Main issue here is a memory issue:
Errno::ENOMEM
-------------
Cannot allocate memory - fork(2)
================================================================================
Error executing actionrun` on resource 'execute[restart_rabbitmq_log_service]'
================================================================================
Errno::ENOMEM
-------------
Cannot allocate memory - fork(2)
Resource Declaration:
---------------------
# In /var/opt/opscode/local-mode-cache/cookbooks/enterprise/definitions/component_runit_service.rb
19: execute "restart_#{component}_log_service" do
20: command "#{node['runit']['sv_bin']} restart #{node['runit']['sv_dir']}/#{component}/log"
21: action :nothing
22: end
23:
`

Yes, you really need at least 4GB of RAM for a Chef Server. https://docs.chef.io/chef_system_requirements.html#the-chef-server has the formal docs which say 8GB but 4GB plus some swap would probably not run too poorly.

Related

How to make Openwrt running on AWS EC2?

I have tried two methods to upload openwrt x86_64 image to AWS AMI and run on EC2, but both failed.
The image I built runs ok on VirutalBox and vmware.
The first method - vm_import/export.
I followed instruction on https://amazonaws-china.com/cn/ec2/vm-import/, vm_import tool failed and said "Not found initrd in Grub" at last.
Openwrt doesn't use initrd at boot stage. This is the default boot entry of grub.cfg
menuentry "OpenWrt" {
linux /boot/vmlinuz root=PARTUUID=fbdad417-02 rootfstype=ext4 rootwait console=tty0 console=ttyS0,115200n8 noinitrd
}
The second method - ec2-bundle-image/ec2-upgrade-image
I tried this way, and it can upload image files and metadata files to S3, and I could make a new AMI, and launch EC2 instance. But EC2 instance was not be booted correctly it stop at the grubdom>.
I followed the instruction of https://forum.archive.openwrt.org/viewtopic.php?id=41588, it seems a little old, I didn't found the aki instance it mentioned and used a alternative one (aki-7077ab11 pv-grub-hd0_1.05-x86_64.gz).
Whatever the combined image(openwrt default built) or the custom image(release rootfs.tar.gz and copy kernel and grub config to it), both failed, here is EC2 instance system log:
Xen Minimal OS!
start_info: 0x10d4000(VA)
nr_pages: 0xe504a
shared_inf: 0xeeb28000(MA)
pt_base: 0x10d7000(VA)
nr_pt_frames: 0xd
mfn_list: 0x9ab000(VA)
mod_start: 0x0(VA)
mod_len: 0
flags: 0x300
cmd_line: root=/dev/sda1 ro console=hvc0 4
stack: 0x96a100-0x98a100
MM: Init
_text: 0x0(VA)
_etext: 0x7b824(VA)
_erodata: 0x97000(VA)
_edata: 0x9cce0(VA)
stack start: 0x96a100(VA)
_end: 0x9aa700(VA)
start_pfn: 10e7
max_pfn: e504a
Mapping memory range 0x1400000 - 0xe504a000
setting 0x0-0x97000 readonly
skipped 0x1000
MM: Initialise page allocator for 1809000(1809000)-e504a000(e504a000)
MM: done
Demand map pfns at e504b000-20e504b000.
Heap resides at 20e504c000-40e504c000.
Initialising timer interface
Initialising console ... done.
gnttab_table mapped at 0xe504b000.
Initialising scheduler
Thread "Idle": pointer: 0x20e504c050, stack: 0x1f10000
Thread "xenstore": pointer: 0x20e504c800, stack: 0x1f20000
xenbus initialised on irq 3 mfn 0xfeffc
Thread "shutdown": pointer: 0x20e504cfb0, stack: 0x1f30000
Dummy main: start_info=0x98a200
Thread "main": pointer: 0x20e504d760, stack: 0x1f40000
"main" "root=/dev/sda1" "ro" "console=hvc0" "4"
vbd 2049 is hd0
******************* BLKFRONT for device/vbd/2049 **********
backend at /local/domain/0/backend/vbd/27482/2049
2097152 sectors of 512 bytes
**************************
vbd 2064 is hd1
******************* BLKFRONT for device/vbd/2064 **********
backend at /local/domain/0/backend/vbd/27482/2064
8377344 sectors of 512 bytes
**************************
[H[J
GNU GRUB version 0.97 (3752232K lower / 0K upper memory)
[ Minimal BASH-like line editing is supported. For
the first word, TAB lists possible command
completions. Anywhere else TAB lists the possible
completions of a device/filename. ]
grubdom>
Any idea? thanks.

It is easy task which doesn't need any of the complicated setup.
I used Virtualbox, but any other virtualization can be used (e.g. VMware or Hyper-V)
By my experience, placing openwrt to AWS fails using any of import methods other than "importing snapshot"
download openwrt
https://downloads.openwrt.org/releases/19.07.5/targets/x86/64/
install openwrt on virtualbox and create ova
https://openwrt.org/docs/guide-user/virtualization/virtualbox-vm
2a) convert img to vdi
- example: VBoxManage convertfromraw --format VDI openwrt-x86-64-combined.img openwrt.vdi
2b) extend vdi to 1GB
- example: VBoxManage modifymedium openwrt.vdi --resize 1024
2c) boot openwrt
2d) change eth0 interface to dhcp
- example: vi /etc/config/network
2e) shutdown
2f) export VM to ova'
rename .ova to .zip
unzip .zip
by unzipping you get vmdk file of virtual disk
upload vmdk to AWS S3 bucket
add vmimport role to your account
https://www.msp360.com/resources/blog/how-to-configure-vmimport-role/
import vmdk as snapshot
https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-import-snapshot.html
create new EC2 instance
replace EC2 instance volume with imported volume
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-restoring-volume.html
boot up

Access to PCF app from another machine

I'm new with the PCF and trying to deploy a simple web-app. I've installed the cf CLI, pcfdev and pushed my app to pcf:
cf push test-ui -b staticfile_buildpack
...
name: test-ui
requested state: started
instances: 1/1
usage: 256M x 1 instances
routes: test-ui.local.pcfdev.io
last uploaded: Thu 23 Aug 13:09:04 +03 2018
stack: cflinuxfs2
buildpack: staticfile_buildpack
start command: $HOME/boot.sh
state since cpu memory disk details
#0 running 2018-08-23T10:09:17Z 0.0% 5.3M of 256M 25M of 512M
So, now, I can access my test app by link test-ui.local.pcfdev.io from the same machine where I've started my pcf instance. But I don't know how to access to this app from another device in the same network.
Could someone tell me what I should do to open my test app from another device in the same network as my local machine?

Since PCFDev was installed in your local machine. I believe you cannot access the cloud foundry apps outside of your machine unless some networking stuff should be done that provide access to other machines.

I've found a solution: using the reverse proxy to redirect request from my local port app's url:
(using nginx as example):
server {
listen 8090;
server_name pcf-rp;
location / {
proxy_pass http://test-ui.local.pcfdev.io;
}
}

Unable to register AWS host to Ambari server

While registering a host to the cluster of Ambari-server, I am getting the following error.
"Host checks were skipped on 1 hosts that failed to register."
I'm trying to install HDP 2.5 version on the instance of AWS.
I have tried to follow the documentation of Hortonworks.
https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/set_the_hostname.html
I have added public ip address and public hostname to /etc/hosts file and change the name of host in /etc/hostname file on the server and on the host. Rebooted both, hostname got changed. Then I have stop iptables by
sudo service iptables stop
After doing everything, the host registration is still failing. Kindly help. I am stuck.

Background
From my experience with Ambari (Hortonworks) you have to explicitly setup your Hadoop nodes in each other's /etc/hosts file with the actual name/IPs that the Hadoop services will bind to. NOTE: hostnames should also be FQDN - fully qualified domain names.
For example if you're setting up the hosts as:
node01.mydom.com (10.0.0.2)
node02.mydom.com (10.0.0.3)
node03.mydom.com (10.0.0.4)
These entries should be in all 3 server's /etc/hosts and these should be the names used when referencing them within Ambari's installation/setup wizards.
If you do not pay special attention to this detail, Ambari's server will fail to find/manage any of the other node's that you're telling it to manage.
hostname of ambari-agents
The other item to look at is that the ambari-agent's and what hostnames they think they're going as.
$ ps -eaf|grep ambari_agent
root 3282 1 0 Jul30 ? 00:00:00 /usr/bin/python /usr/lib/python2.6/site-packages/ambari_agent/AmbariAgent.py start --expected-hostname=node01.mydom.com
root 3290 3282 1 Jul30 ? 08:24:29 /usr/bin/python /usr/lib/python2.6/site-packages/ambari_agent/main.py start --expected-hostname=node01.mydom.com
Debugging further
In the screen where you're attempting to register the other nodes as agents, there's a full log of what's happening and you can typically get the commands from this area and attempt to run them directly. I've done this on a number of occasions. The commands will often be python ... commands which you can then copy/paste from the logs and run on the Ambari server where you're attempting to run the install.

H2O + HDFS (Cloudera)

We have a Cloudera cluster up and running with an h2o instance although it appears to be running off h2o.jar (which as I understand it--please correct me if incorrect) is the stand-alone h2o. I can connect, but it will not load any files from our HDFS. (all of this i can see via 'ps' on edge node.
So I started an instance with h2odriver.jar
java -jar /path/to/h2odriver.jar -nodes 2 -mapperXmx 5g -output /my/hdfs/dir
I get several output/callback addresses:
[Possible callback IP address: 10.96.243.46:33728]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 10.96.243.46:33728
So I fire up python and try and connect (same thing happens if I use 10.96.243.46):
>>>h2o.connection(ip='127.0.0.1', port='33728')
and get
'Connecting to H2O server at http://127.0.0.1:33728..... failed.
H2OConnectionError: COuld not estalich link to the H2O cloud http://127.0.0.1:33728 after 5 retries
...
Failed to establish a new connection:[Errno 111] Connection refused',))`
Thing is on my screen with the H2O jar/java job I can see:
`MapperToDriverMessage: Read invalid type (G) from socket, ignoring...
MapperToDriverMessage: read: Unknown Type `
I cannot figure out how to launch h2o in cluster mode and have it access our hdfs system or even connect. I can connect to the h2o.jar version, but that sees no hdfs (it can see the filesystem of the edgenode). What is the proper way to launch H2O so that it can see the attached HDFS system (We are running Cloudera 5.7 in a enterprise environment, Python is 3.6, H2O is 3.10.0.6 and I know we have a ton of firewalls/security-- i beleive we are setup through LDAP

You are correct that h2o.jar is meant to be the standalone version of H2O which is not meant for connecting to HDFS.
Using the appropriate h2odriver.jar for your particular hadoop distribution is the way to go.
The correct beginner instructions can be found here:
go to http://www.h2o.ai/download/
choose H2O "Latest Stable Release"
choose tab "Install on Hadoop"
It says to run the following command:
hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g -output hdfsOutputDirName
[ Note this is "hadoop jar", not "java -jar" as written in the question. ]
You should see output like this:
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 172.16.2.181]
[Possible callback IP address: 127.0.0.1]
...
Waiting for H2O cluster to come up...
H2O node 172.16.2.188:54321 requested flatfile
Sending flatfiles to nodes...
[Sending flatfile to node 172.16.2.188:54321]
H2O node 172.16.2.188:54321 reports H2O cluster size 1
H2O cluster (1 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
Open H2O Flow in your web browser: http://172.16.2.188:54321
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...
Then point your web browser to the place where it says to "Open H2O Flow in your web browser".
(The other addresses in the output are diagnostics, and not for end users.)
In this case, the python connection command would be:
h2o.connect(ip = '172.16.2.188', port = 54321)
I recommend going to Flow in a web browser, start importing a file by typing in "hdfs://", and seeing if autocompletion works. If it does, your HDFS connection is working.

"vagrant up" failing: Vagrant VM failed to remain in the running state

The command vagrant up is failing and I don't know why.
$ egrep -v '^ *(#|$)' Vagrantfile
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "precise32"
end
$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
[default] Importing base box 'precise32'...
[default] Matching MAC address for NAT networking...
[default] Setting the name of the VM...
[default] Clearing any previously set forwarded ports...
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Preparing network interfaces based on configuration...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
The VM failed to remain in the "running" state while attempting to boot.
This is normally caused by a misconfiguration or host system incompatibilities.
Please open the VirtualBox GUI and attempt to boot the virtual machine
manually to get a more informative error message.
$ vagrant status
Current machine states:
default poweroff (virtualbox)
The VM is powered off. To restart the VM, simply run `vagrant up`
$ VBoxManage list runningvms
$
Here are the messages in the VirtualBox log file, VBoxSVC.log:
$ cat ~/.VirtualBox/VBoxSVC.log
VirtualBox XPCOM Server 4.2.16 r86992 linux.amd64 (Jul 4 2013 16:29:59) release log
00:00:00.000499 main Log opened 2013-08-13T18:40:45.907580000Z
00:00:00.000508 main OS Product: Linux
00:00:00.000509 main OS Release: 3.6.11-4.fc16.x86_64
00:00:00.000510 main OS Version: #1 SMP Tue Jan 8 20:57:42 UTC 2013
00:00:00.000537 main DMI Product Name: X8DA3
00:00:00.000547 main DMI Product Version: 1234567890
00:00:00.000647 main Host RAM: 24103MB total, 17127MB available
00:00:00.000654 main Executable: /usr/local/VirtualBox/VBoxSVC
00:00:00.000655 main Process ID: 9417
00:00:00.000656 main Package type: LINUX_64BITS_GENERIC
00:00:00.110125 nspr-2 Loading settings file "/opt/tomcat/.VirtualBox/VirtualBox.xml" with version "1.12-linux"
00:00:00.110817 nspr-2 Failed to retrive disk info: getDiskName(/dev/md126p1) --> md126p1
00:00:00.264367 nspr-2 VDInit finished
00:00:00.275173 nspr-2 Loading settings file "/opt/tomcat/VirtualBox VMs/vagrant_getting_started_default_1376419129/vagrant_getting_started_default_1376419129.vbox" with version "1.12-linux"
00:00:05.288923 main ERROR [COM]: aRC=VBOX_E_OBJECT_IN_USE (0x80bb000c) aIID={29989373-b111-4654-8493-2e1176cba890} aComponent={Medium} aText={Medium '/opt/tomcat/VirtualBox VMs/vagrant_getting_started_default_1376419129/box-disk1.vmdk' cannot be closed because it is still attached to 1 virtual machines}, preserve=false
00:00:05.290229 Watcher ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={3b2f08eb-b810-4715-bee0-bb06b9880ad2} aComponent={VirtualBox} aText={The object is not ready}, preserve=false
$
Any advice would be greatly appreciated.

Had the same error on OSX. Restarting VirtualBox fixed it :S
sudo /Library/StartupItems/VirtualBox/VirtualBox restart
Also see: https://forums.virtualbox.org/viewtopic.php?t=5489

I solved the problem by re-installing VirtualBox and adding myself to the vboxusers group. The re-installation process printed a message indicating that VM users had to be a member of that group. I don't know if the re-installation was necessary or if being added to the group would have sufficed.

The host machine was 32bits (Ubuntu) and the guest was 64bit, I changed the guest to 32 and it solved the problem.

My understanding is that vboxusers group is related to accessing USB devices within the guest. Not sure why it is causing the issue. Normally, as a vagrant base box build guideline, audio and USB are both disabled.
As per the VirtualBox Manual => The vboxusers group
The Linux installers create the system user group vboxusers during installation. Any system user who is going to use USB devices from VirtualBox guests must be a member of that group. A user can be made a member of the group vboxusers through the GUI user/group management or at the command line with sudo usermod -a -G vboxusers username
Note that adding an active user to that group will require that user to log out and back in again. This should be done manually after successful installation of the package.

I had the same problem. It is because I did a wrong configuration on my Vagrantfile in the provider section. I had tried to make my VM machine more powerfull, with 2 cpus when i have on the machine host just one.
this often happens when you try to add more hardware to your VM machine but your host machine does not have the minimun requirements

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js