Reference: jfrog artifactory could not validate router error - centos7

I have tried everyone's suggestions and I still get a failure. This is on a new installation of artifactory: jfrog-artifactory-oss-7.4.1-linux.tar.gz. This is on a local CentOS VM.
2020-04-18T11:53:25.305Z [jfrt ] [INFO ] [a88f4f6ce96d65bb] [o.j.c.ExecutionUtils:141 ] [pool-13-thread-1 ] - Cluster join: Retry 5: Service registry ping failed, will retry. Error while trying to connect to local router at address ‘http://localhost:8046/access’: Connect to localhost:8046 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
hostname -i
172.16.217.147
more /etc/hosts
127.0.0.1 centos7 centos7.example.com localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.217.147 artifactory-master
system.yaml
shared:
node:
ip: 172.16.217.147
This is from access-service.log:
2020-04-18T11:52:19.789Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.s.r.s.GrpcServerImpl:65 ] [ocalhost-startStop-2] - Starting gRPC Server on port 8045
2020-04-18T11:52:20.072Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.s.r.s.GrpcServerImpl:84 ] [ocalhost-startStop-2] - gRPC Server started, listening on 8045
2020-04-18T11:52:21.995Z [jfac ] [INFO ] [7fbbd46f40602f6b] [o.j.a.AccessApplication:59 ] [ocalhost-startStop-2] - Started AccessApplication in 11.711 seconds (JVM running for 13.514)
2020-04-18T11:52:29.093Z [jfac ] [WARN ] [7b2c676f76c7ef43] [o.j.c.ExecutionUtils:141 ] [pool-6-thread-2 ] - Retry 20 Elapsed 9.54 secs failed: Registration with router on URL http://localhost:8046 failed with error: UNAVAILABLE: io exception. Trying again
2020-04-18T11:52:34.119Z [jfac ] [WARN ] [7b2c676f76c7ef43] [o.j.c.ExecutionUtils:141 ] [pool-6-thread-2 ] - Retry 30 Elapsed 14.57 secs failed: Registration with router on URL http://localhost:8046 failed with error: UNAVAILABLE: io exception. Trying again

Related

AWC EC2 Amazon Linux 2 Instances failed to boot after applying os updates

Yesterday we lost contact with 10 identically configured servers, after some investigation the conclusion was that a reboot after security updates had failed.
We have so far not been able to get any of the servers back online, but were lucky enough to be able to reinstall the instances without data loss.
I will paste the console log below, can anyone help me determine the root cause and perhaps give me some advice on if there is a better way to configure the server to make recovery easier (like getting past the "Press Enter to continue." prompt, that it seems to hang in).
The full log is too big for SO, so I put it on pastebin and pasted a redacted version below. I have removed the escape sequences that colorize the output and removed some double new lines, but besides that it is complete.
[ 0.000000] Linux version 4.14.200-155.322.amzn2.x86_64 (mockbuild#ip-10-0-1-230) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-10) (GCC)) #1 SMP Thu Oct 15 20:11:12 UTC 2020
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.7 present.
[ 0.000000] DMI: Amazon EC2 t3.micro/, BIOS 1.0 10/16/2017
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] e820: last_pfn = 0x3e3fa max_arch_pfn = 0x400000000
[ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000000] Scanning 1 areas for low memory corruption
[ 0.000000] Using GB pages for direct mapping
[ 0.000000] RAMDISK: [mem 0x3433e000-0x36196fff]
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x00000000000F8F80 000014 (v00 AMAZON)
[ 0.000000] ACPI: RSDT 0x000000003E3FE360 00003C (v01 AMAZON AMZNRSDT 00000001 AMZN 00000001)
[ 0.000000] ACPI: FACS 0x000000003E3FFF40 000040
[ 0.000000] ACPI: SSDT 0x000000003E3FF6C0 00087A (v01 AMAZON AMZNSSDT 00000001 AMZN 00000001)
[ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
[ 0.000000] e820: [mem 0x40000000-0xdfffffff] available for PCI devices
[ 0.000000] Booting paravirtualized kernel on KVM
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.200-155.322.amzn2.x86_64 root=UUID=a1e1011e-e38f-408e-878b-fed395b47ad6 ro console=tty0 console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 LANG=en_US.UTF-8
[ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Memory: 943540K/1019488K available (10252K kernel code, 1958K rwdata, 2780K rodata, 2088K init, 4240K bss, 75948K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.000000] Kernel/User page tables isolation: enabled
[ 0.000000] ftrace: allocating 26683 entries in 105 pages
[ 0.004000] Hierarchical RCU implementation.
[ 0.004000] RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=2.
[ 0.004000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[ 0.004000] NR_IRQS: 524544, nr_irqs: 440, preallocated irqs: 16
[ 0.004000] Console: colour VGA+ 80x25
[ 0.004000] console [tty0] enabled
[ 0.004000] console [ttyS0] enabled
[ 0.004005] tsc: Detected 2500.000 MHz processor
[ 0.007582] Calibrating delay loop (skipped) preset value.. 5000.00 BogoMIPS (lpj=10000000)
[ 0.008002] pid_max: default: 32768 minimum: 301
[ 0.012006] ACPI: Core revision 20170728
[ 0.016560] ACPI: 2 ACPI AML tables successfully acquired and loaded
[ 0.020015] Security Framework initialized
[ 0.024002] SELinux: Initializing.
[ 0.028159] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.032082] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.036012] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.040006] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes)
[ 0.044325] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
[ 0.048003] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
[ 0.052003] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[ 0.056003] Spectre V2 : Mitigation: Full generic retpoline
[ 0.060002] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[ 0.064002] Speculative Store Bypass: Vulnerable
[ 0.067720] TAA: Vulnerable: Clear CPU buffers attempted, no microcode
[ 0.068002] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[ 0.072086] Freeing SMP alternatives memory: 24K
[ 0.076807] smpboot: Max logical packages: 1
[ 0.080264] x2apic enabled
[ 0.084003] Switched APIC routing to physical x2apic.
[ 0.088000] ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
[ 0.088000] smpboot: CPU0: Intel(R) Xeon(R) Platinum 8175M CPU # 2.50GHz (family: 0x6, model: 0x55, stepping: 0x4)
[ 0.088074] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
[ 0.092046] Hierarchical SRCU implementation.
[ 0.095857] NMI watchdog: Perf event create on CPU 0 failed with -2
[ 0.096002] NMI watchdog: Perf NMI watchdog permanently disabled
[ 0.100049] smp: Bringing up secondary CPUs ...
[ 0.103696] x86: Booting SMP configuration:
[ 0.104003] .... node #0, CPUs: #1
[ 0.004000] kvm-clock: cpu 1, msr 0:3e357041, secondary cpu clock
[ 0.106853] KVM setup async PF for cpu 1
[ 0.107214] kvm-stealtime: cpu 1, msr 3e1161c0
[ 0.112307] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[ 0.116006] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
[ 0.120007] smp: Brought up 1 node, 2 CPUs
[ 0.123417] smpboot: Total of 2 processors activated (10000.00 BogoMIPS)
[ 0.124320] devtmpfs: initialized
[ 0.126970] x86/mm: Memory block size: 128MB
[ 0.128137] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.132008] futex hash table entries: 512 (order: 3, 32768 bytes)
[ 0.136156] NET: Registered protocol family 16
[ 0.139769] cpuidle: using governor ladder
[ 0.140013] cpuidle: using governor menu
[ 0.143281] ACPI: bus type PCI registered
[ 0.144000] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 0.148144] PCI: Using configuration type 1 for base access
[ 0.156770] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[ 0.160017] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 0.164044] ACPI: Added _OSI(Module Device)
[ 0.168007] ACPI: Added _OSI(Processor Device)
[ 0.172007] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.176004] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.180007] ACPI: Interpreter enabled
[ 0.184011] ACPI: (supports S0 S4 S5)
[ 0.187094] ACPI: Using IOAPIC for interrupt routing
[ 0.188018] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 0.300750] ACPI: Enabled 16 GPEs in block 00 to 0F
[ 0.308023] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 0.312007] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
[ 0.316010] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[ 0.320007] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[ 0.328324] acpiphp: Slot [3] registered
[ 0.420040] acpiphp: Slot [31] registered
[ 0.424003] PCI host bridge to bus 0000:00
[ 0.536451] pci 0000:00:03.0: vgaarb: setting as boot VGA device
[ 0.540000] pci 0000:00:03.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[ 0.548009] pci 0000:00:03.0: vgaarb: bridge control possible
[ 0.551996] vgaarb: loaded
[ 0.556090] EDAC MC: Ver: 3.0.0
[ 0.559140] PCI: Using ACPI for IRQ routing
[ 0.560280] NetLabel: Initializing
[ 0.563268] NetLabel: domain hash size = 128
[ 0.568019] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 0.571902] NetLabel: unlabeled traffic allowed by default
[ 0.576145] clocksource: Switched to clocksource kvm-clock
[ 0.586755] VFS: Disk quotas dquot_6.6.0
[ 0.590090] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 0.594562] pnp: PnP ACPI init
[ 0.597855] pnp: PnP ACPI: found 5 devices
[ 0.608231] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 0.614881] NET: Registered protocol family 2
[ 0.618324] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.622749] TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
[ 0.626965] TCP: Hash tables configured (established 8192 bind 8192)
[ 0.631170] UDP hash table entries: 512 (order: 2, 16384 bytes)
[ 0.635163] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
[ 0.639358] NET: Registered protocol family 1
[ 0.642779] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 0.646797] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 0.651113] pci 0000:00:03.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[ 0.657825] Unpacking initramfs...
[ 0.734208] Freeing initrd memory: 31076K
[ 0.737636] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x240939f1bb2, max_idle_ns: 440795263295 ns
[ 0.745181] Scanning for low memory corruption every 60 seconds
[ 0.750602] audit: initializing netlink subsys (disabled)
[ 0.754606] audit: type=2000 audit(1603879247.564:1): state=initialized audit_enabled=0 res=1
[ 0.754917] Initialise system trusted keyrings
[ 0.764927] Key type blacklist registered
[ 0.768266] workingset: timestamp_bits=36 max_order=18 bucket_order=0
[ 0.773861] zbud: loaded
[ 0.905903] Key type asymmetric registered
[ 0.909292] Asymmetric key parser 'x509' registered
[ 0.912915] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[ 0.918972] io scheduler noop registered (default)
[ 0.922543] io scheduler cfq registered
[ 0.925904] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
[ 0.964594] crc32c_combine: 8373 self tests passed
[ 0.968628] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 1.000785] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 1.007649] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[ 1.014310] i8042: Warning: Keylock active
[ 1.018572] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 1.022414] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 1.026284] rtc_cmos 00:00: RTC can wake from S4
[ 1.030475] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
[ 1.034755] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
[ 1.038955] hidraw: raw HID events driver (C) Jiri Kosina
[ 1.042936] NET: Registered protocol family 17
[ 1.046622] mce: Using 32 MCE banks
[ 1.049627] sched_clock: Marking stable (1049607566, 0)->(1755024155, -705416589)
[ 1.056014] registered taskstats version 1
[ 1.059279] Loading compiled-in X.509 certificates
[ 1.064832] Loaded X.509 cert 'Build time autogenerated kernel key: 121ffea65ca15230f4a21fe7e5b65abaabaa433c'
[ 1.072013] zswap: loaded using pool lzo/zbud
[ 1.075526] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
[ 1.079746] ima: Allocated hash algorithm: sha1
[ 1.083589] rtc_cmos 00:00: setting system clock to 2020-10-28 09:59:31 UTC (1603879171)
[ 1.091820] Freeing unused kernel memory: 2088K
[ 1.116102] Write protecting the kernel read-only data: 16384k
[ 1.120697] Freeing unused kernel memory: 2016K
[ 1.126528] Freeing unused kernel memory: 1316K
[ 1.160972] systemd[1]: Inserted module 'autofs4'
[ 1.176133] NET: Registered protocol family 10
[ 1.181508] Segment Routing with IPv6
[ 1.184828] systemd[1]: Inserted module 'ipv6'
[ 1.189116] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.193763] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.198171] random: systemd: uninitialized urandom read (16 bytes read)
[ 1.205354] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
[ 1.217384] systemd[1]: Detected virtualization kvm.
[ 1.221077] systemd[1]: Detected architecture x86-64.
[ 1.224774] systemd[1]: Running in initial RAM disk.
Welcome to Amazon Linux 2 dracut-033-535.amzn2.1.3 (Initramfs)
[ 1.230712] systemd[1]: No hostname configured.
[ 1.234213] systemd[1]: Set hostname to <localhost>.
[ 1.237934] systemd[1]: Initializing machine ID from KVM UUID.
[ OK ] Reached target Swap.
[ 1.265844] systemd[1]: Reached target Swap.
[ 1.269312] systemd[1]: Starting Swap.
[ OK ] Created slice Root Slice.
[ 1.274036] systemd[1]: Created slice Root Slice.
[ OK ] Listening on Journal Socket.
[ OK ] Reached target Timers.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Reached target Local File Systems.
[ OK ] Listening on udev Control Socket.
[ OK ] Created slice System Slice.
Starting Setup Virtual Console...
Starting Journal Service...
Starting Create list of required st... nodes for the current kernel...
Starting Apply Kernel Variables...
[ OK ] Reached target Slices.
[ OK ] Listening on udev Kernel Socket.
[ OK ] Reached target Sockets.
Starting dracut cmdline hook...
[ OK ] Started Setup Virtual Console.
[ OK ] Started Create list of required sta...ce nodes for the current kernel.
[ OK ] Started Apply Kernel Variables.
Starting Create Static Device Nodes in /dev...
[ OK ] Started Create Static Device Nodes in /dev.
[ OK ] Started Journal Service.
[ OK ] Started dracut cmdline hook.
Starting dracut pre-udev hook...
[ 1.390579] device-mapper: uevent: version 1.0.3
[ 1.394255] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel#redhat.com
[ OK ] Started dracut pre-udev hook.
Starting udev Kernel Device Manager...
[ OK ] Started udev Kernel Device Manager.
Starting dracut pre-trigger hook...
[ OK ] Started dracut pre-trigger hook.
Starting udev Coldplug all Devices...
[ OK ] Started udev Coldplug all Devices.
Starting Show Plymouth Boot Screen...
[ OK ] Reached target System Initialization.
Starting dracut initqueue hook...
[ 1.534629] nvme nvme0: pci function 0000:00:04.0
[ OK ] Started Show Plymouth Boot Screen.
[ OK ] Reached target Paths.
[ OK ] Reached target Basic System.
[ 1.543815] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
[ 1.546543] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[ 1.556607] nvme nvme1: pci function 0000:00:1f.0
[ 1.557854] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
[ 1.576394] AVX2 version of gcm_enc/dec engaged.
[ 1.580503] AES CTR mode by8 optimization enabled
[ 1.601321] alg: No test for pcbc(aes) (pcbc-aes-aesni)
[ 1.776495] nvme0n1: p1 p128
[ 1.908576] random: fast init done
[ OK ] Found device /dev/disk/by-uuid/a1e1011e-e38f-408e-878b-fed395b47ad6.
Starting File System Check on /dev/...e-e38f-408e-878b-fed395b47ad6...
[ OK ] Started File System Check on /dev/d...11e-e38f-408e-878b-fed395b47ad6.
[ OK ] Started dracut initqueue hook.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
Starting dracut pre-mount hook...
[ OK ] Started dracut pre-mount hook.
Mounting /sysroot...
[ 2.235770] SGI XFS with ACLs, security attributes, no debug enabled
[ 2.242333] XFS (nvme0n1p1): Mounting V5 Filesystem
[ 4.142597] XFS (nvme0n1p1): Ending clean mount
[ OK ] Mounted /sysroot.
[ OK ] Reached target Initrd Root File System.
Starting Reload Configuration from the Real Root...
[ OK ] Started Reload Configuration from the Real Root.
[ OK ] Reached target Initrd File Systems.
[ OK ] Reached target Initrd Default Target.
Starting dracut pre-pivot and cleanup hook...
[ OK ] Started dracut pre-pivot and cleanup hook.
Starting Cleaning Up and Shutting Down Daemons...
[ OK ] Stopped Cleaning Up and Shutting Down Daemons.
[ OK ] Stopped target Timers.
[ OK ] Stopped dracut pre-pivot and cleanup hook.
Stopping dracut pre-pivot and cleanup hook...
[ OK ] Stopped target Remote File Systems.
[ OK ] Stopped target Remote File Systems (Pre).
[ OK ] Stopped target Initrd Default Target.
Starting Plymouth switch root service...
[ OK ] Stopped dracut pre-mount hook.
Stopping dracut pre-mount hook...
[ OK ] Stopped dracut initqueue hook.
Stopping dracut initqueue hook...
[ OK ] Stopped target Basic System.
[ OK ] Stopped target Sockets.
[ OK ] Stopped target System Initialization.
[ OK ] Stopped target Swap.
[ OK ] Stopped target Local File Systems.
[ OK ] Stopped Apply Kernel Variables.
Stopping Apply Kernel Variables...
[ OK ] Stopped target Local Encrypted Volumes.
[ OK ] Stopped udev Coldplug all Devices.
Stopping udev Coldplug all Devices...
[ OK ] Stopped dracut pre-trigger hook.
Stopping dracut pre-trigger hook...
Stopping udev Kernel Device Manager...
[ OK ] Stopped target Slices.
[ OK ] Stopped target Paths.
[ OK ] Stopped udev Kernel Device Manager.
[ OK ] Stopped Create Static Device Nodes in /dev.
Stopping Create Static Device Nodes in /dev...
[ OK ] Stopped Create list of required sta...ce nodes for the current kernel.
Stopping Create list of required st... nodes for the current kernel...
[ OK ] Stopped dracut pre-udev hook.
Stopping dracut pre-udev hook...
[ OK ] Stopped dracut cmdline hook.
Stopping dracut cmdline hook...
[ OK ] Closed udev Kernel Socket.
[ OK ] Closed udev Control Socket.
Starting Cleanup udevd DB...
[ OK ] Started Cleanup udevd DB.
[ OK ] Reached target Switch Root.
[ 4.553875] systemd-journald[667]: Received SIGTERM from PID 1 (systemd).
[ OK ] Started Plymouth switch root service.
Starting Switch Root...
[ 4.885212] systemd: 30 output lines suppressed due to ratelimiting
[ 5.925390] SELinux: Disabled at runtime.
[ 5.980115] audit: type=1404 audit(1603879176.396:2): selinux=0 auid=4294967295 ses=4294967295
[ 6.083250] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 6.106470] systemd[1]: Inserted module 'ip_tables'
Welcome to Amazon Linux 2
[ OK ] Stopped Switch Root.
[ OK ] Stopped Journal Service.
Starting Journal Service...
[ OK ] Reached target Swap.
[ OK ] Listening on Delayed Shutdown Socket.
Mounting Huge Pages File System...
[ OK ] Stopped target Switch Root.
[ OK ] Stopped target Initrd Root File System.
[ OK ] Created slice system-getty.slice.
[ OK ] Listening on udev Control Socket.
[ OK ] Listening on Device-mapper event daemon FIFOs.
[ OK ] Created slice User and Session Slice.
Starting Create list of required st... nodes for the current kernel...
[ OK ] Listening on LVM2 poll daemon socket.
[ OK ] Stopped target Initrd File Systems.
[ OK ] Listening on udev Kernel Socket.
Mounting Debug File System...
[ OK ] Reached target Slices.
[ OK ] Listening on LVM2 metadata daemon socket.
Mounting POSIX Message Queue File System...
[ OK ] Created slice system-selinux\x2dpol...grate\x2dlocal\x2dchanges.slice.
Starting Monitoring of LVM2 mirrors... dmeventd or progress polling...
[ OK ] Created slice system-serial\x2dgetty.slice.
Starting Read and set NIS domainname from /etc/sysconfig/network...
[ OK ] Listening on /dev/initctl Compatibility Named Pipe.
[ OK ] Set up automount Arbitrary Executab...ats File System Automount Point.
Starting Remount Root and Kernel File Systems...
[ OK ] Started Journal Service.
[ OK ] Mounted Debug File System.
[ OK ] Mounted POSIX Message Queue File System.
[ OK ] Mounted Huge Pages File System.
[ OK ] Started Create list of required sta...ce nodes for the current kernel.
[ OK ] Started Remount Root and Kernel File Systems.
[ OK ] Started Read and set NIS domainname from /etc/sysconfig/network.
Starting udev Coldplug all Devices...
Starting Configure read-only root support...
Starting Relabel kernel modules early in the boot, if needed...
Starting Create Static Device Nodes in /dev...
Starting Flush Journal to Persistent Storage...
[ OK ] Started Relabel kernel modules early in the boot, if needed.
Starting Load Kernel Modules...
[ 7.047237] systemd-journald[1398]: Received request to flush runtime journal from PID 1
[ 7.069936] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.2.10g
[ 7.084119] ena: ena device version: 0.10
[ 7.089001] ena: ena controller version: 0.0.1 implementation version 1
[ OK ] Started Configure read-only root support.
Starting Load/Save Random Seed...
[ OK ] Started Load/Save Random Seed.
[ 7.156042] ena 0000:00:05.0: LLQ is not supported Fallback to host mode policy.
[ OK ] Started udev Coldplug all Devices.
Starting udev Wait for Complete Device Initialization...
[ 7.181318] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem febf4000, mac addr 0a:cf:65:4e:dd:ff
[ OK ] Started Load Kernel Modules.
Starting Apply Kernel Variables...
[ OK ] Started LVM2 metadata daemon.
Starting LVM2 metadata daemon...
[ OK ] Started Apply Kernel Variables.
[ OK ] Started Create Static Device Nodes in /dev.
Starting udev Kernel Device Manager...
[ OK ] Started Flush Journal to Persistent Storage.
[ OK ] Started udev Kernel Device Manager.
[ OK ] Found device /dev/ttyS0.
[ 7.776329] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[ 7.783413] ACPI: Power Button [PWRF]
[ 7.786723] input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input4
[ 7.793032] ACPI: Sleep Button [SLPF]
Starting Relabel kernel modules early in the boot, if needed...
[ OK ] Created slice system-ec2net\x2difup.slice.
[ OK ] Started Relabel kernel modules early in the boot, if needed.
[ 7.888784] input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input5
[ 7.904661] mousedev: PS/2 mouse device common for all mice
[ OK ] Started udev Wait for Complete Device Initialization.
Starting Activation of DM RAID sets...
[ OK ] Started Activation of DM RAID sets.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Started Monitoring of LVM2 mirrors,...ng dmeventd or progress polling.
[ OK ] Reached target Local File Systems (Pre).
[ 59.305661] random: crng init done
[ 59.308921] random: 7 urandom warning(s) missed due to ratelimiting
[ TIME ] Timed out waiting for device dev-sdf.device.
[DEPEND] Dependency failed for /home/storage.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for Mark the need to relabel after reboot.
[DEPEND] Dependency failed for Relabel all filesystems, if necessary.
[DEPEND] Dependency failed for Migrate local... structure to the new structure.
Starting Preprocess NFS configuration...
[ OK ] Reached target Timers.
[ OK ] Reached target Network (Pre).
[ OK ] Reached target Login Prompts.
[ OK ] Reached target Cloud-init target.
Starting Initial hibernation setup job...
Starting Initial cloud-init job (metadata service crawler)...
[ OK ] Reached target Network.
[ OK ] Reached target Paths.
[ OK ] Reached target Sockets.
Starting Create Volatile Files and Directories...
[ OK ] Started Emergency Shell.
Starting Emergency Shell...
[ OK ] Reached target Emergency Mode.
Starting Tell Plymouth To Write Out Runtime Data...
[ OK ] Started Preprocess NFS configuration.
[ OK ] Started Create Volatile Files and Directories.
Mounting RPC Pipe File System...
Starting Security Auditing Service...
Starting RPC bind service...
[ 97.160193] RPC: Registered named UNIX socket transport module.
[ 97.160194] RPC: Registered udp transport module.
[ 97.160194] RPC: Registered tcp transport module.
[ 97.160195] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ OK ] Mounted RPC Pipe File System.
[ OK ] Reached target rpc_pipefs.target.
[ OK ] Reached target NFS client services.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
[ OK ] Started Tell Plymouth To Write Out Runtime Data.
[ OK ] Started RPC bind service.
[ OK ] Started Security Auditing Service.
Starting Update UTMP about System Boot/Shutdown...
[ OK ] Started Update UTMP about System Boot/Shutdown.
Starting Update UTMP about System Runlevel Changes...
[ OK ] Started Update UTMP about System Runlevel Changes.
[ 99.871085] hibinit-agent[1855]: Traceback (most recent call last):
[ 99.871339] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 496, in <module>
[ 99.871592] hibinit-agent[1855]: main()
[ 99.872080] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 435, in main
[ 99.872516] hibinit-agent[1855]: if not hibernation_enabled(config.state_dir):
[ 99.873017] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 390, in hibernation_enabled
[ 99.873487] hibinit-agent[1855]: imds_token = get_imds_token()
[ 99.873793] hibinit-agent[1855]: File "/usr/bin/hibinit-agent", line 365, in get_imds_token
[ 99.875332] hibinit-agent[1855]: response = requests.put(token_url, headers=request_header)
[ 99.877065] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 121, in put
[ 99.877230] hibinit-agent[1855]: return request('put', url, data=data, **kwargs)
[ 99.877959] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request
[ 99.878225] hibinit-agent[1855]: response = session.request(method=method, url=url, **kwargs)
[ 99.878614] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 486, in request
[ 99.879747] hibinit-agent[1855]: resp = self.send(prep, **send_kwargs)
[ 99.880157] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 598, in send
[ 99.884411] hibinit-agent[1855]: r = adapter.send(request, **kwargs)
[ 99.884728] hibinit-agent[1855]: File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 419, in send
[ 99.892094] hibinit-agent[1855]: raise ConnectTimeout(e, request=request)
[ 99.892377] hibinit-agent[1855]: requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7efc029fa390>: Failed to establish a new connection: [Errno 101] Network is unreachable',))
[FAILED] Failed to start Initial hibernation setup job.
See 'systemctl status hibinit-agent.service' for details.
[ 101.215791] cloud-init[1856]: Cloud-init v. 19.3-3.amzn2 running 'init' at Wed, 28 Oct 2020 10:01:11 +0000. Up 101.18 seconds.
[ 101.264707] cloud-init[1856]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
[ 101.264940] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[ 101.272469] cloud-init[1856]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
[ 101.274166] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[ 101.274497] cloud-init[1856]: ci-info: | eth0 | False | . | . | . | 0a:cf:65:4e:dd:ff |
[ 101.284890] cloud-init[1856]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[ 101.286727] cloud-init[1856]: ci-info: | lo | True | ::1/128 | . | host | . |
[ 101.286986] cloud-init[1856]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
[ 101.291933] cloud-init[1856]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
[ 101.292215] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+
[ 101.294122] cloud-init[1856]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ 101.294383] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+
[ 101.294543] cloud-init[1856]: ci-info: +-------+-------------+---------+-----------+-------+
Welcome to emerg
Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.
Press Enter to continue.
Ok, shortly after posting we figured it out. Seems like a mount point has changed (I expect due to a linux kernel update) and we have not used the nofail option in /etc/fstab as described in the aws knowledge center, this caused the server to hang at boot.
Going forward we will also ensure we use UUID mounting so we are independent on the device naming in /dev/.
I think I've narrowed this down to the ec2-utils package. We had the same issue, related to devices not mounting properly that we initially thought was related to the ENA or NVMe driver. Once we ran a yum update, it was resolved.
If you downgrade the ec2-utils package to ec2-utils-1.2-2.amzn2 the issue returns. This seems to only affect nitro based instances. To fix it, you can temporarily boot as a t2 or other older instance type and update the package.

Deploying Spring Cloud Eureka in AWS ECS (EC2) with DNS props but getting: 'Failed to bind elastic IP (IP)'. I attached a policy to allow user

I am using AWS ECS to deploy Eureka in my Cluster to zones inside us-east-1 region. ECS dynamically deploys to any region and I cannot predetermine the IP or domain the EC2 instance will be, hence I use DNS.
I am using DNS as illustrated here https://github.com/Netflix/eureka/wiki/Deploying-Eureka-Servers-in-EC2. Below are my configurations:
eureka:
instance:
healthCheckUrlPath: /manage/health
client:
region: us-east-1
availabilityZones:
us-east-1: us-east-1a,us-east-1c
eurekaServerPort: 8761
useDnsForFetchingServiceUrls: true
eurekaServerDNSName: eureka.mydomain.com
eurekaServerURLContext: eureka
registerWithEureka: true
fetchRegistry: true
cloud:
aws:
credentials:
accessKey: AWS_KEY
secretKey: AWS_KEY_SECRET
region:
static: us-east-1
The user with AWS_KEY has this policy attached:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:AllocateAddress",
"ec2:AssociateAddress",
"ec2:DescribeAddresses",
"ec2:DisassociateAddress"
],
"Sid": "Stmt1375723773000",
"Resource": [
"*"
],
"Effect": "Allow"
}
]
}
and configured the EurekaInstanceConfigBean configured as:
#Bean
#Profile("!default")
public EurekaInstanceConfigBean eurekaInstanceConfig(InetUtils inetUtils) {
EurekaInstanceConfigBean config = new EurekaInstanceConfigBean(inetUtils);
AmazonInfo info = AmazonInfo.Builder.newBuilder().autoBuild("eureka");
info.getMetadata().put(AmazonInfo.MetaDataKey.publicHostname.getName(), info.get(AmazonInfo.MetaDataKey.publicIpv4));
config.setHostname(info.get(AmazonInfo.MetaDataKey.publicHostname));
config.setIpAddress(info.get(AmazonInfo.MetaDataKey.publicIpv4));
config.setNonSecurePort(port);
config.setDataCenterInfo(info);
return config;
}
GOOD THING: Eureka recognise my Route 53 configured eureka.mydomain.com DNS EIPs and it tries to bind, the (available and unassigned) EIP in zone us-east-1c, to the instance where my eureka server is deployed
PROBLEM: I get the following logs and Unauthorized error as below when booting my app:
...................................
.................................
2017-04-10 16:07:42.141 DEBUG 5 --- [ main] c.n.d.s.r.a.DnsTxtRecordClusterResolver : Resolved txt.us-east-1.eureka.mydomain.com to [AwsEndpoint{ serviceUrl=
'http://ec2-34.200.47.82.compute-1.amazonaws.com:8761/eureka', region='us-east-1', zone='us-east-1c'}]
2017-04-10 16:07:42.141 DEBUG 5 --- [ main] c.n.d.s.r.a.ZoneAffinityClusterResolver : Local zone=us-east-1c; resolved to: [AwsEndpoint{ serviceUrl='http://ec2-3
4.200.47.82.compute-1.amazonaws.com:8761/eureka', region='us-east-1', zone='us-east-1c'}]
2017-04-10 16:07:42.204 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Disable delta property : false
2017-04-10 16:07:42.209 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Single vip registry refresh property : null
2017-04-10 16:07:42.209 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Force full registry fetch : false
2017-04-10 16:07:42.209 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Application is null : false
2017-04-10 16:07:42.209 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Registered Applications size is zero : true
2017-04-10 16:07:42.209 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Application version is -1: true
2017-04-10 16:07:42.211 INFO 5 --- [ main] com.netflix.discovery.DiscoveryClient : Getting all instance registry info from the eureka server
2017-04-10 16:07:42.213 DEBUG 5 --- [ main] c.n.d.s.t.d.SessionedEurekaHttpClient : Ending a session and starting anew
2017-04-10 16:07:42.222 DEBUG 5 --- [ main] n.d.s.t.j.AbstractJerseyEurekaHttpClient : Created client for url: http://ec2-34.200.47.82.compute-1.amazonaws.com:87
61/eureka
2017-04-10 16:07:42.313 DEBUG 5 --- [ main] c.n.d.shared.MonitoredConnectionManager : Get connection: {}->http://ec2-34.200.47.82.compute-1.amazonaws.com:8761,
timeout = 5000
2017-04-10 16:07:42.314 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : [{}->http://ec2-34.200.47.82.compute-1.amazonaws.com:8761] total kept aliv
e: 0, total issued: 0, total allocated: 0 out of 200
2017-04-10 16:07:42.314 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : No free connections [{}->http://ec2-34.200.47.82.compute-1.amazonaws.com:8
761][null]
2017-04-10 16:07:42.314 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : Available capacity: 50 out of 50 [{}->http://ec2-34.200.47.82.compute-1.am
azonaws.com:8761][null]
2017-04-10 16:07:42.314 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : Creating new connection [{}->http://ec2-34.200.47.82.compute-1.amazonaws.c
om:8761]
2017-04-10 16:07:42.330 DEBUG 5 --- [ main] c.n.d.shared.MonitoredConnectionManager : Released connection is not reusable.
2017-04-10 16:07:42.331 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : Releasing connection [{}->http://ec2-34.200.47.82.compute-1.amazonaws.com:
8761][null]
2017-04-10 16:07:42.331 DEBUG 5 --- [ main] c.n.d.shared.NamedConnectionPool : Notifying no-one, there are no waiting threads
2017-04-10 16:07:42.331 DEBUG 5 --- [ main] n.d.s.t.j.AbstractJerseyEurekaHttpClient : Jersey HTTP GET http://ec2-34.200.47.82.compute-1.amazonaws.com:8761/eurek
a/apps/?; statusCode=N/A
2017-04-10 16:07:42.345 ERROR 5 --- [ main] c.n.d.s.t.d.RedirectingEurekaHttpClient : Request execution
....................
....................
2017-04-10 16:07:49.455 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : This client will talk to the following serviceUrls in order : [http://ec2-
34.206.31.211.compute-1.amazonaws.com:8761/eureka/]
2017-04-10 16:07:49.455 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The region url to be looked up is txt.us-east-1.eureka.mydomain.com :
2017-04-10 16:07:49.456 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The zoneName mapped to region us-east-1 is us-east-1c
2017-04-10 16:07:49.456 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : Checking if the instance zone us-east-1c is the same as the zone from DNS
us-east-1c
2017-04-10 16:07:49.456 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The zone index from the list [us-east-1c] that matches the instance zone u
s-east-1c is 0
2017-04-10 16:07:49.456 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The zone url to be looked up is txt.us-east-1c.eureka.mydomain.com :
2017-04-10 16:07:49.457 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The eureka url for the dns name txt.us-east-1c.eureka.mydomain.com is e
c2-34.200.47.82.compute-1.amazonaws.com
2017-04-10 16:07:49.457 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : The EC2 url is http://ec2-34.200.47.82.compute-1.amazonaws.com:8761/eureka
/
2017-04-10 16:07:49.457 DEBUG 5 --- [ Thread-11] c.n.discovery.endpoint.EndpointUtils : This client will talk to the following serviceUrls in order : [http://ec2-
34.200.47.82.compute-1.amazonaws.com:8761/eureka/]
**2017-04-10 16:07:49.527 ERROR 5 --- [ Thread-11] com.netflix.eureka.aws.EIPManager : Failed to bind elastic IP: 34.200.47.82 to i-0bc1018ccdcc69148
com.amazonaws.AmazonServiceException: You are not authorized to perform this operation. (Service: AmazonEC2; Status Code: 403; Error Code: UnauthorizedOperation; Request I
D: f9b2dec4-6d79-4da2-bbac-061416bde000)**
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1378) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:924) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:702) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:454) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:416) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:365) ~[aws-java-sdk-core-1.11.18.jar!/:na]
at com.amazonaws.services.ec2.AmazonEC2Client.doInvoke(AmazonEC2Client.java:12003) ~[aws-java-sdk-ec2-1.11.18.jar!/:na]
at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:11973) ~[aws-java-sdk-ec2-1.11.18.jar!/:na]
at com.amazonaws.services.ec2.AmazonEC2Client.describeAddresses(AmazonEC2Client.java:4716) ~[aws-java-sdk-ec2-1.11.18.jar!/:na]
at com.netflix.eureka.aws.EIPManager.bindEIP(EIPManager.java:202) [eureka-core-1.4.12.jar!/:1.4.12]
at com.netflix.eureka.aws.EIPManager.handleEIPBinding(EIPManager.java:136) [eureka-core-1.4.12.jar!/:1.4.12]
at com.netflix.eureka.aws.EIPManager.start(EIPManager.java:105) [eureka-core-1.4.12.jar!/:1.4.12]
at com.netflix.eureka.aws.AwsBinderDelegate.start(AwsBinderDelegate.java:42) [eureka-core-1.4.12.jar!/:1.4.12]
at org.springframework.cloud.netflix.eureka.server.EurekaServerBootstrap.initEurekaServerContext(EurekaServerBootstrap.java:145) [spring-cloud-netflix-eureka-serve
r-1.2.6.RELEASE.jar!/:1.2.6.RELEASE]
at org.springframework.cloud.netflix.eureka.server.EurekaServerBootstrap.contextInitialized(EurekaServerBootstrap.java:81) [spring-cloud-netflix-eureka-server-1.2.
6.RELEASE.jar!/:1.2.6.RELEASE]
at org.springframework.cloud.netflix.eureka.server.EurekaServerInitializerConfiguration$1.run(EurekaServerInitializerConfiguration.java:70) [spring-cloud-netflix-e
ureka-server-1.2.6.RELEASE.jar!/:1.2.6.RELEASE]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
2017-04-10 16:07:49.527 INFO 5 --- [ Thread-11] com.netflix.eureka.aws.EIPManager : No EIP is free to be associated with this instance. Candidate EIPs are: [3
4.200.47.82]
......................................
........................................
........................................
QUESTION: I have attached the policy to allow Eureka to bind the Elastic IP to the instance where it is deployed but WHY am I getting a You are not authorized to perform this operation. (Service: AmazonEC2; Status Code: 403; Error Code: UnauthorizedOperation and how can I fix this? As it stands, I have spend more than a day Googling and still the same error :(
I tried the netflix way of configuring eureka like below but to no avail :(:
eureka:
awsAccessId: AWS_KEY
awsSecretKey:AWS_KEY_SECRET
asgName: EIPAccessPolicyGroup
So I finally got a solution and had help from #DirkLachowski and #spencergibb on this post. Thanks a lot guys. So I only had to change this:
eureka:
awsAccessId: AWS_KEY
awsSecretKey:AWS_KEY_SECRET
asgName: EIPAccessPolicyGroup
To this:
eureka:
server:
aWSAccessId: AWS_KEY
aWSSecretKey: AWS_SECRET_KEY
asgName: EC2ContainerService_AUTO_SCALING_GROUP_CREATED_BY_ECS_FOR_MY_CLUSTER
So each eureka server bind an unused/free EIP that I put on my TXT DNS records to the EC2 instance where my eureka server is running :)

Elasticsearch 5.2 crashes in Ubuntu14.04, EC2 t2.large machine

I'm trying to run ElasticSearch 5.2 hosted in an EC2 Ubuntu 14.04 machine (t2.large, which has 8gb of RAM, the minimum specified by Elastic to run Elasticsearch). But ElasticSearch is shutting down unexpectedly.
I'm not being able to understand the cause of the shutting down.
this is the elasticsearch.log:
[2017-03-20T10:07:53,410][INFO ][o.e.p.PluginsService ] [QrRfI_U] loaded module [transport-netty4]
[2017-03-20T10:07:53,411][INFO ][o.e.p.PluginsService ] [QrRfI_U] no plugins loaded
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] initialized
[2017-03-20T10:07:55,555][INFO ][o.e.n.Node ] [QrRfI_U] starting ...
[2017-03-20T10:07:55,626][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: f6:fd:16:e4:90:62:fe:d6
[2017-03-20T10:07:55,673][INFO ][o.e.t.TransportService ] [QrRfI_U] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2017-03-20T10:07:58,755][INFO ][o.e.c.s.ClusterService ] [QrRfI_U] new_master {QrRfI_U}{QrRfI_UKQxWwvvhvgYxGmQ}{Rne8jnb_S0KVRnXvJj1m2w}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-20T10:07:58,793][INFO ][o.e.h.HttpServer ] [QrRfI_U] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2017-03-20T10:07:58,793][INFO ][o.e.n.Node ] [QrRfI_U] started
[2017-03-20T10:07:59,072][INFO ][o.e.g.GatewayService ] [QrRfI_U] recovered [6] indices into cluster_state
[2017-03-20T10:07:59,724][INFO ][o.e.c.r.a.AllocationService] [QrRfI_U] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[logstash-2017.02.26][4], [logstash-2017.02.26][3], [logstash-2017.02.26][1], [logstash-2017.02.26][0]] ...]).
[2017-03-20T10:50:12,228][INFO ][o.e.c.m.MetaDataMappingService] [QrRfI_U] [logstash-2017.03.20/HXANYkA9RRKne-YAK9cNQg] update_mapping [logs]
[2017-03-20T11:06:55,449][INFO ][o.e.n.Node ] [QrRfI_U] stopping ...
[2017-03-20T11:06:55,514][INFO ][o.e.n.Node ] [QrRfI_U] stopped
[2017-03-20T11:06:55,515][INFO ][o.e.n.Node ] [QrRfI_U] closing ...
[2017-03-20T11:06:55,523][INFO ][o.e.n.Node ] [QrRfI_U] closed
When I restart ElasticSearch this is the node stats after 1 logstash input (I've never mmore than 3 inputs before elasticsearch crashes):
Request:
curl -i -XGET 'localhost:9200/_nodes/stats'
Response:
{"_nodes":{"total":1,"successful":1,"failed":0},"cluster_name":"elasticsearch","nodes":{"QrRfI_UKQxWwvvhvgYxGmQ":{"timestamp":1490011241990,"name":"QrRfI_U","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0.1:9300","roles":["master","data","ingest"],"indices":{"docs":{"count":17,"deleted":0},"store":{"size_in_bytes":235863,"throttle_time_in_millis":0},"indexing":{"index_total":2,"index_time_in_millis":111,"index_current":0,"index_failed":0,"delete_total":0,"delete_time_in_millis":0,"delete_current":0,"noop_update_total":0,"is_throttled":false,"throttle_time_in_millis":0},"get":{"total":2,"time_in_millis":3,"exists_total":2,"exists_time_in_millis":3,"missing_total":0,"missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":84,"query_time_in_millis":70,"query_current":0,"fetch_total":80,"fetch_time_in_millis":91,"fetch_current":0,"scroll_total":0,"scroll_time_in_millis":0,"scroll_current":0,"suggest_total":0,"suggest_time_in_millis":0,"suggest_current":0},"merges":{"current":0,"current_docs":0,"current_size_in_bytes":0,"total":0,"total_time_in_millis":0,"total_docs":0,"total_size_in_bytes":0,"total_stopped_time_in_millis":0,"total_throttled_time_in_millis":0,"total_auto_throttle_in_bytes":545259520},"refresh":{"total":2,"total_time_in_millis":89,"listeners":0},"flush":{"total":0,"total_time_in_millis":0},"warmer":{"current":0,"total":28,"total_time_in_millis":72},"query_cache":{"memory_size_in_bytes":0,"total_count":0,"hit_count":0,"miss_count":0,"cache_size":0,"cache_count":0,"evictions":0},"fielddata":{"memory_size_in_bytes":0,"evictions":0},"completion":{"size_in_bytes":0},"segments":{"count":17,"memory_in_bytes":137618,"terms_memory_in_bytes":130351,"stored_fields_memory_in_bytes":5304,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":384,"points_memory_in_bytes":15,"doc_values_memory_in_bytes":1564,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":-1,"file_sizes":{}},"translog":{"operations":2,"size_in_bytes":6072},"request_cache":{"memory_size_in_bytes":12740,"evictions":0,"hit_count":0,"miss_count":20},"recovery":{"current_as_source":0,"current_as_target":0,"throttle_time_in_millis":0}},"os":{"timestamp":1490011241998,"cpu":{"percent":1,"load_average":{"1m":0.18,"5m":0.08,"15m":0.06}},"mem":{"total_in_bytes":8371847168,"free_in_bytes":5678006272,"used_in_bytes":2693840896,"free_percent":68,"used_percent":32},"swap":{"total_in_bytes":0,"free_in_bytes":0,"used_in_bytes":0}},"process":{"timestamp":1490011241998,"open_file_descriptors":220,"max_file_descriptors":66000,"cpu":{"percent":1,"total_in_millis":14800},"mem":{"total_virtual_in_bytes":3171389440}},"jvm":{"timestamp":1490011241998,"uptime_in_millis":205643,"mem":{"heap_used_in_bytes":195922864,"heap_used_percent":37,"heap_committed_in_bytes":519438336,"heap_max_in_bytes":519438336,"non_heap_used_in_bytes":75810224,"non_heap_committed_in_bytes":81326080,"pools":{"young":{"used_in_bytes":96089960,"max_in_bytes":139591680,"peak_used_in_bytes":139591680,"peak_max_in_bytes":139591680},"survivor":{"used_in_bytes":11413088,"max_in_bytes":17432576,"peak_used_in_bytes":17432576,"peak_max_in_bytes":17432576},"old":{"used_in_bytes":88419816,"max_in_bytes":362414080,"peak_used_in_bytes":88419816,"peak_max_in_bytes":362414080}}},"threads":{"count":43,"peak_count":45},"gc":{"collectors":{"young":{"collection_count":5,"collection_time_in_millis":164},"old":{"collection_count":1,"collection_time_in_millis":39}}},"buffer_pools":{"direct":{"count":29,"used_in_bytes":70307265,"total_capacity_in_bytes":70307264},"mapped":{"count":17,"used_in_bytes":217927,"total_capacity_in_bytes":217927}},"classes":{"current_loaded_count":10981,"total_loaded_count":10981,"total_unloaded_count":0}},"thread_pool":{"bulk":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"fetch_shard_started":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":26},"fetch_shard_store":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"flush":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"force_merge":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"generic":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":54},"get":{"threads":2,"queue":0,"active":0,"rejected":0,"largest":2,"completed":2},"index":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"listener":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"management":{"threads":5,"queue":0,"active":1,"rejected":0,"largest":5,"completed":203},"refresh":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":550},"search":{"threads":4,"queue":0,"active":0,"rejected":0,"largest":4,"completed":165},"snapshot":{"threads":0,"queue":0,"active":0,"rejected":0,"largest":0,"completed":0},"warmer":{"threads":1,"queue":0,"active":0,"rejected":0,"largest":1,"completed":23}},"fs":{"timestamp":1490011241999,"total":{"total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008},"data":[{"path":"/home/ubuntu/elasticsearch-5.2.0/data/nodes/0","mount":"/ (/dev/xvda1)","type":"ext4","total_in_bytes":8309932032,"free_in_bytes":3226181632,"available_in_bytes":2780459008,"spins":"false"}],"io_stats":{"devices":[{"device_name":"xvda1","operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}],"total":{"operations":901,"read_operations":4,"write_operations":897,"read_kilobytes":16,"write_kilobytes":10840}}},"transport":{"server_open":0,"rx_count":10,"rx_size_in_bytes":3388,"tx_count":10,"tx_size_in_bytes":3388},"http":{"current_open":5,"total_opened":12},"breakers":{"request":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"fielddata":{"limit_size_in_bytes":311663001,"limit_size":"297.2mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.03,"tripped":0},"in_flight_requests":{"limit_size_in_bytes":519438336,"limit_size":"495.3mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0},"parent":{"limit_size_in_bytes":363606835,"limit_size":"346.7mb","estimated_size_in_bytes":0,"estimated_size":"0b","overhead":1.0,"tripped":0}},"script":{"compilations":0,"cache_evictions":0},"discovery":{"cluster_state_queue":{"total":0,"pending":0,"committed":0}},"ingest":{"total":{"count":0,"time_in_millis":0,"current":0,"failed":0},"pipelines":{}}}}}

getting auth failure on salt-cloud command

I'm using salt stack and I want to try and provision new EC2 instances using the salt-cloud command. But I'm getting an auth failure on salt-cloud command:
[root#salt:~] #salt-cloud -p base_ec2_public ops.example.com
[ERROR ] AWS Response Status Code and Error: [401 401 Client Error: Unauthorized] {'Errors': {'Error': {'Message': 'AWS was not able to validate the provided access credentials', 'Code': 'AuthFailure'}}, 'RequestID': '3a5e33e2-d1a9-44fa-983c-26691d4f8ee7'}
[ERROR ] AWS Response Status Code and Error: [401 401 Client Error: Unauthorized] {'Errors': {'Error': {'Message': 'AWS was not able to validate the provided access credentials', 'Code': 'AuthFailure'}}, 'RequestID': '163079c6-2b79-4301-80c8-77ba0d7c896d'}
[ERROR ] There was a profile error: string indices must be integers, not str
This is my /etc/salt/cloud.providers.d/aws.conf file
----
my-ec2-us-east-public-ips:
# Set up the location of the salt master
#
minion:
master: salt.example.com
# Set up grains information, which will be common for all nodes
# using this provider
grains:
node_type: broker
release: 1.0.1
# Specify whether to use public or private IP for deploy script.
#
# Valid options are:
# private_ips - The salt-cloud command is run inside the EC2
# public_ips - The salt-cloud command is run outside of EC2
#
ssh_interface: public_ips
# Optionally configure the Windows credential validation number of
# retries and delay between retries. This defaults to 10 retries
# with a one second delay betwee retries
win_deploy_auth_retries: 10
win_deploy_auth_retry_delay: 1
# Set the EC2 access credentials (see below)
#
id: "REDACTED"
key: "REDACTED"
# Make sure this key is owned by root with permissions 0400.
#
private_key: /etc/salt/my_test_key.pem
keyname: my_test_key
securitygroup: default
# Optionally configure default region
# Use salt-cloud --list-locations <provider> to obtain valid regions
#
location: us-east-1
availability_zone: us-east-1a
#
ssh_username: ec2-user
# Optionally add an IAM profile
iam_profile: 'arn:aws:iam::REDACTED:user/bluethundr'
driver: ec2
my-ec2-us-east-private-ips:
# Set up the location of the salt master
#
minion:
master: salt.example.com
# Specify whether to use public or private IP for deploy script.
#
# Valid options are:
# private_ips - The salt-master is also hosted with EC2
# public_ips - The salt-master is hosted outside of EC2
#
ssh_interface: private_ips
# Optionally configure the Windows credential validation number of
# retries and delay between retries. This defaults to 10 retries
# with a one second delay betwee retries
win_deploy_auth_retries: 10
win_deploy_auth_retry_delay: 1
# Set the EC2 access credentials (see below)
#
id: "REDACTED"
key: "REDACTED"
# Make sure this key is owned by root with permissions 0400.
#
private_key: /etc/salt/my_test_key.pem
keyname: my_test_key
# This one should NOT be specified if VPC was not configured in AWS to be
# the default. It might cause an error message which says that network
# interfaces and an instance-level security groups may not be specified
# on the same request.
#
securitygroup: default
# Optionally configure default region
#
location: us-east-1
availability_zone: us-east-1a
# Configure which user to use to run the deploy script. This setting is
# dependent upon the AMI that is used to deploy. It is usually safer to
# configure this individually in a profile, than globally. Typical users
# are:
#
# Amazon Linux -> ec2-user
# RHEL -> ec2-user
# CentOS -> ec2-user
# Ubuntu -> ubuntu
#
ssh_username: ec2-user
# Optionally add an IAM profile
iam_profile: 'arn:aws:iam::REDACTED:user/bluethundr'
driver: ec2
And this is my /etc/salt/cloud.profiles.d/aws_pofiles.conf
base_ec2:
provider: my-ec2-us-east-public-ips
image: ami-869a9cee
size: t2.micro
ssh_username: ec2-user
base_ec2_private:/et
provider: my-ec2-us-east-private-ips
image: ami-869a9cee
size: t2.micro
ssh_username: ec2-user
base_ec2_public:
provider: my-ec2-us-east-public-ips
image: ami-e565ba8c
size: t2.micro
ssh_username: ec2-user
base_ec2_db:
provider: my-ec2-us-east-public-ips
image: ami-e565ba8c
size: m1.xlarge
ssh_username: ec2-user
volumes:
- { size: 10, device: /dev/sdf }
- { size: 10, device: /dev/sdg, type: io1, iops: 1000 }
- { size: 10, device: /dev/sdh, type: io1, iops: 1000 }
- { size: 10, device: /dev/sdi, tags: {"Environment": "production"} }
# optionally add tags to profile:
tag: {'Environment': 'production', 'Role': 'database'}
# force grains to sync after install
sync_after_install: grains
base_ec2_vpc:
provider: my-ec2-us-east-public-ips
image: ami-a73264ce
size: m1.xlarge
ssh_username: ec2-user
script: /etc/salt/cloud.deploy.d/user_data.sh
network_interfaces:
- DeviceIndex: 0
PrivateIpAddresses:
- Primary: True
#auto assign public ip (not EIP)
AssociatePublicIpAddress: True
SubnetId: subnet-813d4bbf
SecurityGroupId:
- sg-750af413
del_root_vol_on_destroy: True
del_all_vol_on_destroy: True
volumes:
- { size: 10, device: /dev/sdf }
- { size: 10, device: /dev/sdg, type: io1, iops: 1000 }
- { size: 10, device: /dev/sdh, type: io1, iops: 1000 }
tag: {'Environment': 'production', 'Role': 'database'}
sync_after_install: grains
Here's some debug output of the command I'm trying to get working:
[root#salt:~] #salt-cloud -p base_ec2_public ops.example.com -l debug
[DEBUG ] Reading configuration from /etc/salt/cloud
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: salt.example.com
[DEBUG ] Missing configuration file: /etc/salt/cloud.providers
[DEBUG ] Including configuration from '/etc/salt/cloud.providers.d/aws.conf'
[DEBUG ] Reading configuration from /etc/salt/cloud.providers.d/aws.conf
[DEBUG ] Missing configuration file: /etc/salt/cloud.profiles
[DEBUG ] Including configuration from '/etc/salt/cloud.profiles.d/aws_profiles.conf'
[DEBUG ] Reading configuration from /etc/salt/cloud.profiles.d/aws_profiles.conf
[DEBUG ] Configuration file path: /etc/salt/cloud
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[INFO ] salt-cloud starting
[DEBUG ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG ] LazyLoaded parallels.avail_locations
[DEBUG ] LazyLoaded proxmox.avail_sizes
[DEBUG ] Could not LazyLoad saltify.destroy: 'saltify.destroy' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_sizes: 'saltify.avail_sizes' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_images: 'saltify.avail_images' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_locations: 'saltify.avail_locations' is not available.
[DEBUG ] LazyLoaded rackspace.reboot
[DEBUG ] LazyLoaded openstack.list_locations
[DEBUG ] LazyLoaded rackspace.list_locations
[DEBUG ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG ] LazyLoaded parallels.avail_locations
[DEBUG ] LazyLoaded proxmox.avail_sizes
[DEBUG ] Could not LazyLoad saltify.destroy: 'saltify.destroy' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_sizes: 'saltify.avail_sizes' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_images: 'saltify.avail_images' is not available.
[DEBUG ] Could not LazyLoad saltify.avail_locations: 'saltify.avail_locations' is not available.
[DEBUG ] LazyLoaded rackspace.reboot
[DEBUG ] LazyLoaded openstack.list_locations
[DEBUG ] LazyLoaded rackspace.list_locations
[DEBUG ] Using AWS endpoint: ec2.us-east-1.amazonaws.com
[DEBUG ] AWS Request: https://ec2.us-east-1.amazonaws.com/?Action=DescribeInstances&Version=2014-10-01
[DEBUG ] AWS Response Status Code: 401
[ERROR ] AWS Response Status Code and Error: [401 401 Client Error: Unauthorized] {'Errors': {'Error': {'Message': 'AWS was not able to validate the provided acce
ss credentials', 'Code': 'AuthFailure'}}, 'RequestID': '0f483305-6cb2-4c09-ae2f-ec804fd3beea'}
[DEBUG ] Failed to execute 'ec2.list_nodes()' while querying for running nodes: An error occurred while listing nodes: AWS was not able to validate the provided a
ccess credentials
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/cloud/__init__.py", line 2383, in run_parallel_map_providers_query
cloud.clouds[data['fun']]()
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3496, in list_nodes
nodes = list_nodes_full(get_location())
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3346, in list_nodes_full
return _list_nodes_full(location)
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3436, in _list_nodes_full
instances['error']['Errors']['Error']['Message']
SaltCloudSystemExit: An error occurred while listing nodes: AWS was not able to validate the provided access credentials
[DEBUG ] Generating minion keys for 'ops.jokefire.com'
[DEBUG ] LazyLoaded cloud.fire_event
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Sending event - data = {'profile': 'base_ec2_public', 'event': 'starting create', '_stamp': '2016-09-13T19:24:13.555913', 'name': 'ops.jokefire.com', 'pr
ovider': 'my-ec2-us-east-public-ips:ec2'}
[INFO ] Creating Cloud VM ops.jokefire.com in us-east-1
[DEBUG ] Using AWS endpoint: ec2.us-east-1.amazonaws.com
[DEBUG ] AWS Request: https://ec2.us-east-1.amazonaws.com/?Action=DescribeAvailabilityZones&Filter.0.Name=region-name&Filter.0.Value.0=us-east-1&Version=2014-10-0
1
[DEBUG ] AWS Response Status Code: 401
[ERROR ] AWS Response Status Code and Error: [401 401 Client Error: Unauthorized] {'Errors': {'Error': {'Message': 'AWS was not able to validate the provided acce
ss credentials', 'Code': 'AuthFailure'}}, 'RequestID': 'e9912cf2-2e9b-496f-b607-4b9bae8b8938'}
[ERROR ] There was a profile error: string indices must be integers, not str
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/cloud/cli.py", line 284, in run
self.config.get('names')
File "/usr/lib/python2.7/site-packages/salt/cloud/__init__.py", line 1454, in run_profile
ret[name] = self.create(vm_)
File "/usr/lib/python2.7/site-packages/salt/cloud/__init__.py", line 1284, in create
output = self.clouds[func](vm_)
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 2512, in create
data, vm_ = request_instance(vm_, location)
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 1742, in request_instance
az_ = get_availability_zone(vm_)
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 1094, in get_availability_zone
zones = _list_availability_zones(vm_)
File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 1242, in _list_availability_zones
ret[zone['zoneName']] = zone['zoneState']
TypeError: string indices must be integers, not str
Can someone take a stab and let me know why I'm getting auth failures? The redacted AWS keys were taken straight from the AWS interface and copied into the cloud.providers file.
It seems the EC2 credentials are not provided. You may need to check the Key/ID of the EC2 credentials, and their policy.
For credentials, replace "REDACTED" string with your real key/ID.

connection issue with hazelcast on amazon AWS

I am using Hazelcast v3.6 on two amazon AWS virtual machines (not using the AWS specific settings for hazelcast). The connection is supposed to work via TCP/IP connection settings (not multicasting). I have opened 5701-5801 address for connection on the virtual machines.
I have tried using iperf on the two virtual machines using which I can see that the client on one VM connects to the server on another VM (and vice versa when I switch the client server setup for iperf).
When I launch two Hazelcast servers on different VM's, the connection is not established. The log statements and the hazelcast.xml config are given below (I am not using the programmatic settings for Hazelcast). I have changed the IP addresses below:
20160401-16:41:02.812 [cached2] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5701, timeout: 0, bind-any: true
20160401-16:41:02.812 [cached3] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5703, timeout: 0, bind-any: true
20160401-16:41:02.813 [cached1] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5702, timeout: 0, bind-any: true
20160401-16:41:02.816 [cached1] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Could not connect to: /22.23.24.25:5702. Reason: SocketException[Connection refused to address /22.23.24.25:570
2]
20160401-16:41:02.816 [cached1] TcpIpJoiner INFO - [45.46.47.48]:5701 [dev] [3.6] Address[22.23.24.25]:5702 is added to the blacklist.
20160401-16:41:02.817 [cached3] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Could not connect to: /22.23.24.25:5703. Reason: SocketException[Connection refused to address /22.23.24.25:570
3]
20160401-16:41:02.817 [cached3] TcpIpJoiner INFO - [45.46.47.48]:5701 [dev] [3.6] Address[22.23.24.25]:5703 is added to the blacklist.
20160401-16:41:02.834 [cached2] TcpIpConnectionManager INFO - [45.46.47.48]:5701 [dev] [3.6] Established socket connection between /45.46.47.48:51965 and /22.23.24.25:5701
20160401-16:41:02.849 [hz._hzInstance_1_dev.IO.thread-in-0] TcpIpConnection INFO - [45.46.47.48]:5701 [dev] [3.6] Connection [Address[22.23.24.25]:5701] lost. Reason: java.io.EOFException[Remote socket
closed!]
20160401-16:41:02.851 [hz._hzInstance_1_dev.IO.thread-in-0] NonBlockingSocketReader WARN - [45.46.47.48]:5701 [dev] [3.6] hz._hzInstance_1_dev.IO.thread-in-0 Closing socket to endpoint Address[54.89.161.2
28]:5701, Cause:java.io.EOFException: Remote socket closed!
20160401-16:41:03.692 [cached2] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5701, timeout: 0, bind-any: true
20160401-16:41:03.693 [cached2] TcpIpConnectionManager INFO - [45.46.47.48]:5701 [dev] [3.6] Established socket connection between /45.46.47.48:60733 and /22.23.24.25:5701
20160401-16:41:03.696 [hz._hzInstance_1_dev.IO.thread-in-1] TcpIpConnection INFO - [45.46.47.48]:5701 [dev] [3.6] Connection [Address[22.23.24.25]:5701] lost. Reason: java.io.EOFException[Remote socket
closed!]
Part of Hazelcast config
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.6.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<group>
<name>abc</name>
<password>defg</password>
</group>
<network>
<port auto-increment="true" port-count="100">5701</port>
<outbound-ports>
<ports>0-5900</ports>
</outbound-ports>
<join>
<multicast enabled="false">
<!--<multicast-group>224.2.2.3</multicast-group>
<multicast-port>54327</multicast-port>-->
</multicast>
<tcp-ip enabled="true">
<member>22.23.24.25</member>
</tcp-ip>
</join>
<interfaces enabled="true">
<interface>45.46.47.48</interface>
</interfaces>
<ssl enabled="false" />
<socket-interceptor enabled="false" />
<symmetric-encryption enabled="false">
<algorithm>PBEWithMD5AndDES</algorithm>
<!-- salt value to use when generating the secret key -->
<salt>thesalt</salt>
<!-- pass phrase to use when generating the secret key -->
<password>thepass</password>
<!-- iteration count to use when generating the secret key -->
<iteration-count>19</iteration-count>
</symmetric-encryption>
</network>
<partition-group enabled="false"/>
iperf server and client log statements
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 22.23.24.25, TCP port 5701
TCP window size: 1.33 MByte (default)
------------------------------------------------------------
[ 5] local 172.31.17.104 port 57398 connected with 22.23.24.25 port 5701
[ 4] local 172.31.17.104 port 5701 connected with 22.23.24.25 port 55589
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 662 MBytes 555 Mbits/sec
[ 4] 0.0-10.0 sec 797 MBytes 666 Mbits/sec
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local xxx.xx.xxx.xx port 5701 connected with 22.23.24.25 port 57398
------------------------------------------------------------
Client connecting to 22.23.24.25, TCP port 5701
TCP window size: 1.62 MByte (default)
------------------------------------------------------------
[ 6] local 172.31.17.23 port 55589 connected with 22.23.24.25 port 5701
[ ID] Interval Transfer Bandwidth
[ 6] 0.0-10.0 sec 797 MBytes 669 Mbits/sec
[ 4] 0.0-10.0 sec 662 MBytes 553 Mbits/sec
Note:
I forgot to mention that I can connect from hazelcast client to server i.e. when I use a hazelcast client to connect to a single hazlecast server node, I am able to connect just fine
An outbound ports range which includes 0 is interpreted by hazelcast as "use ephemeral ports", so the <outbound-ports> element has actually no effect in your configuration. There is an associated test in hazelcast sources: https://github.com/hazelcast/hazelcast/blob/75251c4f01d131a9624fc3d0c4190de5cdf7d93a/hazelcast/src/test/java/com/hazelcast/nio/NodeIOServiceTest.java#L60