AWS OpenVPN dns not resolving - amazon-web-services

We're using an OpenVPN server on AWS which we configured using this tutorial. However, when we connect to the VPN the internet does not seem to work, because the DNS is not resolving anything. When we switch the DNS to 8.8.8.8 in the configuration panel, everything works as expected.
We've tried reinstalling everything from scratch, but the problem remains the same. We used the standard AWS AMI template for OpenVPN provided by AWS.
Our DNS is:
nameserver[0] : 172.31.0.2
nameserver[0] : 172.31.0.2
When I ping this IP this is the response:
Request timeout for icmp_seq 0
ping: sendto: No route to host
I've executed some commands to provide more information:
dig #127.0.0.1 google.com
; <<>> DiG 9.11.3-1ubuntu1.17-Ubuntu <<>> #127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
dig google.com
; <<>> DiG 9.11.3-1ubuntu1.17-Ubuntu <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45371
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 124 IN A 142.250.185.238
;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Jul 19 07:30:15 UTC 2022
;; MSG SIZE rcvd: 55

Related

GCP: Can't get Compute Engine VMs to contact a specific public DNS name server after applying a DNS outbound forwarding policy

TL;DR SOLUTION: DIG CANNOT BE USED TO VERIFY DNS FORWARDING POLICIES AS THE DNS RESOLUTION IS PERFORMED BY THE ROOT SERVERS
I am trying to create a DNS outbound forwarding policy for forwarding DNS requests to a specific name server. It is going to be implemented in an on-prem setup, but I am currently just testing it with public DNS servers such as 1.1.1.1. I have created the DNS outbound policy, but every request from inside VMs still uses the metadata server: 169.254.169.254
CURRENT SETUP:
I have created a VPC, defined a subnet, and created a VM connected to this subnet.
resource "google_compute_network" "vpc-hub" {
name = "vpc-hub"
depends_on = [
google_project_iam_member.assign-roles,
google_project_service.enable_apis
]
}
resource "google_compute_subnetwork" "vpc-subnet-hub" {
name = "vpc-subnet-hub"
ip_cidr_range = "10.0.0.0/16"
region = "europe-west1"
network = google_compute_network.vpc-hub.id
depends_on = [
google_compute_network.vpc-hub
]
}
resource "google_compute_instance" "hub-vm" {
name = "hub-vm"
machine_type = "e2-medium"
zone = "europe-west1-b"
depends_on = [
google_compute_subnetwork.vpc-subnet-hub
]
boot_disk {
initialize_params {
image = "debian-cloud/debian-9"
}
}
network_interface {
subnetwork = google_compute_subnetwork.vpc-subnet-hub.name
access_config {
}
}
metadata_startup_script = file("${path.module}/startup_install.sh")
}
The startup-script installs dnsutils. Additionally, I have defined a firewall rule to be able to connect to it via SSH:
resource "google_compute_firewall" "ssh-rule-hub" {
name = "demo-ssh"
network = google_compute_network.vpc-hub.name
allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = ["REDACTED]
depends_on = [
google_compute_network.vpc-hub
]
}
ISSUE:
I have created a DNS forwarding policy by specifying the public name server 1.1.1.1:
resource "google_dns_policy" "dns-policy" {
name = "dns-policy"
enable_inbound_forwarding = false
enable_logging = false
alternative_name_server_config {
target_name_servers {
ipv4_address = "1.1.1.1"
forwarding_path = "default"
}
}
networks {
network_url = google_compute_network.vpc-hub.id
}
depends_on = [
google_compute_network.vpc-hub
]
}
When I SSH into the VM and attempt to use dig to send a DNS request to any domain. It always uses the internal metadata server within the compute engine - And thus does not use 1.1.1.1 to retrieve the IP. Instead the answer is always provided by the metadata server: 169.254.169.254. Why is this the case? Output from Dig can be seen below:
bash command: dig google.com
; <<>> DiG 9.10.3-P4-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60323
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 300 IN A 64.233.184.101
google.com. 300 IN A 64.233.184.139
google.com. 300 IN A 64.233.184.102
google.com. 300 IN A 64.233.184.100
google.com. 300 IN A 64.233.184.113
google.com. 300 IN A 64.233.184.138
;; Query time: 18 msec
;; SERVER: 169.254.169.254#53(169.254.169.254)
;; WHEN: Tue Apr 26 12:09:53 UTC 2022
;; MSG SIZE rcvd: 135
EDIT: output from bash command dig google.com +trace
; <<>> DiG 9.10.3-P4-Debian <<>> google.com +trace
;; global options: +cmd
. 510859 IN NS a.root-servers.net.
. 510859 IN NS b.root-servers.net.
. 510859 IN NS c.root-servers.net.
. 510859 IN NS d.root-servers.net.
. 510859 IN NS e.root-servers.net.
. 510859 IN NS f.root-servers.net.
. 510859 IN NS g.root-servers.net.
. 510859 IN NS h.root-servers.net.
. 510859 IN NS i.root-servers.net.
. 510859 IN NS j.root-servers.net.
. 510859 IN NS k.root-servers.net.
. 510859 IN NS l.root-servers.net.
. 510859 IN NS m.root-servers.net.
;; Received 239 bytes from 169.254.169.254#53(169.254.169.254) in 22 ms
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 86400 IN DS 30909 8 2 E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
com. 86400 IN RRSIG DS 8 1 86400 20220510050000 20220427040000 47671 . hGerEs3N471ZCOosNSuakxBfxXh8H+qPP9UxVBakXvVfgLofu40+aNyw X9tfaNxmyFP7LUJDRBrURhNjN1cOdJhbTqa54AXvTlPbd31N5MRF3ZHT seJJLCe8Hv2UYLrnLSzAArpD2M+N0XI+3A6wR8/fE4/q0NULX0gpKxS9 Y/zHpr/Mu/2I8DLmI8sE411vSlK2MFxWj2LfQ0TAocjnmqkZQK9GfqQ4 IEDjcc2OV41JlxSKYtAy9OI3HJPfXcIrmo4aO9Qvoe1ZKn1fu46IUYpo zx1n5NgX4Ou8kOgjePMkpGMlvjLVY2KMxYdln7v4FCRB7mb1WsBp5H+3 3Sv7vQ==
;; Received 1170 bytes from 192.33.4.12#53(c.root-servers.net) in 10 ms
google.com. 172800 IN NS ns2.google.com.
google.com. 172800 IN NS ns1.google.com.
google.com. 172800 IN NS ns3.google.com.
google.com. 172800 IN NS ns4.google.com.
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0Q1GIN43N1ARRC9OSM6QPQR81H5M9A NS SOA RRSIG DNSKEY NSEC3PARAM
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20220503042354 20220426031354 37269 com. vPwsL+I4ptKlrtrsZXrs/Z3/1Ebd3eBCm5f73FcuoljLz9RHDPdxfGCd rHmK3oWXZIErmCLYy6NgcpiMEM+JiqTQm/mNFhVfoFTGicCB9JwGgA3p XfgAB6w/hsNvtqTVuFsszhq4+CjXf1+ky0hFzTHACJ3bPmEeMa80ASwX EHu+vpL9LdLSZrtS68G2ieGvjrwcn6EZfmfJIly3i8UU9g==
S84BKCIBC38P58340AKVNFN5KR9O59QC.com. 86400 IN NSEC3 1 1 0 - S84BUO64GQCVN69RJFUO6LVC7FSLUNJ5 NS DS RRSIG
S84BKCIBC38P58340AKVNFN5KR9O59QC.com. 86400 IN RRSIG NSEC3 8 2 86400 20220504051944 20220427040944 37269 com. G/tnTZRrBZApHzz+bWDsUrDBa6PcH5O15adPaWvIlvIMe1FQUYFRJeMm pzJ586JzBfz5bE+5IjU0uAh6AnWrDVtryUig0CUpMVDn8mm8Axa9QTu/ 1h3hwE6HbKx/4d7hNpZIuKGH98iMKJLaYdfJHFLRV85/AOT9HdL2ihtF t2wOZ3Px0Wo6QVxpwth2hO2iLw0k/tszErnHw6BqxEAebQ==
;; Received 836 bytes from 192.41.162.30#53(l.gtld-servers.net) in 8 ms
google.com. 300 IN A 142.251.5.101
google.com. 300 IN A 142.251.5.113
google.com. 300 IN A 142.251.5.100
google.com. 300 IN A 142.251.5.138
google.com. 300 IN A 142.251.5.102
google.com. 300 IN A 142.251.5.139
;; Received 135 bytes from 216.239.32.10#53(ns1.google.com) in 1 ms
The SERVER address 169.254.169.254 does not mean the server that responded. That address is the metadata server that knows how to forward internal (project specify) and global domain names to the correct DNS server. That address is a virtualized service that provides a number of services such as DNS, DHCP, and NTP.

Issue validating AWS ACM certification using Terraform

Disclaimer: I am new to both AWS and Terraform.
I am testing something out and when deploying my code, I keep running into the same error after running Terraform Apply
Error describing created certificate: Expected certificate to be issued but was in state PENDING_VALIDATION
I have tried running my code multiple times and it times out at 60 minutes each time. I cannot exceed 60 minutes due to a timeout of my session.
For this, there were two steps:
Create the Hosted Zone in a account 2 (this was done by a colleague and completed successfully)
resource "aws_route53_zone" "<example>" {
name = "<domain name>"
2.Created the A record, ACM cert, validation record, and validation object in account 1
resource "aws_route53_record" "<service>" {
provider = aws.account2
zone_id = data.terraform_remote_state.account2.outputs.route53_<example>_zone_id[0]
name = var.domain_name
type = "A"
alias {
name = aws_alb.<service>.dns_name
zone_id = aws_alb.<service>.zone_id
evaluate_target_health = true
}
}
resource "aws_acm_certificate" "<service>" {
domain_name = var.domain_name
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
}
resource "aws_route53_record" "<service>_validation" {
provider = aws.account2
for_each = {
for dvo in aws_acm_certificate.<service>.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
allow_overwrite = true
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
zone_id = data.terraform_remote_state.account2.outputs.route53_<example>_zone_id[0]
}
resource "aws_acm_certificate_validation" "<service>" {
certificate_arn = aws_acm_certificate.<service>.arn
validation_record_fqdns = [for record in aws_route53_record.<service>_validation : record.fqdn]
timeouts {
create = "60m"
}
}
I have looked at quite a few examples online and cannot figure out yet where I went wrong. This is the final piece of my Terraform Apply.
I have checked the AWS Console on Account2 and I saw that the subdomain hosted zone was created and contained the necessary NS record as described here https://aws.amazon.com/premiumsupport/knowledge-center/create-subdomain-route-53/
When I dig A <domain name>
; <<>> DiG 9.16.1-Ubuntu <<>> A <domain name>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 48891
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;<domain name>. IN A
;; AUTHORITY SECTION:
<parent domain>. 300 IN SOA <xxxxxx>.com. <xxxx>.com. 1 14400 3600 604800 300
;; Query time: 100 msec
;; SERVER: x.x.x.x#53(x.x.x.x)
;; WHEN: Fri Apr 01 20:11:12 PDT 2022
;; MSG SIZE rcvd: 152

AWS ACM certificate not validating

Before I begin let me say that I read thoroughly all the stack overflow posts and resources in the appendix, and could not find a solution to my problem.
I am trying to create, validate and connect a subdomain through Route53 and AWS Certificate Manager. The subdomain is challenge.sre.mycompany.com.
The terraform plan looks something like this:
# module.project_challenge.module.challenge-certificate.aws_acm_certificate.cert will be created
+ resource "aws_acm_certificate" "cert" {
+ arn = (known after apply)
+ domain_name = "challenge.sre.mycompany.com"
+ domain_validation_options = [
+ {
+ domain_name = "challenge.sre.mycompany.com"
+ resource_record_name = (known after apply)
+ resource_record_type = (known after apply)
+ resource_record_value = (known after apply)
},
]
+ id = (known after apply)
+ status = (known after apply)
+ subject_alternative_names = (known after apply)
+ tags_all = (known after apply)
+ validation_emails = (known after apply)
+ validation_method = "DNS"
}
# module.project_challenge.module.challenge-certificate.aws_acm_certificate_validation.cert will be created
+ resource "aws_acm_certificate_validation" "cert" {
+ certificate_arn = (known after apply)
+ id = (known after apply)
+ validation_record_fqdns = (known after apply)
}
# module.project_challenge.module.challenge-certificate.aws_route53_record.cert["challenge.sre.mycompany.com"] will be created
+ resource "aws_route53_record" "cert" {
+ allow_overwrite = true
+ fqdn = (known after apply)
+ id = (known after apply)
+ name = (known after apply)
+ records = (known after apply)
+ ttl = 60
+ type = (known after apply)
+ zone_id = (known after apply)
}
# module.project_challenge.module.vpc.aws_route53_zone.public will be created
+ resource "aws_route53_zone" "public" {
+ arn = (known after apply)
+ comment = "Managed by Terraform"
+ force_destroy = false
+ id = (known after apply)
+ name = "sre.mycompany.com"
+ name_servers = (known after apply)
+ tags_all = (known after apply)
+ zone_id = (known after apply)
}
As you can see, it create a public hosted zone, an acm certificate and even the validation record. The problem here is that the certificate is stuck on 'Pending Validation` for about 48 hours.
Some details:
The domain is registered through our production account, where I am working on our development account for this.
Both accounts are in the same AWS organisation (if this matters)
Terraform created a public hosted zone sre.mycompany.com with the following attributes:
sre.mycompany.com NS Records:
ns-001.awsdns-01.com.
ns-002.awsdns-02.net.
ns-003.awsdns-03.co.uk.
ns-004.awsdns-04.org.
sre.mycompany.com SOA Simple Record:
ns-001.awsdns-01.com. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
CNAME Simple Record
_g938534f3gfe03832h34.challenge.sre.mycompany.com _89432htieh4934hw043f.tkfpekghn.acm-validations.aws.
Obviously the real values are obfuscated*
When I dig sre.mycompany.com or dig challenge.sre.mycompany.com I get:
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 16577
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
When I dig just mycompany.com I get:
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61857
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 5
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mycompany.com. IN A
;; ANSWER SECTION:
mycompany.com. 300 IN A <some-ip-hidden>
;; AUTHORITY SECTION:
mycompany.com. 169554 IN NS ns-555.awsdns-55.com.
mycompany.com. 169554 IN NS ns-666.awsdns-66.net.
mycompany.com. 169554 IN NS ns-777.awsdns-77.org.
mycompany.com. 169554 IN NS ns-888.awsdns-88.co.uk.
Notice that the nameservers here are different from the ones I see in the console of my terraform created hosted zone (scroll above ns-001.awsdns-01.com. etc)
I cannot seem to fetch the CNAME record from my terminal.
In AWS everything seems to work fine on the other hand. When I go to:
Route 53> Hosted zones > Test Record I do get the value of the CNAME record:
Response returned by Route 53
Response from Route 53 based on the following options.
Hosted zone: sre.mycompany.com
Record name: _g938534f3gfe03832h34.challenge.
Record type: CNAME
DNS response code: No Error
Protocol: UDP
Response returned by Route 53: _89432htieh4934hw043f.tkfpekghn.acm-validations.aws.
At last if I , the response is:
;; Received 888 bytes from <some-ip-hidden>#53(ns-666.awsdns-66.net) in 3 ms
mycompany.com. 169201 IN NS ns-666.awsdns-66.net.
mycompany.com. 169201 IN NS ns-777.awsdns-77.org.
mycompany.com. 169201 IN NS ns-888.awsdns-88.co.uk.
mycompany.com. 169201 IN NS ns-555.awsdns-55.com.
;; BAD (HORIZONTAL) REFERRAL
;; Received 888 bytes from <some-ip-hidden>#53(ns-888.awsdns-88.co.uk) in 4 ms
mycompany.com. 169201 IN NS ns-777.awsdns-77.org.
mycompany.com. 169201 IN NS ns-666.awsdns-66.net.
mycompany.com. 169201 IN NS ns-888.awsdns-88.co.uk.
mycompany.com. 169201 IN NS ns-555.awsdns-55.com.
;; BAD (HORIZONTAL) REFERRAL
;; Received 888 bytes from <some-ip-hidden>#53(ns-555.awsdns-55.com) in 4 ms
mycompany.com. 169201 IN NS ns-666.awsdns-66.net.
mycompany.com. 169201 IN NS ns-777.awsdns-77.org.
mycompany.com. 169201 IN NS ns-555.awsdns-55.com.
mycompany.com. 169201 IN NS ns-888.awsdns-88.co.uk.
;; BAD (HORIZONTAL) REFERRAL
;; Received 888 bytes from <some-ip-hidden>#53(ns-888.awsdns-88.co.uk) in 4 ms
mycompany.com. 169201 IN NS ns-777.awsdns-77.org.
mycompany.com. 169201 IN NS ns-666.awsdns-66.net.
mycompany.com. 169201 IN NS ns-888.awsdns-88.co.uk.
mycompany.com. 169201 IN NS ns-555.awsdns-55.com.
;; BAD (HORIZONTAL) REFERRAL
;; Received 888 bytes from <some-ip-hidden>#53(ns-777.awsdns-77.org) in 5 ms
mycompany.com. 169201 IN NS ns-777.awsdns-77.org.
mycompany.com. 169201 IN NS ns-888.awsdns-88.co.uk.
mycompany.com. 169201 IN NS ns-555.awsdns-55.com.
mycompany.com. 169201 IN NS ns-666.awsdns-66.net.
;; BAD (HORIZONTAL) REFERRAL
Key takeaways:
I cannot get the CNAME with any command from my terminal
The certificate is not validating
Appendix
Certificate in Pending state in AWS Certificate Manager
Certificate with DNS Validation is stuck in Pending Validation
AWS ACM certificate state is pending validation and not changing to issues
My domain is pending validation in AWS Certificate Manager
AWS ACM Stuck in Pending Validation Unless NS Changed in Domain
Resolve ACM certificate still pending
Everything your terraform is creating is fine, however when you create a new zone in AWS you need to add the nameservers on the ROOT DNS Panel (most likely where you bought the domain mycompany.com).
You need to add the NS entry for the subdomain you want to use (the new zone you're creating)
You can reference this article https://webmasters.stackexchange.com/questions/93897/can-i-use-different-nameservers-for-different-subdomains
When you have multiple route53 hosted zones for domain and subdomains you need to link them together.
This can be done by adding the subdomain Nameservers in the domain hosted zone.
You cannot break the domain hosted zone by adding records, you will only break the link with the subdomain hosted zone.
So to clarify with an example, let's say you have a domain route53 hosted zone for
mycomany.gr
Record Name
Type
Value
mycomany.gr
NS
ns-xxxx.org ns-xxx.uk
sre.mycompany.com
NS
ns-yyy.org ns-yyy.uk
The first row is created when you create the route53 hosted zone. After you need to take the nameservers and add them to your domain provider. With this way you link the domain with AWS and it knows its valid.
The second row you will need to add it manually after you have create your subdomain route53 hosted zone (sre.mycompany.com). The nameservers now are the ones that route53 subdomain created for you. With this way you say to route53 this domain (mycompany.com) owns this subdomain (sre.mycompany.com).
All those stuff need to be done before and ACM certificates are created and the reason is that the ACM has a domain validation which tries to create a record to your valid domain or subdomain. If your domain or subdomain isnt linked to a valid domain the acm will throw an error.
Your CNAME in your zone file has a mycompany.com on the end. That's not the normal way to do a CNAME. Should probably be:
CNAME Simple Record
_g938534f3gfe03832h34.challenge.sre _89432htieh4934hw043f.tkfpekghn.acm-validations.aws.
If you add the mycompany.com into the CNAME, then the actual resolve address is _g938534f3gfe03832h34.challenge.sre.mycompany.com.mycompany.com
The only way I have found to get correct validation records into route53 through terraform looks like this:
resource "aws_route53_record" "cert-verify" {
for_each = {
for dvo in aws_acm_certificate.cert_name.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
zone_id = aws_route53_zone.zone.zone_id
}
makes for a messy state file, but it works

Why my request works locally and does not on heroku?

I'm running rake send_events locally and heroku run rake send_events on heroku.
values = {sendSmsRequest:
{
from: "ABC",
to: "5581999999999",
msg: "msg",
callbackOption: "NONE",
id: "c_1541"
}
}
headers = {
:content_type => 'application/json',
:authorization => 'Basic xxxxxxxxxxxxxxxxxxxxxxxx',
:accept => 'application/json'
}
RestClient.post 'https://api-rest.zenvia360.com.br/services/send-sms', values.to_json, headers
The log print
RestClient::Exceptions::OpenTimeout
Thanks.
OpenTimeout means that rest-client timed out trying to open a connection to the server.
Are you sure that the server is up and reachable from heroku? I'm not able to reach it from my computer or any server that I tried with.
$ nc -w 10 -v api-rest.zenvia360.com.br 443
nc: connect to api-rest.zenvia360.com.br port 443 (tcp) timed out

connection issue with hazelcast on amazon AWS

I am using Hazelcast v3.6 on two amazon AWS virtual machines (not using the AWS specific settings for hazelcast). The connection is supposed to work via TCP/IP connection settings (not multicasting). I have opened 5701-5801 address for connection on the virtual machines.
I have tried using iperf on the two virtual machines using which I can see that the client on one VM connects to the server on another VM (and vice versa when I switch the client server setup for iperf).
When I launch two Hazelcast servers on different VM's, the connection is not established. The log statements and the hazelcast.xml config are given below (I am not using the programmatic settings for Hazelcast). I have changed the IP addresses below:
20160401-16:41:02.812 [cached2] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5701, timeout: 0, bind-any: true
20160401-16:41:02.812 [cached3] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5703, timeout: 0, bind-any: true
20160401-16:41:02.813 [cached1] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5702, timeout: 0, bind-any: true
20160401-16:41:02.816 [cached1] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Could not connect to: /22.23.24.25:5702. Reason: SocketException[Connection refused to address /22.23.24.25:570
2]
20160401-16:41:02.816 [cached1] TcpIpJoiner INFO - [45.46.47.48]:5701 [dev] [3.6] Address[22.23.24.25]:5702 is added to the blacklist.
20160401-16:41:02.817 [cached3] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Could not connect to: /22.23.24.25:5703. Reason: SocketException[Connection refused to address /22.23.24.25:570
3]
20160401-16:41:02.817 [cached3] TcpIpJoiner INFO - [45.46.47.48]:5701 [dev] [3.6] Address[22.23.24.25]:5703 is added to the blacklist.
20160401-16:41:02.834 [cached2] TcpIpConnectionManager INFO - [45.46.47.48]:5701 [dev] [3.6] Established socket connection between /45.46.47.48:51965 and /22.23.24.25:5701
20160401-16:41:02.849 [hz._hzInstance_1_dev.IO.thread-in-0] TcpIpConnection INFO - [45.46.47.48]:5701 [dev] [3.6] Connection [Address[22.23.24.25]:5701] lost. Reason: java.io.EOFException[Remote socket
closed!]
20160401-16:41:02.851 [hz._hzInstance_1_dev.IO.thread-in-0] NonBlockingSocketReader WARN - [45.46.47.48]:5701 [dev] [3.6] hz._hzInstance_1_dev.IO.thread-in-0 Closing socket to endpoint Address[54.89.161.2
28]:5701, Cause:java.io.EOFException: Remote socket closed!
20160401-16:41:03.692 [cached2] InitConnectionTask INFO - [45.46.47.48]:5701 [dev] [3.6] Connecting to /22.23.24.25:5701, timeout: 0, bind-any: true
20160401-16:41:03.693 [cached2] TcpIpConnectionManager INFO - [45.46.47.48]:5701 [dev] [3.6] Established socket connection between /45.46.47.48:60733 and /22.23.24.25:5701
20160401-16:41:03.696 [hz._hzInstance_1_dev.IO.thread-in-1] TcpIpConnection INFO - [45.46.47.48]:5701 [dev] [3.6] Connection [Address[22.23.24.25]:5701] lost. Reason: java.io.EOFException[Remote socket
closed!]
Part of Hazelcast config
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.6.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<group>
<name>abc</name>
<password>defg</password>
</group>
<network>
<port auto-increment="true" port-count="100">5701</port>
<outbound-ports>
<ports>0-5900</ports>
</outbound-ports>
<join>
<multicast enabled="false">
<!--<multicast-group>224.2.2.3</multicast-group>
<multicast-port>54327</multicast-port>-->
</multicast>
<tcp-ip enabled="true">
<member>22.23.24.25</member>
</tcp-ip>
</join>
<interfaces enabled="true">
<interface>45.46.47.48</interface>
</interfaces>
<ssl enabled="false" />
<socket-interceptor enabled="false" />
<symmetric-encryption enabled="false">
<algorithm>PBEWithMD5AndDES</algorithm>
<!-- salt value to use when generating the secret key -->
<salt>thesalt</salt>
<!-- pass phrase to use when generating the secret key -->
<password>thepass</password>
<!-- iteration count to use when generating the secret key -->
<iteration-count>19</iteration-count>
</symmetric-encryption>
</network>
<partition-group enabled="false"/>
iperf server and client log statements
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 22.23.24.25, TCP port 5701
TCP window size: 1.33 MByte (default)
------------------------------------------------------------
[ 5] local 172.31.17.104 port 57398 connected with 22.23.24.25 port 5701
[ 4] local 172.31.17.104 port 5701 connected with 22.23.24.25 port 55589
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 662 MBytes 555 Mbits/sec
[ 4] 0.0-10.0 sec 797 MBytes 666 Mbits/sec
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local xxx.xx.xxx.xx port 5701 connected with 22.23.24.25 port 57398
------------------------------------------------------------
Client connecting to 22.23.24.25, TCP port 5701
TCP window size: 1.62 MByte (default)
------------------------------------------------------------
[ 6] local 172.31.17.23 port 55589 connected with 22.23.24.25 port 5701
[ ID] Interval Transfer Bandwidth
[ 6] 0.0-10.0 sec 797 MBytes 669 Mbits/sec
[ 4] 0.0-10.0 sec 662 MBytes 553 Mbits/sec
Note:
I forgot to mention that I can connect from hazelcast client to server i.e. when I use a hazelcast client to connect to a single hazlecast server node, I am able to connect just fine
An outbound ports range which includes 0 is interpreted by hazelcast as "use ephemeral ports", so the <outbound-ports> element has actually no effect in your configuration. There is an associated test in hazelcast sources: https://github.com/hazelcast/hazelcast/blob/75251c4f01d131a9624fc3d0c4190de5cdf7d93a/hazelcast/src/test/java/com/hazelcast/nio/NodeIOServiceTest.java#L60