Random terraform dial tcp: lookup no such host errors - amazon-web-services

When I run terraform apply, I get random "terraform dial tcp: lookup no such host errors". Sometimes they go away when I run apply again. Here are a some examples:
Post "https://iam.amazonaws.com/": dial tcp: lookup iam.amazonaws.com on [2001:8a0:6f38:7c00::1]:53: no such host
Post "https://firehose.eu-west-1.amazonaws.com/": dial tcp: lookup firehose.eu-west-1.amazonaws.com on [2001:8a0:6f38:7c00::1]:53: no such host
Post "https://events.eu-west-1.amazonaws.com/": dial tcp: lookup events.eu-west-1.amazonaws.com on [2001:8a0:6f38:7c00::1]:53: no such host
I can't find any problem with my network and definitely not a problem looking up names.
I sometimes also get this with terraform destroy

Related

AWS GreenGrass connectivity issues

greengrass v1.8.1 log
[2019-08-02T03:30:28.743-04:00][WARN]-MQTT[client] dial tcp: lookup a1ijl5s1n8kf6a-ats.iot.us-east-1.amazonaws.com on 127.0.0.53:53: read udp 127.0.0.1:49567->127.0.$
[2019-08-02T03:30:28.743-04:00][WARN]-MQTT[client] Failed to connect to a broker
[2019-08-02T03:30:39.74-04:00][WARN]-MQTT[store] Trying to close memory store, but not open
[2019-08-02T03:30:39.74-04:00][WARN]-MQTT connection attempt failed and will be retried. {"attemptId": "nVwA", "clientId": "YoungsoftIoTGroup_Core", "errorStrin$
[2019-08-02T03:30:59.742-04:00][WARN]-MQTT[client] dial tcp: lookup a1ijl5s1n8kf6a-ats.iot.us-east-1.amazonaws.com on 127.0.0.53:53: read udp 127.0.0.1:49845->127.0.$
[2019-08-02T03:30:59.742-04:00][WARN]-MQTT[client] Failed to connect to a broker
[2019-08-02T03:31:10.741-04:00][WARN]-MQTT[store] Trying to close memory store, but not open
[2019-08-02T03:31:10.741-04:00][WARN]-MQTT connection attempt failed and will be retried. {"attemptId": "BAxL", "clientId": "YoungsoftIoTGroup_Core", "errorStrin$
[2019-08-02T03:31:30.742-04:00][WARN]-MQTT[client] dial tcp: lookup a1ijl5s1n8kf6a-ats.iot.us-east-1.amazonaws.com on 127.0.0.53:53: read udp 127.0.0.1:39151->127.0.$
[2019-08-02T03:31:30.742-04:00][WARN]-MQTT[client] Failed to connect to a broker
[2019-08-02T03:31:41.741-04:00][WARN]-MQTT[store] Trying to close memory store, but not open
[2019-08-02T03:31:41.741-04:00][WARN]-MQTT connection attempt failed and will be retried. {"attemptId": "mQbd", "clientId": "YoungsoftIoTGroup_Core", "errorStrin$
[2019-08-02T03:32:03.742-04:00][WARN]-MQTT[client] dial tcp: lookup a1ijl5s1n8kf6a-ats.iot.us-east-1.amazonaws.com on 127.0.0.53:53: read udp 127.0.0.1:46991->127.0.$
[2019-08-02T03:32:03.742-04:00][WARN]-MQTT[client] Failed to connect to a broker
[2019-08-02T03:32:14.741-04:00][WARN]-MQTT[store] Trying to close memory store, but not open
[2019-08-02T03:32:14.741-04:00][WARN]-MQTT connection attempt failed and will be retried. {"attemptId": "UDaF", "clientId": "YoungsoftIoTGroup_Core", "errorStrin$
[2019-08-02T03:32:41.743-04:00][WARN]-MQTT[client] dial tcp: lookup a1ijl5s1n8kf6a-ats.iot.us-east-1.amazonaws.com on 127.0.0.53:53: read udp 127.0.0.1:56790->127.0.$
My OS is Ubuntu 18.04. I used AWS IoT greengrass v1.8.1.
I followed the README.md in each package to setup up and install step by step. All lambdas were failed to connect AWS greengrass group I created.
This is all was working and all of a sudden encountered this issue.

getsockopt: connection refused : Transferring logs from Filebeat to Logstash on other host

I'm trying to transfer logs from Filebeat to Logstash
Both are running on different EC2 instances in the same network.
Apparently the socks5 protocol is used instead of http.
This is my filebeat.yml config file
filebeat.prospectors:
- type: log
paths:
- /camel-logs/app.log
output.logstash:
hosts: ["remote-host:5044"]
proxy_url: socks5://10.0.0.10:5044
filebeat.inputs:
- type: log
paths:
- /camel-logs/app.log
Honestly, I don't really know whether I should use prospectors, inputs, or both here. Neither works for now.
I'm positive that TCP port 5044 between the two hosts is open and accessible, but I don't know if socks5 is even possible over TCP? My knowledge about this stuff is quite limited.
I'm getting this error:
pipeline/output.go:74 Failed to connect: dial tcp 10.0.0.10:5044: getsockopt: connection refused
This could also be interesting:
log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":24},"total":{"ticks":40,"time":52,"value":40},"user":{"ticks":20,"time":28}},"info":{"ephemeral_id":"192acef7-0adb-4fbb-adfe-90cade7a5498","uptime":{"ms":30011}},"memstats":{"gc_next":4194304,"memory_alloc":2166616,"memory_total":4100568,"rss":21409792}},"filebeat":{"events":{"active":334,"added":335,"done":1},"harvester":{"open_files":1,"running":1,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"logstash"},"pipeline":{"clients":1,"events":{"active":318,"filtered":17,"published":318,"retry":852,"total":335}}},"registrar":{"states":{"current":1,"update":1},"writes":2},"system":{"cpu":{"cores":2},"load":{"1":3.22,"15":0.47,"5":1.23,"norm":{"1":1.61,"15":0.235,"5":0.615}}}}}}

Kafka broker on AWS - IP setup

I have installed Kafka software on EC2. My problem is connecting to broker from outside the AWS. It all work for me from inside.
So I can start the broker, and both kafka-console-producer and consumer works (from the same server). I have ports 2181 and 9092 open to the remote location, towards from where I would like to use producer. So from my development (local) machine .. If I do telnet 9092 - it connects me. If i try to use kafka-console-producer i get this error.
[2017-03-09 15:04:44,971] ERROR Error when sending message to topic topic2 with key: null, value: 5 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topic2-0: 1521 ms has passed since batch creation plus linger time
I tried all sorts of combination with and on server.properties file - with keys listeners and advertised.listeners.
I would really appreciate some help...
It might be caused by the fact that the public hostname/ip of AWS machines cannot be used inside AWS. If so you need to fudge a bit. 2 things are needed:
make sure you set advertised.listeners to your private address
in your local /etc/hosts, bind the local hostname of aws (eg. ip-10-0-0-1.eu-west-1.compute.internal) to the public IP
Then make sure you always only use the private hostname. This has been the root cause for me of many weird issues not giving any logs.

Validating a host using SSL Wildcard certificates

My HTTPS Client uses Poco C++ to connect with our server, which uses a wildcard certificate (*.example.com). The connection fails with a CertificateValidationException and the error message is "Unacceptable certificate from x.y.z.w: application verification failure".
The weird thing is it doesn't ALWAYS fail, just most of the time. After much debugging, my hunch is it has something to do with topology (going across subnets, for example) or with how/when the host name is translated to an I.P. address.
I think this because in cases where everything works as expected, the local DNS is routing the host name. But in cases where it doesn't work (above error message), the host name translation is on a local box like my PC.
Is there a way to narrow down what's going on here? Is this a common or known problem?
Thanks.
I just ran into this same symptom myself. Using Poco::Net::HTTPSClientSession, I could connect just fine to non-wildcard sites, but failed with the exception and message mentioned above when connecting to *.example.com wildcard sites. Note however that the behavior I observed was 100% consistent, never intermittent.
After debugging through Poco source code, I found the problem in how the HTTPSClientConnection class sets itself up to perform certificate validation. I filed Poco issue #1303 on pocoproject.org, but the skinny is that if you create an HTTPSClientSession using the no-arg constructor, you will always get this exception when connecting to server with a wildcard cert. For example:
Poco::URI uri("http://blah.example.com"); // SSL cert is for *.example.com
Poco::Net::HTTPSClientSession session; // Note no-arg constructor
session.setHost(uri.getHost());
session.setPort(uri.getPort());
Poco::Net::HTTPRequest req;
// Populate req...
session.sendRequest(req); // Throws CertificateValidationException:
// "Unacceptable certificate from x.x.x.x,
// application verification failure"
The problem is that when validating the cert, the various Poco::Net classes look up the peer name as an IP address if the hostname hasn't been set already, and then subsequently try to match that IP addr against the wildcard cert CN (obviously, x.y.z.w will fail to match against *.example.com).
The good news is that there are several easy workarounds. The easiest is to just use the HTTPSClientSession(host, port) constructor, which will set the proper host name on the underlying SecureStreamSocket so that subsequent certificate validation matches a real hostname (blah.example.com) against the cert (CN=*.example.com), instead of an IP addr:
Poco::URI uri("http://blah.example.com");
Poco::Net::HTTPSClientSession session(uri.getHost(), uri.getPort()); // Calls SecureStreamSocket::setPeerHostName() internally
There are other workarounds, too: create your own SecureStreamSocket first, call setPeerHostName() on it, and then pass that into the appropriate HTTPSClientSession constructor, etc. See the issue tracker link above for a few more ideas if needed.

Zookeeper unable to listen on port 3888

I've got 3 servers on aws. each with open jdk 7 and zookeeper 3.4.6 all have unique elastic ip's.
each conf/zoo.cfg has
clientPort=2181
server.1=server1:2888:3888
server.2=server2:2888:3888
server.3=server3:2888:3888
then i start it with ./zkServer.sh start (says STARTED)
and the zookeeper.out says
2015-01-14 09:27:55,919 [myid:1] - INFO [Thread-1:QuorumCnxManager$Listener#504] - My election bind port: /server1ipaddress:3888
2015-01-14 09:27:55,920 [myid:1] - ERROR [/server1ipaddress:3888:QuorumCnxManager$Listener#517] - Exception while listening
java.net.BindException: Cannot assign requested address
at java.net.PlainSocketImpl.socketBind(Native Method)
at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
at java.net.ServerSocket.bind(ServerSocket.java:376)
at java.net.ServerSocket.bind(ServerSocket.java:330)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:507)
So it cant open the port.
i've eventually opened all ports on aws security to rule that out.
telnet into 2181 with ruok gets imok.
telnet into 2888 cannot connect. connection refused.
telnet into 3888 cannot connect. connection refused.
netstat shows that nothing is blocking 2888 and 3888
i've even tried this with all 3 servers having zookeeper started.
whats going on? how do i get those ports open for use.
Your problem is answered here.
In a few words: on each ZooKeeper machine, at your conf/zoo.cfg, you have to set the current server's IP to 0.0.0.0.
For example: if you are currently on server1, the config should contain the following lines:
server.1=0.0.0.0:2888:3888
server.2=server2:2888:3888
server.3=server3:2888:3888
This step solved the problem in my case.
Cross verify myid's on all the nodes based on the zoo.cfg. The same issue happened to me, upon looking myid pattern got changed on 2 of the nodes.