GCP IPSEC Issues - google-cloud-platform

Since HA VPNs were introduced it's a nightmare to get tunnels up and working again, the errors do not make sense.
Adding the tunnel to the gateway I get:
"Invalid value for field 'resource.destRange': ''. is not a valid IPv4 address range"
since when is 192.168.1.0/24 not CIDR? Looking at the logs it ends of with:
"establishing IKE_SA failed, peer not responding"
yet I am looking at the traffic flowing, what did they do?
Created 10 tunnels, 5 work 5 don't.
Can someone explain what's going on?

Seems that an internal issue is going on when trying to configure IKEv1 + Route-based VPN tunnels. This will be fixed in the next few days.
If this is your case, as workaround, you can try using either IKEv2 or policy-based VPN.

Related

My SSH session into my VM Cloud is suddenly lagging

Everyday I log into my SSH session of a Google Cloud VM I maintain (Debian).
Since a week ago, I noticed my performance was lagging as I typed into the VM or when doing something else. I mostly login into this VM to check log files of scheduled scripts I have, and even when I use "cat script.log", what used to take less than 2 seconds now takes at least 5 or 7 seconds, loading the log text.
Pinging different websites bring me an reasonable 10 - 15 ms. I'm pretty sure it's not about my local connection either, everything else I do works fine in my local computer.
A warning started to appear now into my session, saying
"Please consider adding the IAP-secured Tunnel User IAM role to start using Cloud IAP for TCP forwarding for better performance. Learn more Dismiss"
I've already configured the IAP secured tunnel to my account, which is the owner account of GCP project.
Another coworker of mine is being able to access the VM without any performance issues whatsoever.
Your issue is in my opinion with the ISP. For some reason the SSH sessions are lagging.
That's why even other computers using your home ISP lag SSH sessions too. If that was firewall rule interfering you wouldn't be able to connect at all.
You may try to reset all the network hardware in your home and if that doesn't help
run tracert command in windows shell and then contact your ISP and pass your findings. It's possible it's something on their end (and if not maybe their's ISP etc).
To solve the problem you need to add "IAP-secured Tunnel User" at the project level in IAM for that user.IAP-secured Tunnel User + See instructions here in a blog I wrote about this. That should solve your problem.

I can't connect to my aws instance anymore

I've been running tomcat on my Amazon EC2 instance for a few weeks just fine but all of a sudden, I became unable to connect to it. When I use putty I can connect to it fine but when I try to connect with my browser by using ip:8080 , I can't connect anymore. I've tried restarting the instance (and of course, adjusting my input ip accordingly), restarting the tomcat server within the instance, and checking the security groups. Nothing seems to work. I have no idea why it stopped working out of the blue. How should I proceed?
There are many reasons to why you cannot connect. The best way to solve this is to follow Amazons troubleshooting tips, found here.
I had the same issue. The amazon troubleshooting did not help. Then I remembered that I ad a line in my system32/driver/etc host file when I did some site migration.
Deleted that line and I was back in business again.
Hope that helped

EC2 Database through Laravel Forge has stopped being accessable

I've been running an instance EC2 through Laravel forge for about 2000 hours and this morning got this error while trying to reach it:
SQLSTATE[08006] [7] could not connect to server: Connection refused Is
the server running on host "172...***" and accepting TCP/IP
connections on port 5432?
After SSHing into the server I've getting a similar error when trying to run a command. I've dug through AWS but don't see any errors being throw. I double checked the ip address for the instance to make sure the IP hadn't changed for any reason. Of course I'm a little behind on my backups for the application so I'm hoping someone might have some ideas why else I can do to try and access this data. I haven't made any changes to the app in about 10 days, but found the error while I was pushing an update. I have six other instances of the same app that weren't affected (thankfully) but makes me even more confused with the cause of the issue.
In case anyone comes across a similar issue, here's what had happened. I had an error running in the background which had filled up the EC2 harddrive's log. Since the default Larvel/Forge image has a DB running within in the EC2 instance, once it ran out of room everything stopped working. I was able to SSH in and delete the log though, and everything started working again.
To prevent the issue from happening again I then created an amazon RDS and used that rather than the EC2 instance. It's about three or four times the price of just an EC2 instance, but still not that much and the confidence I now have in the system is well worth it.

EC2 server can't resolve hostnames

When trying to resolve a hostname (i.e. using dig), the server almost always fails, saying ;; connection timed out; no servers could be reached. Around one in ten attempts works, usually after a long waiting time.
Strange thing is that the same behavior happens also if I'm querying a different DNS server (Google's).
My default nameserver is Amazon's, # 172.31.0.2 . I get this one automatically when the server connects using DHCP.
Pinging the IPs (8.8.8.8 & 172.31.0.2) also usually fails.
I've tried checking the VPC settings and security group settings, but found nothing. Also the fact it works every once in a while makes me even more confused.
The problem disappeared by itself after around 48 hours. I don't know how to further analyize the issue so I'm closing this question. I can't think of anything about the server or AWS configuration that could have caused this, so I assume it was something with AWS's infrastructure.
Thanks

Custom Services and NATS Connection Issues

I am creating a custom service in on a single node instance of CloudFoundry which I build from vcap_dev_setup. I have followed these instructions to get an idea of what todo when creating new services.
When I try to start the new service gateway by running 'vcap_dev start service_gaeway' I get the following error:
Exiting due to NATS error: Could not connect to server on nats://nats:nats#172.16.4.146:4222/
The configuration for the :mbus property on the service_gateway is fine and is identical to that of all of the other services which start without issue.
Does anyone know of any reason why a single service could not connect to nats correctly assuming the configuration is correct?
Thanks
Chris
I am not sure why this would be the case, assuming other services are able to connect to NATS
If you are willing to share your changes to VCAP as a patch I will happily take a look, what service are your looking to integrate?
I would also advise posting your query to the VCAP dev google group at https://groups.google.com/a/cloudfoundry.org/forum/?fromgroups#!forum/vcap-dev
Make sure you have NATS running on the IP address: 172.16.4.146 Port: 4222
The IP Address should most likely be your localhost. DHCP most likely has assigned another IP addresst other than 172.16.4.146. Make sure your computer has 172.16.4.146 as IP address. You can check that by doing ifconfig.
The tricky problem may be caused by a version conflict with misleading exception, you could have a try with the latest code;