How to fix - Error client etcd cluster is unavailable or misconfigured



You might be trying to set-up your Kubernetes cluster using Kubespray but during the setup you must be stuck with the issue client etcd cluster is unavailable or misconfigured error.

But before scrolling down for solution please check in the logs again for the errros like connection refused error or subnet watch failed client: because both the error has different solution.

Solution 1 - etcd cluster is unavailable or misconfigured getsockopt: connection refused

Solution 2 - subnet watch failed client: etcd cluster is unavailable or misconfigured getsockopt

(Solution 1) How I fixed : etcd cluster is unavailable or misconfigured getsockopt: connection refused

Here is the error log which I got while I was trying to setup the same and I am assuming you might be getting something similar error logs -

 1fatal: [node1]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/etcdctl --no-sync
 2 --endpoints=https://192.168.140.191:2379,https://192.168.140.192:2379,https://192.168.140.193:2379 
 3 member list | grep -q 192.168.140.191", "delta": "0:00:00.020942", "end": "2018-05-13 18:28:37.103184", 
 4 "msg": "non-zero return code", "rc": 1, "start": "2018-05-13 18:28:37.082242", "stderr": "client: 
 5 etcd cluster is unavailable or misconfigured; error #0: dial tcp 192.168.140.191:2379: 
 6 getsockopt: connection refused\n; error #1: dial tcp 192.168.140.192:2379: getsockopt: no route to host\n;
 7  error #2: dial tcp 192.168.140.193:2379: getsockopt: no route to host", 
 8  "stderr_lines": ["client: etcd cluster is unavailable or misconfigured; 
 9  error #0: dial tcp 192.168.140.191:2379: getsockopt: connection refused", ";
10   error #1: dial tcp 192.168.140. 192:2379: getsockopt: no 


Issue

After spending a day checking the logs, pods status and ton of Googling I couldn't find the issue. But on the next the day when I looked the logs again then I realised its connection refused error and I did not pay attention on this error initially.

The Problem was with the Firewall and Ports. If you look carefully in the error then there is rest webservice which is getting called with the URL https://192.168.140.191:2379. So if you are not able to make a successful web-service call then obviously it is going to throw connection refused error prefixed with etcd cluster is unavailable or misconfigured;



How to fix

To make the things works for me, I updated my Firewalld settings of my all the nodes and added the port 2379 for opening.

Ultimately I added complete port range to the firewall on all the 3 Nodes(Master, Worker1, Worker2)-

  1. 2379-2380
  2. 10250-10255

Here are the linux command which you can use to open the similar port range -

1firewall-cmd --permanent --add-port=6443/tcp
2firewall-cmd --permanent --add-port=2379-2380/tcp
3firewall-cmd --permanent --add-port=10250-10255/tcp
4firewall-cmd --reload
5firewall-cmd --list-ports
66443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 30000-32767/tcp 6783/tcp 21/tcp
1firewall-cmd --permanent --add-port=10250/tcp
2firewall-cmd --permanent --add-port=10255/tcp
3firewall-cmd --permanent --add-port=30000-32767/tcp
4firewall-cmd --permanent --add-port=6783/tcp
5firewall-cmd --reload
6firewall-cmd --list-ports
710255/tcp 10250/tcp 30000-32767/tcp 6783/tcp 


If the previous firewalld settings does not work for you then you can disable the firewall also using following command.

1systemctl stop firewalld && systemctl disable firewalld

But this approach I would not recommend at all in production. Before disabling the firewall in production environment please do consult the sysadmin for security reasons.


(Solution 2) How I fixed : subnet watch failed client: etcd cluster is unavailable or misconfigured

This error is little tricky and I faced this issue while trying to set-up the kubernetes cluster on cloud service such as AWS(Amazon Web Service), GCP(Google Cloud Platform).

The error which I got was in the flanneld logs -

1Watch subnets: client: etcd cluster is unavailable or misconfigured 


How to fix

To fix this issue first check which etcd API you are using. If you are using etcd API v2 then it will not work. To fix this issue you must use etcd API v3 instead, please read this GitHub Thread for more details




Learn more On Kubernetes -

  1. Setup kubernetes on Ubuntu
  2. Setup Kubernetes on CentOs
  3. Setup HA Kubernetes Cluster with Kubespray
  4. Setup HA Kubernetes with Minikube
  5. Setup Kubernetes Dashboard for local kubernetes cluster
  6. Setup Kubernetes Dashboard On GCP(Google Cloud Platform)
  7. How to use Persistent Volume and Persistent Volume Claims in Kubernetes
  8. Deploy Spring Boot Microservice on local Kubernetes cluster
  9. Deploy Spring Boot Microservice on Cloud Platform(GCP)
  10. Setting up Ingress controller NGINX along with HAproxy inside Kubernetes cluster
  11. CI/CD Kubernetes | Setting up CI/CD Jenkins pipeline for kubernetes
  12. kubectl export YAML | Get YAML for deployed kubernetes resources(service, deployment, PV, PVC....)
  13. How to setup kubernetes jenkins pipeline on AWS?
  14. Implementing Kubernetes liveness, Readiness and Startup probes with Spring Boot Microservice Application?
  15. How to fix kubernetes pods getting recreated?
  16. How to delete all kubernetes PODS?
  17. How to use Kubernetes secrets?
  18. Share kubernetes secrets between namespaces?
  19. How to Delete PV(Persistent Volume) and PVC(Persistent Volume Claim) stuck in terminating state?
  20. Delete Kubernetes POD stuck in terminating state?