With the 1.4 release of Kubernetes, Google have made instantiating a cluster a whole lot easier. Using Kubeadm, you can bring up a cluster with a single command on each node. A further command will create a DaemonSet which brings up a Weave mesh network between all your nodes.

As always with complex systems such as Kubernetes, there are some potential pitfalls to be aware of. Firstly, the getting started guide notes that v1.11.2 of Docker is recommended, but v1.10.3 and v1.12.1 also work well (don’t go straight for the latest release like I tried to). If you wish to have your nodes talk over a private network, you’ll also need to explicitly declare this when you init the master node, otherwise Kubernetes will default to using your primary interface/route:

root@kube1:~# kubeadm init --api-advertise-addresses=172.16.0.128 --api-external-dns-names=kube1.int.domain.com
Running pre-flight checks
<master/tokens> generated token: "486896.cc87489554fbb1c7"
<master/pki> generated Certificate Authority key and certificate:
Issuer: CN=kubernetes | Subject: CN=kubernetes | CA: true
Not before: 2016-11-16 20:36:56 +0000 UTC Not After: 2026-11-14 20:36:56 +0000 UTC
Public: /etc/kubernetes/pki/ca-pub.pem
Private: /etc/kubernetes/pki/ca-key.pem
Cert: /etc/kubernetes/pki/ca.pem
<master/pki> generated API Server key and certificate:
Issuer: CN=kubernetes | Subject: CN=kube-apiserver | CA: false
Not before: 2016-11-16 20:36:56 +0000 UTC Not After: 2017-11-16 20:36:56 +0000 UTC
Alternate Names: [172.16.0.128 10.96.0.1 kube1.int.domain.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local]
Public: /etc/kubernetes/pki/apiserver-pub.pem
Private: /etc/kubernetes/pki/apiserver-key.pem
Cert: /etc/kubernetes/pki/apiserver.pem
<master/pki> generated Service Account Signing keys:
Public: /etc/kubernetes/pki/sa-pub.pem
Private: /etc/kubernetes/pki/sa-key.pem
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
<master/apiclient> all control plane components are healthy after 13.060118 seconds
<master/apiclient> waiting for at least one node to register and become ready
<master/apiclient> first node is ready after 6.003302 seconds
<master/apiclient> attempting a test deployment
<master/apiclient> test deployment succeeded
<master/discovery> created essential addon: kube-discovery, waiting for it to become ready
<master/discovery> kube-discovery is ready after 1.003735 seconds
<master/addons> created essential addon: kube-proxy
<master/addons> created essential addon: kube-dns

Kubernetes master initialised successfully!

You can now join any number of machines by running the following on each node:

kubeadm join --token=486896.cc87489554fbb1c7 172.16.0.128

With that done, you can see all the service containers up and running on the master node:

root@kube1:~# kubectl get pods --namespace=kube-system
NAME                                          READY     STATUS              RESTARTS   AGE
dummy-2088944543-whf5o                        1/1       Running             0          14h
etcd-kube1.domain.com                         1/1       Running             0          14h
kube-apiserver-kube1.domain.com               1/1       Running             0          14h
kube-controller-manager-kube1.domain.com      1/1       Running             0          14h
kube-discovery-1150918428-qy41j               1/1       Running             0          14h
kube-dns-654381707-s92ze                      1/1       ContainerCreating   0          14h
kube-proxy-2fh61                              1/1       Running             0          13h
kube-scheduler-kube1.domain.com               1/1       Running             0          14h

It’s now time to join your slave nodes to the master:

root@kube2:~# kubeadm join --token=871c6f.af7b03c1d93ad11a 172.16.0.128
Running pre-flight checks
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://172.16.0.128:9898/cluster-info/v1/?token-id=871c6f"
<node/discovery> cluster info object received, verifying signature using given token
<node/discovery> cluster info signature and contents are valid, will use API endpoints [https://172.16.0.128:6443]
<node/bootstrap> trying to connect to endpoint https://172.16.0.128:6443
<node/bootstrap> detected server version v1.4.4
<node/bootstrap> successfully established connection with endpoint https://172.16.0.128:6443
<node/csr> created API client to obtain unique certificate for this node, generating keys and certificate signing request
<node/csr> received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:kube2.domain.com | CA: false
Not before: 2016-11-16 12:37:00 +0000 UTC Not After: 2017-11-16 12:37:00 +0000 UTC
<node/csr> generating kubelet configuration
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
  received.
* Kubelet informed of new secure connection details.

Run ‘kubectl get nodes’ on the master to see this machine join

root@kube1:~# kubectl get nodes
NAME                  STATUS    AGE
kube1.domain.com      Ready     14h
kube2.domain.com      Ready     14h
kube3.domain.com      Ready     14h

The master now knows about the cluster, however the DNS container will be stuck creating - this is because it needs networking to enable it to spawn successfully. Kubeadm makes this easy enough:

root@kube1:/# kubectl apply -f https://git.io/weave-kube
daemonset "weave-net" created

This will create a DaemonSet provided by Weaveworks which in turn pulls down all the necessary Docker containers to build the networking:

root@kube1:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                          READY     STATUS             RESTARTS   AGE
kube-system   dummy-2088944543-0h6ww                        1/1       Running            0          1h
kube-system   etcd-kube1.domain.com                         1/1       Running            0          1h
kube-system   kube-apiserver-kube1.domain.com               1/1       Running            0          1h
kube-system   kube-controller-manager-kube1.domain.com      1/1       Running            0          1h
kube-system   kube-discovery-1150918428-6hcm9               1/1       Running            0          1h
kube-system   kube-dns-654381707-p7ul4                      3/3       Running            0          1h
kube-system   kube-proxy-3pchs                              1/3       Running            0          1h
kube-system   kube-proxy-b5hx8                              2/3       Running            0          1h
kube-system   kube-proxy-y9k33                              3/3       Running            0          1h
kube-system   kube-scheduler-kube1.domain.com               1/1       Running            0          1h
kube-system   weave-net-8vbez                               2/3       Running            0          1h
kube-system   weave-net-x1bfw                               1/3       CrashLoopBackOff   15         1h
kube-system   weave-net-pg4s1                               1/3       CrashLoopBackOff   15         1h

You can now see the DNS service container has started correctly and is able to serve requests, however the two new weave-net containers on the slave nodes will fail to start. Inspecting the logs for an affected container shows the following:

root@kube1:~# kubectl logs weave-net-pg4s1 -c weave --namespace=kube-system
2016/11/16 13:41:49 error contacting APIServer: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout; trying with fallback: http://localhost:8080
2016/11/16 13:41:49 Could not get peers: Get http://localhost:8080/api/v1/nodes: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers

This is due to the local kube-proxy service on that node using the wrong interface. Fortunately this is easy enough to fix (you’ll need to ensure ‘jq’ is installed first to manipulate the json):

root@kube1:~# root@kube1:~# kubectl -n kube-system get ds -l 'component\=kube-proxy' -o json | jq '.items[0].spec.template.spec.containers[0].command |= .+ ["--cluster-cidr=10.32.0.0/12"]' | kubectl apply -f - &amp;&amp; kubectl -n kube-system delete pods -l 'component\=kube-proxy'
daemonset "kube-proxy" configured
pod "kube-proxy-3tqjn" deleted
pod "kube-proxy-572au" deleted
pod "kube-proxy-tsc3c" deleted

Note that the above is now incorrect for Kubernetes 1.6, you’ll need the following (thanks Mike!):

kubectl -n kube-system get ds -l 'component=kube-proxy' -o json | jq '.items[0].spec.template.spec.containers[0].command |= .+ ["--cluster-cidr=10.32.0.0/12"]' | kubectl apply -f - && kubectl -n kube-system delete pods -l 'component=kube-proxy'

This will correct the –cluster-cidr flag and then delete the containers. Kubernetes applies a restart policy of ‘always’ to the service containers, so these will be respawned:

root@kube1:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                          READY     STATUS    RESTARTS   AGE
kube-system   dummy-2088944543-whf5o                        1/1       Running   0          28m
kube-system   etcd-kube1.analbeard.com                      1/1       Running   0          28m
kube-system   kube-apiserver-kube1.analbeard.com            1/1       Running   0          28m
kube-system   kube-controller-manager-kube1.analbeard.com   1/1       Running   0          28m
kube-system   kube-discovery-1150918428-qy41j               1/1       Running   0          28m
kube-system   kube-dns-654381707-s92ze                      3/3       Running   0          27m
kube-system   kube-proxy-2fh61                              1/1       Running   0          48s
kube-system   kube-proxy-k0c64                              1/1       Running   0          48s
kube-system   kube-proxy-xm82v                              1/1       Running   0          48s
kube-system   kube-scheduler-kube1.analbeard.com            1/1       Running   0          28m
kube-system   weave-net-gzakd                               2/2       Running   1          3m
kube-system   weave-net-x1bfw                               2/2       Running   0          4m
kube-system   weave-net-y9k33                               2/2       Running   1          57s

Success! One working cluster. It is worth bearing in mind that deploying via Kubeadm is fairly limited at the moment - it only creates a single etcd container which won’t be resilient if a node is lost. The tool is being heavily worked on however, and more functionality will be added in later releases.