開發一個禁止刪除 Namespace 的控制器

語言: CN / TW / HK

​大家好,我是喬克。

昨天收到一個朋友的資訊,說不小心把叢集的業務namespace幹掉了,導致整個業務都停滯了,問我有沒有禁止刪除namespace的方案。

在我的記憶裡,Kubernetes的准入裡並沒有這個控制器,所以我就給他說需要自己開發一個准入控制器來實現自己的目標。

作為人,何為正確!我不能只脫褲子,不放屁。所以這裡也整理了一下如何自定義Kubernetes的准入控制器。

理論介紹

准入控制器(Admission Controller)位於 API Server 中,在物件被持久化之前,准入控制器攔截對 API Server 的請求,一般用來做身份驗證和授權。其中包含兩個特殊的控制器:MutatingAdmissionWebhook 和 ValidatingAdmissionWebhook。

  • MutatingAdmissionWebhook :用於變更請求物件,比如istio為每個Pod注入sidecar,就是通過它實現。
  • ValidatingAdmissionWebhook:用於驗證請求物件。

整個准入控制器的流程如下:

當 API 請求進入時,mutating 和 validating 控制器使用配置中的外部 webhooks 列表併發呼叫,規則如下:

  • 如果所有的 webhooks 批准請求,准入控制鏈繼續流轉。
  • 如果有任意一個 webhooks 阻止請求,那麼准入控制請求終止,並返回第一個 webhook 阻止的原因。其中,多個 webhooks 阻止也只會返回第一個 webhook 阻止的原因。
  • 如果在呼叫 webhook 過程中發生錯誤,那麼請求會被終止或者忽略 webhook。

准入控制器是在 API Server 的啟動引數中配置的。一個准入控制器可能屬於以上兩者中的一種,也可能兩者都屬於。

我們在部署 Kubernetes 叢集的時候都會預設開啟一系列准入控制器,如果沒有設定這些准入控制器的話可以說你的 Kubernetes 叢集就是在裸奔,應該叫管理員為叢集新增准入控制器。

程式碼實現

實現邏輯

在開發之前先大致瞭解一下准入控制器的Webhook的大致實現邏輯:

  • Webhook是一個標準的HTTP服務,接收HTTP請求。
  • 接收到的請求是一個AdmissionReview物件。
  • 然後我們自定義的Hook會處理這個AdmissionReview物件。
  • 處理完過後再返回一個AdmissionReview物件,這裡面會包含處理結果。

AdmissionReview的結構體如下:

// AdmissionReview describes an admission review request/response.
type AdmissionReview struct {
 metav1.TypeMeta `json:",inline"`
 // Request describes the attributes for the admission request.
 // +optional
 Request *AdmissionRequest `json:"request,omitempty" protobuf:"bytes,1,opt,name=request"`
 // Response describes the attributes for the admission response.
 // +optional
 Response *AdmissionResponse `json:"response,omitempty" protobuf:"bytes,2,opt,name=response"`
}

從程式碼的命名中可以很清晰的看出,在請求傳送到 WebHook 時我們只需要關注內部的 AdmissionRequest(實際入參),在我們編寫的 WebHook 處理完成後只需要返回包含有 AdmissionResponse(實際返回體) 的 AdmissionReview 物件即可;總的來說 AdmissionReview 物件是個套殼,請求是裡面的 AdmissionRequest,響應是裡面的 AdmissionResponse。

具體實現

(1)首先建立一個HTTP Server,監聽埠,接收請求。

package main

import (
    "context"
    "flag"
    "github.com/joker-bai/validate-namespace/http"
    log "k8s.io/klog/v2"
    "os"
    "os/signal"
    "syscall"
)

var (
    tlscert, tlskey, port string
)

func main() {
    flag.StringVar(&tlscert, "tlscert", "/etc/certs/cert.pem", "Path to the TLS certificate")
    flag.StringVar(&tlskey, "tlskey", "/etc/certs/key.pem", "Path to the TLS key")
    flag.StringVar(&port, "port", "8443", "The port to listen")
    flag.Parse()
    
    server := http.NewServer(port)
    go func() {
        if err := server.ListenAndServeTLS(tlscert, tlskey); err != nil {
            log.Errorf("Failed to listen and serve: %v", err)
        }
    }()
    
    log.Infof("Server running in port: %s", port)
    
    // listen shutdown signal
    signalChan := make(chan os.Signal, 1)
    signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM)
    <-signalChan
    
    log.Info("Shutdown gracefully...")
    if err := server.Shutdown(context.Background()); err != nil {
        log.Error(err)
    }
}

由於准入控制器和Webhook之間需要使用TLS進行通訊,所以上面監聽的埠是TLS埠,通過server.ListenAndServeTLS實現,後續在部署服務的時候需要把證書掛到相應的目錄中。

(2)定義Handler,將請求分發到具體的處理方法。

package http

import (
 "fmt"
 "github.com/joker-bai/validate-namespace/namespace"
 "net/http"
)

// NewServer creates and return a http.Server
func NewServer(port string) *http.Server {
 // Instances hooks
 nsValidation := namespace.NewValidationHook()

 // Routers
 ah := newAdmissionHandler()
 mux := http.NewServeMux()
 mux.Handle("/healthz", healthz())
 mux.Handle("/validate/delete-namespace", ah.Serve(nsValidation))

 return &http.Server{
  Addr:    fmt.Sprintf(":%s", port),
  Handler: mux,
 }
}

實現admissionHandler,主要作用是將http body的內容解析成AdmissionReview物件,然後呼叫具體的Hook處理,再將結果放到AdmissionReview中,返回給客戶端。

package http

import (
 "encoding/json"
 "fmt"
 "io"
 "net/http"

 "github.com/douglasmakey/admissioncontroller"

 "k8s.io/api/admission/v1beta1"
 admission "k8s.io/api/admission/v1beta1"
 meta "k8s.io/apimachinery/pkg/apis/meta/v1"
 "k8s.io/apimachinery/pkg/runtime"
 "k8s.io/apimachinery/pkg/runtime/serializer"
 log "k8s.io/klog/v2"
)

// admissionHandler represents the HTTP handler for an admission webhook
type admissionHandler struct {
 decoder runtime.Decoder
}

// newAdmissionHandler returns an instance of AdmissionHandler
func newAdmissionHandler() *admissionHandler {
 return &admissionHandler{
  decoder: serializer.NewCodecFactory(runtime.NewScheme()).UniversalDeserializer(),
 }
}

// Serve returns a http.HandlerFunc for an admission webhook
func (h *admissionHandler) Serve(hook admissioncontroller.Hook) http.HandlerFunc {
 return func(w http.ResponseWriter, r *http.Request) {
  w.Header().Set("Content-Type", "application/json")
  if r.Method != http.MethodPost {
   http.Error(w, fmt.Sprint("invalid method only POST requests are allowed"), http.StatusMethodNotAllowed)
   return
  }

  if contentType := r.Header.Get("Content-Type"); contentType != "application/json" {
   http.Error(w, fmt.Sprint("only content type 'application/json' is supported"), http.StatusBadRequest)
   return
  }

  body, err := io.ReadAll(r.Body)
  if err != nil {
   http.Error(w, fmt.Sprintf("could not read request body: %v", err), http.StatusBadRequest)
   return
  }

  var review admission.AdmissionReview
  if _, _, err := h.decoder.Decode(body, nil, &review); err != nil {
   http.Error(w, fmt.Sprintf("could not deserialize request: %v", err), http.StatusBadRequest)
   return
  }

  if review.Request == nil {
   http.Error(w, "malformed admission review: request is nil", http.StatusBadRequest)
   return
  }

  result, err := hook.Execute(review.Request)
  if err != nil {
   log.Error(err)
   w.WriteHeader(http.StatusInternalServerError)
   return
  }

  admissionResponse := v1beta1.AdmissionReview{
   Response: &v1beta1.AdmissionResponse{
    UID:     review.Request.UID,
    Allowed: result.Allowed,
    Result:  &meta.Status{Message: result.Msg},
   },
  }

  res, err := json.Marshal(admissionResponse)
  if err != nil {
   log.Error(err)
   http.Error(w, fmt.Sprintf("could not marshal response: %v", err), http.StatusInternalServerError)
   return
  }

  log.Infof("Webhook [%s - %s] - Allowed: %t", r.URL.Path, review.Request.Operation, result.Allowed)
  w.WriteHeader(http.StatusOK)
  w.Write(res)
 }
}

func healthz() http.HandlerFunc {
 return func(w http.ResponseWriter, r *http.Request) {
  w.WriteHeader(http.StatusOK)
  w.Write([]byte("ok"))
 }
}

上面處理是通過hook.Execute來處理請求,這是admissionController內部實現的一個結構體,它為每個操作定義了一個方法,如下:

// AdmitFunc defines how to process an admission request
type AdmitFunc func(request *admission.AdmissionRequest) (*Result, error)

// Hook represents the set of functions for each operation in an admission webhook.
type Hook struct {
 Create  AdmitFunc
 Delete  AdmitFunc
 Update  AdmitFunc
 Connect AdmitFunc
}

我們就需要實現具體的AdmitFunc,並註冊。

(3)將自己實現的方法註冊到Hook中。

package namespace

import (
 "github.com/douglasmakey/admissioncontroller"
)

// NewValidationHook delete namespace validation hook
func NewValidationHook() admissioncontroller.Hook {
 return admissioncontroller.Hook{
  Delete: validateDelete(),
 }
}

(4)實現具體的AdmitFunc。

package namespace

import (
 "github.com/douglasmakey/admissioncontroller"
 log "k8s.io/klog/v2"

 "k8s.io/api/admission/v1beta1"
)

func validateDelete() admissioncontroller.AdmitFunc {
 return func(r *v1beta1.AdmissionRequest) (*admissioncontroller.Result, error) {
  if r.Kind.Kind == "Namespace" {
   log.Info("You cannot delete namespace: ", r.Name)
   return &admissioncontroller.Result{Allowed: false}, nil
  } else {
   return &admissioncontroller.Result{Allowed: true}, nil
  }
 }
}

這裡實現很簡單,如果Kind為Namespace,就拒絕操作。

部署測試

上面完成了業務邏輯開發,下面就把它部署到Kubernetes叢集測試一番。

部署

(1)編寫Dockerfile,將應用打包成映象

FROM golang:1.17.5 AS build-env
ENV GOPROXY https://goproxy.cn
ADD . /go/src/app
WORKDIR /go/src/app
RUN go mod tidy
RUN cd cmd && GOOS=linux GOARCH=amd64 go build -v -a -ldflags '-extldflags "-static"' -o /go/src/app/app-server /go/src/app/cmd/main.go

FROM registry.cn-hangzhou.aliyuncs.com/coolops/ubuntu:22.04
ENV TZ=Asia/Shanghai
COPY --from=build-env /go/src/app/app-server /opt/app-server
WORKDIR /opt
EXPOSE 80
CMD [ "./app-server" ]

(2)建立TLS證書,使用指令碼進行建立。

#!/bin/bash

set -e

usage() {
    cat <<EOF
Generate certificate suitable for use with an sidecar-injector webhook service.

This script uses k8s' CertificateSigningRequest API to a generate a
certificate signed by k8s CA suitable for use with sidecar-injector webhook
services. This requires permissions to create and approve CSR. See
https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster for
detailed explantion and additional instructions.

The server key/cert k8s CA cert are stored in a k8s secret.

usage: ${0} [OPTIONS]

The following flags are required.

       --service          Service name of webhook.
       --namespace        Namespace where webhook service and secret reside.
       --secret           Secret name for CA certificate and server certificate/key pair.
EOF
    exit 1
}

while [[ $# -gt 0 ]]; do
    case ${1} in
        --service)
            service="$2"
            shift
            ;;
        --secret)
            secret="$2"
            shift
            ;;
        --namespace)
            namespace="$2"
            shift
            ;;
        *)
            usage
            ;;
    esac
    shift
done

[ -z ${service} ] && service=validate-delete-namespace
[ -z ${secret} ] && secret=validate-delete-namespace-tls
[ -z ${namespace} ] && namespace=default

if [ ! -x "$(command -v openssl)" ]; then
    echo "openssl not found"
    exit 1
fi

csrName=${service}.${namespace}
tmpdir=$(mktemp -d)
echo "creating certs in tmpdir ${tmpdir} "

cat <<EOF >> ${tmpdir}/csr.conf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names
[alt_names]
DNS.1 = ${service}
DNS.2 = ${service}.${namespace}
DNS.3 = ${service}.${namespace}.svc
EOF

openssl genrsa -out ${tmpdir}/server-key.pem 2048
openssl req -new -key ${tmpdir}/server-key.pem -subj "/CN=${service}.${namespace}.svc" -out ${tmpdir}/server.csr -config ${tmpdir}/csr.conf

# clean-up any previously created CSR for our service. Ignore errors if not present.
kubectl delete csr ${csrName} 2>/dev/null || true

# create  server cert/key CSR and  send to k8s API
cat <<EOF | kubectl create -f -
apiVersion: certificates.k8s.io/v1beta1
kind: CertificateSigningRequest
metadata:
  name: ${csrName}
spec:
  groups:
  - system:authenticated
  request: $(cat ${tmpdir}/server.csr | base64 | tr -d '\n')
  usages:
  - digital signature
  - key encipherment
  - server auth
EOF

# verify CSR has been created
while true; do
    kubectl get csr ${csrName}
    if [ "$?" -eq 0 ]; then
        break
    fi
done

# approve and fetch the signed certificate
kubectl certificate approve ${csrName}
# verify certificate has been signed
for x in $(seq 10); do
    serverCert=$(kubectl get csr ${csrName} -o jsonpath='{.status.certificate}')
    if [[ ${serverCert} != '' ]]; then
        break
    fi
    sleep 1
done
if [[ ${serverCert} == '' ]]; then
    echo "ERROR: After approving csr ${csrName}, the signed certificate did not appear on the resource. Giving up after 10 attempts." >&2
    exit 1
fi
echo ${serverCert} | openssl base64 -d -A -out ${tmpdir}/server-cert.pem


# create the secret with CA cert and server cert/key
kubectl create secret generic ${secret} \
        --from-file=key.pem=${tmpdir}/server-key.pem \
        --from-file=cert.pem=${tmpdir}/server-cert.pem \
        --dry-run -o yaml |
    kubectl -n ${namespace} apply -f -

(3)編寫Deployment部署服務。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: validate-delete-namespace
  labels:
    app: validate-delete-namespace
spec:
  replicas: 1
  selector:
    matchLabels:
      app: validate-delete-namespace
  template:
    metadata:
      labels:
        app: validate-delete-namespace
    spec:
      containers:
        - name: server
          image: registry.cn-hangzhou.aliyuncs.com/coolops/validate-delete-namespace:latest
          imagePullPolicy: Always
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8443
              scheme: HTTPS
          ports:
            - containerPort: 8443
          volumeMounts:
            - name: tls-certs
              mountPath: /etc/certs
              readOnly: true
      volumes:
        - name: tls-certs
          secret:
            secretName: validate-delete-namespace-tls
---
apiVersion: v1
kind: Service
metadata:
  name: validate-delete-namespace
spec:
  selector:
    app: validate-delete-namespace
  ports:
    - port: 443
      targetPort: 8443

(4)部署Webhook

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  name: validate-delete-namespace
webhooks:
  - name: validate-delete-namespace.default.svc.cluster.local
    clientConfig:
      service:
        namespace: default
        name: validate-delete-namespace
        path: "/validate/delete-namespace"
      caBundle: "${CA_BUNDLE}"
    rules:
      - operations:
          - DELETE
        apiGroups:
          - ""
        apiVersions:
          - "v1"
        resources:
          - namespaces
    failurePolicy: Ignore

這裡有一個${CA_BUNDLE}佔位符,在建立Webhook的時候要將其替換掉,使用如下命令:

cat ./validate-delete-namespace.yaml | sh ./patch-webhook-ca.sh > ./webhook.yaml

然後建立webhook.yaml即可。

kubectl apply -f webhook.yaml

上面的所有檔案都在程式碼庫裡,可以直接使用指令碼進行部署。

# sh deploy.sh 
creating certs in tmpdir /tmp/tmp.SvMHWcPI6x 
Generating RSA private key, 2048 bit long modulus
..........................................+++
.............................................................+++
e is 65537 (0x10001)
certificatesigningrequest.certificates.k8s.io/validate-delete-namespace.default created
NAME                                AGE   REQUESTOR          CONDITION
validate-delete-namespace.default   0s    kubernetes-admin   Pending
certificatesigningrequest.certificates.k8s.io/validate-delete-namespace.default approved
secret/validate-delete-namespace-tls created
Creating k8s admission deployment
deployment.apps/validate-delete-namespace created
service/validate-delete-namespace created
validatingwebhookconfiguration.admissionregistration.k8s.io/validate-delete-namespace created

執行完成過後,可以檢視具體的資訊。

# kubectl get po
NAME                                         READY   STATUS    RESTARTS   AGE
validate-delete-namespace-74c9b8b7bd-5g9zv   1/1     Running   0          3s
# kubectl get secret
NAME                            TYPE                                  DATA   AGE
default-token-kx5wf             kubernetes.io/service-account-token   3      72d
validate-delete-namespace-tls   Opaque                                2      53s
# kubectl get ValidatingWebhookConfiguration
NAME                                  CREATED AT
validate-delete-namespace             2022-06-24T09:39:26Z

測試

(1)首先開啟webhook的pod日誌。

# kubectl logs validate-delete-namespace-74c9b8b7bd-5g9zv -f
I0624 17:39:27.858753       1 main.go:30] Server running in port: 8443

(2)建立一個namespace並刪除。

# kubectl create ns joker
# kubectl get ns | grep joker
joker                             Active   4h5m
# kubectl delete ns joker
Error from server: admission webhook "validate-delete-namespace.default.svc.cluster.local" denied the request without explanation
# kubectl get ns | grep joker
joker                             Active   4h5m

可以發現我們的刪除操作被拒絕了,並且檢視namespace還存在。

我們也可以到日誌中檢視,如下:

# kubectl logs validate-delete-namespace-74c9b8b7bd-5g9zv -f
I0624 17:39:27.858753       1 main.go:30] Server running in port: 8443
2022/06/24 17:43:34 You cannot delete namespace:  joker
I0624 17:43:34.664945       1 handler.go:94] Webhook [/validate/delete-namespace - DELETE] - Allowed: false
2022/06/24 17:43:34 You cannot delete namespace:  joker
I0624 17:43:34.667043       1 handler.go:94] Webhook [/validate/delete-namespace - DELETE] - Allowed: false

上面就是簡單的實現了一個准入控制器。

參考

https://www.qikqiak.com/post/k8s-admission-webhook

https://github.com/douglasmakey/admissioncontroller

https://mritd.com/2020/08/19/write-a-dynamic-admission-control-webhook/