SaFi Bank Space : Thought Machine for SandBox TM-3 (Manual Installation)

Content Sections

Preparation

Gcrane utility for recursively copying the container images between (Google in this case, but should be generally for all) container repositories:

gcrane cp -r asia.gcr.io/safi-sandbox-tm3 asia.gcr.io/safi-sandbox-tm4 | tee asia.gcr.io_safi-sandbox-tm3-tm4-copy.lst

Step 0 - Access to the Kubernetes in GCP project

After Terraform run there is a Bastion server instance created in the GCP, which can be used as a gateway to access the GCP resources. One of the parameters specified in Terraform is a public key to put in the instance template.

Log in to the bastion via GCP ssh, using the private key:

gcloud compute ssh <bastion-instance-name> --zone=asia-southeast1-a --ssh-key-file=/home/<user>/.ssh/<ssh-private-key-file>

gcloud utility will log you in as current user (name as you are logged in on the local machine).

Create socks5 proxy - dynamic port forwarding, ssh will run in background without visible terminal:

gcloud compute ssh <bastion-instance-name> --zone=asia-southeast1-a --ssh-key-file=/home/<user>/.ssh/<ssh-private-key-file> -- -fN -D 8888

afterwards you use local kubectl through the proxy:

https_proxy=socks5://localhost:8888 kubectl get nodes

Note: this approach does not seem to work with kubectl exec commands, as they use SPDY2 protocol, only kubectl get commands.

For kubectl exec you need to install HTTP proxy (like tinyproxy) on the bastion (apt-get update; apt-get install tinyproxy) , and use different port-forwarding parameter:

gcloud compute ssh <bastion-instance-name> --zone=asia-southeast1-a --ssh-key-file=/home/<user>/.ssh/<ssh-private-key-file> -- -fN -L 8888:127.0.0.1:8888

and

https_proxy=http://localhost:8888 kubectl get nodes

Step 1.A - HashiCorp Vault Installation (SKIP if you have vault already installed)

If you don’t have the hc-vault prepared, you can install it in the sandbox cluster with the following procedure

source: https://learn.hashicorp.com/tutorials/vault/kubernetes-raft-deployment-guide?in=vault/kubernetes

  1. Login to sandbox cluster and create namespace for vault
    kubectl create namespace hc-vault

  2. Helm must be installed and configured on your machine
    helm repo add hashicorp https://helm.releases.hashicorp.com

  3. Install vault with specific version (that matches production or leave it blank to get latest)
    helm install vault hashicorp/vault --namespace hc-vault --version 0.9.0 -f values.yaml

    #values.yaml sample
    ---
    global:
      metrics:
        enabled: true
      standalone:
        enabled: true
    ui:
      enabled: true

    Check the vault pods if properly working

  4. Initialize vault and Save ALL Unseal Key and Root Token to a txt file

    $ kubectl exec -n hc-vault --stdin=true --tty=true vault-0 -- vault operator init
    
    Unseal Key 1: <example unseal key 1>
    Unseal Key 2: <example unseal key 2>
    Unseal Key 3: <example unseal key 3>
    Unseal Key 4: <example unseal key 4>
    Unseal Key 5: <example unseal key 5>
    
    Initial Root Token: <root_token>

  5. Unseal the vault

    ## Unseal the first vault server until it reaches the key threshold
    $ kubectl exec --stdin=true --tty=true vault-0 -- vault operator unseal # ... Unseal Key 1
    $ kubectl exec --stdin=true --tty=true vault-0 -- vault operator unseal # ... Unseal Key 2
    $ kubectl exec --stdin=true --tty=true vault-0 -- vault operator unseal # ... Unseal Key 3

Step 1.B - HashiCorp Vault Kube Authentication (SKIP if you have vault kube auth enabled)

sources: - https://tansanrao.com/hashicorp-vault-kubernetes-auth-kv-secrets/

  1. Create a ServiceAccount and ClusterRoleBinding for Vault to access the TokenReview API. If you are running separate clusters, you will have to set your kubectl context to the cluster running the workloads.

    kubectl create serviceaccount vault-sa -n hc-vault
    
    kubectl apply -f -<<EOH
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: role-tokenreview-binding
      namespace: hc-vault
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:auth-delegator
    subjects:
    - kind: ServiceAccount
      name: vault-sa
      namespace: hc-vault
    EOH

  2. set-up access to the HCVault (example for GKE internal Load balancer, needs HTTP Load Balancing enabled for K8s GKE cluster)

    apiVersion: v1
    kind: Service
    metadata:
      name: vault-ui-lb
      namespace: hc-vault
      annotations:
        networking.gke.io/load-balancer-type: "Internal"
    spec:
      selector:
        app.kubernetes.io/instance: vault
        app.kubernetes.io/name: vault
        component: server    
      ports:
        - protocol: TCP
          port: 8200
          targetPort: 8200
      type: LoadBalancer

  3. Access vault from command line to test the connection (http_proxy env variable set to bastion port forward over ssh):

    export https_proxy=http://127.0.0.1:8888
    export http_proxy=http://127.0.0.1:8888
    export VAULT_ADDR=http://10.0.0.10:8200   # or other, current IP address, check K8s services
    export VAULT_SKIP_VERIFY=true #if problems relating SSL prompt
    export VAULT_TOKEN=<root_token>
    
    #Login to Vault
    vault login #input root token

  4. Configuring Kubernetes Auth Method - If you are running separate clusters, you will have to replace k8s_host and k8s_port values with the Kubernetes API endpoint for your workloads cluster and set your kubectl context to the cluster running the workloads.

    #!/usr/bin/env bash
    
    namespace="hc-vault"
    user="vault-sa"   # user name is not that important, it just needs toha ClusterRoleBinding "role-tokenreview-binding"
    
    k8s_host="$(kubectl exec vault-0 -n ${namespace} -- printenv | grep KUBERNETES_PORT_443_TCP_ADDR | cut -f 2- -d "=" | tr -d " ")"
    
    k8s_port="443"
    
    k8s_cacert="$(kubectl config view --raw --minify --flatten -o jsonpath='{.clusters[].cluster.certificate-authority-data}' | base64 --decode)"
    
    secret_name="$(kubectl get serviceaccount ${user} -n ${namespace} -o go-template='{{ (index .secrets 0).name }}')"
    
    tr_account_token="$(kubectl get secret ${secret_name} -n ${namespace} -o go-template='{{ .data.token }}' | base64 --decode)"
    
    vault auth enable -path=k8-sandbox kubernetes
    vault write auth/k8-sandbox/config token_reviewer_jwt="${tr_account_token}" kubernetes_host="https://${k8s_host}:${k8s_port}" kubernetes_ca_cert="${k8s_cacert}"

    1. in some cases disabling iss (JWT issuer validation) is necessary:

  5. Create Secrets / Role / Policy

    vault secrets enable -version=1 -path=secret/dev kv
    
    #create vault-installer policy
    vault policy write vault-installer -<<EOF 
    # Required for vault installer to manage internal secrets and certs used by Thought Machine Vault
    path "secret/dev/*" {
      capabilities = ["create", "read", "update", "delete", "list"]
    }
    # Required by Observability packages installer
    path "secret/monitoring/*" {
      capabilities = ["create", "read", "update", "delete", "list"]
    }
    # Required for the vault installer to determine whether it has a valid HashiCorp Vault access token
    path "auth/token/lookup-self" {
      capabilities = ["read"]
    }
    # Optional: allows vault installer to automatically create policies for services in HashiCorp Vault
    path "sys/policy/*" {
      capabilities = ["create", "read", "update", "delete"]
    }
    # Optional: allows vault installer to automatically create policies for services in HashiCorp Vault
    path "auth/k8-sandbox/role/*"{
      capabilities = ["create", "read", "update", "delete"]
    }
    EOF
    
    #create role and attach policy
    vault write auth/k8-sandbox/role/vault-installer \
        bound_service_account_names=vault-installer \
        bound_service_account_namespaces=tm-system \
        policies=vault-installer \
        ttl=24h

Step 2. Postgresql Adjustments

  1. Check if Postgresql Server has at least 300 max connection setup

  2. Check if Vault Server can communicate with the Postgresql Server, from vault server, use telnet to the postgresql in port 5432

  3. Create a KV secret named root-db-secrets under same path (in this case secret/dev) with the IP or Domain followed by the root password of postgresq

    #IP OR DOMAIN
    {
      "<IP address of the Postgresql instance>": "dbR00tPa$$SOWRd"
    }


Step 3. Thought Machine SAML secret

Stored in secret/dev/dummy-saml-idp-secrets

{
"basic_auth_password": "password",
"basic_auth_user": "username"
}

Ingress

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace

Cert-manager

https://cert-manager.io/docs/installation/

https://www.howtogeek.com/devops/how-to-install-kubernetes-cert-manager-and-configure-lets-encrypt/

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml

Create ClusterIssuer resource

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: lets-encrypt
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: <email address specific to the project certificate management>
    privateKeySecretRef:
      name: lets-encrypt
    solvers:
      - http01:
          ingress:
            class: nginx

Installation

NOTE: Upon writing this document, the TM vault is version v.3.3.1, it is possible the process might change for newer version of TM vault and this document needs to be updated.

Step 1. Setup Vault Installer

  1. Setting up the vault installer pod, you will need k8 resources like namespace, service-account, cluster-role, clusterrole-binding, and the pod installer itself.
    ssh in to the bastion and login to the k8 cluster, execute the following (assuming you have the permissions)

    #create vault-installer namespace
    kubectl create ns tm-system
    
    #create service account
    kubectl create serviceaccount vault-installer -n tm-system
    
    #create cluster role binding, NOTE!!!: for sandbox purpose only, bind the SA to clusteradmin role to avoid permission issues. 
    kubectl apply -f -<<EOH
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: vault-installer
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: cluster-admin
    subjects:
    - kind: ServiceAccount
      name: vault-installer
      namespace: tm-system
    EOH

2. Test the HCVault K8s backend auth login

kubectl run curl-test -n default --image=curlimages/curl -i --tty -- sh

JWT token for the parameter is the token of the "vault-installer" service account in "tm-system" namespace

curl -L --request POST --data-raw '{"jwt":"eyJhb.......", "role":"vault-installer"}' http://vault.hc-vault.svc.cluster.local:8200/v1/auth/k8-sandbox/login

result:

{"request_id":"36f8dccf-0d35-b118-bcf9-46a7a0244319","lease_id":"","renewable":false,"lease_duration":0,"data":null,"wrap_info":null,"warnings":null,"auth":{"client_token":"s.6dbOOLH6nGG1DF6CqNRuUp4a","accessor":"NP8fdq2OnLJNz0YJD3OZ8UiP","policies":["default","vault-installer"],"token_policies":["default","vault-installer"],"metadata":{"role":"vault-installer","service_account_name":"vault-installer","service_account_namespace":"tm-system","service_account_secret_name":"vault-installer-token-rvzw5","service_account_uid":"16ecae88-8b77-48e2-8058-51b8cf002899"},"lease_duration":86400,"renewable":true,"entity_id":"c27bc105-95fb-e714-65f9-ca166fbcdbbd","token_type":"service","orphan":true}}

3. Apply the vault-installer pod manifest below, provide the unknown variables to the template such as {DOCKER_REGISTRY_PATH} {VAULT_NAMESPACE} and {CLOUD_PROVIDER} → we use GCP

#apply priority class
kubectl apply -n tm-system -f priority-classes.yaml

#create vault-installer pod
kubectl apply -n tm-system -f base_vault_installer.yaml

4. Check the pod by running kubectl -n tm-system get po if everything went well, you should have a running pod installer.

Step 1.2. Configuring the values.yaml

One of the most crucial part of the installation is the values.yaml, make sure you configure this correctly. Take a look at base.values.yaml to check all the context and other parameters that can be applied.

  1. For this sandbox installation, we will use the provided values.yaml, every setup is different so check all the applicable values for you installation.

#Copy the values.yaml to the vault-installer pod
kubectl -n tm-system cp values.yaml vault-installer:/release-artifacts/config-templates/

#Copy the package.txt to the vault-installer pod
kubectl -n tm-system cp packages.txt vault-installer:/release-artifacts/packages.txt

Step 2. Install Istio

  1. Ensure the Kubernetes service account, cluster role and cluster role binding for Istio are applied by running:

kubectl apply -n tm-system -f install-istio-service-account.yaml
kubectl apply -n tm-system -f install-istio-cluster-role.yaml
kubectl apply -n tm-system -f install-istio-cluster-role-binding.yaml

2. Install Istio into the istio-system namespace on a cluster by running:

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component istio

3. Label the Thought Machine namespace to use the istio-annotation-tm-webhook: here we use tm-system for vault-installer and thought-machine for TM Vault namespace

#create TM vault namespace
kubectl create ns thought-machine

#label namespace for istio
kubectl label ns thought-machine --overwrite istio-annotation-tm-webhook-thought-machine=enabled

# remove any previous labels for Istio injection
kubectl label ns thought-machine --overwrite istio.io/rev-
kubectl label ns thought-machine --overwrite istio-injection-

# If installing Istio with the Vault installer provided by Thought Machine, the Vault namespace should be labelled for injection by Istio 1.11.
kubectl label ns thought-machine --overwrite istio.io/rev=canary-v1-11

4. To ensure that new configuration is applied to the Istio control plane, restart all pods within istio-system:

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/restart-pods --namespace istio-system

5. Check istio pods if running

kubectl -n istio-system get po

6. You can delete the cluster role and cluster role binding

kubectl delete -n tm-system -f install-istio-cluster-role-binding.yaml
kubectl delete -n tm-system -f install-istio-cluster-role.yaml

Step 2. Install Kafka

  1. First, ensure that the following params has values in the values.yaml, generate your own cert using openssl

secrets_management:
  hashicorp_vault:
    address: http://yourhashicorp-vault.com:8200
    ca_pem: |
      -----BEGIN CERTIFICATE-----
      Blahblahblah
      Blahblahblah
      -----END CERTIFICATE-----

2. Run the install-vault command within the pod to perform Kafka installation:

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component kafka

3. Verify kafka installation

kubectl -n thought-machine get po 

4. Add external LB services for each kafka broker with external IPs

5. to be able to use kafka for manual test by developer over the LB (each broker has own LB with public IP added to DNS), environmental variable EXTERNAL_KAFKA_V2_DOMAIN needs to be set, or the list of brokers returned won’t contain the full domain names:

kafkacat -b kafka-0.kafka.tm3.sandbox.safibank.online:9096 -L | head -7                                                                                                                           2008ms 
Metadata for all topics (from broker -1: kafka-0.kafka.tm3.sandbox.safibank.online:9096/bootstrap):
 3 brokers:
  broker 0 at kafka-0.:9096 (controller)
  broker 2 at kafka-2.:9096
  broker 1 at kafka-1.:9096

In ConfigMap kafka-broker-config-kafka-package-ff1352e156 - to be sure that name suffix won’t change between deployments, check “broker” container settings in kafka StatefullSet:

      containers:
        - name: broker
...
...
...
          envFrom:
            - configMapRef:
                name: kafka-broker-config-kafka-package-ff1352e156

update value: EXTERNAL_KAFKA_V2_DOMAIN = 'kafka.tm3.sandbox.safibank.online'

Restart the pods (rolling update) managed by StatefullSet:

kubectl rollout restart statefulset kafka -n thought-machine

you can watch progress

rollout status statefulset kafka -n thought-machine                                                                                                                                       1685ms 
waiting for statefulset rolling update to complete 0 pods at revision kafka-8545f558c8...
Waiting for 1 pods to be ready...
Waiting for 1 pods to be ready...

Step 3. Install Webhook Operator

This operator should always be installed prior to any Vault installation, as it manages critical webhook configurations required to mutate Kubernetes resources for Vault installation.

Run the install-vault command within the pod to perform Webhook Operator installation:

kubectl create namespace webhook-operator
kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component webhook-operator

Step 3.5. Add DNS entries

Add DNS entries for Kafka LBs and Ingress for domain (tm3.sandbox.safibank.online)

https://github.com/SafiBank/SaFiMono/blob/main/devops/terraform/tf-dns-safibankonline/sandbox_tm3.tf

Step 3.5 Add ingress entries

cd support_files/thoughmachine_installer/manifests/ingress
kubectl apply -f .

Step 4. Install Though Machine Vault

NOTE: This would take quite some time so standby and watch the progress of installation to avoid any errors

Before you run the Vault installation, you may want to test it first in dry run mode.

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component vault --dry_run

Once you are happy with dry-run result, deploy TM vault.

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component vault

or for minimal resource consumption and no HPA deployment:

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component vault --dev_mode

This will deploy a lot of package that are included in package.txt

Step 5. Observability package

kubectl exec -it -n tm-system vault-installer -- /deployment-tools/install-vault --component observability

Step 6. Cleanup

Deletion of Vault installer pod (recommended)

Since the Vault installer pod is granted a number of permissions to perform deployments in the Kubernetes cluster, we recommend that you delete the Vault installer pod once the installation is complete.

#delete vault-installer pod
kubectl -n tm-system delete -f base_vault_installer.yaml

#delete vault-installer service account
kubectl -n tm-system delete sa vault-installer
 
#delete vault-installer crb
kubectl -n tm-system delete clusterrolebinding vault-installer

Uninstall/removal

Istio namespace

kubectl delete namespaces istio-system
kubectl delete daemonset istio-cni-node -n kube-system
kubectl delete configmap istio-ca-root-cert -n default
kubectl delete configmap istio-cni-config -n kube-system
kubectl get crd | grep -i istio | cut -f 1 -d " " | xargs kubectl delete crd

Thought Machine namespace

? kubectl delete mutatingwebhookconfigurations istio-sidecar-injector-canary-v1-11   
? kubectl delete mutatingwebhookconfigurations istio-annotation-tm-webhook-thought-machine 
kubectl delete namespaces thought-machine

# if it will be stuck for long time in Terminating state, see troubleshooting sectino of this page, or run:
kubectl patch crd/managedwebhooks.tmachine.io -p '{"metadata":{"finalizers":[]}}' --type=merge

Webhook operator namespace

kubectl delete namespaces webhook-operator

Thought Machine installer namespace

kubectl delete namespaces tm-system

HashiCorp Vault namespace

kubectl delete namespaces hc-vault

Troubleshooting

When removing namespace does not work, one possible cause can be finalizers deadlock: https://stackoverflow.com/questions/52009124/not-able-to-completely-remove-kubernetes-customresource https://www.redhat.com/sysadmin/troubleshooting-terminating-namespaces

For Managedwebhooks, the solution:

kubectl patch crd/managedwebhooks.tmachine.io -p '{"metadata":{"finalizers":[]}}' --type=merge