Skip to main content

Enabling Kubernetes self-service the operator way

Learn how operators can serve as governance tools in a multitenant setting.
Image
Colorful cubes

Image by lisa runnels from Pixabay

This article presents an approach for using a custom operator and the Operator SDK to transform a collection of YAML files into a custom resource. It outlines the steps for creating a distinctive resource CustomResource / CustomResourceDefinition that you can use to deploy a predefined set of resources, such as a particular policy, governance, or deployment.

The primary aim of this article is to demonstrate how operators can serve as governance tools in a multitenant setting. By delegating access controls to tenant teams, platform administrators can enable self-service capabilities within their organizations without jeopardizing coexistence among tenants.

[ Learn how to extend the Kubernetes API with operators

Prerequisites for this tutorial

  1. You have an understanding of the basics of Kubernetes and you are the administrator of the cluster where you are deploying. To follow this example, a local machine instance is good enough.
  2. An understanding of Ansible basics is important.
  3. You need access to a container image repository to publish images built during this exercise.

Get started

For this tutorial, you'll simulate a scenario or a problem statement you are trying to solve.

You want your cluster's tenants to have the ability to create or delete a namespace in a multitenant environment.

However, this scenario poses some problems that you need to solve to answer this request:

  • There is a single resource type kind: Namespace. Providing create or delete access to this resource would potentially grant access to delete namespaces belonging to another tenant in the same cluster.
  • Tenants can create namespaces without restrictions without mapping to resource quotas or LDAP bindings to the namespace, among others.

Now I'll explore how you can solve this problem and make this scenario a self-service capability for tenants using operators.

Step 1: Install the dependent libraries

The dependencies required in your PATH variables are

  1. Kubectl
  2. Container runtime such as Podman
  3. Operator SDK (see documentation)

Step 2: Set up the environment

Log in to OpenShift or Kubernetes as kubeadmin (prefer a local instance).

Clone an empty git repo to your local machine. This is needed to store the code you generate. Then switch into the cloned folder.

For this example, assume you have 2 tenants:

  • Tenant 1 is named Acme
  • Tenant 2 is named Umbrella

In this article, I aim to create 2 custom resources Kind: AcmeNamespace and Kind: UmbrellaNamespace, which can create a namespace and attach to the appropriate cluster resource quota and perform other actions. Once the permissions are mapped, tenants in Acme cannot interfere with namespaces belonging in Umbrella.

Initiate an operator project:

$ operator-sdk init --domain mytenant.com --plugins ansible

Create custom resource definitions for your resources:

$ operator-sdk create api --group tenants \
    --version v1alpha1 --kind AcmeNamespace --generate-role

$ operator-sdk create api --group tenants \
    --version v1alpha1 --kind UmbrellaNamespace --generate-role

Open the generated codebase in an IDE.

Step 3: Understand the codebase generated by the Operator SDK

For the purposes of this article, you will go through a more simplistic, minimum viable product (MVP) approach. In this case, you will edit only the files required to make it work, without anything extra. For a production environment, you may need to add more checks and logic according to your requirements.

You need the following files to follow along with this article, which are created when the operator-sdk create api command is executed in step 2:

  • config/crd: Contains 2 custom resource definitions (CRDs). One for each type you added kind: AcmeNamespace and kind UmbrellaNamespace.
  • roles/api name: This is the Ansible role that gets executed when the operator encounters an instance of the custom resource defined previously.

A CRD is what an operator tracks. When the operator receives a resource or YAML instance belonging to this CRD, it performs an action. A CRD contains the OpenAPI schema specification which defines the structure of the resource YAML file.

In this article you will use a free-form OpenAPI spec key x-kubernetes-preserve-unknown-fields: true, which defines that it can accept any format.

Step 4: Build the Ansible role

Under roles/acmenamespace/tasks create a file called main.yaml. The main.yaml is the entrypoint for this role:

---
- name: Create namespace
  kubernetes.core.k8s:
    definition:
      apiVersion: v1
      kind: Namespace
      metadata:
        name: "{{ ansible_operator_meta.name }}"
        labels:
          tenant: acme

This task creates a new namespace. You can add additional LDAP bindings as necessary. You can also add other resources such as security context constraints (SCC), limit ranges, labels, or any other resources you consider as default tenant settings.

Do the same for Umbrella tenants but with a different label tenant: Umbrella within the file roles/umbrellanamespace/tasks/main.yaml.

Note: In order to showcase custom tenant specific configuration, I'm adding in a network policy to this namespace:

---
- name: Create namespace
  kubernetes.core.k8s:
    definition:
      apiVersion: v1
      kind: Namespace
      metadata:
        name: "{{ ansible_operator_meta.name }}"
        labels:
          tenant: umbrella

- name: Create default network policy to allow networking between namespaces owned by tenant umbrella.
  kubernetes.core.k8s:
    definition:
      apiVersion: networking.k8s.io/v1
      kind: NetworkPolicy
      metadata:
        name: allow-ingress-from-all-namespaces-under-umbrella-tenant
        namespace: "{{ ansible_operator_meta.name }}"
      spec:
        podSelector: null
        ingress:
          - from:
              - namespaceSelector:
                  matchLabels:
                    tenant: umbrella

In the same way as with the previous tenant, you can add your custom configuration and resources here.

Step 5: Prepare the cluster to accept these new namespaces

To ensure your cluster maps the new namespaces to the appropriate resource quotas, you'll need to prepare the cluster.

Create a resource quota mapping to the labels you used to define the tenants above:

$ oc create clusterquota acme \
     --project-label-selector tenant=acme \
     --hard pods=10 \
     --hard secrets=20

$ oc create clusterquota umbrella \
     --project-label-selector tenant=umbrella \
     --hard requests.cpu=4

If you use Podman in your environment, replace occurrences of docker to podman in the file Makefile:

$ sed -i 's/docker/podman/g' Makefile

Publish the operator you built to a container image repository. Modify the MakeFile in the root directory and replace the value for the IMG variable with your own public repository. You may need to log in to your repository first.

## Change
IMG ?= controller:latest

## To your in-house repo
IMG ?= docker.io/datatruckerio/operatorize-tenancy:latest 

Execute the operator to the cluster:

$ make podman-build podman-push
$ make deploy

The operator namespace is created automatically by the operators as <project-name>-system.

You'll need to provide authorization for the service account created for the operator via RBAC. To keep the project relatively simple, you'll add the service account to the cluster-admin role which grants all privileges. For a real deployment, you should definitely create a specific cluster role and role binding with only the required permissions to be provisioned by the Ansible role.

To add the service account to role cluster-admin, execute this command:

$ oc adm policy add-cluster-role-to-user cluster-admin \
    -z <project-name>-controller-manager \
    -n <project-name>-system

Alternatively, you can add the granular RBAC configuration in the config/rbac folder of your operator.

Push your code to the git repository for storage. You can reuse this repository to add more custom resources or upgrade the operator later. This helps in having one operator for several custom resources.

Step 6: Test

Create a new AcmeNamespace using the operator:

apiVersion: tenants.mytenant.com/v1alpha1
kind: AcmeNamespace
metadata:
  name: acmenamespace-sample
  namespace: <project-name>-system

Now you can see the new namespace auto-mapped to the cluster resource.

[operatorize-my-stuff]$ oc get namespace | grep acme
acmenamespace-sample                               Active   36s

[operatorize-my-stuff]$ oc describe  clusterresourcequota acme
Name:           acme
Created:        16 hours ago
Labels:         <none>
Annotations:    <none>
Namespace Selector: ["acmenamespace-sample"]
Label Selector: tenant=acme
AnnotationSelector: map[]
Resource        Used    Hard
--------        ----    ----
pods            0       10
secrets         6       20

Create Umbrella namespace:

apiVersion: tenants.mytenant.com/v1alpha1
kind: UmbrellaNamespace
metadata:
  name: umbrella-example
  namespace: <project-name>-system

Now you can see the new namespace auto-mapped to the cluster resource. You can also see the network policy deployed. Even if the tenant tries to delete it, the operator will replace it in its next reconciliation.

[operatorize-my-stuff]$ oc get namespaces | grep umbrella
umbrella-example                                   Active   7m33s

[operatorize-my-stuff]$ oc describe clusterresourcequota umbrella
Name:           umbrella
Created:        16 hours ago
Labels:         <none>
Annotations:    <none>
Namespace Selector: ["umbrella-example"]
Label Selector: tenant=umbrella
AnnotationSelector: map[]
Resource        Used    Hard
--------        ----    ----
requests.cpu    0       4

[operatorize-my-stuff]$ oc project umbrella-example
[operatorize-my-stuff]$ oc get networkpolicy
NAME                                                      POD-SELECTOR   AGE
allow-ingress-from-all-namespaces-under-umbrella-tenant   <none>         90s

[Kubernetes: Everything you need to know]

Tips and tricks

You can use this method for more than provisioning. A few examples:

  • Add a Grafana instance prepared with configured data sources
  • Add an Ingress resource merged with cert-manager to ensure tenants can only create TLS ingress
  • Add a namespace with default Prometheus rules, alerts, and alert receivers
  • Add egress node labels

You can also convert existing Helm charts to a service component like authentication-as-a-service or proxy-as-a-service. The operator allows Ansible automation in its execution, thus enabling a complex list of instructions to be executed in a specific order as well as triggering APIs outside the scope of the Kubernetes cluster. The custom resources also let you set variables using spec, allowing customizations based on variables. You can find more info in the documentation.

Summary

Now that you have 2 separate custom resources kind: AcmeNamespace and kind: UmbrellaNamespace, you can map specific RBAC permissions allowing only members of tenant Acme to access AcmeNamespace, and members of Umbrella to access UmbrellaNamespace.

By doing this, the tenants can provision their own namespaces without actually having access to delete the other tenants' namespace. Applying governance policies on the objects the tenant creates enables self-service capabilities to tenants in a "git-ops-like" way, meaning you can store the entire configuration as YAML files in a Git repository.

A copy of this example implementation is stored in my public Github repo for reference.

Topics:   Kubernetes   Containers   DevOps  
Author’s photo

Gaurav Shankar

Gaurav Shankar is a Senior App Dev Consultant at Red Hat Canada, where he specializes in application development on OpenShift. More about me

Try Red Hat OpenShift, the enterprise Kubernetes application platform that helps teams focus on the work that matters.