Understand labels and selectors in Kubernetes

byGink▻ Fri, 10 Sep 2021

1. Labels

Labels are key/value pairs, always stay in objects metadata. The quotes exactly from K8s documment:

Valid label keys have two segments: an optional prefix and name, separated by a slash /. The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character [a-z0-9A-Z] with dashes -, underscores _, dots ., and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots ., not longer than 253 characters in total, followed by a slash /.

If the prefix is omitted, the label Key is presumed to be private to the user. Automated system components which add labels to end-user objects must specify a prefix.

Valid label value must be 63 characters or less (can be empty), unless empty, must begin and end with an alphanumeric character [a-z0-9A-Z], could contain dashes -, underscores _, dots ., and alphanumerics between.

In summary, from user aspect, we should use only alphanumerics and 3 symbols (dash, dot, underscore), no more specical. And it's up to 63 max length. The same rule applies for both key and value. Clearly.

As in this manifest example, we declared a pod labeled with 2 keys environment and app:

apiVersion: v1
kind: Pod
metadata:
  name: label-demo
  labels:
    environment: production
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80

2. Selectors

Let's move on to the interesting part, selectors. Because only defining labels is meaningless. And before that, we must know that currently, K8s API support 2 types of selectors:

Equality-based
- Three kinds of operators =, ==, !=
Set-based
- Three kind of operators in, notin, exists

Keep in mind about that because some resouces like Service, ReplicationController support only Equality-based while the others like Job, ReplicaSet, Deployment, DaemonSet support Set-based.

Therefore, the syntax for selecting will be a little bit different:

Equality-based:

selector:
  app: nginx

Set-based:

selector:
  matchLabels:
    app: nginx
  matchExpressions:
    - {key: environment, operator: NotIn, values: [develop]}

That's why we often see the difference in selector syntax when declaring manifest between Service and Deployment.

Now, let's take a look futher on a Deployment manifest to see how it works:

piVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web
  namespace: testing
spec:
  replicas: 2
  selector:
    matchLabels:
      app: web
      environment: dev
  template:
    metadata:
      labels:
        app: web
        environment: dev
    spec:
      containers:
      - image: gcr.io/google-samples/hello-app:1.0
        imagePullPolicy: IfNotPresent
        name: hello-app

In this example, we declared the Pod inside spec → template. It has 2 labels app and environment. These labels will be used right away by its own manager (Deployment) which has selector with the same condition in matchLabels. With this way, the Deployment will check and make sure correct amount of replicas will be run in the cluster.

3. Selector via API

From API via kubectl, we can also specify selector (-l) like following:

# equality-based
kubectl get pod -l app=web,environment=production

# set-based
kubectl get pod -l 'app=web,environment in (production)'

That's it!

By a group of appropriate labels for the object, we can easily organize the structure for all resources inside Kubernetes, apply configuration and better for filtering resources. Especially when there are more and more services running inside a cluster.

I recommend to always config more than one label for important resources like deployment, statefulset or specific pod. It will help you a lot when filtering log data with labels.