Extending kubectl

  • September 7, 2019
  • tuxotron
  • Kubectl

    Kubectl Plugin

    As you probably already know, kubectl is the official tool to interact with Kubernetes from the command line. This tool, besides all the functionality that it already provides, allows us to extend its functionality through plugins.

    A kubectl plugin is nothing but a file with the following three requirements:

    • It has to be an executable (binary or script)
    • It must be in your system’s PATH
    • Its name must start with kubectl- (including the dash!)

    The plugin system in kubectl was introduced as alpha in version 1.8.0 and it was rewritten in version 1.12.0, which is the minimum version recommended if you are going to play with this feature.

    Let’s write our first plugin. It will be a bash script named kubectl-hello. This is its content:

    #!/bin/bash
    
    echo "Hello, World!"
    

    Prefixing the file name with kubectl- is one of the requirements. Another requirement is to make it an executable:

    chmod +x kubectl-hello
    

    The last requirement is to make sure the file is somewhere in the PATH. You can either copy the file to a directory that is already in the PATH, or make the directory you created the file in, part of the PATH. In our case, we’ll take the second approach:

    export PATH=$PATH:~/tmp/kplugin
    

    This command only takes effect in the session where you run it. Once that session is closed (or opening a different terminal), your directory will no longer be part of your PATH. To make it persistent you will need to add that command to your ~/.bashrc, ~/.zshrc or something similar.

    Now we can run our plugin:

    kubectl hello
    Hello, World!
    

    Let’s create another plugin, a little bit more useful this time. In this case will also be a bash script. This plugin will generate a configuration file with the given account’s credentials (in this case a service account). This is pretty handy when you need to interact with a Kubernetes cluster from, let’s say your CI/CD server. In order to get your server access to the cluster, you will need to provide some type of credential. So you can use this plugin to create the configuration file based on a service account with the necessary credentials and cluster information.

    Our file will have the following content:

    #!/bin/bash
    
    set -e
    
    usage="
    USAGE: 
      kubectl kubeconf-generator -a SERVICE_ACCOUNT -n NAMESPACE
    "
    
    while getopts a:n: option
    do
    case "${option}"
    in
    a) SA=${OPTARG};;
    n) NAMESPACE=${OPTARG};;
    esac
    done
    
    [[ -z "$SA" ]] && { echo "Service account is required" ; echo "$usage" ; exit 1; }
    [[ -z "$NAMESPACE" ]] && { echo "Namespace is required" ; echo "$usage" ; exit 1; }
    
    # Get secret name
    SECRET_NAME=($(kubectl get sa $SA -n $NAMESPACE -o jsonpath='{.secrets[0].name}'))
    
    # Get secret value
    SECRET=$(kubectl get secret $SECRET_NAME -n $NAMESPACE -o jsonpath='{.data.token}' | base64 -D)
    
    # Get cluster server name
    SERVER=$(kubectl config view --minify -o json | jq -r '.clusters[].cluster.server')
    # Get cluster name
    CLUSTER_NAME=$(kubectl config view --minify -o json | jq -r '.clusters[].name')
    
    # Get cluster certs
    CERTS=$(kubectl config view --raw --minify -o json | jq -r '.clusters[].cluster."certificate-authority-data"')
    
    cat << EOM
    apiVersion: v1
    kind: Config
    users:
    - name: $SA
      user:
        token: $SECRET
    clusters:
    - cluster:
        certificate-authority-data: $CERTS
        server: $SERVER
      name: $CLUSTER_NAME
    contexts:
    - context:
        cluster: $CLUSTER_NAME
        user: $SAS
      name: svcs-acct-context
    current-context: svcs-acct-context
    EOM
    

    As you can see, this plugin, besides kubectl also uses jp. So to make it work, you need to have such tool installed as well.

    The plugin expects two parameters: the service account and the namespace for such account.

    Let’s dump the content listed above into a file named kubectl-kubeconf_generator. Pay attention to the underscore character. We’ll come back to this later.

    Let’s run our plugin without any parameters:

    kubectl kubeconf_generator
    Service account is required
    
    USAGE:
      kubectl-kubeconf_generator -a SERVICE_ACCOUNT -n NAMESPACE
    

    Now let’s call it with the service account (default) and namespace (default). Now this time the plugin will try to fetch some information from the cluster defined in your current context. In this case I’m using minikube (if you are too, make sure it is up and running):

    kubectl kubeconf_generator -a default -n default
    apiVersion: v1
    kind: Config
    users:
    - name: default
      user:
        token: REDACTADO
    clusters:
    - cluster:
        certificate-authority-data: null
        server: https://192.168.99.110:8443
      name: minikube
    contexts:
    - context:
        cluster: minikube
        user:
      name: svcs-acct-context
    current-context: svcs-acct-context
    

    The output is a configuration file we could use to provide access to our minikube using the default service account from the default namespace (with whatever permissions granted to that account).

    Let’s go back to the name of the file. Remember, in our case we named our file as kubectl-kubeconf_generator. When kubectl sees an underscore character in the plugin name, it will allow us to call the plugin either with the underscore or a dash:

    kubectl kubeconf_generator -a default -n default
    ...
    

    o kubectl kubeconf-generator -a default -n default …

    Now, if you rename our file to kubectl-kubeconf-generator (replace the underscore with a dash), kubectl recognizes as a command and subcommand. So to call it this time, we will have to have a blank space between kubeconf and generator:

    kubectl kubeconf generator -a default -n default
    ...
    

    Following this pattern we can create subcommands for our plugin command. For instance, imagine we want to be able to generate two different output formats: yaml and json. We could create two files:

    kubectl-kubeconf-generator-yaml
    kubectl-kubeconf-generator-json
    

    This way to invoke them:

    kubectl kubeconf generator json ...
    ...
    kubectl kubeconf generator yaml ...
    ...
    

    *This example is not the best way to customize the output. For that you may want to use a parameter, in this case probably -o to align with kubectl “standards”.*

    To wrap up, it is worth to mention that you can’t override an existing kubectl command. For instance, kubectl provides the version command, so if you create a new plugin and named kubectl-version, when you call kubectl version, the internal version command will be called and not your plugin. Also there are some rules about name conflicts, when you have two plugins with the same name in different directories, etc, but I’m not going to touch on those rules in this entry. You can always consult the official documentation linked at the beginning of this post.

    Last but not least, I just want to mention that kubectl has a plugin command with the list option which will show us the kubectl plugins in our system:

    kubectl plugin list
    The following compatible plugins are available:
    
    /Users/tuxotron/tmp/kplugin/kubectl-hello
    /Users/tuxotron/tmp/kplugin/kubectl-kubeconf-generator
    

    In a future entry I will talk about a cleaner way to handle and install plugins.

How kubectl uses Kubernetes API

  • August 19, 2019
  • tuxotron
  • kubectl

    kubectl Fuente: https://blog.risingstack.com/what-is-kubernetes-how-to-get-started/

    As you probably already know, any type of query or command that you run against Kubernetes, it is done by sending an API request to a component called API server. This component lives in the master node/s.

    API Server

    Fuente: https://blog.openshift.com/kubernetes-deep-dive-api-server-part-1/

    The most common way to interact with a Kubernetes cluster, although you have several graphical options, it is through a command line tool called kubectl.

    This tool provides a quite extended number of options, but in this entry I’m going to focus on verbosity, which is a very handy option if we want to learn more how kubectl interacts with the component mentioned previously: the API server.

    Any command we run though kubectl, we can ask to obtain a more verbose output by adding the -v or –v option to it. This option also gets the level of verbosity we would like to get out of our command as a parameter. Such level is specified by a number between 0 and 9 inclusive, and each level provides a certain degree of verbosity as you can see in the following image:

    Verbosity

    Fuente: https://kubernetes.io/docs/reference/kubectl/cheatsheet/

    For instance, if we run the following command:

    kubectl get pods
    NAME                     READY   STATUS    RESTARTS   AGE
    nginx-7bb7cd8db5-8z6t8   1/1     Running   0          33s
    

    We get the pods running in the default workspace.

    Now, if we add to the previous command the option -v=5:

    kubectl get pods -v=5
    I0819 17:02:54.174578   30833 get.go:564] no kind "Table" is registered for version "meta.k8s.io/v1beta1" in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:30"
    NAME                     READY   STATUS    RESTARTS   AGE
    nginx-7bb7cd8db5-8z6t8   1/1     Running   0          26s
    

    Besides the same result we obtained previously, we can also see some extra information. Levels between 0 and 5 will give you back some extra information about what’s going on while kubectl runs. This could be very helpful for debugging purposes. But in this post, I want to focus on levels 6 to 9, in which kubectl will also provide information about the resources called (API) and the information sent and received (headers and body) from those calls.

    Let’s run again the previous command, but changing the verbosity level to 6:

    kubectl get pods -v=6
    ...
    I0819 17:11:39.565753   30923 round_trippers.go:438] GET https://192.168.99.110:8443/api/v1/namespaces/default/pods?limit=500 200 OK in 12 milliseconds
    ...
    

    As you can see here, we see now some extra information about the calls made to the API server, in this case a GET request to https://192.168.99.110:8443/api/v1/namespaces/default/pods?limit=500. Here you can also see the limit parameter issued by 'kubectl, so if you run a get pods and only get 500 pods back, now you know where the limitation is coming from ;)

    Let’s try now with verbosity level 7:

    kubectl get pods -v=7
    ...
    I0819 17:22:29.600084   31029 round_trippers.go:416] GET https://192.168.99.110:8443/api/v1/namespaces/default/pods?limit=500
    I0819 17:22:29.600108   31029 round_trippers.go:423] Request Headers:
    I0819 17:22:29.600118   31029 round_trippers.go:426]     Accept: application/json;as=Table;v=v1beta1;g=meta.k8s.io, application/json
    I0819 17:22:29.600132   31029 round_trippers.go:426]     User-Agent: kubectl/v1.15.2 (darwin/amd64) kubernetes/f627830
    I0819 17:22:29.612086   31029 round_trippers.go:441] Response Status: 200 OK in 11 milliseconds
    ...
    

    As you can see here, the difference between levels 6 and 7, is that in 6 we only see the resources called, and 7 we also see the HTTP headers of such calls.

    In levels 8 and 9, besides the headers, we also get the body content. The difference between these two, 8 and 9, is that in 9 the content is not truncated while in 8 it is. Let’s see an example:

    kubectl get pods -v=8
    ...
    I0819 17:22:22.188395   31000 request.go:947] Response Body: {"kind":"Table","apiVersion":"meta.k8s.io/v1beta1","metadata":{"selfLink":"/api/v1/namespaces/default/pods","resourceVersion":"70162"},"columnDefinitions":[{"name":"Name","type":"string","format":"name","description":"Name must be unique within a namespace. Is required when creating resources, although some resources may allow a client to request the generation of an appropriate name automatically. Name is primarily intended for creation idempotence and configuration definition. Cannot be updated. More info: http://kubernetes.io/docs/user-guide/identifiers#names","priority":0},{"name":"Ready","type":"string","format":"","description":"The aggregate readiness state of this pod for accepting traffic.","priority":0},{"name":"Status","type":"string","format":"","description":"The aggregate status of the containers in this pod.","priority":0},{"name":"Restarts","type":"integer","format":"","description":"The number of times the containers in this pod have been restarted.","priority":0},{"name":"Age","type":"string [truncated 2611 chars]
    ...
    

    Let’s check out now a different command with a little bit more complexity:

    kubectl describe pod nginx-7bb7cd8db5-8z6t8 -v=6
    ...
    I0819 17:26:27.770772   31121 round_trippers.go:438] GET https://192.168.99.110:8443/api/v1/namespaces/default/pods/nginx-7bb7cd8db5-8z6t8 200 OK in 12 milliseconds
    I0819 17:26:27.777728   31121 round_trippers.go:438] GET https://192.168.99.110:8443/api/v1/namespaces/default/pods/nginx-7bb7cd8db5-8z6t8 200 OK in 2 milliseconds
    I0819 17:26:27.786906   31121 round_trippers.go:438] GET https://192.168.99.110:8443/api/v1/namespaces/default/events?fieldSelector=involvedObject.name%3Dnginx-7bb7cd8db5-8z6t8%2CinvolvedObject.namespace%3Ddefault%2CinvolvedObject.uid%3D9e77227d-cc08-4365-aeab-c0bbbfc1c1d8 200 OK in 2 milliseconds
    ...
    

    Here you can observe that describe requires more than one call to the API server.

    And lastly, let’s create a deployment using the run command:

    kubectl run nginx2 --image nginx -v=8
    ...
    I0819 17:29:23.727063   31398 round_trippers.go:416] GET https://192.168.99.110:8443/apis/apps/v1?timeout=32s
    I0819 17:29:23.727097   31398 round_trippers.go:423] Request Headers:
    ...
    I0819 17:29:23.749539   31398 request.go:947] Request Body: {"kind":"Deployment","apiVersion":"apps/v1","metadata":{"name":"nginx2","creationTimestamp":null,"labels":{"run":"nginx2"}},"spec":{"replicas":1,"selector":{"matchLabels":{"run":"nginx2"}},"template":{"metadata":{"creationTimestamp":null,"labels":{"run":"nginx2"}},"spec":{"containers":[{"name":"nginx2","image":"nginx","resources":{}}]}},"strategy":{}},"status":{}}
    I0819 17:29:23.749618   31398 round_trippers.go:416] POST https://192.168.99.110:8443/apis/apps/v1/namespaces/default/deployments
    I0819 17:29:23.749631   31398 round_trippers.go:423] Request Headers:
    I0819 17:29:23.749638   31398 round_trippers.go:426]     Content-Type: application/json
    I0819 17:29:23.749645   31398 round_trippers.go:426]     User-Agent: kubectl/v1.15.2 (darwin/amd64) kubernetes/f627830
    ...
    

    In this case we see that kubectl not only makes a GET request, but also a POST request, and because we are using the verbosity level 8, we can also see the body content of that POST request, as well as the responses back from GET and POST.

    This is a very nice way to see how kubectl uses the Kubernetes API behind the scene and also an interactive way to learn more about the this API, besides obviously consulting its documentation.

    Happy Hacking!

Moving the blog images from Flickr to Digital Ocean Spaces

  • February 3, 2019
  • tuxotron
  • Cyberhades Digital Ocean Spaces

    Cyberhades Digital Ocean Spaces

    In this blog we have been using Flickr as our main images repository since 2008. We even paid for a pro account for a couple of years back in 2015 and 2016, however I can’t recall the benefits of a pro account versus the free one.

    Our experience with Flickr has been always very positive and never had an issue with them, but after the acquisition of Flickr by SmugMug, they recently changed the policies, and they announced that the free accounts will be limited to 1,000 images, the rest will be removed. In Cyberhades we have exactly 3,997 images in Flickr, so if we want to maintain all these images we need to pay for a pro account, which is around $50 a year or $6 a month. The Flickr pro account offer more than just unlimited number of images, and if you are a photographer you may benefit from these other perks, but in our case, besides de CDN we are not getting any benefit.

    The reason for this blog post is not to talk about that we moved out from Flickr, but how we did it.

    The first thing we needed to decide was where to migrate. After looking around into several cloud providers, we decided to go with Digital Ocean’s (DO) Spaces. One thing I do like about DO is their fixed price policy, and also we moved our blog infrastructure to DO about 3 years ago and the service, and experience have been excellent.

    DO offers a service called Spaces. It is compatible with AWS S3, this means you can interact with this service using any tool that can interact with AWS S3 and it also has a CDN, which is pretty much all we need for the blog.

    Once we had decided where to migrate, it was time to get our hands dirty. The first we need was to download our pictures from Flickr. Luckily Flickr allow you to download all the data they have about you, including all the files (images, videos, etc). You can do this from your account settings page. There is an option to request your data, after you do so, it can take a while, depending on how many files you have there.

    When your data is ready, you will see something like this:

    Flickr Data

    Flickr Data

    Each one of these zip archives contain 500 files. After downloading these archives and extract their content, we faced out first problem. The filenames are not the same when we uploaded them. Flickr adds an id number plus “_o” at the end of the filename, before the extension. For instance, if you upload an image to Flickr with the following name: libro-microhistorias-informatica--nuevo0xword.jpg, Flickr will store it with something like: libro-microhistorias-informatica--nuevo0xword_8768892888_o.jpg. That 8768892888 is an unique identifier.

    The next problem we had to solve was how to know what picture corresponded to what link in our current blog posts. Some of the links to Flickr looked like this: https://farm4.staticflickr.com/3803/8768892888_8932423465.jpg. As you can see here, there is not picture name, although we have the picture id. So we needed a way to map the pictures id with their name in order to later replace all these links.

    To do that I wrote a small Python script, where I read all the images I downloaded from Flickr, extract the id from their name and put such id in a dictionary as a key and the filename as its value. Here is a little snippet:

    def loadFilenames(picspath):
    
        dict = {}
    
        onlyfiles = [f for f in listdir(picspath) if isfile(join(picspath, f))]
        for entry in onlyfiles:
            tokens = entry.split("_")
            if tokens[-3].isdigit():
                dict[tokens[-3]] = entry
            if len(tokens) > 3 and tokens[-4].isdigit():
                dict[tokens[-4]] = entry
    
        return dict
    

    The next step was to find all the links to Flickr on our over 6,000 blog posts. We found out tha the links weren’t consistent, and there were slightly different link format. We identified 3 different groups, so to match these groups we came up with 3 different regular expressions:

    matches = re.findall("http[s]?://farm?.\.static\.*flickr\.com/\d+/\d+_\w+\.[a-z]{3,4}", c)
    matches = matches + re.findall("http[s]?://www\.flickr\.com/photos/cyberhades/\d+/*", c)
    matches = matches + re.findall("http[s]?://c?.\.staticflickr\.com/\d+/\d+/\d+_\w+\.[a-z]{3,4}", c)
    

    After this, we needed to go through each of the blog post, find any link that matched any of these regular expressions and replace them with the links from DO Spaces, but to do this, we needed first our images in DO.

    Spaces is $5 a month, and you get 250gb of space. When we created our Space, we activated the CDN option. Once the Space was created, we were ready to start uploading our images. You can use their web interface, a third party client compatible with S3 or write your own code to do so. As a passionate developer, I took the latter option :) and I wrote a small Go application for that. One thing to keep in mind is to make sure you make the images public, and also you need to set the right Content-Type. You will also need to create a token from the DO website to access to your Space. Here is a little snippet:

    func GetFileContentType(out *os.File) (string, error) {
    
      // Only the first 512 bytes are used to sniff the content type.
      buffer := make([]byte, 512)
    
      _, err := out.Read(buffer)
      if err != nil {
        return "", err
      }
      contentType := http.DetectContentType(buffer)
    
      return contentType, nil
    }
    
    ...
    // Make the file public
    userMetaData := map[string]string{"x-amz-acl": "public-read"}
    
    // Upload the file with FPutObject
    n, err := client.FPutObject(spaceName, objectName+strings.Replace(path, dirPath, "", 1), path, minio.PutObjectOptions{ContentType: contentType, CacheControl: cacheControl, UserMetadata: userMetaData})
    if err != nil {
      log.Fatalln(err)
    }
    ...
    

    There is one more thing we needed to do before uploading all the images to our Space. The images we downloaded from Flickr are the original ones, this means, most of them are pretty large and not suitable for a blog post. One think Flickr does when you upload an image, it creates different sizes out of the original one, ideal for blogs and other matters. So before uploading our images, we need to downscale them to 600px width. Also, we don’t want to upscale any image that is smaller and lastly we want to keep the aspect ratio. To do this we used the magnificent ffmpeg:

    for i in *; do ffmpeg -i $i -vf "scale='min(600,iw)':-1'" ${i%.*}_opt.${i#*.}; done
    

    The scale='min(600,iw)' option, we are telling ffmpeg to only scale these images which width is larger than 600px. And with :-1 we are telling to resize the height to number of pixels that will keep the aspect ratio. This will generate another set of files with the same name, but adding “_opt” at end of their name (before the extension), this way, we’ll preserve the original images.

    Now is time to upload our optimized images to DO.

    Cyberhades Images

    Cyberhades Images

    One your images are uploaded, we can see we have two links available to access them, Origin and Edge. The link we are interested is the Edge link, which uses the CDN to delivery our images.

    Finally, all we need to do is to replace all the Flickr links in our blog posts with the new ones. Here is a snippet of the code that takes care of that part:

    ...
    for match in matches:
        if '_' in match:
            k = match.split('/')[-1].split('_')[0]
        else:
            tokens = match.split('/')
            if tokens[-1].isdigit():
                k = tokens[-1]
            else:
                k = tokens[-2]
    
        if k in dict:
            c = c.replace(match, "https://cyberhades.ams3.cdn.digitaloceanspaces.com/imagenes/" + dict[k])
    
    print(entry)
    o = open(entry, "w", encoding = "ISO-8859-1")
    o.write(c)
    o.close()
    ...
    

    The complete Python script can be found here.

    The wrap up this post, DO Spaces is not cheaper than paying for a Flickr pro account (if you pay yearly), but with Spaces, we have more control over our pictures, and now all the links are pretty consistent, which means, if we need to do another migration or any other thing, it will make our life way easier.