Deploy WASM with Knative
Wasm did you say?
One of the latest fun tech topics is Server-side WebAssembly (Wasm). Wasm has started showing up all over since it has many characteristics that make it really exciting, from cold-start time, to security, and portability.
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications.
— webassembly.org
There are a number of ways to get started with Wasm including many different frameworks, runtimes, open source projects and the like. It can be a bit difficult to navigate the world of Wasm, however, it is this way for good reason. This tech is very young, but maturing rapidly and there are lots of nice, helpful folks in the community that are more than willing to help out, so don’t be afraid to jump in and ask!
Use-Case
I’ve been using Wasm to deploy Functions-as-a-service (FaaS) style apps for a number of months now. Think of an AWS Lambda-style service where you write code in a browser and have it be hosted + executed for you automatically. A challenge with this type of service is that it needs to run untrusted code, that potentially untrusted people will write.
There are a number of ways you could accomplish this, from booting VMs, utilizing another service (e.g. AWS Lambda), or running a hypervisor like Firecracker. These are all great solutions, but like any Systems Design Architecture, you need to pick what’s best for you.
Leaving that discussion for another blog post, let’s get to the point of this one. =)
Kubernetes, Knative, & Spin
Kubernetes is my go-to platform for deploying applications. I’ve been a part of it for some time now and have helped to contribute some tech to it over the years.
Kubernetes is great because of what I find to be interesting, control loops. I realize this isn’t a new topic in software engineering, but it fits my mental model very nicely where a defined state is configured and the system takes actions to correct it as needed.
For Wasm specifically, an interesting piece of tech in the community is Spin from the folks at Fermyon. Spin is great because you can “spin up” a new template based on a language you want to code in, then run a few commands and all the complexity of Wasm, wit files, wat files, etc is abstracted away from you.
At the moment, my FaaS service runs Spin apps, but represented as Kubernetes Deployments which means one (or more) pods are running all the time regardless of the requests that are directed to it. Normally for a typical company, this is fine because if you deployed a service, then you’d expect traffic, otherwise why deploy it?
For my scenario, there are free tiers and paid tiers, but all users can deploy apps, so this means I have a lot of users that sign up, deploy something, then never come back. This leaves my Kubernetes cluster still running that “hello-world” app all the time which is expensive and wasteful.
Knative provides a way (among other features) to deploy an application, and takes control of automatically scaling up the application’s replica count. This gives me a “scale-to-zero” approach which only runs that applications that are in-use and automatically handling scale since Wasm is single-threaded at the moment (more on this later).
Set up
Mikkel Hegnhoj from Fermyon has a great blog post that outlines much of what I’ve already covered as well as has some setup that you’d need to do to your cluster. I’d recommend reading through that and getting started, but I’ll copy/paste my setup below for you follow.
Runtime & Containerd Shim
Deploying these two files will create a runtime class
for Spin and a corresponding containerd shim to execute the Wasm.
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: wasmtime-spin
handler: spin
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: spin-installer-20
spec:
selector:
matchLabels:
name: spin-installer-20
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: spin-installer-20
spec:
nodeSelector:
kubernetes.io/os: linux
wasm-runtime: spin-20
tolerations:
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
hostPID: true
volumes:
- name: host-root
hostPath:
path: /
initContainers:
- name: installer
image: ghcr.io/fermyon/spin-containerd-shim-installer:0.10.0
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: HOST_ROOT
value: /host
volumeMounts:
- name: host-root
mountPath: /host
containers:
- name: pause
image: k8s.gcr.io/pause:3.1
imagePullPolicy: IfNotPresent
Install Knative
Knative can be installed a number of ways, however, on EKS (where I deploy this too), I ran into a couple problems that lead me to utilize their Operator for deployment.
Install the operator into your cluster:
$ kubectl apply -f https://github.com/knative/operator/releases/download/knative-v1.12.2/operator.yaml
Configure an instance of Knative Serving which we’ll use next to deploy our Wasm application.
NOTE: I’m using a custom ingress-class
of contour
since that’s my preferred Ingress controller, but you should be able to swap out if desired. See the docs if you want more detail.
apiVersion: v1
kind: Namespace
metadata:
name: knative-serving
---
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
name: knative-serving
namespace: knative-serving
spec:
ingress:
contour:
enabled: true
config:
network:
ingress-class: "contour.ingress.networking.knative.dev"
domain:
"kn.stevesloka.com": ""
features:
kubernetes.podspec-runtimeclassname: "enabled"
kubernetes.podspec-affinity: "enabled"
Deploy!
Now that everything is running, let’s deploy our app and see how things work.
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: steve-spin2
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: wasm-runtime
operator: In
values:
- spin-20
runtimeClassName: wasmtime-spin
timeoutSeconds: 30
containers:
- image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:v0.10.0
ports:
- containerPort: 80
protocol: TCP
command:
- "/"
livenessProbe:
tcpSocket:
port: 80
Now when you curl the endpoint of your application (i.e. https://steve-spin2.default.kn.stevesloka.com) Knative will dynamically increase the replica count of your application as needed and when no requests come through, will scale down to zero.
Conclusion
There is so much more to what I just described, but hopefully it’s enough to get you excited about Wasm and its surrounding community! Please reach out with questions and I’ll look to make a deeper, technical post about how this works in the future.