It was our great honour to speak at Kubeflow Summit 2019. We talked about how we use Kubeflow to help financial institutions in Europe to deploy their ML workloads.
This summit took place in the fall of 2019, in Google premises (Silicon Valley) alongside larger community event focused on Tensorflow Extended. It was the first Kubeflow summit where the community of developers exchanged their ideas with early adopters. CREDO is one of the early adopters.
CREDO is using Kubeflow as the runtime & orchestration environment for ML workflows in the YQ solution. Rado presented a Kubeflow setup in the context of a financial institution which is quite exotic compared to what Kubeflow was originally designed for. He briefly talked about the advantages and challenges that this setup brings and how CREDO tackles them.
The presentation can be found here.
Walking with the giants
Kubeflow might be a relatively new open-source project (currently in beta), but the community is backed by the industry giants such as Google, IBM, and Cisco.
Community effort led by Google, IBM and Cisco.
As we soon learned from the summit agenda, there were some ‘big names’ also on the side of early adopters too (CERN, Spotify, LinkedIn, Microsoft, JP Morgan Chase, GitHub).
From the talks of the fellow speakers, we especially liked the talk of Michelle Casbon (Google) on how Kubeflow helped CERN’s scientists to sift through petabytes of data on their quest for the Higgs boson.
Another piece we truly enjoyed was the talk of Josh Baer (Spotify) on how Kubeflow Pipelines helps Spotify automate their daily machine learning workflows.
What is Kubeflow
Kubeflow started off, at Google from TensorFlow Extended pipelines, with an ambition to broaden the “pipeline idea on Kubernetes” also to other ML frameworks (PyTorch, Keras, Spark, Scikit-learn, …).
Kubeflow is a machine learning toolkit for Kubernetes.
Citing Kubeflow pages, the mission of Kubeflow is to make scaling and deploying of ML models to production as simple as possible, by letting Kubernetes do what it’s great at:
- Easy, repeatable, portable deployments on a diverse infrastructure (for example, experimenting on a laptop, then moving to an on-premises cluster or to the cloud)
- Deploying and managing loosely-coupled microservices
- Scaling based on demand
Because ML practitioners use a diverse set of tools, one of the key goals is to customize the stack based on user requirements (within reason) and let the system take care of the “boring stuff”.