The KubeCon EU morning keynotes were a veritable call to action for the cloud-native community to get involved and help evolve the important technologies and themes that were a focus of this year’s event. The first call related to AI, with an ask for help in scaling the required infrastructure for generative AI computational needs. This call was balanced with encouragement to make a cloud-native platform’s "golden path" green and sustainable, ensuring that any innovation is also responsible.
The opening day keynote of KubeCon and CloudNativeCon EU 2024 focused on the combination of AI and Kubernetes and explored how the cloud native ecosystem can help with the computational needs that the Generative AI gold rush requires. Starting from a demo, Priyanka Sharma, executive director of CNCF, ran the LLaVa model with Ollama to analyse a picture of the audience. She continued by introducing a CNCF reference architecture for AI applications, which is part of a whitepaper from the CNCF AI Working Group.
The keynote continued with a panel consisting of Paige Bailey, lead product manager (generative models) at Google, Timothee Lacroix, chief technical officer of Mistral, and Jeffrey Morgan, founder of Ollama, exploring the needs of the cloud native AI ecosystem. They felt that today there is a big gap between AI engineers and operations engineers, similar to the gap a couple of years ago between operations engineers and software developers. All three concluded that as more AI applications are leaving the research labs to move into real life, it will be important to work together, especially since there is currently a whole catch-up game for the infrastructure engineers to keep up with the needs of the AI engineers.
The panel considered that the ecosystem will change as developers discover which is the best size of AI model for their scenario. They anticipated that the hardware manufacturers would also start providing different offers to be more efficient for different scenarios. The use cases of AI fall into three categories: training, finetuning and inference - each of them with different requirements. According to the panel, everything needs to be released as open source, from the models to the tools being used. Lacroix went a step further and stated that even the hardware should be open source.
Invited by Sharma, the three panellists provided their future wishes for the open-source community:
Paige Bailey: Even though much experimentation is done on top of paid services, it is of utmost importance for the future that the community starts creating patterns that would replicate the experience you get while working with paid service API with purely open-source tooling.
Timothee Lacroix: We need to abstract the hardware away, so when we are running in production I shouldn’t care if my GPU will die in any weird way, or if I need to scale my infrastructure.
Jeffrey Morgan: These models are not running into a vacuum, we built a fair share of powerful tooling to run applications that scale and these challenges will keep on repeating in this new world of running AI-powered applications. My big ask is how we take the existing tooling like monitoring, scanning, and logging to the world of AI and do it fast, as the AI landscape is evolving swiftly.
To balance the exuberance of AI, the other message that was relayed throughout the keynotes was "efficiency". Aparna Subramanian, director of production engineering at Shopify, encouraged the cloud-native community to create architectures and platforms that will provide us with the ability to "innovate responsibly". If this advice is ignored this translates not only into bigger costs, but also into a bigger carbon footprint and water consumption.
Next in the keynotes, WebAssembly (Wasm) was introduced as a feasible runtime option, even when working with Kubernetes, to improve the resource footprint of the application in very impactful ways. In the example provided by Kai Walter, a distinguished architect at Zeiss, the memory used by the application was decreased from 423 MB to 2.4 just by using Wasm as the runtime. The keynotes also saw the announcement that SpinKube was being donated to the CNCF as a sandbox project.
The pressure of making cloud computing greener doesn’t fall only on the cloud vendors, but also on the cloud consumers. Gualter Barbas Baptista, chief advisor of platform strategy and enablement at Deutsche Bahn emphasized that by making our infrastructure greener, we can help to achieve the 2 degrees reduction target established by the Paris Climate Agreement in 2015. He mentioned that an easy win is to turn off the infrastructure when it’s not needed and thus help save costs and produce less carbon. He also underlined that software developers have the power to architect an application more efficiently.
The industry-wide migration to cloud-native platforms and infrastructure continues, so the CNCF community was encouraged to take rapid action and get involved with projects. This was especially important as the community celebrated the newly sandboxed, incubated and graduated projects.
2024 will also see the Kubernetes project celebrate its first decade since its inception. The CNCF team announced that celebration events will be organised around the globe over the coming year. Given the importance and complexity of the challenges involved with adopting cloud-native technologies, the community was encouraged to contribute in any way they could. For example, the zero-to-merge program has been positioned to help new contributors get started.