Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News QCon New York 2023 Panel Discussion: Navigating the Future - LLM in Production

QCon New York 2023 Panel Discussion: Navigating the Future - LLM in Production

The recent QCon New York conference featured a panel discussion titled "Navigating the Future: LLM in Production." Some key takeaways are that there are two trends in LLMS: closed models behind APIs and open-source models, and that organizations using LLMs will need to think deeply about testing and evaluating the models themselves, with a strong emphasis on risk mitigation.

The panel was moderated by Bozhao (Bo) Yu. Panelists included Sherwin Wu, a member of technical staff at OpenAI; Hien Luu, Sr. engineering manager at DoorDash; and Rishab Ramanathan, co-founder & CTO of Openlayer. The panelists discussed questions about large language models (LLMs) posed by Yu and fielded a few more from the audience at the end of the session.

Yu began by asking panelists their opinions of the future of LLMs in production and how they will be used. Ramanathan predicted that there would be two broad categories of use: low-risk scenarios, such as internal document retrieval; and higher risk scenarios, where LLMs would likely be used as a "copilot" rather than acting autonomously. Luu referred to a recent blog post by DoorDash's Head of AI, which identified five usage areas; Luu elaborated on the use case of LLMs as digital assistants. Wu posited that there would be a mix of use cases: calling out to APIs for "closed" foundation models vs. running self-hosted open-source models.

Yu next posed the question of whether operating LLMs (LLMOps) would continue to be a part of MLOps, or if it would be a new discipline. Luu, who manages an MLOps team, thought it would be an extension of MLOps, pointing out that the goal of MLOps is to allow an organization to use ML "quickly, easily and efficiently." Ramanathan agreed, but thought that there would be components of MLOps that might not be as important.

The next question was what parts of the ML workflow would be kept and what parts might need rethinking, in particular due to the challenges of serving very large models. Luu praised the efforts of the open-source community in researching ways to distribute models across GPUs. Wu suggested the focus would be on the input and output of the pipeline: the input being the data needed to fine-tune models, and the output being careful evaluation of the model output. Ramanathan seconded the need for evaluation, pointing out that consumers of an LLM's output should "think deeply about testing it and evaluating it themselves."

Yu concluded by asking the panelists for their "wish list" of LLM developments. Ramanathan, who had previously worked at Apple, wished for assistants such as Siri to gain abilities on par with ChatGPT. Wu wished for more progress on multimodal models as well as improvements in AI safety.

The panelists then answered several questions from audience members. One asked about whether prompt engineering would be a long-term need, or whether models would improve to the point where it was not needed. Wu agreed it was an open question, but speculated prompt engineering would be needed for at least five more years. Ramanathan pointed out that there were open-source libraries to help with prompt generation.

Several audience members asked questions about privacy and regulation, especially in light of the recent EU AI Act. Wu said that OpenAI's perspective is that they would always follow the law, and would work to improve or fix their models to "have as far reach as possible." Ramanathan followed up by pointing out that the new Act would require transparency of training datasets; he noted however that the law was rather "handwavy."

About the Author

Rate this Article