Recently eBay disclosed the lessons learned about the application of generative AI in the development process. eBay's AI endeavors have uncovered three pivotal avenues toward enhancing developer productivity: integrating commercial offerings, fine-tuning existing Large Language Models (LLMs), and harnessing an internal knowledge network.
Adopting commercial AI solutions such as GitHub Copilot has yielded promising results for eBay's developer community. In a meticulously designed A/B test, developers utilizing Copilot demonstrated heightened productivity, marked by a notable increase in code acceptance rates (27% code acceptance rate reported through Copilot telemetry), and efficiency metrics: 60% accuracy for the generated code. The introduction of Github Copilot also allows a decrease of PR (about 17%) and a decrease (about 12%) of the Lead time for Change. However, limitations such as prompt size constraints underscore the need for tailored solutions in the context of eBay's vast codebase.
By post-training and fine-tuning open-source LLMs like Code Llama, and in particular Code Lllama 13B, eBay has unveiled new avenues for streamlining labor-intensive tasks and mitigating code duplication. The development of eBayCoder, a bespoke model trained on the organization's proprietary data (code base and documentation), showcases the potential for LLM customization in addressing nuanced challenges unique to eBay's ecosystem. This approach worked well for some tasks that were previously time-intensive like the update of libraries to fix vulnerability.
Given eBay's extensive and diverse codebase, a typical commercial Large Language Model might only access data and code directly pertinent to a specific query. Typically, this includes nearby files, the present repository, and a handful of dependent libraries. However, such models might overlook alternative internal services or non-dependent libraries managed by other teams, even if they provide identical functionality currently under development. Consequently, this often results in significant code redundancy. In contrast, a finely tuned LLM can access a broader context, potentially mitigating the occurrence of code duplication.
Recognizing the significance of streamlined access to internal knowledge, eBay has implemented an internal GPT-driven query system. Leveraging Retrieval Augmented Generation (RAG) techniques, this system seamlessly integrates with existing documentation sources, empowering developers with timely and relevant insights. Despite occasional hiccups in response quality, ongoing refinement efforts via Reinforcement Learning from Human Feedback (RLHF) allow eBay to make the GPT-driven query system better over time.
In the dynamic landscape of technological advancement, eBay's journey is a testament to the introduction of AI integration in driving tangible outcomes for developers and organizations alike.