InfoQ Homepage Podcasts Platform Engineering for AI: Scaling Agents and MCP at LinkedIn

Platform Engineering for AI: Scaling Agents and MCP at LinkedIn

Dec 10, 2025

QCon AI New York Chair Wes Reisz talks with LinkedIn’s Karthik Ramgopal and Prince Valluri about enabling AI agents at enterprise scale. They discuss how platform teams orchestrate secure, multi-agentic systems, the role of MCP, the use of foreground and background agents, improving developer experience, and reducing toil.

Key Takeaways

Platform engineering teams must recognize AI agents as a new execution model (on par with microservices) that requires shared, production-grade infrastructure for security, compliance, reliability, and observability to move beyond scattered proofs of concept.
A comprehensive AI strategy distinguishes between "foreground agents" that assist developers directly in the IDE and "background agents" that autonomously handle repetitive toil, such as large-scale code migrations, upgrades, and test coverage improvements.
To maintain high engineering standards, autonomous agents should operate within a secure sandboxed environment to execute specific developer intent, ultimately producing standard Pull Requests that must pass existing CI/CD pipelines and human review.
MCP provides a unified, vendor-neutral way for LinkedIn’s foreground and background agents to use the same tools and enterprise context (powered by RAG, PR history, and semantic code indexes) while evals, sandboxing, and auditing ensure safe, compliant, production-grade agentic workflows.
Organizations should avoid reinventing generic coding tools and instead focus on solving specific domain problems by feeding agents deep organizational context—such as historical PR data—and implementing rigorous evaluations to track performance.

Subscribe on:

Transcript

Wes Reisz: On today's episode of the InfoQ Podcast, we're diving into a topic that is rapidly transforming how engineering organizations build software today. How platform teams are enabling AI or more specifically, how platform teams are enabling AI at enterprise scale.

Hi, my name is Wes Reisz, I'm the creator of the InfoQ Podcast and the technical principal at Thoughtworks, where our focus on platform engineering, AI-first software delivery, and the intersection of modern architectures with AI. I'm also the chair for QCon AI, which is the newest edition of the QCon family, focusing deeply on how software engineers and leaders are shipping responsibly with AI. Today's guests, Karthik Ramgopal and Prince Valluri are leading the work at LinkedIn to bring secure, compliant, observable and developer-friendly, multi-agentic systems into the platform.

Their QCon AI in New York talk is exploring how LinkedIn's platform teams are giving engineers the ability to define tasks, orchestrate agents, and generate real PR safely using structured specification governance and a secure MCP framework.

In today's conversation, we'll walk through that and look at specifically how LinkedIn is approaching the shift. Why platform teams, not just ML teams, are becoming central to enabling AI. What tools and abstractions are emerging? How MCP fits into the developer workflow and experience. And what it takes to build an enterprise-scale safe agentic systems. Prince, Karthik, welcome to the show.

Karthik Ramgopal: Hey, Wes, thanks for hosting us.

Wes Reisz: Absolutely. I'm excited about QCon AI too. Karthik, I think you are one of the first ones I first spoke about getting I talk together, so thank you for supporting it and thank you for sharing everything you all are doing at LinkedIn with us.

Karthik Ramgopal: Sure, thanks for the opportunity.

Moving Beyond Siloed Proof of Concepts [02:16]

Wes Reisz: So both of you involved in leading efforts at LinkedIn to bring AI and multi-agentic systems and tooling into the hands of thousands of developers. What is the key problem that you had to deal with that said, "Hey, this is a problem for a platform team that we need to solve?"

Karthik Ramgopal: So at LinkedIn we saw two converging challenges. The first is a lot of teams are experimenting with AI, but in silos people do proof of concepts. One of scripts, local tools. It is valuable, but it's also very inconsistent because everyone's reinventing the same plumbing, prompt orchestration, data access, safety evals, deployment. Also, the way people work is changing. You no longer have hermit developers and teams who just pull up an IDE, code it there and they're done, right? Engineers are actually spending huge amounts of time on cross-system coordination. They triage issues, they pull data from multiple services, they synthesize insights, they trigger workflows, and of course they write code and they go through the entire coding life cycle.

But what this means is that these aren't single-shot queries. They are multistep stateful tasks, which agents are remarkably good at handling. So when we put these two together, we were like, "Hey, we have an obvious need. If we want AI to meaningfully improve productivity and not just be bling-bling, we cannot have every team build their own mini-agent platform. We need a unified open-ended platform so that teams focus on their domain problems and we focus on the system and infrastructure problems".

Wes Reisz: Do you find that still gives developers the autonomy to do what they need to do, but within these guardrails, how do you balance that paved path with developer autonomy for what they're doing?

Karthik Ramgopal: I think it starts with meeting the employees where they are. So we talk a bunch about coding agents here, but it isn't just specific to coding agents. We also have agents integrated into workplace productivity suites, messaging systems like Teams, et cetera. And it's not just inside the ID. So first is you try to meet them where they are and you try to fit into their workflow. So that itself makes a huge difference in terms of adoption right? Now, specifically within the context of development tools, the developers still use a combination of prompts, which they write. MCP tools, we will cover why MCP is so critical later as well as built-in infrastructure hooks. So what this means is that they still have a lot of autonomy and a lot of control in what they do. They just aren't reinventing the boilerplate infrastructure pieces.

Wes Reisz: Just to put things at scale. How much are we talking about? Are we talking about 1,000 developers, 10,000 developers, and how much are we using the tools that we're about to talk about across that scale?

Karthik Ramgopal: So we have thousands of developers who are using these tools every single day. So it's very widespread and more importantly, it's like something which we are trying to highly encourage in order to increase productivity, but without compromising quality, which means the developers are still in charge, the developers are still responsible for the quality of the output. They're just using these tools to accelerate their productivity.

Agents as a New Infrastructure Execution Model [05:18]

Wes Reisz: I mentioned that both of you're speaking in QCon AI here in just a few weeks in New York. What are the big things that you're going to be talking about in your talk?

Karthik Ramgopal: We are going to be talking about how we build this platform, some of the use cases of this platform and how central MCP is for tool orchestration in genera. And more importantly, the message we want to convey is that AI agents aren't a feature. AI agents are a new execution model, which means that platform teams have to treat them with the same level of sensitivity and support with which we treat something else like microservice infrastructure or compute infrastructure with Kubernetes, et cetera. Which means it's shared infrastructure which is built for scale, reliability and trust. Without it, you end up with fancy demos, but you cannot actually ship something to production at scale.

Wes Reisz: You say a new execution model. Talk more about that.

Karthik Ramgopal: Essentially, it's like supporting any other first-class infrastructure just like you have a team for storage infrastructure or machine learning infrastructure. In the same way you need a fully funded agentic platform team, which is thinking about all aspects, right? The process changes, the technology changes, the rebuilding of these common building blocks so that people aren't repeating this again and again. And also there is a bunch of new technology coming in here every day. So somebody has to be thinking about how to thoughtfully adopt it and incorporate it into the existing workflows in the enterprise.

Structuring Intent with Specs and Sandboxes [06:47]

Wes Reisz: So to get beyond just POCs. We really need to be thinking about AI as an operating model way of bringing it across your systems. Yes, makes sense. Makes total sense. Prince as Karthik was talking, he mentioned about developers in control. How are you making sure developers are in control and have that ownership, when you're talking about orchestration of these agents?

Prince Valluri: I think the main point of control that developers have is how they communicate with these agents, right? A spec is how we translate the developer's intent into something the agent can reliably execute. Now, it's the contract between the developer and the system and at LinkedIn we deliberately structured these in a way that we want the developer to express what they want to change, how the work should be broken down, what tools should be allowed and importantly, what does good look like at the end of it through acceptance criteria and other checks. It's that intent that needs to be made explicit and we try to remove ambiguity from it as much as possible. So that's where developers are in control. The agents with this can plan way more deterministically, reviewers understand the context, and we can run automated validations along the way in a similar way that developers used to when they work themselves.

Wes Reisz: What scope are we talking about or does it vary? Are we talking at a use case level or at a given use case we're doing a specific thing, a task level where we're doing an individual task or are you talking more at an epic level where there's orchestration of many different things happening throughout several different, potentially even codebases?

Prince Valluri: Eventually we'll get there with this at the epic level, but where we are is really at a task level, right? It is the same level of granularity that a developer generally thinks for themselves when they pick up a task and what they want do and the change that they want to produce. That's essentially the same mental model we want to bring, sure. So you want to treat the agent as a teammate, give them the same amount of context that you would generally have, while completing that task and expect the same kind of output, which in most case is a port request with that change. So that's really the of granularity that we have today with these tasks.

Wes Reisz: Okay. So when you say a task level, is this create a repository to be able to do something, to actually talk to a database or is it like a feature that is actually creating a repository, creating a component, creating a service?

Prince Valluri: It can be either. There's nothing stopping us from excluding one or the other or including one or the other, but really depends on what tools the agent has access to and what you ask it to do. So it is essentially just bounded by how much context you give it, what you ask it to do and what tools you give it. So definitely both.

Wes Reisz: So then you give it a spec around a task for maybe a simple use case. What happens from there? Do you give it certain tools that you want to access to it? Does the platform deliver a set of tools? What does that experience look like?

Prince Valluri: We start from a safe starting point, which is the agent has access to certain tools which we deem are basic and give the agent the same level of knowledge as any LinkedIn engineer would have, information to proprietary systems, how do you make certain calls, how do you talk to certain systems? Essentially all the things that are engineers go through during boot camp. Now, on top of this, it's up to you as a developer what kind of other tools you want to give this agent access to because not everything makes sense all the time and you don't want to overload the agent with too many tools that are not useful for all the tasks and that just deteriorates quality over time.

So just to be mindful of that, you have a starting point and then you can add more on top of it. If you're dealing with a very specific use case for a very specific system, you can choose to have a tool that will surface additional information or context about that specific system and give it to the agent so you get better quality results.

Wes Reisz: So what's, I guess the execution model is this inside Cursor? Is this inside some IDE that you're running and then triggering this MCP call that talks to what's it look like?

Prince Valluri: What we mainly try to aim for here is to give these agents a very safe sandbox environment for them to execute on. As soon as you as a developer says, "Hey, this is my entire task that I need you to execute", we then orchestrate that flow in a remote sandbox environment, where the agent is free to run and execute and do what it needs to do. But because it's under the platform, there are certain restrictions that the agent cannot cross. It cannot talk to certain systems, it cannot make certain calls, but it is otherwise free to do what it needs to do with the file system and make changes all in that local sandbox assignment.

So we instantiate the agent there along with the context that you provided with your spec or your prompt and then set it up with the tools that you want it to have access to, which is either through MCP or native tools that are already available in the platform because you don't want to keep redoing some tools that are available locally and reading files and searching and globbing and you don't need MCP for stuff like that.

But then there are remote things that you would definitely want to use MCP for, and that's really the setup that we give it. And then there's other pieces like authentication with GitHub and how it should pull repos and how it should push to branches and how it should create a pull request and so on. And that part, again, is handed by the platform. Once the agent is done executing, we then move it on to the next stage of like, "Okay, let's create a pull request out of this change and kind of manage that workflow".

The "Human-in-the-Loop" Review Process [12:23]

Wes Reisz: And so that's the primary one. Everything generates a PR, developers review the PR, make sense.

Prince Valluri: And a very critical piece of this then becomes as a developer, I want to review the PR, like I would a teammate's PR. So I would ask for changes or requests that certain changes need to be made for certain files or certain logic that it implemented or certain documentation or whatever it is. I'm able to review that for request and ask for changes in the agent then picks it back up where it left off, addresses your changes and comes back with updates to the same pull request. So that human in the loop, aspect in the pull request is very critical.

Wes Reisz: One of the things that I found in the process and how much autonomy I can get all the way for an agent before I get to that PR is the level I guess of domain knowledge that the developer has on what's actually happening. When you have deep domain knowledge, you can put evals in there to be able to measure and make sure you're going the right way. How does the platform at LinkedIn allow that domain knowledge to be encapsulated into evals as it moves towards that PR? How do you do that?

Prince Valluri: This really comes from a couple of different places. The first one is we have a huge plethora of pull requests that we have been reviewing, been submitting over the last couple of years at least just to make sure that recency has better quality or is more relevant to what exists today. There's tens or hundreds of thousands of pull requests that we can already use where humans have provided good feedback and other humans have addressed that feedback. And if you really use all of this data and show an agent two PRs, we have enough data for that agent to make a decision and say, "Which is mergeable and which is not". So that creates a very good baseline for us to go off of. And not only that, because of all this historical data, it can also pretty accurately tell you which comments should be added to a certain PR or a certain depth to then take it to a level where it can be mergeable.

Karthik Ramgopal: I want to underscore one very important thing right there are also quantitative stochastic verifiable things here. For example, does the bill pass?

Foreground vs. Background Agents [14:30]

Wes Reisz: That makes sense. That kind of leans into what other patterns are you seeing as you implement agents in these workflows? Thinking about the platform, orchestrating agents that developers are seeing like the stochastic measures that you can do to actually measure its success. That's certainly a pattern. There are other patterns that you're seeing that become applicable to being able to orchestrate agents like this.

Karthik Ramgopal: Firstly, during development, we need to understand that there are two distinct kinds of agents in the picture here. The first is what I call as foreground agents, which are the IDE. These are like the where you're in the IDE where we again augment these agents. We don't build them ground up, we just buy GitHub Copilot and use it inside the IDEs, but we will augment it with MCP tools. We will augment it with some instructions which essentially get into the system prompt, which help the developers when they are coding in the IDE and the developers can see what the agent is doing and they're more actively involved. So this was the foreground mode. The second one, which Prince talked about, which is where we have built an essential orchestration system in the background where we've essentially built the agent ground up is the background agents where you try to give it a high-level description of the task.

The agent is going and doing things in the background. You may not see all of the sausage-making. All you see is the PR which gets put up, and after that you have an opportunity to go give comments on the PR and then the agent picks it up again and it responds to those comments, et cetera. And both of these are equally important and applicable to the different kinds of tasks which you're doing. For example, the former is useful when you want to be more actively in control and the latter is useful for things like doing refactors or migrations or things like that where you just want to see the output and periodically provide input.

Wes Reisz: You mentioned refactorings are good for background tasks. What are some of the other things that you're finding a lot of success for maybe the background tasks versus the foreground?

Prince Valluri: One very interesting pattern that has emerged is just reduction of toil. All the things that are essential, but never get prioritized and people hate doing those, but migrations are sometimes critical, so you don't really have a choice, but there's things like improve your code coverage in your repo, right? Clean up A/B tests that no longer need to be there in your repo. All of these things are increasing complexity if your codebase increasing tech debt that you eventually need to pay, but people have found that using unsupervised background agents for these use cases is really turning out to be quite productive because all you need to do is define that context upfront for what you need the agent to do and then it's a very repeatable pattern.

Karthik Ramgopal: Something else we've seen is in order to provide these agents examples for some of these repetitive tasks, you can have a human do a few PRs or code changes, teach the agent the pattern as to how, it's similar to how you would teach a junior developer on the team and then the agent uses those as examples, learns from it and does it. The other place where the background agents are incredibly useful is development is not just about coding. You also run systems in production. So sometimes you want observability agents which are looking at metrics in production, which are looking at logs, alerts, et cetera, responding to outages, responding to availability dips. These are again, areas where it's really good to have an ambient agent running in the background, which is able to escalate to the human when something of interest happens or take remedial actions if they're absolutely safe to take.

Wes Reisz: I really like that specific example. So if you have a lot of refactoring that you need to run across a large codebase, take a particular class, refactor that, establish that as to a spec and then give that as a one-shot and say, looking at this particular class, I need to now apply this to the rest of the codebase or migrate it or refactor, whatever it may be they apply. That's an interesting pattern. I like that a lot. You mentioned spec-driven development using spec to be able to orchestrate more of a long-running background task. What about in the foreground? Are you using spec-driven development in the foreground as well to define even something that needs to happen to generate code? How are you using it in the foreground?

Prince Valluri: The foreground is really where the human is very actively interacting with the agent to get the work done right. It's still good to have a very detailed spec for what you want to accomplish out of it, but because of that very high fidelity nature of that interaction with the agent, you can always keep going back and adding more context to it and getting the agent to refine it output and keep making progress. And these foreground models are amazing for cases where you need to do a lot of active thinking and you want to test out a few things. You want to validate with how things are working versus not, and you're kind of working through a solution. That's where these foreground agents really shine because they eliminate some of that toil for setting up some stuff and tearing it down or running things in the background. A lot of that can get eliminated. That's where foreground agents really shine.

Karthik Ramgopal: This is in response to your prior question, you said, "How do you ensure you don't get in the way of developers?" As Prince said, "In foreground agents, the developer really wants to be in control". So if we essentially were very heavy-handed and restricted the control of the developers, they wouldn't be happy. But at the same time, we do not want them to make common mistakes or repeat the same things again and again. So a bunch of these foreground coding agents have instruction-like files. For example, for GitHub Copilot, it's called Copilot instructions and it's essentially MD file. So we again generate these files. We help developers update these files in different repos. We provide a standard library of MCP tools for common tasks.

We encapsulate context common to LinkedIn development, in general development, in a particular language, development inside that repo in these files so that all of that context is preserved. It's similar to how you would train a human to follow certain rules in the same way we essentially train and guide the agent to follow certain rules, but the human developer still has full agency on what they want these agents to do for that particular task.

Prince Valluri: We use the background agents to generate those instructions for the foreground agents.

Karthik Ramgopal: And the MCP tools are actually shared between the background agents and foreground agents. So we try to get as much leverage as possible.

The Role of Model Context Protocol (MCP) [21:06]

Wes Reisz: Very nice. So Karthik, earlier you talked about MCP and its role and importance at LinkedIn on the platform. I'm sure everyone's familiar with MCP, but step back for a second. What is MCP and why is it so important to AI tooling?

Karthik Ramgopal: So MCP stands for model context protocol, right now there is a huge foundation which is responsible for its development. It's essentially standardizing tool calling. So before MCP, every AI team ended up wiring their tool calls very differently because every model vendor has their own function calling format. People are kind of settling on the OpenAI format, but there are still subtle differences. Every internal service is going to expose APIs very differently and every agent framework is creating its own adapters. So what this means is that it's all going to work, but it's a lot of fragmentation. It's closed on adoption. It requires a lot of work to put it all together.

So MCP is essentially trying to solve that by giving us a common protocol and as long as you implement that protocol, any language, any agent, any tool, any model can interact with each other. So it standardizes this entire workflow and as I said before, the standardization is what is really cool and enabling us to use the same tools in the foreground agents, which we aren't building ground up, we are just using a product like Copilot as well as the background agents, which we are pretty much writing ourselves ground up.

Wes Reisz: So what are some of the tools or some of the MCP agents that anybody listening might be able to get access, and then what are some of the tools that you're creating maybe internally to be able to create these workflows?

Karthik Ramgopal: I think we have some common ones for things like code search or running static analysis, executing some internal command line tools, structured impact analysis to try and look at the impact of our changes. We also have a bunch of documentation tools, which will let agents pull structured knowledge from internal sources. We have semantic indexes over internal sources. We have observability tools, which will help agents pull production observability and metrics data. In general, the way we are thinking about it as this, right, you go to any infrastructure and a tooling team, earlier, you would ask them to think about building a UX or a UI layer for humans to interact. And right now we are telling them, "Hey, you also have agentic actors in the picture. So also think about how you would want to expose MCP tools for agents to interact with your systems".

Security, Compliance, and Observability [23:33]

Wes Reisz: All right, so we're coming up on the end of the podcast. There's a couple other things that I wanted to ask you both about. I think we could go on for another 30, 40 minutes just talking about this because it's so interesting. But the two things I wanted to talk about is what from a traditional platform perspective are you building into the platform specifically for AI? And when I'm talking about are things like security, compliance, observability, and I thought I toss that one to you, Karthik. What are the types of things that you're building into the platform and how does it change maybe for an AI-centric workflow?

Karthik Ramgopal: I think let's first start with some developer-facing stuff. You need some sort of mechanism for managing prompts. After that, you need abstractions for easily defining tools and spinning up MCP servers, adopt these tools, et cetera, so that you can actually tie the prompts to the tools. You also need some standard infrastructure for abstracting out inference either with commercial models, in-house models, et cetera. So that is another bit. And for security and compliance primarily you need this technology change as well as process change. The technology change is the sandbox environment, which Prince was talking about earlier where all of these agents are pretty much running and this sandbox environment has restrictions on what you can do and cannot do. Certain systems, for example, are simply isolated from the sandbox environment, only limited context just provided here. Authentication and authorization are limited and all these agents have identities as well.

So all the work they do is auditable and they have a limited set of permissions assigned to them. Again, we are not reinventing the wheel here. We are essentially reusing a lot of abstractions which work in the human development world and applying it to agents. Then the other important thing is having transparency and control here. Everything which agents are doing is observable by default, which means every step, every tool call, every decision, everything they're doing is audited. So the human can then go take a look and try to understand the chain of thought, the chain of actions.

Now, the more important thing though is the process change, which is that agents simply cannot make a code change. They can propose a code change and that code change will go through the exact same reviews and the tests and ownership which a human would be exposed to. So the developers can essentially replay the traces, inspect the reasoning, verify what the agent actually saw, which is dealt through the thought process, look at the final changes, and after that approve it. Agents aren't replacing engineering judgment, but are essentially providing a structured collaborator who is increasing the productivity of the human developers involved here.

Wes Reisz: Yes, absolutely. Makes total sense. Prince, we talked a little bit about developer experience before about background agents versus maybe IDE foreground type agents. What are you seeing from building this platform to really enable the

Prince Valluri: It's funny. All the problems with general engineering that we used to have with just people, are humans around. We still have with agents, we're having to solve the same problems, but now everybody's thinking of it from a different lens. The key problem everywhere is you have a lot of context in your head, when you're making code changes, the agent does not, and how do you do that? How do you elaborate your task or elaborate your task description or have all the adjacent knowledge to that task listed out and given to the agent in a way that it can comprehend it? That really is the key piece of the puzzle. I don't think there's a big solution for it, but they're going to keep evolving it and keep working on it, and that's what developers are facing today. If you give me this task, I'll do it, but I don't know how to explain that to the agent.

Solving Context with RAG and Historical Data [27:32]

Wes Reisz: You just keyed another question that I really wanted to focus on. When you're supervised in the foreground and you're using an agent to be able to do some kind of work, the developer has context between different hops, different steps that you're actually doing. So they're maintaining the knowledge graph of actually what's happening. When your background and where you're taking multistep, inter tool, inter agent communication. How is LinkedIn tackling the problem of context between agents?

Prince Valluri: Across engineering, if you look at how our collective knowledge base exists or is evolving, it's constantly changing. There's thousands of PRs created every day and merged, and that evolves our codebase and evolves our semantic understanding of everything. What we try to do is maintain a semantic understanding of the codebase as it exists and as it evolves in a way that it is queryable by the agents to understand how something works or how to do something, in the same way that a human word by searching through Wikis or Slack or asking your teammate by the next desk. But that really is a very critical source of information for the agents to get their changes right. That is made available to the agents as a general RAG system that they can query and get more information.

Wes Reisz: I was going to ask, so what I'm hearing is a RAG basically that has the engineering context of the PRs that are going, what the architecture looks like, what your principles are so that it can actually begin to follow us.

Prince Valluri: An interesting thing here also is we decided to do this as an experiment. It turned out it actually works pretty well, is we started taking PRs individually and started describing them using AI to figure out what changes were actually made. So if let's say you bumped a dependency version of a certain system like Gradle in a PR, it would say that this is what happened in this PR. And then the agents, when they're asked to do something, they can query these past PRs and say, "Hey, are there any PRs that bumped Gradle version?" And they would be able to find all these PRs that made those changes and then they could also see what other things are generally impacted when you bump a Gradle version so it doesn't have to figure it out itself. So that's an interesting experiment that paid out pretty well.

Key Advice for Platform Engineers [29:44]

Wes Reisz: Again, you all are going to be speaking in New York in a few weeks, so if we get this out ahead of time, come take a look at QCon AI to learn more from Karthik and Prince. But before we wrap, as developers are on this journey, all of us are using different AI tools and our IDE and we're moving to try to operationalize these tools as a platform. What advice do you give developers as they're on this journey, going from the foreground to background?

Karthik Ramgopal: I would give three pieces of advice to developers. The first is invest in solid engineering and platform abstractions. That is the only way to look past the hype and to actually get something working in production. The second is understand the strengths and the limitations of the AI and use it wisely. AI is good at certain tasks. AI isn't good at certain tasks. AI right now still isn't at a state where it's fully autonomous everywhere. It's not a replacement for human judgment, so use it appropriately. And the third piece of guidance is if you try to fit AI into your existing processes, which are human heavy, where stuff is undocumented, it's tribal knowledge, it's in people's heads, it's not going to work. You have to change the way you work and change some of your processes as well in order for AI to be maximally effective.

Wes Reisz: Yes, that's a really great point. I'm doing a talk in a couple of weeks, and that's what I've been trying to focus on is that AI is not a one size fits all solution and our conversation really drove down specifically about different approaches that you could use with AI in the foreground or in the background, for example. Great advice, Karthik. Prince, what about you?

Prince Valluri: So Karthik really hit the nail on the head with those three. I would add two more. The first one is don't take evals too lightly. They are very critical for you to know if your system is improving or regressing, so please don't treat them as phase two. They're a core part of the platform. We learned this the hard way and tried now to figure out that we really need to invest in this and understand how these agents are improving over time. And the second thing is solve for your company's specific context. Don't try to recreate GitHub Copilot or Cursor or Replit inside your company. Instead, ask the right questions for what is repetitive, what are the high-fraction engineering tasks that are unique to us? Where can we provide the most value for the problems that are unique to us? And that is how you're really going to save developers a lot of time.

Wes Reisz: Well, gentlemen, thank you for joining us on the InfoQ Podcast. And again, if you're listening to this and you want to hear more, Prince, Karthik will be in New York on the 16th and 17th at QCon AI and you can ask them questions directly. Gentlemen, thank you.

Prince Valluri: Thank you.

Karthik Ramgopal: Thank you so much.

About the Authors

Karthik Ramgopal

Show moreShow less

Prince Valluri

Show moreShow less

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and YouTube. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.