BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Virtual Panel - AI in the Trenches: How Developers Are Rewriting the Software Process

Virtual Panel - AI in the Trenches: How Developers Are Rewriting the Software Process

Listen to this article -  0:00

Key Takeaways

  • AI tools accelerate development when paired with strong context and validation. They work best when teams provide clear structure and verify results with tests and human review.
  • AI is changing the role of developers: from authors to orchestrators. Instead of writing code line by line, developers increasingly manage AI output, introducing new architectural concerns such as “context engineering” to keep generative agents within constraints.
  • Developer onboarding is faster, but with limitations. AI lowers the entry barrier for junior developers by helping them navigate unfamiliar codebases, but trust and long-term skill growth depend on mentorship, runtime feedback, and a strong ownership culture.
  • Common productivity metrics are misleading without context. Vanity metrics like lines of code, PR volume, and commits can spike with AI, while real productivity shows up in stability, incident rates, and code churn.
  • Cultural change matters as much as technical integration. Teams that succeed with AI adapt their mindset, expectations, and collaboration, making AI part of a shared process rather than an isolated productivity hack.

 

Introduction

From code generation to automated documentation, AI has begun to insert itself into nearly every stage of the software development lifecycle. But beyond the hype, what’s actually changed? We asked a group of engineers, architects, and technical leaders how the rise of AI-assisted tools has reshaped the established rhythms of software development - and what they’ve learned from adopting it in the real world.

The panelists:

  • Mariia Bulycheva - Senior Machine Learning Engineer - Intapp
  • Phil Calçado - CEO - Outropy
  • Andreas Kollegger - Senior Developer Advocate - Neo4j
  • May Walter - Founder, CTO - Hud.io

InfoQ: How has the rise of AI-assisted tools impacted the software development process in your organization? Have they changed the way you think about software architecture?

Mariia Bulycheva: AI-assisted tools have accelerated prototyping and reduced time spent on repetitive coding tasks, allowing our teams to focus more on architectural decisions and on designing complex online experiments, which are critical for iteratively improving complex recommender systems at large scale. Getting initial insights from the large amounts of multimodal data typical of digital platforms has also become faster, smoother, and more consistent, since we can delegate the initial data analysis to AI.

Another very important aspect of our work is keeping up with the rapid pace of scientific developments in our field. Every year, thousands of new research papers are published at top conferences, and it used to be quite time-consuming to read them all and determine which ones could be relevant for our team's daily ML tasks. Today, AI tools provide high-quality summaries and even highlight which methods could be applicable to our use cases. This has already led to several quick implementations of new modeling ideas that would otherwise have taken us weeks or even months to discover and test.

Phil Calçado: Absolutely. We run a consumer engagement platform with more features than any sane person could keep in their head. As an example, recently, we needed to change how we handle time zones in scheduling. The code change itself was maybe ten lines, but the real work was spelunking through hundreds of places that touch scheduling, figuring out each one’s assumptions, and slapping unit tests to assert that the call site wasn't going to break but the change in behavior. We thought this was going to be a six-month project as we'd have to gradually research and make small changes.

With tools like Cursor and Claude Code, we cut that down dramatically. They helped us surface all the impacted locations, generate unit tests for each, and split the rollout into small PRs grouped by subsystem. Each PR came with a context-aware description for the owning team—not just "fixing scheduling, please review", but an explanation of the why and the expected impact in their world.

So while we have seen the same uptick in raw code output as everyone else, but in a mature, hyperscale system like ours the biggest lift is in how AI helps us research our own codebase, stitch together the boring but essential safety checks, making systemic changes less terrifying.

Andreas Kollegger: Across our organization, all employees now have access to AI-assisted tools. For surface-level interface design, these tools have helped us iterate faster, explore new ideas, and unlock new approaches, like vibe coding, to focus on higher-level design and strategy.

But we’ve also run into AI’s limits. Like many organizations, we’ve seen that large language models (LLMs) struggle with highly specialized code requiring deep domain expertise and a holistic view of global architecture. Our codebase alone exceeds the capacity of any LLM context window, and the models themselves haven’t been trained on the unique complexity within it. Simply put, AI cannot invent what it doesn’t understand. As a result, we’ve taken an intentionally human-centric approach: while AI helps us accelerate and augment, it’s our engineers’ expertise that drives the breakthroughs in software architecture.

May Walter: AI-assisted tools have dramatically shortened the path from idea to working code. Once intent is clear, iteration cycles compress significantly. Developers are shifting from being sole authors of code to acting more like managers - guiding agents, validating outputs, and ensuring requirements are truly met.

Pre-AI - architecture was about ownership between teams and scalable interfaces. This introduces a new dimension: context architecture - designing the inputs, scaffolding, and guardrails an agent needs to generate production-ready code. Context engineering is becoming a core part of the system which streamlines the ability to build fast in complex environments like distributed and event-based systems.

But speed creates a new bottleneck: preparing AI-generated changes for production. Even as reviews become AI-assisted, the challenge is less about spotting syntax errors and more about verifying unintended consequences across large, scaled systems.

InfoQ: How does the adoption of AI impact the onboarding process within a team? Were the junior developers in your team or organization impacted by the adoption of AI in the software development process?

Mariia Bulycheva: AI tools can substantially speed up the learning process by providing instant code examples, documentation summaries, and test suggestions, which supports junior developers. In teams working on complex domains like personalization and recommender systems, this has been especially helpful, because now juniors can explore new codebase faster without always depending on senior engineers. At the same time, we pair them with more experienced colleagues to ensure they learn underlying fundamental modeling and system design principles, not just shortcuts.

Phil Calçado: We just had our Summer interns present their projects, and almost every one of them called out AI as a lifesaver. Dropping into a decade-old Rails codebase with thousands of moving parts is intimidating. But being able to say to Cursor or Claude Code, "I’m a third-year student who knows Python and C++, explain this Rails code to me using parallels to what I know" meant they could get productive in weeks instead of burning them just figuring out the basics.

And it’s not just interns. In a system this large, even senior engineers need more ramp-up time than they would at a smaller company. AI doesn’t remove the need to actually understand the system, but it does take the edge off the "where do we handle authentication?" or "do we already have an implementation of the Observer pattern somewhere?" kind of questions.

Of course, there’s a catch. Generative AI is great at copying patterns, and this often means legacy styles and architectures we’d rather not see anymore. So we’ve had to adapt. We’re making our workflows and architecture more AI-friendly, and we’ve started embedding our current guidelines directly into agents on Claude Code and Cursor. That way, when the AI offers help, it’s nudging people toward the present, not the past.

Andreas Kollegger: AI adoption has enhanced our onboarding process, particularly for junior developers new to graph databases. While AI can’t replace the guidance of an experienced mentor, it has complemented our existing onboarding resources by helping newcomers get up to speed more quickly.

Onboarding is also much more than about teaching coding skills. It’s about building domain expertise. Coding ability matters, but it’s far more critical to understand what to code and why. That’s why our onboarding developers, who bring deep knowledge of the codebase and its architecture, play an essential role in transferring expertise and context to junior team members.

May Walter: AI has lowered the barrier to contribution. A new developer can now produce usable code on their very first day - a dramatic shift from the days when early work was limited to boilerplate or bug fixes. But the real opportunity isn’t speed; it’s depth and range of competence.

The concern I hear most often is that AI risks making onboarding shallow - juniors can generate code without understanding why it behaves the way it does. My experience has been the opposite. When code generation is paired with runtime feedback, junior developers gain exposure to systems thinking from the start: how architecture behaves under load, how dependencies interact, and how changes ripple into business outcomes. Engineers become your business ambassadors in the agentic code generation process.

Instead of spending months grinding through low-value work, they’re now able to tackle more of the team’s load. Done well, this doesn’t skip steps - it accelerates them. With the right culture and expectation setting, juniors can develop into well-rounded engineers faster, because they learn not just how to write code, but why it matters in the context of the system.

InfoQ: Have you measured the productivity or quality impact of AI-assisted development in your team or organization? What did you learn?

Mariia Bulycheva: We’ve seen tangible productivity gains in boilerplate code and unit test generation, and even in setting up simulation experiments for recommender systems. However, when working with critical systems, such as those influencing customer experience at scale, the real benefits come when AI assistance is combined with deep engineer involvement. We’ve learned that while AI improves productivity, quality still depends on careful validation with clear metrics and tests.

Phil Calçado: Not formally. And frankly, I don’t buy most of the "productivity" numbers being thrown around. In software you can massage metrics until they say whatever you want, and the AI hype cycle has made that worse. The fact that people are seriously counting lines of code again just to juice a funding round or goose a stock price is embarrassing.

Andreas Kollegger: On the front-end side, we’ve seen a boost in productivity, especially for our engineers working with tools like Cursor. Many of our engineers have used AI support for faster understanding, surface-level coding, and testing our code base, but the real impact we’ve seen from AI is on the developer experience. By using AI tools to support some of their activities, our engineers now have more time to be creative, and ultimately, improve how they solve problems and create new approaches to their work.

May Walter: Yes - and the first thing we learned is that most of the common measurements don’t mean much. Accepted lines, commits, PRs: AI inflates those instantly, but they’re vanity metrics for engineering productivity.

The real signals live downstream. Release stability, incident frequency, time spent on-call, and even code churn tell us whether we’re actually moving faster or just generating more fragility. AI shifts velocity to the front of the pipeline, but unless validation loops are tight, the debt surfaces later - in bugs, regressions, and burned-out teams.

With continuous production feedback from day one, we could see where the truth lies: feature development got faster, but review cycles are longer and post-deploy errors emerged as well.

The lesson is that AI productivity requires a learning curve and iterative approach. Once measured, adoption can be improved iteratively to capture the upside - while avoiding the trap of shipping faster but suffocating with stability issues.

InfoQ: What is something non-technical that had to change in your team or organization in order to make the use of AI tools effective?

Mariia Bulycheva: The biggest change was in mindset. Teams had to move away from expecting AI suggestions to be "correct" and instead treat them as starting points that require thorough validation, discussion, and testing. This cultural shift encouraged experimentation and cross-disciplinary collaboration, replacing a focus on certainty with a focus on exploration. In large-scale personalization work, we also needed alignment with product and legal teams on responsible data usage and reproducibility. These agreements created the guardrails that allowed engineers to safely explore and deploy AI-assisted solutions.

Phil Calçado: I think the biggest thing about Generative AI tools—and this goes beyond coding—is that, like any other tool, you have to watch for the side effects. Generative AI makes it really easy to generate content: code, PR comments, tech specs, emails, Slack messages. It also makes it really easy to summarize a wall of text and filter out what isn’t essential.

The combination of those two traits creates a weird incentive: people generate tons of low signal-to-noise content, and then other people use AI again to filter it back down. That’s incredibly ineffective. We’ve started talking internally about the right way to use AI when producing content. Spoiler: it’s not about having AI write for you but rather about using AI to help you write better.

Andreas Kollegger: We established an AI Ethics Board during the early stages of AI, with representatives across our organization to better understand and guide every aspect of how AI impacts our business. All technology can be a force of good, yet it also requires intentional thought, action, and guidance.

Because we’re trusted with customer data, our developers need to apply a heightened sensitivity into any area where AI is introduced as an assistant, from simple planning documents and email threads to the codebase itself. As we adopt, integrate, and scale AI, all our developers must ensure that human judgement, not AI, guides and oversees every step.

May Walter: The biggest change wasn’t technical - it was cultural. Developers naturally adopt tools individually, but AI doesn’t work well when treated as a personal productivity hack. It only becomes effective when it’s part of a shared process, with aligned validation steps and clear accountability. Furthermore, AI tools don’t fail when lacking context, but rather generate inaccurate responses which can hurt the users’ trust and add friction to the change.

At 10 engineers, everyone can experiment their own way. At 100, that breaks down. Different agents generating code in isolation creates fragmentation and risk. We shifted toward common set ups and shared workflows so that AI wasn’t just helping individuals move faster, but making the whole team move faster.

InfoQ: Which guardrails (cultural, ethical, or technical) have you put in place to manage AI-assisted coding, and how do you manage the matter of trust in AI output across individuals, teams, and organizations?

Mariia Bulycheva: We treat AI output as "first draft code" for repetitive or boilerplate tasks, which always goes through unit tests and peer review. On the cultural side, we emphasize accountability: the developer who submits code is responsible, regardless of AI assistance. For machine learning workflows, we don’t trust AI to generate models directly, instead, we rely on automated offline evaluation against established baselines before any model change can even be considered for production. This ensures AI-driven contributions meet the same quality bar as human-written ones.

Phil Calçado: This is still a very nascent practice, so we’ve been experimenting with different guardrails and tooling. On security and compliance, our stance was clear from day zero: as a public company handling data for some of the world’s biggest brands, we have to apply the same governance practices to AI coding tools that we do everywhere else. A few years ago that meant being stuck behind the curve, but today most vendors have solid enterprise programs, so we can safely use state-of-the-art models without compromising security or auditability.

Culturally, we set expectations early: just because an AI tool wrote the change doesn’t mean it isn’t your code. You still own it, and you need to treat every line as if you typed it yourself. It’s no different than using IntelliJ’s extract method refactoring, in which it may automate the mechanics, but you’re still accountable for understanding and validating the result.

Andreas Kollegger: Large-scale enterprise software can provide safeguards against AI-generated mistakes, but a higher level of accuracy, context, and traceability are what make AI outputs explainable and verifiable, not just performant. That’s why we’ve incorporated an extensive testing regimen that spans everything from individual unit tests to exhaustive production-level validations.

At the same time, it’s critical that our engineers balance discipline with innovation. We encourage engineers to experiment with ideas and explore projects that may not yet be production-ready. This environment allows for rapid iteration and creativity, while ensuring only the most valuable and well-tested innovations transition into production. The result is a unique balance: preserving trust and stability for customers, while continuously advancing graph-powered innovation that makes AI more accurate, transparent, and explainable.

May Walter: Trust in AI output has to be earned, and the only way to earn it is with context. Every AI-generated change goes through the same standards as human-written code - reviews, tests, validation - but with one extra bar: it has to prove itself once it runs.

For us, trust doesn’t come from believing in the model; it comes from watching the code behave in real-world conditions. Does the new version perform like the old one? Does it introduce new errors or shift performance under load? When that runtime context is continuously available, AI stops being a black box. It becomes a partner that can be trusted because it’s reasoning with the same signals engineers rely on.

InfoQ: What do you think software development teams are underestimating about AI coding tools? Are there any current AI-enhanced developer workflows or models do you think are overhyped, and which are still underutilized?

Mariia Bulycheva: Many teams underestimate the importance of context management, since AI is only as effective as the context you provide (codebase, documentation, architecture, experimental setup for the online test). In large systems, this means curating not just code snippets but model performance data, logs, and experiment history to guide AI tools effectively. Overhyped: "push-button development" where AI supposedly replaces engineering judgment. Underutilized: AI-assisted debugging, experiment setup, and documentation of complex ML workflows, which could drastically reduce long-term maintenance costs.

Phil Calçado: There’s so much empty hype in AI right now it’s hard to pick just one offender. But the thing most teams underestimate is this: AI coding tools aren’t a single-trip magic box. You can’t just throw a prompt at them and expect consistent, correct results.

This is a bitter lesson anyone who’s actually built AI products already knows. No matter how clever your prompt engineering is, effective use of LLMs comes from combining workflows and making sure the right context is available at the right time. Otherwise you’re just rolling dice.

I saw this firsthand in a previous life building AI pipelines for a popular code review tool. The model might have memorized every Python book ever written, but ask ten developers "the right way" to do something and you’ll get eleven answers. Without context of your codebase, your org’s standards, and your actual goals, the LLM can’t know which one applies. That’s why you end up with solutions that are completely different, even antagonistic—depending on which way the probability gods felt like leaning when you asked.

Andreas Kollegger: Many software development teams underestimate how AI coding tools can simplify developers’ least favorite tasks, like writing tests and documentation. While AI coding demos with promises of low-code and no-code often seem trivial or unreliable, they show how AI can translate between natural language and code, which is ideal for automating tedious tasks and repetitive setup. Similarly, there's an entire subgenre of coding tools dedicated to project initialization and code generation.

One workflow that is both overhyped and underutilized is kicking off coding agents to run overnight and reviewing their work in the morning. I wouldn’t recommend refactoring new product features or substantial code unsupervised, but coding agents are perfect for a well-defined GitHub issue with a good discussion, an isolated and reproducible example, and a testable fix.

May Walter: What most teams underestimate is that the models are already good enough (and getting better) - the missing ingredient is organizational context. Waiting for ‘better models’ is a distraction. The real challenge is designing systems that provide the context needed to generate production-grade code: your architecture, coding standards, data boundaries, and business priorities. Without that, even the best models (or engineers) will underperform.

On the flip side, what’s overhyped today is raw code generation and static code review. Those workflows look impressive in demos, but they don’t address the hardest part of software engineering in large organizations: debugging and quality assurance. Agents still lack runtime context and have few tools to assess which changes are truly critical in terms of business impact.

That gap matters because faster code generation means more changes flowing into production - and without stronger processes to decide what to monitor, teams risk trading speed for fragility. The underutilized frontier isn’t writing code faster, it’s building validation loops and runtime-aware tooling that increase certainty before those changes ever get deployed.

Conclusions

The first and perhaps most important conclusion from this discussion is that while the adoption of AI tools in the software development process has undoubtedly lowered the barrier to contribution, it still is a multiplier, not a silver bullet. AI amplifies productivity only when paired with a strong organizational context. AI-based engineering has the potential to become as central to software development as CI/CD pipelines once were. However, architecture, coding standards, and experiment scaffolding are the sustaining pillars of successful AI adoption.

At the same time that AI tools evolve, the role of developers within the organizations also tilts from code authors to system orchestrators. The newly adopted process of curating, validating, and integrating AI outputs does not replace software engineering as a craft; instead, it adds to it. Critical thinking and architecture awareness are more important than ever.

There are, of course, pitfalls that come with the adoption of any new technology, which is also true for AI and AI-based tools. Lowered entry barrier to contribution also means increased risk of shallow understanding and production of subpar code, which can negatively impact both junior developers, career-wise, and the organization as a whole. Mentorship and runtime feedback are important guardrails, together with cultural and ethics safeguards: AI outputs must be treated as first drafts, and humans must be held accountable for them. When it comes to AI, trust is not granted: it is a process, earned through tests, peer review, runtime validation, and transparency.

Success metrics must also be rethought, since AI inflates all traditional productivity metrics. Meaningful signals come later: stability, churn, incidents, and how much time is freed for creativity and architecture. Scaling AI must be viewed as a collaborative process, not a personal productivity boost, which requires a coordinated workflow and an enhanced level of maturity for the surrounding processes.

With the good and the bad, it is clear that the changes brought by AI are already here, reshaping the craft of software development. There are still underutilized aspects to it, but context design and runtime-aware tooling are already the next architectural frontier. In the long run, the winners of the AI race will be those who integrate it into team-level processes with accountability, trust, and systems that can evolve together in a responsible manner.

About the Authors

Rate this Article

Adoption
Style

BT