GitHub has introduced a significant update to its CodeQL engine, enabling developers to define custom sanitizers and validators directly through "models-as-data," a move that simplifies how teams extend security analysis across their codebases. The update allows engineers to configure how trusted and validated data is handled without writing custom CodeQL queries, marking a shift toward more accessible and scalable application security practices.
The enhancement addresses a key limitation in traditional static analysis workflows, where extending detection logic often required deep expertise in query languages. With the new approach, teams can define these behaviors declaratively using YAML-based data extensions, making it easier to adapt CodeQL to project-specific frameworks, internal libraries, and custom validation logic.
At the core of the update is improved control over taint tracking, a method used to trace how untrusted data flows through an application. CodeQL now allows developers to define sanitizers (functions that clean or neutralize data) and validators (checks that confirm data safety) as "barriers" and "barrier guards." These constructs determine where potentially unsafe data should stop propagating through the system.
Two new extensible predicates, barrierModel and barrierGuardModel, enable this functionality. The former stops tainted data flow when a function is known to sanitize inputs, while the latter halts propagation when a validation condition is met. Previously, implementing this required writing custom CodeQL logic; now it can be done declaratively, reducing complexity and lowering the barrier to entry for teams adopting advanced security analysis.
The update applies across a wide range of programming languages, including C/C++, C#, Go, Java/Kotlin, JavaScript/TypeScript, Python, Ruby, and Rust. This broad support ensures that organizations with polyglot codebases can standardize how they model and enforce security rules without duplicating effort across different tooling or languages.
By allowing teams to encode knowledge about their own systems, such as internal sanitization functions or validation patterns, CodeQL can produce more accurate and context-aware results, reducing false positives and improving the detection of real vulnerabilities. This is particularly important in modern development environments where custom frameworks and abstractions can obscure traditional analysis.
The introduction of models-as-data reflects a broader trend in application security: moving from code-centric customization to data-driven configuration. Instead of writing and maintaining complex queries, teams can now manage security logic as structured data, making it easier to version, share, and scale across organizations.
This aligns with GitHub's ongoing efforts to integrate security more deeply into developer workflows, enabling teams to extend built-in tooling rather than relying on external or bespoke solutions. It also supports faster onboarding for security practices, as developers can adopt and adapt models without specialized training in CodeQL's query language.
Ultimately, the update aims to make advanced security analysis more accessible, flexible, and maintainable. By reducing the need for custom query development, GitHub is enabling more teams to tailor CodeQL to their specific environments, closing coverage gaps and improving vulnerability detection accuracy.
Other platforms are tackling challenges similar to GitHub's CodeQL update by making security modeling more accessible, integrated, and developer-friendly, though they differ in how they balance flexibility, usability, and depth of analysis.
For example, GitLab takes a more pipeline-centric approach, embedding static application security testing (SAST), dependency scanning, and secret detection directly into CI/CD workflows. Rather than exposing deep customization through query languages, GitLab emphasizes prebuilt rules and policy-driven enforcement, making it easier for teams to adopt security without needing specialized expertise. Similarly, Snyk focuses on developer-first security, automatically identifying vulnerabilities in code and dependencies and providing remediation guidance inline, prioritizing ease of use over deep customization.
On the more flexible and customizable end of the spectrum, tools like Semgrep offer an alternative model closer to what GitHub is evolving toward. Semgrep allows teams to define custom security rules using code-like patterns, avoiding the complexity of full query languages while still enabling tailored analysis. This makes it easier for developers to extend detection logic. Meanwhile, platforms such as SonarQube provide continuous code inspection, combining security, quality, and maintainability checks into a unified dashboard, with a strong focus on ongoing visibility rather than deep, query-driven modeling.
Across these approaches, a clear trend is emerging: while GitHub's CodeQL update moves toward data-driven, declarative security modeling, the broader ecosystem is converging on reducing friction, whether through simplified rule definitions, built-in policies, or tighter CI/CD integration. The key trade-off remains consistent: platforms must balance depth and precision of analysis with usability and scalability for everyday developers, and each tool is evolving along that spectrum in different ways.