Our approach to AI in development

At Brightspot, we are deliberately and thoughtfully integrating AI into our software development process. While we recognize the immense potential of AI, we’ve resisted rushing in. Instead, we’re taking a slow, deliberate and iterative approach to ensure real return on investment while avoiding unintended consequences.

graphic depicting AI capabilities being used in web development

By Ravi Singh, Hyoo Lim, Jake Graham, Axel Tarnvik & Paul Easterbrooks

Exploration phase: Grassroots learning

Our journey began with exploration. Developers were encouraged to use AI tools in their daily work at their own pace. This grassroots effort provided invaluable feedback and revealed where AI could add immediate value without disrupting workflows or compromising standards.

At first, things were chaotic. We tested multiple tools, quickly realizing that the “arms race” between platforms would continue. Our takeaway: pick a tool, learn its nuances and avoid the trap of constant switching.

Today, our core toolset includes GitHub Copilot, Jules and Claude Code, with OpenAI keys powering additional integrations.

We continue to explore and evaluate but we are also focused on maximizing these core toolsets and our ongoing investment into AI.

Execution phase: From small steps to bold moves

Phase 1: Big things have small beginnings

We started with simple, repetitive tasks:

Inline comments and documentation auto-suggestions
Generic or boilerplate object creation like Enums and POJOs (plain old java objects)
Straightforward method logic that just do what the method is called
Leveraging inline chat features to get quick feedback about certain blocks of code

Example: Code fragments

example code block from AI-generated code fragment

1 of 2

In this example, we instructed the AI to make a class called SimpleStatusEnum and this was the suggestion.

2 of 2

This example shows how we simply added a “/” to start a comment and the AI suggested the rest.

Important

These small efficiencies added up — saving about 5% of developer time.

Phase 2: Smarter code reviews

Next, we moved to AI-assisted code reviews. By enabling automated reviews across all pull requests, we unlocked several benefits:

Change summarization of large pull requests
Real-time bug prevention (e.g., catching unclosed HTML tags)
Secondary code checks prompting closer human review
Code performance optimization, catching issues that could have slowed execution significantly

Example: Leveraging inline chat to help refine/improve a code block or brainstorm an implementation approach

We started by asking specific questions about the exact spot in a file, which we found usually returns much better suggestions than asking abstractly to an accompanying chatbot.

Based on the insights we gathered, our first formal initiative was to enable automated code reviews across all pull requests against platform code. This presented a clear opportunity to leverage AI with manageable risk. To start, we used GitHub Copilot, which uses OpenAI’s models.

This approach has significantly enhanced our development process by streamlining various aspects of code review and bug detection. Here are some key ways it has contributed.

example of AI-generated code recommendations

1 of 4

Change summarization: AI can quickly analyze pull requests, even with a large amount of changes (such as the one here, which contains 3,000+ lines), and provide concise summaries that highlight key changes, identify patterns and flag potential concerns. This allows human reviewers to rapidly grasp the essence of modifications, crucial for maintaining project velocity.

2 of 4

Real-time bug prevention: AI can proactively identify common coding mistakes, such as unclosed HTML tags, as developers write code. This immediate feedback prevents simple but impactful bugs from reaching full review, allowing teams to focus on more complex issues.

3 of 4

Secondary code check: While AI tools may produce false positives because they do not know the intent behind the changes, their value also lies in prompting developers to scrutinize specific code sections, acting as an effective secondary check for subtle issues.

4 of 4

Code performance optimization: Developers occasionally make simple mistakes that can dramatically alter the performance of running code. Copilot caught what looked to be an innocuous change, but would have made the following line exponentially slower.

Important

These successes reinforced our view: AI is an assistant, not a replacement, for human expertise.

Over the long term, catching and correcting mistakes like this can lead to reduced operational and staffing costs as site reliability engineers will have to intervene less for non-critical issues.

Phase 3: Unit Tests and Documentation

We then applied AI to low-risk but high-value areas: unit tests and documentation. The verification overhead here is lower, making it ideal for AI integration.

Using Copilot and Jules, developers collaborated with AI to:

Suggest edge cases they hadn’t considered
Format code to align with internal guidelines
Iterate like working with a junior developer — needing guidance but improving output over time

Example: Training AI to follow coding guidelines

There are many cases where our engineers feel that they have exhausted the edge cases that need to be tested. Getting a suggestion for something that increases the quality of the unit test case without the cognitive load.

example of AI prompts to follow coding guidelines

1 of 5

In this first step, Jules is instructed to follow coding guidelines. The AI formats the code automatically using Spotless and using the rules embedded in the codebase.

2 of 5

Here, the AI continues to develop its approach while explaining each step.

3 of 5

After adding the code and running a build, it fails initially. Jules struggles to proceed without clear instructions, similar to how a human junior developer sometimes struggles.

4 of 5

The developer then provides the necessary guidance, and after following it, the AI successfully builds the codebase, finishes the feature and prepares it for the next step.

5 of 5

This kind of interaction shows how AI can act as a knowledgeable but still-developing partner, and how human expertise remains vital to guiding the process and resolving issues in real-time.

Phase 4: AI agents as developer assistants

Finally, we began experimenting with AI as a day-to-day development assistant for production code.

AI agents excel at:

Automating tedious work, such as building POJOs from JSON specs
Acting as “hyper-intelligent rubber ducks” for architectural discussions

But with great power comes risk. Agents can accelerate complex tasks, but they also make mistakes. For example, one agent attempted a misguided optimization that slowed execution by more than 10x. That’s why human review remains non-negotiable.

Example: Delegate the dirty work

AI agents work by taking your instructions, breaking them down into a series of tasks and executing them directly in your codebase. This makes them perfect for jobs that are low on complexity but high on effort. Not only can the agent execute these steps reliably, but it’s also less prone to the simple mistakes a human might make during such tedious work.

1 of 2

This example shows the AI building a POJO from a JSON spec fed to it using Claude Code.

2 of 2

In this example, Claude AI is executing a multi-step code generation while iterating through a complex problem.

Warning

Our developers find that while agents can successfully accelerate complex tasks, they can also introduce serious mistakes. Consequently, all production-bound code they generate requires careful review, the same standard applied to any developer’s work.

Iteration phase: Continuous refinement

Our phased approach ensures continuous learning:

Develop a deep understanding: We gain practical insight into where AI truly excels and where human expertise remains indispensable. We are not just adopting tools but building a profound understanding of their capabilities and limitations in our environment.
Tailor solutions to our needs: We are not simply adopting off-the-shelf AI. Instead, we are actively experimenting and adapting tools to align with our specific development workflows, coding standards and project requirements.
Ensure strategic value: Every AI initiative we undertake is evaluated for its tangible impact. By carefully tracking AI’s influence on productivity and code quality, we ensure that every effort delivers measurable value and contributes directly to our organizational goals.

At first, engineers were skeptical. Today, adoption is high, and productivity is up.

Where we see risks

Lazily accepting code: AI output is reviewed as critically as human code.
Production issues: While no AI-produced code has caused failures, we remain vigilant about avoiding increases in Mean Time to Resolution (MTR) in QA, staging or production.

Measuring the ROI of AI

Our AI investment is delivering measurable returns.

Productivity gains

Adoption of AI tools has boosted productivity, with weekly task completion improving by about 20% on average. This is up from less than 10% early in our rollout. The evolution of AI agents capable of handling more complex tasks has been a key driver.

We’ve also seen faster code generation, which requires more detailed reviews. To maximize returns, we continually measure uplift and foster rapid learning. AI is positioned as a productive co-author, supported through an #ai Slack channel and monthly sessions that engage both engineering and non-engineering teams in safe, systematic adoption.

Adoption curve

The adoption curve remains strong, with significant productivity gains expected across engineering by year-end. In January 2025, our thesis centered on applying AI to code reviews, document reviews, production code and UI tests.

This was the chart we published to internal leadership in February and we are pacing at better than our originally predicted pace (e.g. we have turned on code reviews in 100% of our repos and document reviews are running well ahead of schedule).

AI engineering forecast chart — Chart shows forecasted AI adoption across Brightspot’s engineering practice areas

Investment

Our initial exploration into AI was restrictive and the investment open-ended, which made for a poor understanding of yield and outcomes.

To address, we have since set up a budget and governance structure to equip our teams to request and use company-approved AI tools.

To help illustrate how the budgeting process panned out, the below shows a simplified version of the model used with our finance teams to measure the impact of our AI investments.

Number of developers (N)	50
Annual salary per developer (S)	120,000
Hours saved per week – productivity (H_p)	1
Hours saved per week – quality (H_q)	0.5
Retention improvement (absolute %)	0.01
Baseline turnover rate	0.15
Cost of replacing a developer (months of salary)	4
License initial cost	50,000
License cost growth rate (per year)	0.15
Implementation cost per developer	1,500
Monitoring cost per year	7,020
Working weeks per year (W)	48
Discount rate	0.1
Productivity YoY improvement	0.01
Quality YoY improvement	0.01

Note

Tune in soon for upcoming posts about how we are using AI in our QA process and site-reliability engineering!

Artificial intelligence (AI)