Dan Dobrin

The Senior Engineer vs. The Speedster: How AI Tackles Advent of Code

2025-12-17T00:00:00+00:00

The Senior Engineer vs. The Speedster: How AI Tackles Advent of Code

Every December, many developers from around the world tackle the Advent of Code challenges — a much loved tradition of daily programming puzzles, increasingly complex, ranging from warm-up exercises to complex algorithmic challenges.

This year, having early access to Gemini 3 Pro and Flash, I decided to take a totally different approach to solving these puzzles: delegate to Gemini models and run an experiment that combines the creative puzzles of Advent of Code with the power of state-of-the-art LLMs. I was curious of the results, with different models tested.

TL;DR

A Focused Goal

use state-of-the-art models: Gemini 3 Pro Preview and Gemini 3 Flash Preview
employ only minimalistic prompting
give each model the same puzzle and dataset with a simple prompt, for all 12 days (different than in previous years with 2 puzzles/day now)
validate that each model can solve it correctly in a one-shot approach, no hints, no intervention, for all 12 days
compare not just whether the models solve it, but how they solve it, as a follow-up
Gemini CLI is used for working with both models

Results

BOTH models successfully solve ALL 24 puzzles in one-shot, as I hoped they will !!!
the results reveal some interesting differences in what I'd call "engineering maturity"

Note to the Reader

experimenting with all puzzles is left to the reader, with even more interesting solutions in the last few puzzles
this is not an exhaustive, complex prompting type of exercise, it is just a snapshot of what can be done with LLMs today
why this exercise? Imagine what you can do with this exercise with optimization, human-in-the-loop, software engineering principles guidance, etc !!!

Why Advent of Code Is Great for a Personal LLM Evaluation

Before looking into the exercise, let's touch briefly on two points:

Advent of Code: Packing Serious Challenges as Personal Developer Fun

Advent of Code occupies an interesting middle ground - it's both a fun exercise and a serious programming challenge. The problems genuinely test algorithmic thinking, especially in the later days of the schedule (last days usuaaly involve graph algorithms, dynamic programming, or computational geometry).

Many professional developers use it for interview prep or to challenge themselves and stay sharp.

Why It Works as an LLM Benchmark

Advent of Code is actually an excellent framework for comparing LLMs:

Criterion	Why It Matters
Objective correctness	Problems have definitive right answers with test cases
Graduated difficulty	Days 1-6 focus more on basic reasoning; Days 7-12 reveal which models understand complex algorithms
Logical focus	focus on solving a problem in a logical way, no distraction from using frameworks or libraries in the code
Flexibility	Implement in the programming language of your choice
Multi-dimensional evaluation	You can assess correctness, code quality, performance, explanation clarity, and debugging ability all at once
Real-world-like	Unlike small demo challenges, Advent of Code requires parsing varied and messy input, handling edge cases, and sometimes optimizing for scale
Reproducible	Same problems, same inputs, easy to compare apples-to-apples across models

The Experiment Setup

Create an account on the Advent of Code website to unlock solution corectness evaluation and the subsequent exercises.

The Models

Model	Persona	Expected Behavior
Gemini 3 Pro Preview	"The Senior Engineer"	Robust, maintainable, production-ready code. Proper error handling, modern APIs, clean architecture
Gemini 3 Flash Preview	"The Speedster"	Get the right answer fast. Correct logic, but less emphasis on "gold plating" or defensive coding

The Prompt

Both models received the same simple prompt, with file name changes only: the puzzle description and input, with a request to solve it in Java. No hints about code quality, no instructions about error handling - just solve the problem

Read carefully the following puzzle text from puzzle1.md and the associated input file puzzle1.input for testing; plan and implement the puzzle in Java 25 with a main() method for testing

Evaluation Methodology

I used a hybrid approach, and this is just a guideline - remember that we used minimalistic prompting and have not looked into optimizations!

Automated correctness testing - pass/fail against the actual puzzle input online at Advent of Code
Side-by-side analysis - in an IDE
Independent LLM evaluation for qualitative aspects:
- Code clarity and structure
- Algorithmic adherence
- Software engineering principles
- Handling of edge cases
Performance benchmarking - execution time comparison

Let's Examine Puzzle 1 in Details: The Secret Entrance

The first puzzle involves a combination safe with a dial numbered 0-99. Starting at position 50, you receive a list of instructions to rotate the dial Left (L) or Right (R). The challenge: calculate the "password" by counting how many times the dial points to 0 after completing each rotation.

The Algorithmic Challenge

This is a classic modular arithmetic problem with a twist: handling negative numbers correctly when rotating left past zero. In Java, the % operator can return negative values for negative operands, so the implementation must account for this.

Example walkthrough: Starting at 50, applying rotations: L68, L30, R48, L5, R60, L55, L1, L99, R14, L82

Rotation	Position After	Ends at 0?
L68	82	No
L30	52	No
R48	0	Yes
L5	95	No
R60	55	No
L55	0	Yes
L1	99	No
L99	0	Yes
R14	14	No
L82	32	No

Password: 3 (the dial pointed to 0 three times)

Code Comparison: Where Engineering Maturity Shows

Both models solved the puzzle correctly. But how they solved it reveals the seniority angle.

1. The Core Logic: Handling Negative Modulo

The Senior Engineer (Gemini 3 Pro) used an explicit check for negative numbers:

// Senior approach - explicit and readable
if (direction == 'L') {
    currentPos = (currentPos - amount) % 100;
    if (currentPos < 0) {
        currentPos += 100;
    }
} else if (direction == 'R') {
    currentPos = (currentPos + amount) % 100;
} else {
    throw new IllegalArgumentException("Unknown direction: " + direction);
}

The Speedster (Gemini 3 Flash) used a mathematical one-liner:

// Speedy approach - clever but redundant
if (direction == 'R') {
    currentPos = (currentPos + distance) % 100;
} else if (direction == 'L') {
    currentPos = (currentPos - (distance % 100) + 100) % 100;
}

Analysis:

Both are correct, but Speedster's version includes a redundant distance % 100 operation—the outer modulo already handles wrap-around.
The Senior's code is also easier to debug: the logic is explicit and the negative case is clearly handled.

2. API Design: Flexibility vs. Hardcoding

Aspect	Senior (Pro 3)	Speedster (Flash 3)
Method signature	`solve(List, int startPos)`	`solve(List)`
Start position	Parameterized (configurable)	Hardcoded (`int currentPos = 50;`)
Return type	`long` (prevents overflow)	`int`

Winner: Senior Engineer

By parameterizing startPos, the Senior's code is immediately testable with different scenarios without modifying any source code. This is a hallmark of production-ready design. The long return type also shows foresight about potential overflow with large inputs.

3. Error Handling: Fail-Fast vs. Silent

Behavior	Senior (Pro)	Speedster (Flash)
Unknown direction	`throw IllegalArgumentException`	Silently ignored
File not found	Explicit check with clear message	Generic IOException
IOException handling	`throws IOException` (caller decides)	try-catch (handled internally)

Winner: Senior Engineer

The Senior implements "Fail Fast" logic. If the input contains an unknown direction (e.g., 'U'), it throws an exception immediately.
The Speedster silently ignores invalid lines — which might prevent a crash, but could lead to incorrect results with no warning. This is a classic debugging issue in production.

4. Modern Java Idioms

Aspect	Senior (Pro 3)	Speedster (Flash 3)
Path API	`Path.of()` (Java 11+)	`Paths.get()` (Java 7+)
File existence check	`Files.exists()`	None

Winner: Senior Engineer

Path.of() is the modern, preferred API. Using it signals awareness of current Java best practices.

5. Test code

BOTH Pro and Flash models generated almost identical test code, based on the outline of Puzzle 1

class Puzzle1Test {

    @Test
    void testExample() {
        List instructions = List.of(
            "L68",
            "L30",
            "R48",
            "L5",
            "R60",
            "L55",
            "L1",
            "L99",
            "R14",
            "L82"
        );
        
        long result = Puzzle1.solve(instructions, 50);
        assertEquals(3, result);
    }
}

6. Performance Analysis

Metric	Senior (Pro 3)	Speedster (Flash 3)
Time complexity	O(N)	O(N)
Operations per rotation	1 modulo + 1 conditional	2-3 modulo operations

Winner: Senior Engineer (marginally)

The extra distance % 100 in Flash's implementation is unnecessary work.
Both are O(N) and the difference is negligible in practice, but the Senior's code is technically more efficient.

The Scorecard

Criterion	Senior (Pro 3)	Speedster (Flash 3)
Correctness	Pass	Pass
API Design	Winner (configurable)	Basic (hardcoded)
Error Handling	Winner (fail-fast)	Silent/permissive
Input Robustness	Good	Good
Modern Java	Winner (`Path.of`)	Older (`Paths.get`)
Performance	Winner (marginal)	Good

Key Insights: What Have We Learned

The Speedster as a Developer Under Deadline

Gemini 3 Flash acted like a developer racing against a deadline: it got the right answer efficiently, but left behind some technical debt. Hardcoded values, older APIs, silent error handling—these are the hallmarks of "I'll fix it later" code that often takes time or never gets fixed.

The Senior as a Production-Minded Engineer

Gemini 3 Pro solved the problem and built a reusable, robust component. The use of dependency injection (passing startPos as a parameter), explicit error handling, and modern APIs demonstrates a deeper understanding of software craftsmanship.

The Adoption Decision

For a quick script or prototype? Flash is perfectly fine. It gets the job done.

For the production codebase? Adopt the Senior's solution. The generated code is:

Testable without modification
Self-documenting through explicit error handling
Future-proof with modern idioms
Easier to debug when things go wrong

Conclusion

For Puzzle 1 of Advent of Code 2025, the distinction between the Senior Engineer and the Speedster was clear:

Gemini Pro delivered a working solution and a production-ready component
Gemini Flash delivered a working solution with minimal fuss, but left some technical debt

The metaphor extends beyond this experiment. When using AI for code generation, the model you choose—and the prompts you craft—can produce either throwaway prototypes or building blocks for serious software.

For quick scripts? The Speedster is fine. For the production codebase? Start by adopting the Senior's solutions.

Long Document Summarization Techniques in Java with LangChain4J and Gemini

2024-07-06T00:00:00+00:00

Long Document Summarization Techniques with Java with Langchain4J and Gemini models

Suppose your organization has a large number of documents, in various formats, and you, a Java developer, are tasked to efficiently summarize the content of each document.

While summarizing any document with just a few paragraphs is a simple task, there are several challenges to overcome when summarizing large documents with many pages.

Generative AI is top of mind for both developer and business stakeholder and you want to explore how Large Language Models (LLMs) can help you with large document summarization, a complex use-case with universal applicability in the enterprise.

As a Java developer, you're adept at building robust, scalable, high-performance applications. While Python dominates the ML and NLP landscape, Java is the backbone of enterprise software for a long time. It's strength in enterprise systems makes it an ideal platform for integrating advanced NLP techniques. With LLM usage, you can now add powerful, AI-driven insights, to your Java applications, bridging the gap between traditional enterprise software and cutting-edge machine learning capabilities.

This blog post explores various summarization techniques using LLMs, leaving you with practical information and a codebase with ready-to-test Java examples. The objective is to enable you with both theoretical knowledge and hands-on skills for effective document summarization.

We'll be leveraging Vertex AI with the latest Gemini models and the open-source Langchin4J LLM orchestration framework.

Why consider LLMs for text summarization

LLMs offer a number of advantages over traditional extractive summarization methods:

Context comprehension: can grasp complex nuances in text, producing more coherent and relevant summaries
Abstractive capabilities: will generate new sentences capturing the essence of the original text
Flexibility: can be fine-tuned for specific domains or styles of summarization
Multilingual support: many LLMs work across multiple languages, with versatility important for global applications

Text Summarization Techniques

We'll explore in detail the following three summarization techniques in this blog post

Prompt Stuffing - pass in the content of the entire document as a prompt in the LLM's content window
Map-reduce - split the document into smaller (potentially overlapping) chunks, summarize each chunk in parallel, then summarize the individual summaries in a second and final step
Refine iteratively - split the document as in map-reduce, summarize the first chunk, then ask the LLM to refine the initial summary iteratively with the text from the following chunk, to the end of the text.

Before you start

The summarization code uses Java 21. If not already installed, use the following instructions to set it up.

Documentation provides details for cloning the repository, setting the required environment vriables up and authenticating to Vertex AI.

Loading and splitting the document

Before summarization can be started, you need to load the document, then, depending of your summarization approach, split the content up into smaller chunks that can fit into the context window for your chosen LLM.

The latest multimodal Gemini models in Vertex AI have very large context windows, up to 2M tokens, however you will have to adapt to the context window for LLM of your choice.

Langchain4J offers a number of out-of-the-box Document Loaders, Document Parsers and DocumentSplitters . It is very important to explore which one would yield the best results for your use-case.

The codebase for this blog loads the documents from the test folder using a FileSystemDocumentLoader and the TextDocumentParser. Sample documents are provided in text format in this repo.

For text splitting, the DocumentByParagraphSplitter is being used. It splits a Document into paragraphs and attempts to fit as many paragraphs as possible into a single TextSegment, within the limit set for the chunk size. The splitter allows you to specify an overlap window for chunks, with benefits discussed later in the post.

Choosing the right chunk size is an exercise dependent on the length of the context window for the LLM of your choice.

// load and parse the document  
Document document = loadDocument(resource, new TextDocumentParser());  
  
// Overlap window size between chunks set to OVERLAP_SIZE - can be configured  
// from 0 - text.length()  
DocumentSplitter splitter = new DocumentByParagraphSplitter(CHUNK_SIZE, OVERLAP_SIZE);  
List<TextSegment> chunks = splitter.split(document);

LLM conversation inputs

@SystemMessage and @UserMessage are commonly used in the context of prompting and interacting with Large Language Models (LLMs)

@SystemMessage is used to set the context, or role of the AI models, and is usually not visible to the user. We will use for system instructions the same @SystemMessage whenever the AI Service is invoked.

@UserMessage represents the actual input from the human user interacting with the AI. It's the question, prompt, or statement that the user wants the AI to respond to.

@SystemMesage and @UserMessage can be provided directly as Strings or loaded from a prompt template from resources: SystemMessage(fromResource = "my-system-prompt-template.txt") or @UserMessage(fromResource = "my-user-template.txt")

#1: Prompt Stuffing

Stuffing is the simplest summarization technique, as you can pass in the content of the entire document as a prompt in the LLM's content window. However, as prompts for LLMs are token-count-limited, different techniques need to be used for large documents, depending on the size of the content window.

Google's Gemini models have very large context windows, making them an easy choice summarizing large documents. (see limits here)

public interface StuffingSummarizationAssistant {  
    @SystemMessage("""  
    You are a helpful AI assistant.    
    You are an AI assistant that helps people summarize information.    
    Your name is Gemini    
    You should reply to the users request with your name and also in the style 
    of a literary critic    
    Strictly ignore Project Gutenberg & ignore copyright notice in summary 
    output.    
    """)  
    @UserMessage("""  
    Please provide a concise summary in strictly no more 
    than 10 one sentence bullet points,    
    starting with an introduction and ending with a conclusion, 
    of the following text
                  TEXT:     
    """)  
    String summarize(@V("content") String content);  
}

...
// summarize the document with the help of the StuffingSummarizationAssistant 
StuffingSummarizationAssistant assistant = AiServices.create(StuffingSummarizationAssistant.class, chatModel);  
String response = assistant.summarize(document.text());
...				 

Pros:

Single call to the LLM required to summarize the text, most likely faster than with multiple summarization calls
Model has access to the entire document content at once, potentially resulting in better summary results

Cons

Stuffing is applicable only as long as the entire document content can fit into the LLM context window

#2: Map-Reduce

Map-reduce is more intricate than prompt stuffing and implements a multi-stage summarization, as you split the document into smaller (optionally overlapping) chunks, summarize each chunk in parallel, then summarize the individual summaries in a second and final step.

In this method, you need to prepare two user prompt templates, one for the initial chunk summarization step and another for the final combine step. The system instructions remain the same across all LLM calls.

Splitting the text and summarizing individual chunks (the "map" step)

You'll be using the following @UserMessage:

public interface ChunkSummarizationAssistant {
	@SystemMessage(fromResource = "my-system-prompt-template.txt")
	@UserMessage("""  
	Taking the following context delimited by triple backquotes into consideration   
	''''''    
	Write a concise summary of the following text delimited by triple backquotes.  
	''''''  
	Output starts with CONCISE SUB-SUMMARY:  
	""")
	String summarize(@V("context") String context, @V("content") String content);
}

...
ChunkSummarizationAssistant assistant = AiServices.create(ChunkSummarizationAssistant.class, chatModel);  
String response = assistant.summarize(context.toString(), segment);

Map-reduce allows you to parallelize the individual chunk summarization steps, as they are independent of each other:

List<CompletableFuture<Map<Integer, String>>> futures = new ArrayList<>();  
ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor();  
Map<Integer, String> resultMap = new TreeMap<>(); // TreeMap to automatically sort by key  
  
for(int i = 0; i < segments.size(); i++) {  
    int index = i;  
    CompletableFuture<Map<Integer, String>> future = CompletableFuture  
        .supplyAsync(() -> summarizeChunk(index, segments.get(index).text()), executor);  
    futures.add(future);  
}  
  
// Wait for all futures to complete and collect the results in resultMap  
CompletableFuture<Void> allDone = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))  
        .thenAccept(v -> futures.forEach(f -> f.thenAccept(resultMap::putAll)));  
  
allDone.get(); // Wait for all processing to complete

Important: please be aware that parallel summarization of individual chunks does not guarantee the ordering of the individual chunk summaries at the end of the process. It is up to the developer to keep summary ordering in mind before proceeding to the second and final step. An index value is attached to each chunk in the sample codebase, guaranteeing an ordered list of individual summaries !

A key factor for improving summarization results is the concept of overlapping chunks.

Splitting a document by a specific chunk size is arbitrary from a summarization perspective, even if done with utility classes which will split the text cleanly in paragraphs, then attempt to fit entire sentences into the remaining chunk space.

Providing the ability for chunks to overlap up to a specified overlap size can yield better summarization results by preserving more contexts between the individual chunks.

Please consider experimenting with different chunk and overlap sizes for your respective summarization use-cases.

Note: the degree to which you can parallelize LLM calls depends on whether the rate limit of API calls per minute imposed by the LLM !! !

Summary of summaries (the "reduce" part)

With all individual summaries on hand, you can move on to the second and final step, the summarization of the individual summaries.

You would be using a different @UserMessage in this step:

public interface FinalSummarizationAssistant {  
    @SystemMessage(fromResource = "my-system-prompt-template.txt")  
    @UserMessage("""  
    Please provide a concise summary in strictly no more than 10 one sentence bullet points,    
    starting with an introduction and ending with a conclusion, 
    of the following text delimited by triple backquotes.
          '''Text:'''  
      Output starts with SUMMARY:  
    """)  
    String summarize(@V("content") String content);  
}
...
FinalSummarizationAssistant assistant = AiServices.create(FinalSummarizationAssistant.class, chatModel);  
String response = assistant.summarize(content);

Pros:

Large documents can be summarized even with LLMs with smaller context windows
Parallel processing leads to reduced summarization latency
Overlapping chunks can improve summarizaiton accuracy

Cons

Multiple LLM calls are required
There can be context loss due to arbitrary text splitting
Overlapping chunks can slightly increase latency and create larger input text

#3: Refine

The refine method is an alternative to map-reduce to handle large document summarization. You split the document similar to map-reduce, summarize the first chunk, then ask the LLM to refine the initial summary iteratively with the added text from the following chunk, to the end of the text.

This approach ensures a that the summary is both comprehensive, as well as accurate, as it takes into consideration the context of the previous chunk(s).

You would be using the same @UserMessages illustrated in the two steps in the "Map-reduce" approach: ChunkSummarizationAssistant and FinalSummarizationAssistant.

// process each individual chunk in order  
// summary refined in each step by adding the summary of the current chunk  
long start = System.currentTimeMillis();  
StringBuilder context = new StringBuilder();  
chunks.forEach(segment -> summarizeChunk(context, segment.text()));  
  
// process the final summary  of the text  
String output = buildFinalSummary(context.toString());

Pros:

Large documents can be summarized even with LLMs with smaller context windows
Context is preserved between chunks, improving summarization accuracy and completeness
Overlapping chunks can improve summarization accuracy even further

Cons

Multiple LLM calls are required
Must be executed iteratively and does not lend itself to parallel processing, due to the interdependent nature of the individual chunks and their associated context
Latency significantly higher than map-reduce

Summary

In this blog post, we have explored different programmatic summarization techniques for large documents using Google's Gemini LLM, as an advanced use-case for generative AI in enterprise software.

LLM orchestration frameworks do not provide out-of-the-box summarization functionality, therefore you have a full codebase available here, with practical examples, demonstrating how to implement these techniques efficiently in Java.

As an enterprise Java developer, you now have powerful options to leverage LLMs and add AI-driven insights to your applications, potentially transforming how you handle document analysis and summarization.

The field of AI-powered document summarization is rapidly evolving, with new models and techniques emerging regularly. Stay tuned for future developments that could further enhance these capabilities.

Don't hesitate to reach out at @ddobrin for feedback, questions or to discuss new summarization techniques.

Test-Driven Development with Java, Spring Boot, and Duet AI

2024-02-21T00:00:00+00:00

Test-driven Development with Duet AI for Java & Spring Boot Applications

Many times, as a developer, I have been asked to quickly add business functionality to an existing app. As I might not be familiar with the app, I am wondering if Duet AI could help me gain an understanding of the app, review the code, then help add new functionality.

I like to follow a test-driven development approach, and I want to explore how Duet AI can help me in my work.

What's the Ask

The Quotes app requires a new endpoint to retrieve book information by book name. This endpoint should respond on the “/quotes/book/{book}” path. The ask is to implement this particular endpoint.

Before you start

I'll be using Duet AI with Cloud Code in IntelliJ on my Mac, however you can use Duet AI from other IDEs supported by Cloud Code, such as VSCode. If you prefer an online experience, you can use it in the Cloud Shell Editor or Cloud Workstations from Google Cloud Console.

The Duet AI page provides details on how to get setup with Duet AI.

Once we have Duet set up, open a Terminal window and install the pre-requisites for the work.

The Quotes application uses Java 21. If you do not have it installed, it's an easy setup using SDKMan:

curl -s "https://get.sdkman.io" | bash && source "$HOME/.sdkman/bin/sdkman-init.sh"

Install the latest OpenJDK and confirm it as the default JDK in the install:

sdk install java 21.0.2-tem && sdk use java 21.0.2-tem && java -version

With Java installed, if you wish to test the service in Cloud Run when done, enable the APIs you would need to deploy the Quotes app in Cloud Run, if not already installed:

gcloud services enable cloudaicompanion.googleapis.com
gcloud services enable cloudbuild.googleapis.com 
gcloud services enable run.googleapis.com
gcloud services enable logging.googleapis.com 

The Quotes app

Let's clone the Github repo for Quotes and switch to the /services/quotes folder, then open the codebase in the IDE, as well as a new terminal window:

git clone https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp.git && cd serverless-production-readiness-java-gcp/services/quotes

We can validate that the project is set, and set it otherwise:

# get project 
gcloud config list
# set project
gcloud config set project [PROJECT_ID]

We then check that the Quotes app is building correctly and can be tested:

./mvnw package spring-boot:run

From a Terminal window, we execute:

curl localhost:8083/random-quote -w "\n"

Explain the app

Duet AI can be used to explain the app, perform code reviews, suggest changes and tests

Use Duet AI to explain the Quotes app

Let's open the QuoteApplication class, then type the following prompt in the Duet AI Chat window and observe the response:

Prompt

I want to get details about the QuotesApplication; please provide a detailed overview of the QuotesApplication

Response

Duet AI will respond with a detailed overview of the QuotesApplication. Remember that responses are non-deterministic, therefore prompt responses provided in this blog post should be taken as an example and could vary slightly.

Sure. The QuotesApplication is a Spring Boot application that provides a REST API for managing quotes. The application is divided into several components:
* The QuoteController class handles requests for quotes.
* The QuoteService class provides business logic for managing quotes.
* The QuoteRepository class persists quotes to a database.
The QuotesApplication can be deployed to a variety of platforms, including Cloud Run, GKE.

Here is a more detailed overview of each component:
* The QuoteController class handles requests for quotes. The controller has several methods, including:
    * getQuote(): Returns a random quote.
    * getQuotes(): Returns all quotes.
    * getQuoteByAuthor(): Returns all quotes by a given author.
    * createQuote(): Creates a new quote.
    * updateQuote(): Updates an existing quote.
    * deleteQuote(): Deletes an existing quote.
* The QuoteService class provides business logic for managing quotes. The service has several methods, including:
    * getQuote(): Returns a random quote.
    * getQuotes(): Returns all quotes.
    * getQuoteByAuthor(): Returns all quotes by a given author.
    * createQuote(): Creates a new quote.
    * updateQuote(): Updates an existing quote.
    * deleteQuote(): Deletes an existing quote.
* The QuoteRepository class persists quotes to a database. The repository has several methods, including:
    * save(): Saves a quote to the database.
    * findAll(): Returns all quotes from the database.
    * findByAuthor(): Returns all quotes by a given author from the database.
    * delete(): Deletes a quote from the database.

We might want to review the code in the QuoteController before adding an endpoint. Let's open the QuoteController class, then type the following prompt in the Duet AI chat window:

Please perform a detailed code review of the QuoteController and suggest any improvements I can bring to the code

Duet AI will respond with code review items and actually provide suggestions for how to rewrite the code.

Sure, I can help you with that. Here are a few suggestions for improving the code in QuoteController:
* Use more descriptive variable names. For example, instead of using quotes, you could use allQuotes or listOfQuotes. This would make the code easier to read and understand.
* Use more comments to explain the code. This would help other developers understand what the code is doing and how it works.
* Use a more consistent coding style. For example, you could use the same indentation for all code blocks. This would make the code look more organized and easier to read.
* Use a linter to check for errors and inconsistencies. A linter can help you identify errors and inconsistencies in your code, which can help you improve its quality.
* Test your code regularly. This will help you catch bugs early and prevent them from causing problems in production.

Here is an example of how the code could be improved:
...

Use TDD with Duet AI to add business logic

The Quotes app is missing an endpoint to retrieve book information by book name. This endpoint should respond on the “/quotes/book/{book}” path.

We'll use a test-driven approach to add this functionality, starting from writing an application test.

Start by asking Duet AI to recommend which types of tests you need to write:

Which types of tests should I be writing for the QuoteController

Duet AIAI will reply with a list including:

There are many different types of tests that you can write for the QuoteController class. Some of the most important types of tests include:
* Unit tests: Unit tests are designed to test the individual components of a program. In the case of the QuoteController class, you could write unit tests to test the following methods:
  * getAllQuotes()
  * getQuoteById()
  * createQuote()
  * updateQuote()
  * deleteQuote()
* Integration tests: Integration tests are designed to test the interactions between different components of a program. In the case of the QuoteController class, you could write integration tests to test the following:
  * The interaction between the QuoteController class and the QuoteService class
  * The interaction between the QuoteController class and the Quote class
* System tests: System tests are designed to test the entire system, including all of its components and interactions. In the case of the QuoteController class, you could write system tests to test the following:
  * The ability of the QuoteController class to handle a variety of user requests
  * The performance of the QuoteController class
  * The reliability of the QuoteController class
In addition to these types of tests, you may also want to write some security tests to ensure that the QuoteController class is secure from attacks.

Here are some additional tips for writing tests for the QuoteController class:
* Use a test framework that supports Java. There are many different test frameworks available, such as JUnit, TestNG, and Mockito.
* Write your tests in a modular way. This will make it easier to maintain and update your tests in the future.
* Use descriptive test names. This will make it easier to understand what your tests are doing.
* Use assertions to verify the results of your tests. This will help you to catch errors early and prevent them from causing problems in production.
By following these tips, you can write comprehensive and effective tests for the QuoteController class.

We might be wondering whether we should also ask network tests, and can prompt Duet AI for it: Should I add tests for network failures

Duet AI will provide a detailed response starting along the lines of:

Yes, you should add tests for network failures. Network failures can occur for a variety of reasons, such as:
* The client or server is offline
* The network connection is interrupted
* The network is congested
...

Let's open the /src/test/java/com/example/quotes folder and observe that several test classes have already been created:

QuotesControllerTest
QuotesRepositoryTest
QuotesApplicationNetworkFailuresTests

To add the quote retrieval by book name functionality, start writing code in true TDD fashion by adding tests to both the QuotesControllerTest (for the endpoint) and QuotesRepositoryTest (for data retrieval from the db).

Step 1: Generate the web controller test first Open the QuotesControllerTest class in the com.example.quotes.web package

In the code, we add the comment, say towards the end of the file and press Enter

// Answer as a Software Engineer with expertise in Java. Create a test for the QuotesController for a method getByBook which responds at the HTTP endpoint /quotes/book/{book} and retrieves a quote from the book The Road

Click the Duet AI SmartAction and select “Generate code” or use keys to generate cod: (CTRL+Enter in VSCode, OPTION+Shift+\ in IntelliJ). Duet AI will make the suggestion:

  @Test
  @DisplayName("Test returns Quotes by Book")
  void shouldReturnQuoteByBook() throws Exception {
    mockMvc.perform(get("/quotes/book/The Road"))
        .andExpect(status().isOk())
        .andExpect(content().contentType(MediaType.APPLICATION_JSON))
        .andExpect(jsonPath("$[0].book", Matchers.equalTo("The Road")));
  }

Accept the suggestion, if it meets your requirements, with Tab or click Accept.

In the Terminal window, run the command:

./mvnw clean verify

We observe that the test fails, as expected, with a ‘404’ error, as the business logic has not been implemented:

[ERROR] Failures: 
[ERROR]   QuotesControllerTest.shouldReturnQuoteByBook:94 Status expected:&lt;200> but was:&lt;404>
...

Step 2: Generate controller code

Add the missing controller method getByBook. Open the QuoteController class. Add the following comment towards the end of the class:

// generate a getByBook method which responds at the HTTP endpoint /quotes/book/{book} and retrieves a quote by book name; use the QuoteService class to retrieve the book by name, as a String

Duet AI will respond with a code block along the lines of:

    @GetMapping("/quotes/book/{book}")
    public ResponseEntity> quoteByBook(@PathVariable("book") 
                String book) {
        try {
            List quotes = quoteService.getByBook(book);

            if(!quotes.isEmpty()){
                return new  ResponseEntity>(quotes, 
                                                        HttpStatus.OK);
            } else {
                return 
                   new ResponseEntity>(HttpStatus.NOT_FOUND);
            }
        } catch (Exception e) {
            System.out.println(e.getMessage());
            return 
          new ResponseEntity>(HttpStatus.INTERNAL_SERVER_ERROR);
        }
    }

Note that the code is missing the ‘getByBook()’ implementation in the QuoteService class, which leads you to the next step of the implementation.

Step 3: Generate now test methods for repository access

Open the QuoteService class and observe that the getByBook method is missing. Generate a test for the service class first, then add methods to access the repository, and test it out.

Open the QuotesRepositoryTest class and add the following comment towards the bottom of the class:

// generate a unit test for the getByBook method in the QuoteService; create a Quote in the QuoteService first then test the getByBook method against the new Quote

Observe that the generated code looks like:

  @Test
  @DisplayName("Get a quote by book - failed")
  void testGetQuoteByBookFailed(){
    var quotes = this.quoteService.getByBook("The Glass Menagerie");
    assertThat(quotes).isEmpty();
  }

With the test implemented, it is time to implement the missing functionality in the QuoteRepository and the QuoteService class.

The ask is to get a Quote by the book name, therefore need to add a findByBook method to the JPA repository class QuoteRepository, then allow the QuoteService to use this method.

Open the QuoteRepository class and add this comment towards the end of the class:

// generate a find by book method which retrieves a quote by book name; use the native query syntax

Duet AI will generate code along the lines of:

  @Query( nativeQuery = true, value =
            "SELECT id,quote,author,book FROM quotes WHERE book = :book")
  List findByBook(String book);

With the repository method in place, generate the missing link, the getByBook method in the service class and test it out. Open the QuoteService class and add the comment:

// add get by book method, use the QuoteRepository

Duet AI will generate code along the lines of:

  public List getByBook(String book) {
    return quoteRepository.findByBook(book);
  }

Done! with coding, let’s test the result.

Test your code

We'll run ALL tests from the Terminal:

./mvnw verify

Quotes app can be started with the following command:

./mvnw spring-boot:run

Ask Duet AI to generate cURL commands to test the newly added functionality. Switch to a different terminal window to run a cURL command.

In the Duet AI chat window, we can prompt Duet AI to generate a test command:

Generate a curl command for the /quotes/book endpoint for a local environment at port 8083 for the book "The Lord of the Rings"

Duet AI will generate the cURL command, which we can run:

curl -X GET http://localhost:8083/quotes/book/The%20Lord%20of%20the%20Rings

While you can assume that the command has not found a book, we wish to print the HTTP error code and refine the prompt with the following:

Update the curl command to print the HTTP response code

Run the updated command generated by Duet AI, which should return a 404:

curl -X GET http://localhost:8083/quotes/book/The%20Lord%20of%20the%20Rings -o /dev/null -s -w '%{http_code}\n'

Now refine the prompt to generate a successful command:

Update the command again to use the book "The Road"

Run the updated command generated by Duet AI, which should return a 200:

curl -X GET http://localhost:8083/quotes/book/The%20Road -o /dev/null -s -w '%{http_code}\n'

Done! with testing.

Test our code in the Google Cloud Run

To deploy and test the code in Cloud Run, follow the detailed Build and Deploy instructions in the README

What have we learned today

In this blog post, I started from an existing app and used Duet to help me get started with an overview of the codebase, perform code reviews, generate test code and business logic at different functionality levels and, last but not least, generate cURL commands to test the added functionality. All of this, following my preferred test-driven development approach.

It saved me time (and search effort) not only in generating code, or perform analysis. Duet was constantly available to generate the right cURL commands for testing when I forgot a parameter or suggested areas of code improvements. and removed a lot of these small interruptions from my workflow.

For any questions or feedback, feel free to contact me on Twitter/X @ddobrin.

The Future of Java Performance in Cloud Run: Native Java, CRaC and Project Leyden

2024-02-13T00:00:00+00:00

The Future of Java Performance in Cloud Run: Native Java, CRaC and Project Leyden

Applications run better on Cloud Run if they start fast, with instant peak performance and lower CPU/memory consumption. You’ll save on costs if the application needs less time and resources to run on, handle more requests with less CPU/memory and achieve better performance.

We love to use Java for its stability, performance and portability, however we all know that Java and its various web frameworks aren't known for starting fast or not using a lot of resources.

Don’t worry, I’ll start exploring carefully how this all changes with three technologies geared towards improving Java app runtime efficiency in serverless environments.

Native Java Images, with GraalVM
JVM Checkpoint and Restore, with CRaC
Upcoming OpenJDK runtime efficiency project, Project Leyden

This is a (fast) emerging space where nothing is completely done at this time, with multiple paths to optimizing Java runtime efficiency for your Cloud Run applications shaping up. Java performance optimization in Cloud Run is a sum of multiple factors: the cloud runtime, the JVM, the web frameworks/dependencies and, last but not least, your application code.

In this blog post we’ll dive into these optimization options, starting from the current state. My goal is to help you identify the best options for your application running in production in Cloud Run.

Sample apps

The blog post is supported by Spring Boot sample apps, using the latest Java 21 LTS, Spring Boot and Spring Framework versions, including build, test, deployment and runtime guidelines to Google Cloud Run.

Start by cloning the Git repository supporting this blog and setting up your environment. Follow along the blog with instructions on building the Quotes service as a JIT image, GraalVM Native image, respectively CRaC image and deploying them to Cloud Run.

A super-early Project Leyden sample has also been provided.

Key terms

Here are a few concepts that I’ll be referring to:

Startup latency: time to get to processing the first request in the app
Warmup: time it takes the app to reach peak performance
Peak performance: level of performance where the Java app handles the highest possible workload with the lowest possible latency and resource consumption
RSS memory: amount of physical memory actively used by a Java process

Plain JIT Images (OpenJDK)

In traditional Java applications, source code is compiled into bytecode and packaged into a Jar archive (and containerized). The JVM uses a bytecode interpreter to execute the program on the host runtime. The JIT compiler translates Java bytecode frequently executed code (hotspots) into machine code to improve peak performance.

It is important for us to understand this process, with JIT being the default compiler in the JVM. We want to use it as a benchmark against which to measure improvements.

Fast build time + less initial optimization → slower startup + higher resource consumption

Note the start-up time of the Quotes service JIT container image:

> docker run --rm -p8080:8083 quotes
Starting QuotesApplication v1.0.0 using Java 21 with PID 1
...
Tomcat started on port 8083 (http) with context path ''
Started QuotesApplication in 3.224 seconds (process running for 3.466)

While ~3 seconds might sound acceptable to many applications, in the real world, when you run your enterprise-grade apps in the cloud, this startup time might be measured in tens of seconds or even minutes. Scale-to-zero would potentially be unacceptable, while scale-out speed severely impacted.

JIT Images at runtime in Cloud Run CPU Boost in Cloud Run offers a great feature for improving the cold startup time for JIT images, by dynamically allocating more CPU to your container during startup, with some applications observing a 50% reduction in startup time. It could be sufficient to meet the SLOs for your application.

Next-level AOT performance with Native Java Images (GraalVM)

Native Images follow the same process as above, however the application is transformed ahead-of-time (AOT) into a native executable at build time, for the individual OS and Machine Architecture of the runtime environment, and can run without the need of a full JVM.

Building Native Images leveraging AOT compilation makes closed-world assumptions (at build time) of all the classes required by the application (at runtime). This static analysis is time consuming. The resulting application image contains only the classes required to run the app, with no further JIT runtime optimizations. This results in smaller container images, for faster and more efficient deployment, while limiting the security attack surface.

The static analysis commences from your application entry point and includes any class which can be reached across source code, dependent libraries, respectively JDK classes. If classes can’t be reached due to Java’s dynamic features, say reflection, proxying, serialization, resource access, they will not be included in the image and would have to be supplied externally through configuration, called shared metadata.

Slow build time + higher optimization → super-fast startup + lower resource consumption

Why use native images with GraalVM?

Any GraalVM trade-offs?

Warmup for peak performance with GraalVM

Let’s note that peak performance was an additional trade-off for GraalVM usage, as there was no just-in-time optimization at runtime.

The recent release of the Oracle GraalVM distribution under the GraalVM Free Terms and Conditions license (see restrictions) addresses this aspect with the introduction of profile-guided-optimization(PGO) for peak performance and the availability of G1GC in GraalVM.

Note the startup time of the Quotes service Native Java container image:

> docker run --rm -p8080:8083 quotes-native
Starting AOT-processed QuotesApplication using Java 21 with PID 1
...
Tomcat started on port 8083 (http) with context path ''
Started QuotesApplication in 0.392 seconds (process running for 0.399)

Observe the super-fast startup of the Quotes app, a 10x startup time improvement. Note that the improvement would be even more substantial for real-world applications, due to the larger number of dependencies which can be optimized ahead of time.

Native Images in Cloud Run Native Java images generally benefit only in small measure from enabling CPU Boost. From a startup perspective only, they could require as little as a single CPU to be allocated, as startup is a plain container loading effort without any reflection, dynamic proxying, deserialization or other Java specific operations happening.

The running container requires less memory at runtime than a JIT image, due to less classes being loaded and consumes less CPU as well, as no further runtime optimizations are required at runtime.

JVM Checkpoint and Restore (CRaC)

CRaC OpenJDK is an emerging open-source project focused on improving runtime efficiency in Java applications. It is based on CRIU, a project that implements checkpoint/restore functionality on Linux, which allows you to run CRaC images only on Linux.

Building a CRaC application follows the same compile process as JIT images do. At runtime, the Java application is started normally, with the optional execution of a number of requests. At this time, a memory snapshot of the running Java application is triggered (checkpointed), stored to disk in a new image and then restored super-fast when the application is restarted. Thus, CRaC can save a significant amount of startup time, especially for large and complex applications.

As a developer, be aware that file descriptors, sockets and pools are objects in memory, which you have to gracefully close, then restore according to the CRaC lifecycle.

To deploy your CRaC container image to Cloud Run, build the regular Docker image, with a CRaC OpenJDK and start it (see Quotes CRaC sample). Checkpoint the running app and store the resulting container image in Artifact Registry, then deploy it to Cloud Run. You can now update your production application configuration at restoration time in Cloud Run, thus preventing any potential leak of sensitive information, say your production database password. Let’s not forget that everything loaded in memory at startup will be serialized to the snapshot files!!

CRaC OpenJDK is being developed by Azul, which has made CRaC OpenJDK available under an open source license. CRaC OpenJDK is not yet supported by all Java frameworks and libraries. Spring Boot, used for all samples in this blog post, has introduced CRaC support in with the 3.2 GA version.

Fast build time + less initial optimization → super-fast startup + higher resource consumption

Why use Project CRaC?

Any Project CRaC trade-offs?

Warmup for peak performance with CRaC

While peak performance of applications checkpointed/restored using CRaC follows the same process as JIT based applications, with Hotspot optimization at runtime, instant peak performance is dependent on when the checkpoint has been taken.

Snapshots taken in CI/CD pipeline may only capture web framework and app initialization. Checkpointing the application after requests have been executed may allow application to reach instant peak performance.

Checkpointing secrets in memory before startup can lead to leaking sensitive data in snapshots as the secret will be serialized in the snapshot. To mitigate this, checkpoint right after application startup and refresh the context.

Observe the Quotes service CRaC Java image checkpoint process:

❯ ./checkpoint.sh

...
Using CRaC enabled JDK /bin/zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz
[INFO] Building quotes 1.0.0
...
=> [internal] load metadata for docker.io/library/ubuntu:22.04                                                                              
=> https://.../zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz                                                        
=> [1/6] FROM docker.io/library/ubuntu:22.04                                                                                             
=> CACHED [2/6] ADD /bin/zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz
/opt/jdk/openjdk.tar.gz            
...
=> [5/6] COPY target/quotes-crac-1.0.0.jar /opt/app/quotes-crac-1.0.0.jar                                                                
=> [6/6] COPY src/scripts/entrypoint.sh /opt/app/entrypoint.sh                                                                           
=> exporting to image                                                                                                                    
=> => exporting layers                                                                                                                   
=> => writing image
=> => naming to docker.io/library/quotes-crac:builder                                                                                    
...
Please wait during checkpoint creation...
sha256:1e48fde5ae0b39a4a458d659b34e491c9fe965fd757e0b861c61fe891a4796fb
Image: 28982acffecd

You can follow the checkpointing process and see the Java app being built, containerized, started then a new image being created during checkpointing with a new Docker image digest.

Note the startup time of the Quotes service CRaC Java image restore:

❯ ./restore.sh
Restarting Spring-managed lifecycle beans after JVM restore
Tomcat started on port 8080 (http) with context path ''
Spring-managed lifecycle restart completed in 13 ms (restored JVM running for 64 ms)
Completed initialization in 2 ms

Observe that the startup is super-fast, similar to native images, orders of magnitude faster than JIT images.

CRaC Images in Cloud Run

CRaC Native Java images benefit in some measure from enabling CPU Boost in Cloud Run, depending on when the checkpointing has occurred.

If the app has been checkpointed in a CI/CD pipeline, and the application context refreshed at runtime during startup, it would benefit from more CPU during context initialization. In case the app has been checkpointed in a production environment, fully warmed up, CPU boost would not yield any benefits (remember the security limitation above).

At runtime, after the startup phase, CRaC-based apps consume the same amount of CPU and memory as JIT images, as they are running in a similar manner on a regular JVM.

A peek into the future of OpenJDK with Project Leyden

I compared two technologies to improve Java application runtime efficiency, each providing significant benefits, but also presenting non-trivial trade-offs. Is there any (future) alternative available, which would allow developers to balance static AOT with dynamic JIT?

Project Leyden is a new OpenJDK project with the same goal of improving startup/warmup time and lowering the footprint of Java applications. The focus of Project Leyden is to allow selectively shifting and constraining computation (talk, concepts), while employing the concept of meaning preservation, i.e. the resulting image has the same meaning as the original, without side effects.

While the project is in its early stages, the Java Platform Group is looking into experimenting with and combining various optimization options. One very promising “early” optimization is the combination of Class Data Sharing (CDS) from the JDK with Spring AOT, into an experiment which shows a 15% startup improvement.

Additional contributors to runtime efficiency

Runtime efficiency is achieved from faster startup time, smaller container images and lower CPU and memory consumption (RSS).

Observe that Native Java images are smaller than CRaC images, while RSS memory consumption in Native Images is significantly lower than CRaC images (on par with a regular JIT image).

❯ docker images | grep quotes
quotes                 latest          231MB
quotes-native          latest          198MB
quotes-crac            checkpoint      994MB

> ps aux | grep quotes
USER     PID     RSS     COMMAND
   47831   541964  java -jar target/quotes-1.0.0.jar
   47901   143848  ./target/quotes

A summary of runtime efficiency optimization options

Let’s summarize all the concepts addressed in this blog post, starting with JIT images and the added optimization technologies available to you when running apps in Cloud Run.

Which technology should you use?

Native Java with GraalVM, CRaC and Project Leyden all share the same goal: improving the runtime efficiency of Java applications running on scale-to-zero platforms, by improving start-up and warm-up time, and reducing resource consumption.

Native Java with GraalVM and CRaC both offer excellent start-up performance, typically up to 50x faster than the start-up time on a regular JVM. Peak performance with Java 21 and profile-guided optimization has both solutions on par.

Project Leyden is a future project that aims to combine the best of both Native Java and CRaC. Leyden will use a new intermediate representation that is designed to be both performant and easy to develop for.

In short:

GraalVM technology is production-ready at this time, with support across all major Java web frameworks with the lowest startup time, resource consumption and security attack surface
CRaC is a very good emerging initiative, with full production readiness dependent on the availability of CRaC lifecycle support in various web frameworks and dependent libraries
Project Leyden is a future project that has the potential to combine the best of both worlds: static ahead-of-time (AOT) with dynamic just-in-time (JIT)

Next steps

Inspect the codebase for the Quotes service illustrated throughout this blog post, build both Quotes Native Java or Quotes CRaC versions and deploy them to Cloud Run
Peek into the future with OpenJDK’s Project Leyden
For a general overview of Java optimization in Cloud Run, watch the Developer Stories: Road to Java on GCP Serverless - What can trip you up? session

For any questions or feedback, feel free to contact me on Twitter/X @ddobrin.

Rotation	Position After	Ends at 0?
L68	82	No
L30	52	No
R48	0	Yes
L5	95	No
R60	55	No
L55	0	Yes
L1	99	No
L99	0	Yes
R14	14	No
L82	32	No

Rotation	Position After	Ends at 0?
L68	82	No
L30	52	No
R48	0	Yes
L5	95	No
R60	55	No
L55	0	Yes
L1	99	No
L99	0	Yes
R14	14	No
L82	32	No

Rotation	Position After	Ends at 0?
L68	82	No
L30	52	No
R48	0	Yes
L5	95	No
R60	55	No
L55	0	Yes
L1	99	No
L99	0	Yes
R14	14	No
L82	32	No