<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://ddobrin.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://ddobrin.github.io/" rel="alternate" type="text/html" /><updated>2026-04-30T20:29:05+00:00</updated><id>https://ddobrin.github.io/feed.xml</id><title type="html">Dan Dobrin</title><subtitle>Practical deep dives into Java, Gemini, LangChain4J, and modern cloud engineering.</subtitle><author><name>Dan Dobrin</name></author><entry><title type="html">The Senior Engineer vs. The Speedster: How AI Tackles Advent of Code</title><link href="https://ddobrin.github.io/2025/12/17/The-Senior-Engineer-vs-The-Speedster-How-AI-Tackles-Advent-of-Code/" rel="alternate" type="text/html" title="The Senior Engineer vs. The Speedster: How AI Tackles Advent of Code" /><published>2025-12-17T00:00:00+00:00</published><updated>2025-12-17T00:00:00+00:00</updated><id>https://ddobrin.github.io/2025/12/17/The-Senior-Engineer-vs-The-Speedster-How-AI-Tackles-Advent-of-Code</id><content type="html" xml:base="https://ddobrin.github.io/2025/12/17/The-Senior-Engineer-vs-The-Speedster-How-AI-Tackles-Advent-of-Code/"><![CDATA[<h1 id="the-senior-engineer-vs-the-speedster-how-ai-tackles-advent-of-code">The Senior Engineer vs. The Speedster: How AI Tackles Advent of Code</h1>
<p>Every December, many developers from around the world tackle the <strong><a href="https://adventofcode.com/">Advent of Code</a></strong> challenges — a much loved tradition of daily programming puzzles, increasingly complex, ranging from warm-up exercises to complex algorithmic challenges.</p>
<p>This year, having early access to Gemini 3 Pro and Flash, I decided to take a totally different approach to solving these puzzles: delegate to Gemini models and run an experiment that combines the creative puzzles of Advent of Code with the power of state-of-the-art LLMs. I was curious of the results, with different models tested.</p>
<h2 id="tldr">TL;DR</h2>
<p><strong>A Focused Goal</strong></p>
<ul>
<li>use state-of-the-art models: <em><strong><a href="https://ai.google.dev/gemini-api/docs/models#gemini-3-pro">Gemini 3 Pro Preview</a></strong></em> and <em><strong><a href="https://ai.google.dev/gemini-api/docs/models#gemini-3-flash">Gemini 3 Flash Preview</a></strong></em></li>
<li><strong>employ only minimalistic</strong> prompting</li>
<li>give each model the same puzzle and dataset with a simple prompt, for all 12 days (different than in previous years with 2 puzzles/day now)</li>
<li>validate that each model can solve it correctly in a <strong>one-shot</strong> approach, no hints, no intervention, for <strong>all</strong> 12 days</li>
<li>compare not just <em><strong>whether</strong></em> the models solve it, but <em><strong>how</strong></em> they solve it, as a follow-up</li>
<li>Gemini CLI is used for working with both models</li>
</ul>
<p><strong>Results</strong></p>
<ul>
<li><em><strong>BOTH</strong></em> models successfully solve <em><strong>ALL 24 puzzles</strong></em> in one-shot, as I hoped they will !!!</li>
<li>the results reveal some interesting differences in what I'd call &quot;engineering maturity&quot;</li>
</ul>
<p><strong>Note to the Reader</strong></p>
<ul>
<li>experimenting with all puzzles is left to the reader, with even more interesting solutions in the last few puzzles</li>
<li>this is not an exhaustive, complex prompting type of exercise, it is just a snapshot of what can be done with LLMs today</li>
<li>why this exercise? Imagine what you can do with this exercise with optimization, human-in-the-loop, software engineering principles guidance, etc !!!</li>
</ul>
<p><img src="/assets/images/aoc.png" alt="Image1" /></p>
<h2 id="why-advent-of-code-is-great-for-a-personal-llm-evaluation">Why Advent of Code Is Great for a Personal LLM Evaluation</h2>
<p>Before looking into the exercise, let's touch briefly on two points:</p>
<h3 id="advent-of-code-packing-serious-challenges-as-personal-developer-fun">Advent of Code: Packing Serious Challenges as Personal Developer Fun</h3>
<p>Advent of Code occupies an interesting middle ground - it's both a fun exercise <em>and</em> a serious programming challenge. The problems genuinely test algorithmic thinking, especially in the later days of the schedule (last days usuaaly involve graph algorithms, dynamic programming, or computational geometry).</p>
<p>Many professional developers use it for interview prep or to challenge themselves and stay sharp.</p>
<h3 id="why-it-works-as-an-llm-benchmark">Why It Works as an LLM Benchmark</h3>
<p>Advent of Code is actually an excellent framework for comparing LLMs:</p>
<table>
<thead>
<tr>
<th>Criterion</th>
<th>Why It Matters</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Objective correctness</strong></td>
<td>Problems have definitive right answers with test cases</td>
</tr>
<tr>
<td><strong>Graduated difficulty</strong></td>
<td>Days 1-6 focus more on basic reasoning; <br> Days 7-12 reveal which models understand complex algorithms</td>
</tr>
<tr>
<td><strong>Logical focus</strong></td>
<td>focus on solving a problem in a logical way, no distraction <br> from using frameworks or libraries in the code</td>
</tr>
<tr>
<td><strong>Flexibility</strong></td>
<td>Implement in the programming language of your choice</td>
</tr>
<tr>
<td><strong>Multi-dimensional evaluation</strong></td>
<td>You can assess correctness, code quality, performance, <br> explanation clarity, and debugging ability all at once</td>
</tr>
<tr>
<td><strong>Real-world-like</strong></td>
<td>Unlike small demo challenges, Advent of Code requires parsing <br> varied and messy input, handling edge cases, and sometimes optimizing for scale</td>
</tr>
<tr>
<td><strong>Reproducible</strong></td>
<td>Same problems, same inputs, easy to compare apples-to-apples across models</td>
</tr>
</tbody>
</table>
<h2 id="the-experiment-setup">The Experiment Setup</h2>
<p>Create an account on the Advent of Code website to unlock solution corectness evaluation and the subsequent exercises.</p>
<h3 id="the-models">The Models</h3>
<table>
<thead>
<tr>
<th>Model</th>
<th>Persona</th>
<th>Expected Behavior</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Gemini 3 Pro Preview</strong></td>
<td>&quot;The Senior Engineer&quot;</td>
<td>Robust, maintainable, production-ready code. <br> Proper error handling, modern APIs, clean architecture</td>
</tr>
<tr>
<td><strong>Gemini 3 Flash Preview</strong></td>
<td>&quot;The Speedster&quot;</td>
<td>Get the right answer <em>fast</em>. <br> Correct logic, but less emphasis on &quot;gold plating&quot; or defensive coding</td>
</tr>
</tbody>
</table>
<h3 id="the-prompt">The Prompt</h3>
<p>Both models received the same simple prompt, with file name changes only: the puzzle description and input, with a request to solve it in Java. No hints about code quality, no instructions about error handling - just solve the problem</p>
<pre><code>Read carefully the following puzzle text from puzzle1.md and the associated input file puzzle1.input for testing; plan and implement the puzzle in Java 25 with a main() method for testing
</code></pre>
<h3 id="evaluation-methodology">Evaluation Methodology</h3>
<p>I used a hybrid approach, and this is just a guideline - <strong>remember</strong> that we used minimalistic prompting and have not looked into optimizations!</p>
<ol>
<li><strong>Automated correctness testing</strong> - pass/fail against the actual puzzle input online at <a href="https://adventofcode.com/">Advent of Code</a></li>
<li><strong>Side-by-side analysis</strong> - in an IDE</li>
<li><strong>Independent LLM evaluation</strong> for qualitative aspects:
<ul>
<li>Code clarity and structure</li>
<li>Algorithmic adherence</li>
<li>Software engineering principles</li>
<li>Handling of edge cases</li>
</ul>
</li>
<li><strong>Performance benchmarking</strong> - execution time comparison</li>
</ol>
<h2 id="lets-examine-puzzle-1-in-details-the-secret-entrance">Let's Examine Puzzle 1 in Details: The Secret Entrance</h2>
<p>The <a href="https://adventofcode.com/2025/day/1">first puzzle</a> involves a combination safe with a dial numbered 0-99. Starting at position <strong>50</strong>, you receive a list of instructions to rotate the dial Left (L) or Right (R). The challenge: calculate the &quot;password&quot; by counting how many times the dial points to <strong>0</strong> after completing each rotation.</p>
<h3 id="the-algorithmic-challenge">The Algorithmic Challenge</h3>
<p>This is a classic modular arithmetic problem with a twist: handling negative numbers correctly when rotating left past zero. In Java, the <code>%</code> operator can return negative values for negative operands, so the implementation must account for this.</p>
<p><strong>Example walkthrough:</strong>
Starting at 50, applying rotations: L68, L30, R48, L5, R60, L55, L1, L99, R14, L82</p>
<table>
<thead>
<tr>
<th>Rotation</th>
<th>Position After</th>
<th>Ends at 0?</th>
</tr>
</thead>
<tbody>
<tr>
<td>L68</td>
<td>82</td>
<td>No</td>
</tr>
<tr>
<td>L30</td>
<td>52</td>
<td>No</td>
</tr>
<tr>
<td>R48</td>
<td><strong>0</strong></td>
<td>Yes</td>
</tr>
<tr>
<td>L5</td>
<td>95</td>
<td>No</td>
</tr>
<tr>
<td>R60</td>
<td>55</td>
<td>No</td>
</tr>
<tr>
<td>L55</td>
<td><strong>0</strong></td>
<td>Yes</td>
</tr>
<tr>
<td>L1</td>
<td>99</td>
<td>No</td>
</tr>
<tr>
<td>L99</td>
<td><strong>0</strong></td>
<td>Yes</td>
</tr>
<tr>
<td>R14</td>
<td>14</td>
<td>No</td>
</tr>
<tr>
<td>L82</td>
<td>32</td>
<td>No</td>
</tr>
</tbody>
</table>
<p><strong>Password: 3</strong> (the dial pointed to 0 three times)</p>
<h2 id="code-comparison-where-engineering-maturity-shows">Code Comparison: Where Engineering Maturity Shows</h2>
<p>Both models solved the puzzle correctly. But <em>how</em> they solved it reveals the seniority angle.</p>
<h3 id="the-core-logic-handling-negative-modulo">1. The Core Logic: Handling Negative Modulo</h3>
<p><strong>The Senior Engineer (Gemini 3 Pro)</strong> used an explicit check for negative numbers:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Senior approach - explicit and readable</span>
<span class="k">if</span> <span class="o">(</span><span class="n">direction</span> <span class="o">==</span> <span class="sc">'L'</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">currentPos</span> <span class="o">=</span> <span class="o">(</span><span class="n">currentPos</span> <span class="o">-</span> <span class="n">amount</span><span class="o">)</span> <span class="o">%</span> <span class="mi">100</span><span class="o">;</span>
    <span class="k">if</span> <span class="o">(</span><span class="n">currentPos</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
        <span class="n">currentPos</span> <span class="o">+=</span> <span class="mi">100</span><span class="o">;</span>
    <span class="o">}</span>
<span class="o">}</span> <span class="k">else</span> <span class="k">if</span> <span class="o">(</span><span class="n">direction</span> <span class="o">==</span> <span class="sc">'R'</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">currentPos</span> <span class="o">=</span> <span class="o">(</span><span class="n">currentPos</span> <span class="o">+</span> <span class="n">amount</span><span class="o">)</span> <span class="o">%</span> <span class="mi">100</span><span class="o">;</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
    <span class="k">throw</span> <span class="k">new</span> <span class="nf">IllegalArgumentException</span><span class="o">(</span><span class="s">"Unknown direction: "</span> <span class="o">+</span> <span class="n">direction</span><span class="o">);</span>
<span class="o">}</span>
</code></pre></div></div>
<p><strong>The Speedster (Gemini 3 Flash)</strong> used a mathematical one-liner:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Speedy approach - clever but redundant</span>
<span class="k">if</span> <span class="o">(</span><span class="n">direction</span> <span class="o">==</span> <span class="sc">'R'</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">currentPos</span> <span class="o">=</span> <span class="o">(</span><span class="n">currentPos</span> <span class="o">+</span> <span class="n">distance</span><span class="o">)</span> <span class="o">%</span> <span class="mi">100</span><span class="o">;</span>
<span class="o">}</span> <span class="k">else</span> <span class="k">if</span> <span class="o">(</span><span class="n">direction</span> <span class="o">==</span> <span class="sc">'L'</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">currentPos</span> <span class="o">=</span> <span class="o">(</span><span class="n">currentPos</span> <span class="o">-</span> <span class="o">(</span><span class="n">distance</span> <span class="o">%</span> <span class="mi">100</span><span class="o">)</span> <span class="o">+</span> <span class="mi">100</span><span class="o">)</span> <span class="o">%</span> <span class="mi">100</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>
<p><strong>Analysis:</strong></p>
<ul>
<li>Both are correct, but Speedster's version includes a redundant <code>distance % 100</code> operation—the outer modulo already handles wrap-around.</li>
<li>The Senior's code is also easier to debug: the logic is explicit and the negative case is clearly handled.</li>
</ul>
<h3 id="api-design-flexibility-vs-hardcoding">2. API Design: Flexibility vs. Hardcoding</h3>
<table>
<thead>
<tr>
<th>Aspect</th>
<th>Senior (Pro 3)</th>
<th>Speedster (Flash 3)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Method signature</td>
<td><code>solve(List&lt;String&gt;, int startPos)</code></td>
<td><code>solve(List&lt;String&gt;)</code></td>
</tr>
<tr>
<td>Start position</td>
<td>Parameterized (configurable)</td>
<td>Hardcoded (<code>int currentPos = 50;</code>)</td>
</tr>
<tr>
<td>Return type</td>
<td><code>long</code> (prevents overflow)</td>
<td><code>int</code></td>
</tr>
</tbody>
</table>
<p><strong>Winner: Senior Engineer</strong></p>
<p>By parameterizing <code>startPos</code>, the Senior's code is immediately testable with different scenarios without modifying any source code. This is a hallmark of production-ready design. The <code>long</code> return type also shows foresight about potential overflow with large inputs.</p>
<h3 id="error-handling-fail-fast-vs-silent">3. Error Handling: Fail-Fast vs. Silent</h3>
<table>
<thead>
<tr>
<th>Behavior</th>
<th>Senior (Pro)</th>
<th>Speedster (Flash)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unknown direction</td>
<td><code>throw IllegalArgumentException</code></td>
<td>Silently ignored</td>
</tr>
<tr>
<td>File not found</td>
<td>Explicit check with clear message</td>
<td>Generic IOException</td>
</tr>
<tr>
<td>IOException handling</td>
<td><code>throws IOException</code> (caller decides)</td>
<td>try-catch (handled internally)</td>
</tr>
</tbody>
</table>
<p><strong>Winner: Senior Engineer</strong></p>
<ul>
<li>The Senior implements &quot;Fail Fast&quot; logic. If the input contains an unknown direction (e.g., 'U'), it throws an exception immediately.</li>
<li>The Speedster silently ignores invalid lines — which might prevent a crash, but could lead to incorrect results with no warning. This is a classic debugging issue in production.</li>
</ul>
<h3 id="modern-java-idioms">4. Modern Java Idioms</h3>
<table>
<thead>
<tr>
<th>Aspect</th>
<th>Senior (Pro 3)</th>
<th>Speedster (Flash 3)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Path API</td>
<td><code>Path.of()</code> (Java 11+)</td>
<td><code>Paths.get()</code> (Java 7+)</td>
</tr>
<tr>
<td>File existence check</td>
<td><code>Files.exists()</code></td>
<td>None</td>
</tr>
</tbody>
</table>
<p><strong>Winner: Senior Engineer</strong></p>
<ul>
<li><code>Path.of()</code> is the modern, preferred API. Using it signals awareness of current Java best practices.</li>
</ul>
<h3 id="test-code">5. Test code</h3>
<p><strong>BOTH</strong> Pro and Flash models generated almost identical test code, based on the outline of Puzzle 1</p>
<pre><code>class Puzzle1Test {

    @Test
    void testExample() {
        List&lt;String&gt; instructions = List.of(
            &quot;L68&quot;,
            &quot;L30&quot;,
            &quot;R48&quot;,
            &quot;L5&quot;,
            &quot;R60&quot;,
            &quot;L55&quot;,
            &quot;L1&quot;,
            &quot;L99&quot;,
            &quot;R14&quot;,
            &quot;L82&quot;
        );
        
        long result = Puzzle1.solve(instructions, 50);
        assertEquals(3, result);
    }
}
</code></pre>
<h3 id="performance-analysis">6. Performance Analysis</h3>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Senior (Pro 3)</th>
<th>Speedster (Flash 3)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Time complexity</td>
<td>O(N)</td>
<td>O(N)</td>
</tr>
<tr>
<td>Operations per rotation</td>
<td>1 modulo + 1 conditional</td>
<td>2-3 modulo operations</td>
</tr>
</tbody>
</table>
<p><strong>Winner: Senior Engineer (marginally)</strong></p>
<ul>
<li>The extra <code>distance % 100</code> in Flash's implementation is unnecessary work.</li>
<li>Both are O(N) and the difference is negligible in practice, but the Senior's code is technically more efficient.</li>
</ul>
<h2 id="the-scorecard">The Scorecard</h2>
<table>
<thead>
<tr>
<th>Criterion</th>
<th>Senior (Pro 3)</th>
<th>Speedster (Flash 3)</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Correctness</strong></td>
<td>Pass</td>
<td>Pass</td>
</tr>
<tr>
<td><strong>API Design</strong></td>
<td><strong>Winner</strong> (configurable)</td>
<td>Basic (hardcoded)</td>
</tr>
<tr>
<td><strong>Error Handling</strong></td>
<td><strong>Winner</strong> (fail-fast)</td>
<td>Silent/permissive</td>
</tr>
<tr>
<td><strong>Input Robustness</strong></td>
<td>Good</td>
<td>Good</td>
</tr>
<tr>
<td><strong>Modern Java</strong></td>
<td><strong>Winner</strong> (<code>Path.of</code>)</td>
<td>Older (<code>Paths.get</code>)</td>
</tr>
<tr>
<td><strong>Performance</strong></td>
<td><strong>Winner</strong> (marginal)</td>
<td>Good</td>
</tr>
</tbody>
</table>
<h2 id="key-insights-what-have-we-learned">Key Insights: What Have We Learned</h2>
<h3 id="the-speedster-as-a-developer-under-deadline">The Speedster as a Developer Under Deadline</h3>
<p>Gemini 3 Flash acted like a developer racing against a deadline: it got the right answer efficiently, but left behind some technical debt. Hardcoded values, older APIs, silent error handling—these are the hallmarks of &quot;I'll fix it later&quot; code that often takes time or never gets fixed.</p>
<h3 id="the-senior-as-a-production-minded-engineer">The Senior as a Production-Minded Engineer</h3>
<p>Gemini 3 Pro solved the problem <em>and</em> built a reusable, robust component. The use of dependency injection (passing <code>startPos</code> as a parameter), explicit error handling, and modern APIs demonstrates a deeper understanding of software craftsmanship.</p>
<h3 id="the-adoption-decision">The Adoption Decision</h3>
<p><strong>For a quick script or prototype?</strong> Flash is perfectly fine. It gets the job done.</p>
<p><strong>For the production codebase?</strong> Adopt the Senior's solution. The generated code is:</p>
<ul>
<li>Testable without modification</li>
<li>Self-documenting through explicit error handling</li>
<li>Future-proof with modern idioms</li>
<li>Easier to debug when things go wrong</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>For Puzzle 1 of Advent of Code 2025, the distinction between the Senior Engineer and the Speedster was clear:</p>
<ul>
<li><strong>Gemini Pro</strong> delivered a working solution <em>and</em> a production-ready component</li>
<li><strong>Gemini Flash</strong> delivered a working solution with minimal fuss, but left some technical debt</li>
</ul>
<p>The metaphor extends beyond this experiment. When using AI for code generation, the model you choose—and the prompts you craft—can produce either throwaway prototypes or building blocks for serious software.</p>
<p><strong>For quick scripts? The Speedster is fine. For the production codebase? Start by adopting the Senior's solutions.</strong></p>]]></content><author><name>Dan Dobrin</name></author><category term="Gemini" /><category term="GenAI" /><category term="java" /><category term="googlecloud" /><summary type="html"><![CDATA[Gemini 3 Pro vs. Gemini 3 Flash solving Advent of Code 2025 — a study in engineering maturity, code quality, and when to use each model.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://ddobrin.github.io/assets/images/aoc.png" /><media:content medium="image" url="https://ddobrin.github.io/assets/images/aoc.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Long Document Summarization Techniques in Java with LangChain4J and Gemini</title><link href="https://ddobrin.github.io/2024/07/06/Long-document-summarization-techniques-Java-with-Gemini-models/" rel="alternate" type="text/html" title="Long Document Summarization Techniques in Java with LangChain4J and Gemini" /><published>2024-07-06T00:00:00+00:00</published><updated>2024-07-06T00:00:00+00:00</updated><id>https://ddobrin.github.io/2024/07/06/Long-document-summarization-techniques-Java-with-Gemini-models</id><content type="html" xml:base="https://ddobrin.github.io/2024/07/06/Long-document-summarization-techniques-Java-with-Gemini-models/"><![CDATA[<h1 id="long-document-summarization-techniques-with-java-with-langchain4j-and-gemini-models">Long Document Summarization Techniques with Java with Langchain4J and Gemini models</h1>
<p>Suppose your organization has a large number of documents, in various formats, and you, a Java developer, are tasked to efficiently summarize the content of each document.</p>
<p>While summarizing any document with just a few paragraphs is a simple task, there are several challenges to overcome when summarizing large documents with many pages.</p>
<p>Generative AI is top of mind for both developer and business stakeholder and you want to explore how Large Language Models (LLMs) can help you with large document summarization, a complex use-case with universal applicability in the enterprise.</p>
<p>As a Java developer, you're adept at building robust, scalable, high-performance applications. While Python dominates the ML and NLP landscape, Java is the backbone of enterprise software for a long time. It's strength in enterprise systems makes it an ideal platform for integrating advanced NLP techniques. With LLM usage, you can now add powerful, AI-driven insights, to your Java applications, bridging the gap between traditional enterprise software and cutting-edge machine learning capabilities.</p>
<p>This blog post explores various summarization techniques using LLMs, leaving you with practical information and a codebase with ready-to-test Java examples. The objective is to enable you with both theoretical knowledge and hands-on skills for effective document summarization.</p>
<p>We'll be leveraging <a href="https://cloud.google.com/vertex-ai?e=48754805&amp;hl=en">Vertex AI</a> with the latest <a href="https://deepmind.google/technologies/gemini/">Gemini models</a> and the open-source <a href="https://docs.langchain4j.dev/">Langchin4J</a> LLM orchestration framework.</p>
<h2 id="why-consider-llms-for-text-summarization">Why consider LLMs for text summarization</h2>
<p>LLMs offer a number of advantages over traditional extractive summarization methods:</p>
<ul>
<li><strong>Context comprehension</strong>: can grasp complex nuances in text, producing more coherent and relevant summaries</li>
<li><strong>Abstractive capabilities</strong>: will generate new sentences capturing the essence of the original text</li>
<li><strong>Flexibility</strong>: can be fine-tuned for specific domains or styles of summarization</li>
<li><strong>Multilingual support</strong>: many LLMs work across multiple languages, with versatility important for global applications</li>
</ul>
<h2 id="text-summarization-techniques">Text Summarization Techniques</h2>
<p>We'll explore in detail the following three summarization techniques in this blog post</p>
<ul>
<li><strong>Prompt Stuffing</strong> - pass in the content of the entire document as a prompt in the LLM's content window</li>
<li><strong>Map-reduce</strong> - split the document into smaller (potentially overlapping) chunks, summarize each chunk in parallel, then summarize the individual summaries in a second and final step</li>
<li><strong>Refine iteratively</strong> - split the document as in map-reduce, summarize the first chunk, then ask the LLM to refine the initial summary iteratively with the text from the following chunk, to the end of the text.</li>
</ul>
<h2 id="before-you-start">Before you start</h2>
<p>The summarization code uses Java 21. If not already installed, use the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main/ai-patterns/summarization-langchain4j#setup-java-ecosystem">following instructions</a> to set it up.</p>
<p><a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/ai-patterns/summarization-langchain4j/README.md">Documentation</a> provides details for <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/ai-patterns/summarization-langchain4j/README.md#clone-the-code">cloning the repository</a>, setting the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/ai-patterns/summarization-langchain4j/README.md#summarization-techniques---langchain4j-vertexai-gemini">required environment vriables up</a> and <a href="https://cloud.google.com/vertex-ai/docs/authentication">authenticating to Vertex AI</a>.</p>
<h2 id="loading-and-splitting-the-document">Loading and splitting the document</h2>
<p>Before summarization can be started, you need to load the document, then, depending of your summarization approach, split the content up into smaller chunks that can fit into the context window for your chosen LLM.</p>
<p>The latest multimodal Gemini models in Vertex AI have <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models">very large context windows</a>, up to 2M tokens, however you will have to adapt to the context window for LLM of your choice.</p>
<p>Langchain4J offers a number of out-of-the-box <a href="https://docs.langchain4j.dev/tutorials/rag#document-loader">Document Loaders</a>, <a href="https://docs.langchain4j.dev/tutorials/rag#document-parser">Document Parsers</a> and  <a href="https://docs.langchain4j.dev/tutorials/rag#document-splitter">DocumentSplitters</a> . It is very important to explore which one would yield the best results for your use-case.</p>
<p>The codebase for this blog loads the documents from the test folder using a <code>FileSystemDocumentLoader</code> and the <code>TextDocumentParser</code>. Sample documents are provided in text format in this repo.</p>
<p>For text splitting, the <code>DocumentByParagraphSplitter</code> is being used. It splits a Document into paragraphs and attempts to fit as many paragraphs as possible into a single TextSegment, within the limit set for the chunk size. The splitter allows you to specify an <strong>overlap window for chunks</strong>, with benefits discussed later in the post.</p>
<p>Choosing <strong>the right chunk size</strong> is an exercise dependent on the length of the context window for the LLM of your choice.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// load and parse the document  </span>
<span class="nc">Document</span> <span class="n">document</span> <span class="o">=</span> <span class="n">loadDocument</span><span class="o">(</span><span class="n">resource</span><span class="o">,</span> <span class="k">new</span> <span class="nc">TextDocumentParser</span><span class="o">());</span>  
  
<span class="c1">// Overlap window size between chunks set to OVERLAP_SIZE - can be configured  </span>
<span class="c1">// from 0 - text.length()  </span>
<span class="nc">DocumentSplitter</span> <span class="n">splitter</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">DocumentByParagraphSplitter</span><span class="o">(</span><span class="no">CHUNK_SIZE</span><span class="o">,</span> <span class="no">OVERLAP_SIZE</span><span class="o">);</span>  
<span class="nc">List</span><span class="o">&lt;</span><span class="nc">TextSegment</span><span class="o">&gt;</span> <span class="n">chunks</span> <span class="o">=</span> <span class="n">splitter</span><span class="o">.</span><span class="na">split</span><span class="o">(</span><span class="n">document</span><span class="o">);</span>
</code></pre></div></div>
<h2 id="llm-conversation-inputs">LLM conversation inputs</h2>
<p>@SystemMessage and @UserMessage are commonly used in the context of prompting and interacting with Large Language Models (LLMs)</p>
<p><a href="https://docs.langchain4j.dev/tutorials/ai-services#systemmessage">@SystemMessage</a> is used to set the context, or role of the AI models, and is usually not visible to the user. We will use for system instructions the same @SystemMessage whenever the AI Service is invoked.</p>
<p><a href="https://docs.langchain4j.dev/tutorials/ai-services#usermessage">@UserMessage</a> represents the actual input from the human user interacting with the AI. It's the question, prompt, or statement that the user wants the AI to respond to.</p>
<p>@SystemMesage and @UserMessage can be provided directly as Strings or loaded from a prompt template from resources: <code>SystemMessage(fromResource = &quot;my-system-prompt-template.txt&quot;)</code> or <code>@UserMessage(fromResource = &quot;my-user-template.txt&quot;)</code></p>
<h2 id="prompt-stuffing">#1: Prompt Stuffing</h2>
<p>Stuffing is the simplest summarization technique, as you can pass in the content of the entire document  as a prompt in the LLM's content window. However, as prompts for LLMs are token-count-limited, different techniques need to be used for large documents, depending on the size of the content window.</p>
<p>Google's Gemini models have very large context windows, making them an easy choice summarizing large documents. (see limits <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models">here</a>)</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">StuffingSummarizationAssistant</span> <span class="o">{</span>  
    <span class="nd">@SystemMessage</span><span class="o">(</span><span class="s">"""  
    You are a helpful AI assistant.    
    You are an AI assistant that helps people summarize information.    
    Your name is Gemini    
    You should reply to the users request with your name and also in the style 
    of a literary critic    
    Strictly ignore Project Gutenberg &amp; ignore copyright notice in summary 
    output.    
    """</span><span class="o">)</span>  
    <span class="nd">@UserMessage</span><span class="o">(</span><span class="s">"""  
    Please provide a concise summary in strictly no more 
    than 10 one sentence bullet points,    
    starting with an introduction and ending with a conclusion, 
    of the following text
                  TEXT: &lt;content&gt;    
    """</span><span class="o">)</span>  
    <span class="nc">String</span> <span class="nf">summarize</span><span class="o">(</span><span class="nd">@V</span><span class="o">(</span><span class="s">"content"</span><span class="o">)</span> <span class="nc">String</span> <span class="n">content</span><span class="o">);</span>  
<span class="o">}</span>

<span class="o">...</span>
<span class="c1">// summarize the document with the help of the StuffingSummarizationAssistant </span>
<span class="nc">StuffingSummarizationAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="nc">StuffingSummarizationAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="n">chatModel</span><span class="o">);</span>  
<span class="nc">String</span> <span class="n">response</span> <span class="o">=</span> <span class="n">assistant</span><span class="o">.</span><span class="na">summarize</span><span class="o">(</span><span class="n">document</span><span class="o">.</span><span class="na">text</span><span class="o">());</span>
<span class="o">...</span>				 
</code></pre></div></div>
<h4 id="pros">Pros:</h4>
<ul>
<li>Single call to the LLM required to summarize the text, most likely faster than with multiple summarization calls</li>
<li>Model has access to the entire document content at once, potentially resulting in better summary results</li>
</ul>
<h4 id="cons">Cons</h4>
<ul>
<li>Stuffing is applicable only as long as the entire document content can fit into the LLM context window</li>
</ul>
<h2 id="map-reduce">#2: Map-Reduce</h2>
<p>Map-reduce is more intricate than prompt stuffing and implements a multi-stage summarization, as you split the document into smaller (optionally overlapping) chunks, summarize each chunk in parallel, then summarize the individual summaries in a second and final step.</p>
<p>In this method, you need to prepare two user prompt templates, one for the initial chunk summarization step and another for the final combine step. The system instructions remain the same across all LLM calls.</p>
<h4 id="splitting-the-text-and-summarizing-individual-chunks-the-quotmapquot-step">Splitting the text and summarizing individual chunks (the &quot;map&quot; step)</h4>
<p>You'll be using the following @UserMessage:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">ChunkSummarizationAssistant</span> <span class="o">{</span>
	<span class="nd">@SystemMessage</span><span class="o">(</span><span class="n">fromResource</span> <span class="o">=</span> <span class="s">"my-system-prompt-template.txt"</span><span class="o">)</span>
	<span class="nd">@UserMessage</span><span class="o">(</span><span class="s">"""  
	Taking the following context delimited by triple backquotes into consideration   
	''''''    
	Write a concise summary of the following text delimited by triple backquotes.  
	'''&lt;content&gt;'''  
	Output starts with CONCISE SUB-SUMMARY:  
	"""</span><span class="o">)</span>
	<span class="nc">String</span> <span class="nf">summarize</span><span class="o">(</span><span class="nd">@V</span><span class="o">(</span><span class="s">"context"</span><span class="o">)</span> <span class="nc">String</span> <span class="n">context</span><span class="o">,</span> <span class="nd">@V</span><span class="o">(</span><span class="s">"content"</span><span class="o">)</span> <span class="nc">String</span> <span class="n">content</span><span class="o">);</span>
<span class="o">}</span>

<span class="o">...</span>
<span class="nc">ChunkSummarizationAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="nc">ChunkSummarizationAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="n">chatModel</span><span class="o">);</span>  
<span class="nc">String</span> <span class="n">response</span> <span class="o">=</span> <span class="n">assistant</span><span class="o">.</span><span class="na">summarize</span><span class="o">(</span><span class="n">context</span><span class="o">.</span><span class="na">toString</span><span class="o">(),</span> <span class="n">segment</span><span class="o">);</span>
</code></pre></div></div>
<p>Map-reduce allows you to parallelize the individual chunk summarization steps, as they are independent of each other:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">List</span><span class="o">&lt;</span><span class="nc">CompletableFuture</span><span class="o">&lt;</span><span class="nc">Map</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;&gt;&gt;</span> <span class="n">futures</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ArrayList</span><span class="o">&lt;&gt;();</span>  
<span class="nc">ExecutorService</span> <span class="n">executor</span> <span class="o">=</span> <span class="nc">Executors</span><span class="o">.</span><span class="na">newVirtualThreadPerTaskExecutor</span><span class="o">();</span>  
<span class="nc">Map</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">resultMap</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">TreeMap</span><span class="o">&lt;&gt;();</span> <span class="c1">// TreeMap to automatically sort by key  </span>
  
<span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">segments</span><span class="o">.</span><span class="na">size</span><span class="o">();</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>  
    <span class="kt">int</span> <span class="n">index</span> <span class="o">=</span> <span class="n">i</span><span class="o">;</span>  
    <span class="nc">CompletableFuture</span><span class="o">&lt;</span><span class="nc">Map</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;&gt;</span> <span class="n">future</span> <span class="o">=</span> <span class="nc">CompletableFuture</span>  
        <span class="o">.</span><span class="na">supplyAsync</span><span class="o">(()</span> <span class="o">-&gt;</span> <span class="n">summarizeChunk</span><span class="o">(</span><span class="n">index</span><span class="o">,</span> <span class="n">segments</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="n">index</span><span class="o">).</span><span class="na">text</span><span class="o">()),</span> <span class="n">executor</span><span class="o">);</span>  
    <span class="n">futures</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">future</span><span class="o">);</span>  
<span class="o">}</span>  
  
<span class="c1">// Wait for all futures to complete and collect the results in resultMap  </span>
<span class="nc">CompletableFuture</span><span class="o">&lt;</span><span class="nc">Void</span><span class="o">&gt;</span> <span class="n">allDone</span> <span class="o">=</span> <span class="nc">CompletableFuture</span><span class="o">.</span><span class="na">allOf</span><span class="o">(</span><span class="n">futures</span><span class="o">.</span><span class="na">toArray</span><span class="o">(</span><span class="k">new</span> <span class="nc">CompletableFuture</span><span class="o">[</span><span class="mi">0</span><span class="o">]))</span>  
        <span class="o">.</span><span class="na">thenAccept</span><span class="o">(</span><span class="n">v</span> <span class="o">-&gt;</span> <span class="n">futures</span><span class="o">.</span><span class="na">forEach</span><span class="o">(</span><span class="n">f</span> <span class="o">-&gt;</span> <span class="n">f</span><span class="o">.</span><span class="na">thenAccept</span><span class="o">(</span><span class="nl">resultMap:</span><span class="o">:</span><span class="n">putAll</span><span class="o">)));</span>  
  
<span class="n">allDone</span><span class="o">.</span><span class="na">get</span><span class="o">();</span> <span class="c1">// Wait for all processing to complete</span>
</code></pre></div></div>
<p><strong>Important</strong>: please be aware that parallel summarization of individual chunks does not guarantee the ordering of the individual chunk summaries at the end of the process. It is up to the developer to keep summary ordering in mind before proceeding to the second and final step. An index value is attached to each chunk in the sample codebase, guaranteeing an ordered list of individual summaries !</p>
<p>A <strong>key factor for improving summarization results</strong> is the concept of <strong>overlapping chunks</strong>.</p>
<p>Splitting a document by a specific chunk size is arbitrary from a summarization perspective, even if done with utility classes which will split the text cleanly in paragraphs, then attempt to fit entire sentences into the remaining chunk space.</p>
<p>Providing the ability for chunks to overlap up to a specified overlap size can yield better summarization results by preserving more contexts between the individual chunks.</p>
<p>Please consider experimenting with different chunk and overlap sizes for your respective summarization use-cases.</p>
<p><strong>Note</strong>: the degree to which you can parallelize LLM calls depends on whether the rate limit of API calls per minute imposed by the LLM !!
!</p>
<h4 id="summary-of-summaries-the-quotreducequot-part">Summary of summaries (the &quot;reduce&quot; part)</h4>
<p>With all individual summaries on hand, you can move on to the second and final step, the summarization of the individual summaries.</p>
<p>You would be using a different @UserMessage in this step:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">FinalSummarizationAssistant</span> <span class="o">{</span>  
    <span class="nd">@SystemMessage</span><span class="o">(</span><span class="n">fromResource</span> <span class="o">=</span> <span class="s">"my-system-prompt-template.txt"</span><span class="o">)</span>  
    <span class="nd">@UserMessage</span><span class="o">(</span><span class="s">"""  
    Please provide a concise summary in strictly no more than 10 one sentence bullet points,    
    starting with an introduction and ending with a conclusion, 
    of the following text delimited by triple backquotes.
          '''Text:&lt;content&gt;'''  
      Output starts with SUMMARY:  
    """</span><span class="o">)</span>  
    <span class="nc">String</span> <span class="nf">summarize</span><span class="o">(</span><span class="nd">@V</span><span class="o">(</span><span class="s">"content"</span><span class="o">)</span> <span class="nc">String</span> <span class="n">content</span><span class="o">);</span>  
<span class="o">}</span>
<span class="o">...</span>
<span class="nc">FinalSummarizationAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="nc">FinalSummarizationAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="n">chatModel</span><span class="o">);</span>  
<span class="nc">String</span> <span class="n">response</span> <span class="o">=</span> <span class="n">assistant</span><span class="o">.</span><span class="na">summarize</span><span class="o">(</span><span class="n">content</span><span class="o">);</span>
</code></pre></div></div>
<h4 id="pros-1">Pros:</h4>
<ul>
<li>Large documents can be summarized even with LLMs with smaller context windows</li>
<li>Parallel processing leads to reduced summarization latency</li>
<li>Overlapping chunks can improve summarizaiton accuracy</li>
</ul>
<h4 id="cons-1">Cons</h4>
<ul>
<li>Multiple LLM calls are required</li>
<li>There can be context loss due to arbitrary text splitting</li>
<li>Overlapping chunks can slightly increase latency and create larger input text</li>
</ul>
<h2 id="refine">#3: Refine</h2>
<p>The refine method is an alternative to map-reduce to handle large document summarization. You split the document similar to map-reduce, summarize the first chunk, then ask the LLM to refine the initial summary iteratively with the added text from the following chunk, to the end of the text.</p>
<p>This approach ensures a that the summary is both comprehensive, as well as accurate, as it takes into consideration the context of the previous chunk(s).</p>
<p>You would be using the same @UserMessages illustrated in the two steps in the &quot;Map-reduce&quot; approach: <code>ChunkSummarizationAssistant</code> and <code>FinalSummarizationAssistant</code>.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// process each individual chunk in order  </span>
<span class="c1">// summary refined in each step by adding the summary of the current chunk  </span>
<span class="kt">long</span> <span class="n">start</span> <span class="o">=</span> <span class="nc">System</span><span class="o">.</span><span class="na">currentTimeMillis</span><span class="o">();</span>  
<span class="nc">StringBuilder</span> <span class="n">context</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">StringBuilder</span><span class="o">();</span>  
<span class="n">chunks</span><span class="o">.</span><span class="na">forEach</span><span class="o">(</span><span class="n">segment</span> <span class="o">-&gt;</span> <span class="n">summarizeChunk</span><span class="o">(</span><span class="n">context</span><span class="o">,</span> <span class="n">segment</span><span class="o">.</span><span class="na">text</span><span class="o">()));</span>  
  
<span class="c1">// process the final summary  of the text  </span>
<span class="nc">String</span> <span class="n">output</span> <span class="o">=</span> <span class="n">buildFinalSummary</span><span class="o">(</span><span class="n">context</span><span class="o">.</span><span class="na">toString</span><span class="o">());</span>
</code></pre></div></div>
<h4 id="pros-2">Pros:</h4>
<ul>
<li>Large documents can be summarized even with LLMs with smaller context windows</li>
<li>Context is preserved between chunks, improving summarization accuracy and completeness</li>
<li>Overlapping chunks can improve summarization accuracy even further</li>
</ul>
<h4 id="cons-2">Cons</h4>
<ul>
<li>Multiple LLM calls are required</li>
<li>Must be executed iteratively and does not lend itself to parallel processing, due to the interdependent nature of the individual chunks and their associated context</li>
<li>Latency significantly higher than map-reduce</li>
</ul>
<h2 id="summary">Summary</h2>
<p>In this blog post, we have explored different programmatic summarization techniques for large documents using Google's Gemini LLM, as an advanced use-case for generative AI in enterprise software.</p>
<p>LLM orchestration frameworks do not provide out-of-the-box summarization functionality, therefore you have a full codebase available here, with practical examples, demonstrating how to implement these techniques efficiently in Java.</p>
<p>As an enterprise Java developer, you now have powerful options to leverage LLMs and add AI-driven insights to your applications, potentially transforming how you handle document analysis and summarization.</p>
<p>The field of AI-powered document summarization is rapidly evolving, with new models and techniques emerging regularly. Stay tuned for future developments that could further enhance these capabilities.</p>
<p>Don't hesitate to reach out at <a href="https://twitter.com/ddobrin">@ddobrin</a> for feedback, questions or to discuss new summarization techniques.</p>]]></content><author><name>Dan Dobrin</name></author><category term="Gemini" /><category term="GenAI" /><category term="java" /><category term="springboot" /><category term="langchain4j" /><category term="googlecloud" /><summary type="html"><![CDATA[Comparing Prompt Stuffing, Map-Reduce, and Refine summarization patterns for enterprise documents using Java, LangChain4J, and Vertex AI Gemini models.]]></summary></entry><entry><title type="html">Test-Driven Development with Java, Spring Boot, and Duet AI</title><link href="https://ddobrin.github.io/2024/02/21/TDD-with-Java-Spring-Boot-and-DuetAI/" rel="alternate" type="text/html" title="Test-Driven Development with Java, Spring Boot, and Duet AI" /><published>2024-02-21T00:00:00+00:00</published><updated>2024-02-21T00:00:00+00:00</updated><id>https://ddobrin.github.io/2024/02/21/TDD-with-Java-Spring-Boot-and-DuetAI</id><content type="html" xml:base="https://ddobrin.github.io/2024/02/21/TDD-with-Java-Spring-Boot-and-DuetAI/"><![CDATA[<h1 id="test-driven-development-with-duet-ai-for-java-amp-spring-boot-applications">Test-driven Development with Duet AI for Java &amp; Spring Boot Applications</h1>
<p>Many times, as a developer, I have been asked to quickly add business functionality to an existing app.
As I might not be familiar with the app, I am wondering if Duet AI could help me gain an understanding of the
app, review the code, then help add new functionality.</p>
<p>I like to follow a test-driven development approach, and I want to explore how Duet AI can help me in my work.</p>
<h2 id="whats-the-ask">What's the Ask</h2>
<p>The Quotes app requires a new endpoint to retrieve book information by book name.
This endpoint should respond on the “/quotes/book/{book}” path.
The ask is to implement this particular endpoint.</p>
<h2 id="before-you-start">Before you start</h2>
<p>I'll be using Duet AI with Cloud Code in IntelliJ on my Mac, however you can use Duet AI from other IDEs supported by Cloud Code,
such as VSCode. If you prefer an online experience, you can use it in the Cloud Shell Editor or Cloud Workstations
from Google Cloud Console.</p>
<p>The <a href="https://cloud.google.com/duet-ai/docs/developers/overview">Duet AI</a> page provides details on how to get setup with Duet AI.</p>
<p>Once we have Duet set up, open a Terminal window and install the pre-requisites for the work.</p>
<p>The Quotes application uses Java 21. If you do not have it installed, it's an easy setup using <a href="https://sdkman.io/">SDKMan</a>:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-s</span> <span class="s2">"https://get.sdkman.io"</span> | bash <span class="o">&amp;&amp;</span> <span class="nb">source</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.sdkman/bin/sdkman-init.sh"</span>
</code></pre></div></div>
<p>Install the latest OpenJDK and confirm it as the default JDK in the install:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sdk <span class="nb">install </span>java 21.0.2-tem <span class="o">&amp;&amp;</span> sdk use java 21.0.2-tem <span class="o">&amp;&amp;</span> java <span class="nt">-version</span>
</code></pre></div></div>
<p>With Java installed, if you wish to test the service in Cloud Run when done, enable the APIs you would need to deploy the Quotes app in Cloud Run, if not already installed:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gcloud services <span class="nb">enable </span>cloudaicompanion.googleapis.com
gcloud services <span class="nb">enable </span>cloudbuild.googleapis.com 
gcloud services <span class="nb">enable </span>run.googleapis.com
gcloud services <span class="nb">enable </span>logging.googleapis.com 
</code></pre></div></div>
<h2 id="the-quotes-app">The Quotes app</h2>
<p>Let's clone the Github repo for Quotes and switch to the /services/quotes folder, then open the codebase in the IDE, as well as a new terminal window:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp.git <span class="o">&amp;&amp;</span> <span class="nb">cd </span>serverless-production-readiness-java-gcp/services/quotes
</code></pre></div></div>
<p>We can validate that the project is set, and set it otherwise:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># get project </span>
gcloud config list
<span class="c"># set project</span>
gcloud config <span class="nb">set </span>project <span class="o">[</span>PROJECT_ID]
</code></pre></div></div>
<p>We then check that the Quotes app is building correctly and can be tested:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./mvnw package spring-boot:run
</code></pre></div></div>
<p>From a Terminal window, we execute:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl localhost:8083/random-quote <span class="nt">-w</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">"</span>
</code></pre></div></div>
<h2 id="explain-the-app">Explain the app</h2>
<p>Duet AI can be used to explain the app, perform code reviews, suggest changes and tests</p>
<h4 id="use-duet-ai-to-explain-the-quotes-app">Use Duet AI to explain the Quotes app</h4>
<p>Let's open the QuoteApplication class, then type the following prompt in the Duet AI Chat window and observe the response:</p>
<p><strong>Prompt</strong></p>
<blockquote>
<p>I want to get details about the QuotesApplication; please provide a detailed overview of the QuotesApplication</p>
</blockquote>
<p><strong>Response</strong></p>
<p>Duet AI will respond with a detailed overview of the QuotesApplication. Remember that responses are non-deterministic,
therefore prompt responses provided in this blog post should be taken as an example and could vary slightly.</p>
<pre><code>Sure. The QuotesApplication is a Spring Boot application that provides a REST API for managing quotes. The application is divided into several components:
* The QuoteController class handles requests for quotes.
* The QuoteService class provides business logic for managing quotes.
* The QuoteRepository class persists quotes to a database.
The QuotesApplication can be deployed to a variety of platforms, including Cloud Run, GKE.

Here is a more detailed overview of each component:
* The QuoteController class handles requests for quotes. The controller has several methods, including:
    * getQuote(): Returns a random quote.
    * getQuotes(): Returns all quotes.
    * getQuoteByAuthor(): Returns all quotes by a given author.
    * createQuote(): Creates a new quote.
    * updateQuote(): Updates an existing quote.
    * deleteQuote(): Deletes an existing quote.
* The QuoteService class provides business logic for managing quotes. The service has several methods, including:
    * getQuote(): Returns a random quote.
    * getQuotes(): Returns all quotes.
    * getQuoteByAuthor(): Returns all quotes by a given author.
    * createQuote(): Creates a new quote.
    * updateQuote(): Updates an existing quote.
    * deleteQuote(): Deletes an existing quote.
* The QuoteRepository class persists quotes to a database. The repository has several methods, including:
    * save(): Saves a quote to the database.
    * findAll(): Returns all quotes from the database.
    * findByAuthor(): Returns all quotes by a given author from the database.
    * delete(): Deletes a quote from the database.
</code></pre>
<p>We might want to review the code in the QuoteController before adding an endpoint. Let's open the QuoteController class,
then type the following prompt in the Duet AI chat window:</p>
<pre><code>Please perform a detailed code review of the QuoteController and suggest any improvements I can bring to the code
</code></pre>
<p>Duet AI will respond with code review items and actually provide suggestions for how to rewrite the code.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Sure, I can <span class="nb">help </span>you with that. Here are a few suggestions <span class="k">for </span>improving the code <span class="k">in </span>QuoteController:
<span class="k">*</span> Use more descriptive variable names. For example, instead of using quotes, you could use allQuotes or listOfQuotes. This would make the code easier to <span class="nb">read </span>and understand.
<span class="k">*</span> Use more comments to explain the code. This would <span class="nb">help </span>other developers understand what the code is doing and how it works.
<span class="k">*</span> Use a more consistent coding style. For example, you could use the same indentation <span class="k">for </span>all code blocks. This would make the code look more organized and easier to read.
<span class="k">*</span> Use a linter to check <span class="k">for </span>errors and inconsistencies. A linter can <span class="nb">help </span>you identify errors and inconsistencies <span class="k">in </span>your code, which can <span class="nb">help </span>you improve its quality.
<span class="k">*</span> Test your code regularly. This will <span class="nb">help </span>you catch bugs early and prevent them from causing problems <span class="k">in </span>production.

Here is an example of how the code could be improved:
...
</code></pre></div></div>
<h2 id="use-tdd-with-duet-ai-to-add-business-logic">Use TDD with Duet AI to add business logic</h2>
<p>The Quotes app is missing an endpoint to retrieve book information by book name. This endpoint should respond on the “/quotes/book/{book}” path.</p>
<p>We'll use a test-driven approach to add this functionality, starting from writing an application test.</p>
<p>Start by asking Duet AI to recommend which types of tests you need to write:</p>
<blockquote>
<p>Which types of tests should I be writing for the QuoteController</p>
</blockquote>
<p>Duet AIAI will reply with a list including:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>There are many different types of tests that you can write <span class="k">for </span>the QuoteController class. Some of the most important types of tests include:
<span class="k">*</span> Unit tests: Unit tests are designed to <span class="nb">test </span>the individual components of a program. In the <span class="k">case</span> of the QuoteController class, you could write unit tests to <span class="nb">test </span>the following methods:
  <span class="k">*</span> getAllQuotes<span class="o">(</span><span class="p">)</span>
  <span class="k">*</span> getQuoteById<span class="o">()</span>
  <span class="k">*</span> createQuote<span class="o">()</span>
  <span class="k">*</span> updateQuote<span class="o">()</span>
  <span class="k">*</span> deleteQuote<span class="o">()</span>
<span class="k">*</span> Integration tests: Integration tests are designed to <span class="nb">test </span>the interactions between different components of a program. In the <span class="k">case</span> of the QuoteController class, you could write integration tests to <span class="nb">test </span>the following:
  <span class="k">*</span> The interaction between the QuoteController class and the QuoteService class
  <span class="k">*</span> The interaction between the QuoteController class and the Quote class
<span class="k">*</span> System tests: System tests are designed to <span class="nb">test </span>the entire system, including all of its components and interactions. In the <span class="k">case</span> of the QuoteController class, you could write system tests to <span class="nb">test </span>the following:
  <span class="k">*</span> The ability of the QuoteController class to handle a variety of user requests
  <span class="k">*</span> The performance of the QuoteController class
  <span class="k">*</span> The reliability of the QuoteController class
In addition to these types of tests, you may also want to write some security tests to ensure that the QuoteController class is secure from attacks.

Here are some additional tips <span class="k">for </span>writing tests <span class="k">for </span>the QuoteController class:
<span class="k">*</span> Use a <span class="nb">test </span>framework that supports Java. There are many different <span class="nb">test </span>frameworks available, such as JUnit, TestNG, and Mockito.
<span class="k">*</span> Write your tests <span class="k">in </span>a modular way. This will make it easier to maintain and update your tests <span class="k">in </span>the future.
<span class="k">*</span> Use descriptive <span class="nb">test </span>names. This will make it easier to understand what your tests are doing.
<span class="k">*</span> Use assertions to verify the results of your tests. This will <span class="nb">help </span>you to catch errors early and prevent them from causing problems <span class="k">in </span>production.
By following these tips, you can write comprehensive and effective tests <span class="k">for </span>the QuoteController class.
</code></pre></div></div>
<p>We might be wondering whether we should also ask network tests, and can prompt Duet AI for it:
<code>Should I add tests for network failures</code></p>
<p>Duet AI will provide a detailed response starting along the lines of:</p>
<pre><code>Yes, you should add tests for network failures. Network failures can occur for a variety of reasons, such as:
* The client or server is offline
* The network connection is interrupted
* The network is congested
...
</code></pre>
<p>Let's open the <strong><em>/src/test/java/com/example/quotes</em></strong> folder and observe that several test classes have already been created:</p>
<ul>
<li>QuotesControllerTest</li>
<li>QuotesRepositoryTest</li>
<li>QuotesApplicationNetworkFailuresTests</li>
</ul>
<p>To add the quote retrieval by book name functionality, start writing code in true TDD fashion by adding tests
to both the QuotesControllerTest (for the endpoint) and QuotesRepositoryTest (for data retrieval from the db).</p>
<p><strong>Step 1: Generate the web controller test first</strong>
Open the <strong>QuotesControllerTest</strong> class in the <strong><em>com.example.quotes.web</em></strong> package</p>
<p>In the code, we add the comment, say towards the end of the file and press Enter</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Answer as a Software Engineer with expertise in Java. Create a test for the QuotesController for a method getByBook which responds at the HTTP endpoint /quotes/book/{book} and retrieves a quote from the book The Road</span>
</code></pre></div></div>
<p>Click the Duet AI SmartAction and select “Generate code” or use keys to generate cod: (CTRL+Enter in VSCode, OPTION+Shift+\ in IntelliJ).
Duet AI will make the suggestion:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nd">@Test</span>
  <span class="nd">@DisplayName</span><span class="o">(</span><span class="s">"Test returns Quotes by Book"</span><span class="o">)</span>
  <span class="kt">void</span> <span class="nf">shouldReturnQuoteByBook</span><span class="o">()</span> <span class="kd">throws</span> <span class="nc">Exception</span> <span class="o">{</span>
    <span class="n">mockMvc</span><span class="o">.</span><span class="na">perform</span><span class="o">(</span><span class="n">get</span><span class="o">(</span><span class="s">"/quotes/book/The Road"</span><span class="o">))</span>
        <span class="o">.</span><span class="na">andExpect</span><span class="o">(</span><span class="n">status</span><span class="o">().</span><span class="na">isOk</span><span class="o">())</span>
        <span class="o">.</span><span class="na">andExpect</span><span class="o">(</span><span class="n">content</span><span class="o">().</span><span class="na">contentType</span><span class="o">(</span><span class="nc">MediaType</span><span class="o">.</span><span class="na">APPLICATION_JSON</span><span class="o">))</span>
        <span class="o">.</span><span class="na">andExpect</span><span class="o">(</span><span class="n">jsonPath</span><span class="o">(</span><span class="s">"$[0].book"</span><span class="o">,</span> <span class="nc">Matchers</span><span class="o">.</span><span class="na">equalTo</span><span class="o">(</span><span class="s">"The Road"</span><span class="o">)));</span>
  <span class="o">}</span>
</code></pre></div></div>
<p>Accept the suggestion, if it meets your requirements, with Tab or click Accept.</p>
<p>In the Terminal window, run the command:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./mvnw clean verify
</code></pre></div></div>
<p>We observe that the test fails, as expected, with a ‘404’ error, as the business logic has not been implemented:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span><span class="no">ERROR</span><span class="o">]</span> <span class="nl">Failures:</span> 
<span class="o">[</span><span class="no">ERROR</span><span class="o">]</span>   <span class="nc">QuotesControllerTest</span><span class="o">.</span><span class="na">shouldReturnQuoteByBook</span><span class="o">:</span><span class="mi">94</span> <span class="nc">Status</span> <span class="nl">expected:</span><span class="o">&amp;</span><span class="n">lt</span><span class="o">;</span><span class="mi">200</span><span class="o">&gt;</span> <span class="n">but</span> <span class="nl">was:</span><span class="o">&amp;</span><span class="n">lt</span><span class="o">;</span><span class="mi">404</span><span class="o">&gt;</span>
<span class="o">...</span>
</code></pre></div></div>
<p><strong>Step 2: Generate controller code</strong></p>
<p>Add the missing controller method getByBook. Open the QuoteController class. Add the following comment towards the end of the class:</p>
<pre><code>// generate a getByBook method which responds at the HTTP endpoint /quotes/book/{book} and retrieves a quote by book name; use the QuoteService class to retrieve the book by name, as a String
</code></pre>
<p>Duet AI will respond with a code block along the lines of:</p>
<pre><code>    @GetMapping(&quot;/quotes/book/{book}&quot;)
    public ResponseEntity&lt;List&lt;Quote&gt;&gt; quoteByBook(@PathVariable(&quot;book&quot;) 
                String book) {
        try {
            List&lt;Quote&gt; quotes = quoteService.getByBook(book);

            if(!quotes.isEmpty()){
                return new  ResponseEntity&lt;List&lt;Quote&gt;&gt;(quotes, 
                                                        HttpStatus.OK);
            } else {
                return 
                   new ResponseEntity&lt;List&lt;Quote&gt;&gt;(HttpStatus.NOT_FOUND);
            }
        } catch (Exception e) {
            System.out.println(e.getMessage());
            return 
          new ResponseEntity&lt;List&lt;Quote&gt;&gt;(HttpStatus.INTERNAL_SERVER_ERROR);
        }
    }
</code></pre>
<p>Note that the code is missing the ‘getByBook()’ implementation in the QuoteService class, which leads you to the next step of the implementation.</p>
<p><strong>Step 3: Generate now test methods for repository access</strong></p>
<p>Open the QuoteService class and observe that the getByBook method is missing.
Generate a test for the service class first, then add methods to access the repository, and test it out.</p>
<p>Open the QuotesRepositoryTest class and add the following comment towards the bottom of the class:</p>
<pre><code>// generate a unit test for the getByBook method in the QuoteService; create a Quote in the QuoteService first then test the getByBook method against the new Quote
</code></pre>
<p>Observe that the generated code looks like:</p>
<pre><code>  @Test
  @DisplayName(&quot;Get a quote by book - failed&quot;)
  void testGetQuoteByBookFailed(){
    var quotes = this.quoteService.getByBook(&quot;The Glass Menagerie&quot;);
    assertThat(quotes).isEmpty();
  }
</code></pre>
<p>With the test implemented, it is time to implement the missing functionality in the QuoteRepository and the QuoteService class.</p>
<p>The ask is to get a Quote by the book name, therefore need to add a findByBook method to the JPA repository class QuoteRepository, then allow the QuoteService to use this method.</p>
<p>Open the QuoteRepository class and add this comment towards the end of the class:</p>
<pre><code>// generate a find by book method which retrieves a quote by book name; use the native query syntax
</code></pre>
<p>Duet AI will generate code along the lines of:</p>
<pre><code>  @Query( nativeQuery = true, value =
            &quot;SELECT id,quote,author,book FROM quotes WHERE book = :book&quot;)
  List&lt;Quote&gt; findByBook(String book);
</code></pre>
<p>With the repository method in place, generate the missing link, the getByBook method in the service class and test it out.
Open the QuoteService class and add the comment:</p>
<pre><code>// add get by book method, use the QuoteRepository
</code></pre>
<p>Duet AI will generate code along the lines of:</p>
<pre><code>  public List&lt;Quote&gt; getByBook(String book) {
    return quoteRepository.findByBook(book);
  }
</code></pre>
<p><strong>Done!</strong> with coding, let’s test the result.</p>
<h2 id="test-your-code">Test your code</h2>
<p>We'll run ALL tests from the Terminal:</p>
<pre><code>./mvnw verify
</code></pre>
<p>Quotes app can be started with the following command:</p>
<pre><code>./mvnw spring-boot:run
</code></pre>
<p>Ask Duet AI to generate cURL commands to test the newly added functionality. Switch to a different terminal window to run a cURL command.</p>
<p>In the Duet AI chat window, we can prompt Duet AI to generate a test command:</p>
<pre><code>Generate a curl command for the /quotes/book endpoint for a local environment at port 8083 for the book &quot;The Lord of the Rings&quot;
</code></pre>
<p>Duet AI will generate the cURL command, which we can run:</p>
<pre><code>curl -X GET http://localhost:8083/quotes/book/The%20Lord%20of%20the%20Rings
</code></pre>
<p>While you can assume that the command has not found a book, we wish to print the HTTP error code and refine the prompt with the following:</p>
<pre><code>Update the curl command to print the HTTP response code
</code></pre>
<p>Run the updated command generated by Duet AI, which should return a 404:</p>
<pre><code>curl -X GET http://localhost:8083/quotes/book/The%20Lord%20of%20the%20Rings -o /dev/null -s -w '%{http_code}\n'
</code></pre>
<p>Now refine the prompt to generate a successful command:</p>
<pre><code>Update the command again to use the book &quot;The Road&quot;
</code></pre>
<p>Run the updated command generated by Duet AI, which should return a 200:</p>
<pre><code>curl -X GET http://localhost:8083/quotes/book/The%20Road -o /dev/null -s -w '%{http_code}\n'
</code></pre>
<p><strong>Done!</strong> with testing.</p>
<h2 id="test-our-code-in-the-google-cloud-run">Test our code in the Google Cloud Run</h2>
<p>To deploy and test the code in Cloud Run, follow the detailed Build and Deploy instructions in the  <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main/services/quotes#readme">README</a></p>
<h2 id="what-have-we-learned-today">What have we learned today</h2>
<p>In this blog post, I started from an existing app and used Duet to help me get started with an overview of the codebase,
perform code reviews, generate test code and business logic at different functionality levels and, last but not least, generate
cURL commands to test the added functionality. All of this, following my preferred test-driven development approach.</p>
<p>It saved me time (and search effort) not only in generating code, or perform analysis. Duet was constantly available to generate the right cURL commands for testing when I forgot a parameter or suggested areas of code improvements. and removed a lot of these small interruptions from my workflow.</p>
<p>For any questions or feedback, feel free to contact me on Twitter/X <a href="https://twitter.com/ddobrin">@ddobrin</a>.</p>]]></content><author><name>Dan Dobrin</name></author><category term="DuetAI" /><category term="GenAI" /><category term="java" /><category term="springboot" /><category term="cloudrun" /><category term="googlecloud" /><summary type="html"><![CDATA[How Duet AI accelerates TDD workflows in Java and Spring Boot — from understanding unfamiliar codebases to implementing and testing new endpoints on Cloud Run.]]></summary></entry><entry><title type="html">The Future of Java Performance in Cloud Run: Native Java, CRaC and Project Leyden</title><link href="https://ddobrin.github.io/2024/02/13/Future-of-Java/" rel="alternate" type="text/html" title="The Future of Java Performance in Cloud Run: Native Java, CRaC and Project Leyden" /><published>2024-02-13T00:00:00+00:00</published><updated>2024-02-13T00:00:00+00:00</updated><id>https://ddobrin.github.io/2024/02/13/Future-of-Java</id><content type="html" xml:base="https://ddobrin.github.io/2024/02/13/Future-of-Java/"><![CDATA[<h1 id="the-future-of-java-performance-in-cloud-run-native-java-crac-and-project-leyden">The Future of Java Performance in Cloud Run: Native Java, CRaC and Project Leyden</h1>
<p>Applications run better on Cloud Run if they start fast, with instant peak performance and lower CPU/memory consumption.
You’ll save on costs if the application needs less time and resources to run on, handle more requests with less CPU/memory and
achieve better performance.</p>
<p>We love to use Java for its stability, performance and portability, however we all know that Java and its various web frameworks aren't known
for starting fast or not using a lot of resources.</p>
<p>Don’t worry, I’ll start exploring carefully how this all changes with three technologies geared towards improving Java app runtime efficiency in serverless environments.</p>
<ul>
<li>Native Java Images, with <a href="https://www.graalvm.org/">GraalVM</a></li>
<li>JVM Checkpoint and Restore, with <a href="https://wiki.openjdk.org/display/crac/Main">CRaC</a></li>
<li>Upcoming OpenJDK runtime efficiency project, <a href="https://openjdk.org/projects/leyden/">Project Leyden</a></li>
</ul>
<p>This is a (fast) emerging space where nothing is completely done at this time, with multiple paths to optimizing Java runtime efficiency for
your Cloud Run applications shaping up. Java performance optimization in Cloud Run is a sum of multiple factors: the cloud runtime,
the JVM, the web frameworks/dependencies and, last but not least, your application code.</p>
<p>In this blog post we’ll dive into these optimization options, starting from the current state.
My goal is to help you identify the best options for <strong><em>your</em></strong> <em>application running in production in Cloud Run</em>.</p>
<h2 id="sample-apps">Sample apps</h2>
<p>The blog post is supported by  Spring Boot sample apps, using the latest <a href="https://openjdk.org/projects/jdk/21/">Java 21</a> LTS,
<a href="https://github.com/spring-projects/spring-boot">Spring Boot</a> and <a href="https://github.com/spring-projects/spring-framework/wiki/What&#x27;s-New-in-Spring-Framework-6.x">Spring Framework</a> versions, including build, test, deployment and runtime guidelines to <a href="https://cloud.google.com/run/?utm_source=google&amp;utm_medium=cpc&amp;utm_campaign=na-none-all-en-dr-sitelink-all-all-trial-e-gcp-1605212&amp;utm_content=text-ad-none-any-DEV_c-CRE_665735485586-ADGP_Hybrid+%7C+BKWS+-+MIX+%7C+Txt_Cloud+Run-KWID_43700077225654501-kwd-678836618089-userloc_9000912&amp;utm_term=KW_google%20cloud%20run-ST_google+cloud+run-NET_g-&amp;gclid=CjwKCAjwv-2pBhB-EiwAtsQZFB8K6FUxSvkHEktLRF1UpODa1du2ZawyO82eHqP_CvKJO_jABnb8_hoC5UwQAvD_BwE&amp;gclsrc=aw.ds">Google Cloud Run</a>.</p>
<p>Start by cloning the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main">Git repository supporting this blog</a>
and <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main?tab=readme-ov-file#getting-started">setting up your environment</a>.
Follow along the blog with instructions on building the Quotes service as a <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/services/quotes/README.md">JIT image</a>,
<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/services/quotes/README.md">GraalVM Native image</a>,
respectively <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/runtimes/crac/quotes-crac/README.md">CRaC image</a> and deploying them to Cloud Run.</p>
<p>A super-early <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/runtimes/project-leyden/README.md">Project Leyden</a> sample has also been provided.</p>
<h2 id="key-terms">Key terms</h2>
<p>Here are a few concepts that I’ll be referring to:</p>
<ul>
<li>Startup latency: time to get to processing the first request in the app</li>
<li>Warmup: time it takes the app to reach peak performance</li>
<li>Peak performance: level of performance where the Java app handles the highest possible workload with the lowest possible latency and resource consumption</li>
<li>RSS memory: amount of physical memory actively used by a Java process</li>
</ul>
<h2 id="strongplain-jit-images-openjdkstrong"><strong>Plain JIT Images (OpenJDK)</strong></h2>
<p>In traditional Java applications, source code is compiled into bytecode and packaged into a Jar archive (and containerized).
The JVM uses a bytecode interpreter to execute the program on the host runtime. The <a href="https://docs.oracle.com/en/database/oracle/oracle-database/21/jjdev/Oracle-JVM-JIT.html#GUID-23D5BA60-A2B3-45F9-93DF-81A3D971CA50">JIT compiler</a> translates Java bytecode frequently executed code (hotspots) into machine code to improve peak performance.</p>
<p>It is important for us to understand this process, with JIT being the default compiler in the JVM. We want to use it as a benchmark against which to measure improvements.</p>
<p><strong>Fast build time + less initial optimization → slower startup + higher resource consumption</strong></p>
<p><img src="/assets/images/Image1.png" alt="Image1" /></p>
<p>Note the start-up time of the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/services/quotes/README.md">Quotes service</a> JIT container image:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> docker run <span class="nt">--rm</span> <span class="nt">-p8080</span>:8083 quotes
Starting QuotesApplication v1.0.0 using Java 21 with PID 1
...
Tomcat started on port 8083 <span class="o">(</span>http<span class="o">)</span> with context path <span class="s1">''</span>
Started QuotesApplication <span class="k">in </span>3.224 seconds <span class="o">(</span>process running <span class="k">for </span>3.466<span class="o">)</span>
</code></pre></div></div>
<p>While ~3 seconds might sound acceptable to many applications, in the real world, when you run your enterprise-grade apps in the cloud, this startup time might be measured in tens of seconds or even minutes. Scale-to-zero would potentially be unacceptable, while scale-out speed severely impacted.</p>
<p><strong>JIT Images at runtime in Cloud Run</strong>
<a href="https://cloud.google.com/blog/products/serverless/announcing-startup-cpu-boost-for-cloud-run--cloud-functions">CPU Boost</a> in Cloud Run
offers a great feature for improving the cold startup time for JIT images, by dynamically allocating more CPU to your container during startup,
with some applications observing a 50% reduction in startup time. It could be sufficient to meet the SLOs for your application.</p>
<h2 id="strongnext-level-aot-performance-with-native-java-images-graalvmstrong"><strong>Next-level AOT performance with Native Java Images (GraalVM)</strong></h2>
<p>Native Images follow the same process as above, however the application is transformed ahead-of-time (AOT) into a native executable at build time, for the individual OS and Machine Architecture of the runtime environment, and can run without the need of a full JVM.</p>
<p>Building Native Images leveraging AOT compilation makes <a href="https://docs.oracle.com/en/graalvm/enterprise/21/docs/reference-manual/native-image/basics/#static-analysis">closed-world assumptions</a> (at build time) of all the classes required by the
application (at runtime). This static analysis is time consuming. The resulting application image contains only the classes required
to run the app, with no further JIT runtime optimizations. This results in smaller container images, for faster and more efficient deployment,
while limiting the security attack surface.</p>
<p>The static analysis commences from your application entry point and includes any class which can be reached across source code,
dependent libraries, respectively JDK classes. If classes can’t be reached due to Java’s dynamic features, say reflection, proxying,
serialization, resource access, they will not be included in the image and would have to be supplied externally through configuration,
called <a href="https://medium.com/graalvm/enhancing-3rd-party-library-support-in-graalvm-native-image-with-shared-metadata-9eeae1651da4">shared metadata</a>.</p>
<p><strong>Slow build time + higher optimization → super-fast startup + lower resource consumption</strong></p>
<p><img src="/assets/images/Image2.png" alt="Image1" /></p>
<p><strong>Why use native images with GraalVM?</strong></p>
<p><img src="/assets/images/Image3.png" alt="Image1" /></p>
<p><strong>Any GraalVM trade-offs?</strong></p>
<p><img src="/assets/images/Image4.png" alt="Image1" /></p>
<p><strong>Warmup for peak performance with GraalVM</strong></p>
<p>Let’s note that peak performance was an additional trade-off for GraalVM usage, as there was no just-in-time optimization at runtime.</p>
<p>The recent release of the <a href="https://www.graalvm.org/downloads/">Oracle GraalVM</a> distribution under the GraalVM Free Terms and Conditions <a href="https://medium.com/graalvm/a-new-graalvm-release-and-new-free-license-4aab483692f5">license</a>
(see <a href="https://www.oracle.com/java/technologies/javase/jdk-faqs.html#GraalVM-licensing">restrictions</a>) addresses this aspect with the
introduction of <a href="https://www.graalvm.org/latest/reference-manual/native-image/guides/optimize-native-executable-with-pgo/">profile-guided-optimization(PGO)</a>
for peak performance and the availability of <a href="https://www.graalvm.org/latest/reference-manual/native-image/optimizations-and-performance/MemoryManagement/#g1-garbage-collector">G1GC</a> in GraalVM.</p>
<p><strong>Note</strong> the startup time of the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/services/quotes/README.md">Quotes service</a> Native Java container image:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> docker run <span class="nt">--rm</span> <span class="nt">-p8080</span>:8083 quotes-native
Starting AOT-processed QuotesApplication using Java 21 with PID 1
...
Tomcat started on port 8083 <span class="o">(</span>http<span class="o">)</span> with context path <span class="s1">''</span>
Started QuotesApplication <span class="k">in </span>0.392 seconds <span class="o">(</span>process running <span class="k">for </span>0.399<span class="o">)</span>
</code></pre></div></div>
<p>Observe the super-fast startup of the Quotes app, a 10x startup time improvement.
Note that the improvement would be even more substantial for real-world applications, due to the larger number of dependencies which can be optimized ahead of time.</p>
<p><strong>Native Images in Cloud Run</strong>
Native Java images generally benefit only in small measure from enabling CPU Boost.
From a startup perspective only, they could require as little as a single CPU to be allocated, as startup is a plain container loading effort without any reflection, dynamic proxying, deserialization or other Java specific operations happening.</p>
<p>The running container requires less memory at runtime than a JIT image, due to less classes being loaded and consumes less CPU as well, as no further runtime optimizations are required at runtime.</p>
<h2 id="strongjvm-checkpoint-and-restore-cracstrong"><strong>JVM Checkpoint and Restore (CRaC)</strong></h2>
<p>CRaC OpenJDK is an emerging open-source project focused on improving runtime efficiency in Java applications.
It is based on <a href="https://github.com/checkpoint-restore/criu">CRIU</a>,
a project that implements checkpoint/restore functionality on Linux, which allows you to run CRaC images only on Linux.</p>
<p>Building a CRaC application follows the same compile process as JIT images do. At runtime, the Java application is started normally, with the optional execution of a number of requests. At this time, a memory snapshot of the running Java application is  triggered (checkpointed), stored to disk in a new image and then restored super-fast when the application is restarted. Thus, CRaC can save a significant amount of startup time, especially for large and complex applications.</p>
<p>As a developer, be aware that file descriptors, sockets and pools are objects in memory, which you have to gracefully close, then restore according to the CRaC lifecycle.</p>
<p>To deploy your CRaC container image to Cloud Run, build the regular Docker image, with a CRaC OpenJDK and start it (see Quotes CRaC sample). Checkpoint the running app and store the resulting container image in Artifact Registry, then deploy it to Cloud Run. You can now update your production application configuration at restoration time in Cloud Run, thus preventing any potential leak of sensitive information, say your production database password. Let’s not forget that everything loaded in memory at startup will be serialized to the snapshot files!!</p>
<p>CRaC OpenJDK is being developed by Azul, which has made CRaC OpenJDK available under an open source license. CRaC OpenJDK is not yet supported by all Java frameworks and libraries. Spring Boot, used for all samples in this blog post, has introduced CRaC support in with the 3.2 GA version.</p>
<p><strong>Fast build time + less initial optimization → super-fast startup + higher resource consumption</strong></p>
<p><img src="/assets/images/Image5.png" alt="Image1" /></p>
<p><strong>Why use Project CRaC?</strong></p>
<p><img src="/assets/images/Image6.png" alt="Image1" /></p>
<p><strong>Any Project CRaC trade-offs?</strong></p>
<p><img src="/assets/images/Image7.png" alt="Image1" /></p>
<p><strong>Warmup for peak performance with CRaC</strong></p>
<p>While peak performance of applications checkpointed/restored using CRaC follows the same process as JIT based applications,
with Hotspot optimization at runtime, instant peak performance is dependent on when the checkpoint has been taken.</p>
<p>Snapshots taken in CI/CD pipeline may only capture web framework and app initialization.
Checkpointing the application after requests have been executed may allow application to reach instant peak performance.</p>
<p>Checkpointing secrets in memory before startup can lead to leaking sensitive data in snapshots as the secret will be serialized in the snapshot.
To mitigate this, checkpoint right after application startup and <a href="https://docs.spring.io/spring-framework/reference/6.1/integration/checkpoint-restore.html#_automatic_checkpointrestore_at_startup">refresh the context</a>.</p>
<p><strong>Observe</strong> the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main/runtimes/crac/quotes-crac/README.md">Quotes service</a> CRaC Java image checkpoint process:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>❯ ./checkpoint.sh

...
Using CRaC enabled JDK /bin/zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz
<span class="o">[</span>INFO] Building quotes 1.0.0
...
<span class="o">=&gt;</span> <span class="o">[</span>internal] load metadata <span class="k">for </span>docker.io/library/ubuntu:22.04                                                                              
<span class="o">=&gt;</span> https://.../zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz                                                        
<span class="o">=&gt;</span> <span class="o">[</span>1/6] FROM docker.io/library/ubuntu:22.04                                                                                             
<span class="o">=&gt;</span> CACHED <span class="o">[</span>2/6] ADD /bin/zulu21.28.89-ca-crac-jdk21.0.0-linux_aarch64.tar.gz
/opt/jdk/openjdk.tar.gz            
...
<span class="o">=&gt;</span> <span class="o">[</span>5/6] COPY target/quotes-crac-1.0.0.jar /opt/app/quotes-crac-1.0.0.jar                                                                
<span class="o">=&gt;</span> <span class="o">[</span>6/6] COPY src/scripts/entrypoint.sh /opt/app/entrypoint.sh                                                                           
<span class="o">=&gt;</span> exporting to image                                                                                                                    
<span class="o">=&gt;</span> <span class="o">=&gt;</span> exporting layers                                                                                                                   
<span class="o">=&gt;</span> <span class="o">=&gt;</span> writing image
<span class="o">=&gt;</span> <span class="o">=&gt;</span> naming to docker.io/library/quotes-crac:builder                                                                                    
...
Please <span class="nb">wait </span>during checkpoint creation...
sha256:1e48fde5ae0b39a4a458d659b34e491c9fe965fd757e0b861c61fe891a4796fb
Image: 28982acffecd
</code></pre></div></div>
<p>You can follow the checkpointing process and see the Java app being built, containerized, started then a new image being created during
checkpointing with a new Docker image digest.</p>
<p><strong>Note</strong> the startup time of the <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main/runtimes/crac/quotes-crac/README.md">Quotes service</a> CRaC Java image restore:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>❯ ./restore.sh
Restarting Spring-managed lifecycle beans after JVM restore
Tomcat started on port 8080 <span class="o">(</span>http<span class="o">)</span> with context path <span class="s1">''</span>
Spring-managed lifecycle restart completed <span class="k">in </span>13 ms <span class="o">(</span>restored JVM running <span class="k">for </span>64 ms<span class="o">)</span>
Completed initialization <span class="k">in </span>2 ms
</code></pre></div></div>
<p>Observe that the startup is super-fast, similar to native images, orders of magnitude faster than JIT images.</p>
<p><strong>CRaC Images in Cloud Run</strong></p>
<p>CRaC Native Java images benefit in some measure from enabling CPU Boost in Cloud Run, depending on when the checkpointing has occurred.</p>
<p>If the app has been checkpointed in a CI/CD pipeline, and the application context refreshed at runtime during startup, it would benefit from more CPU during context initialization. In case the app has been checkpointed in a production environment, fully warmed up, CPU boost would not yield any benefits (remember the security limitation above).</p>
<p>At runtime, after the startup phase, CRaC-based apps consume the same amount of CPU and memory as JIT images, as  they are running in a similar manner on a regular JVM.</p>
<h2 id="a-peek-into-the-future-of-openjdk-with-project-leyden">A peek into the future of OpenJDK with Project Leyden</h2>
<p>I compared two technologies to improve Java application runtime efficiency, each providing significant benefits, but also presenting non-trivial trade-offs. Is there any (future) alternative available, which would allow developers to balance static AOT with dynamic JIT?</p>
<p><a href="https://openjdk.org/projects/leyden/">Project Leyden</a> is a new OpenJDK project with the same goal of improving startup/warmup time and
lowering the footprint of Java applications. The focus of Project Leyden is to allow selectively shifting and constraining computation
(<a href="https://www.youtube.com/watch?v=O1Oz2-AXKKM">talk</a>, <a href="https://openjdk.org/projects/leyden/notes/03-toward-condensers">concepts</a>),
while employing the  concept of meaning preservation, i.e. the resulting image has the same meaning as the original, without side effects.</p>
<p>While the project is in its early stages, the Java Platform Group is looking into experimenting with and combining various optimization options.
One very promising “early” optimization is the combination of <a href="https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91">Class Data Sharing</a> (CDS) from the JDK
with <a href="https://docs.spring.io/spring-framework/reference/core/aot.html">Spring AOT</a>, into an <a href="https://github.com/openjdk/leyden/tree/premain/test/hotspot/jtreg/premain/javac_new_workflow">experiment</a> which shows a 15% startup improvement.</p>
<h2 id="additional-contributors-to-runtime-efficiency">Additional contributors to runtime efficiency</h2>
<p>Runtime efficiency is achieved from faster startup time, smaller container images and lower CPU and memory consumption (<a href="https://en.wikipedia.org/wiki/Resident_set_size#">RSS</a>).</p>
<p>Observe that Native Java images are smaller than CRaC images, while RSS memory consumption in Native Images is significantly lower than CRaC images (on par with a regular JIT image).</p>
<pre><code class="language-shelll">❯ docker images | grep quotes
quotes                 latest          231MB
quotes-native          latest          198MB
quotes-crac            checkpoint      994MB

&gt; ps aux | grep quotes
USER     PID     RSS     COMMAND
&lt;user&gt;   47831   541964  java -jar target/quotes-1.0.0.jar
&lt;user&gt;   47901   143848  ./target/quotes
</code></pre>
<h2 id="a-summary-of-runtime-efficiency-optimization-options">A summary of runtime efficiency optimization options</h2>
<p>Let’s summarize all the concepts addressed in this blog post, starting with JIT images and the added optimization technologies available to you when running apps in Cloud Run.</p>
<p><img src="/assets/images/Image8.png" alt="Image1" /></p>
<h2 id="which-technology-should-you-use">Which technology should you use?</h2>
<p>Native Java with GraalVM, CRaC and Project Leyden all share  the same goal: improving the runtime efficiency of Java applications running on scale-to-zero platforms, by improving start-up and warm-up time, and reducing resource consumption.</p>
<p><strong>Native Java with GraalVM</strong> and <strong>CRaC</strong> both offer excellent start-up performance,
typically up to 50x faster than the start-up time on a regular JVM. Peak performance with Java 21
and profile-guided optimization has both solutions on par.</p>
<p><strong>Project Leyden</strong> is a future project that aims to combine the best of both Native Java and CRaC. Leyden will use a new intermediate representation that is designed to be both performant and easy to develop for.</p>
<p>In short:</p>
<ul>
<li><strong>GraalVM technology is production-ready at this time</strong>, with support across all major Java web frameworks with the lowest startup time, resource consumption and security attack surface</li>
<li><strong>CRaC is a very good emerging initiative</strong>, with full production readiness dependent on the availability of CRaC lifecycle support in various web frameworks and dependent libraries</li>
<li><strong>Project Leyden</strong> is a future project that has the potential to combine the best of both worlds: static ahead-of-time (AOT) with dynamic just-in-time (JIT)</li>
</ul>
<h2 id="next-steps">Next steps</h2>
<ul>
<li>Inspect the codebase for the Quotes service illustrated throughout this blog post,
build both <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/services/quotes/README.md/README.md">Quotes Native Java</a>
or <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main/runtimes/crac/quotes-crac">Quotes CRaC</a>
versions and deploy them to Cloud Run</li>
<li>Peek into the future with OpenJDK’s <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/java21/runtimes/project-leyden">Project Leyden</a></li>
<li>For a general overview of Java optimization in Cloud Run, watch the <a href="https://cloudonair.withgoogle.com/events/developer-stories-road-to-java-on-gcp-serverless">Developer Stories: Road to Java on GCP Serverless - What can trip you up?</a> session</li>
</ul>
<p>For any questions or feedback, feel free to contact me on Twitter/X <a href="https://twitter.com/ddobrin">@ddobrin</a>.</p>]]></content><author><name>Dan Dobrin</name></author><category term="native" /><category term="graalvm" /><category term="crac" /><category term="leyden" /><category term="java" /><category term="serverless" /><category term="cloudrun" /><category term="googlecloud" /><summary type="html"><![CDATA[A deep-dive comparison of GraalVM Native Images, CRaC checkpoint/restore, and Project Leyden for improving Java startup time and resource efficiency in serverless environments.]]></summary></entry></feed>