<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agent on XEDCZQ Blog</title><link>https://xedczq.cn/en/tags/agent/</link><description>Recent content in Agent on XEDCZQ Blog</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 22 May 2026 10:30:00 +0800</lastBuildDate><atom:link href="https://xedczq.cn/en/tags/agent/index.xml" rel="self" type="application/rss+xml"/><item><title>Agent_RAG Optimization</title><link>https://xedczq.cn/en/post/agent_rag%E4%BC%98%E5%8C%96/</link><pubDate>Fri, 22 May 2026 10:30:00 +0800</pubDate><guid>https://xedczq.cn/en/post/agent_rag%E4%BC%98%E5%8C%96/</guid><description>&lt;h1 id="rag-optimization-notes-first-person"&gt;&lt;a href="#rag-optimization-notes-first-person" class="header-anchor"&gt;&lt;/a&gt;RAG Optimization Notes (First-Person)
&lt;/h1&gt;&lt;p&gt;After reviewing recent RAG optimization materials, my conclusion is straightforward:&lt;/p&gt;
&lt;p&gt;The bottleneck of RAG is no longer &amp;ldquo;can it run,&amp;rdquo; but &amp;ldquo;can it hit reliably, stay controllable, and remain measurable in production.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;I now break RAG optimization into four layers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Pre-retrieval optimization (Query + Chunk)&lt;/li&gt;
&lt;li&gt;Retrieval-time optimization (Recall + Rank)&lt;/li&gt;
&lt;li&gt;Post-retrieval optimization (Context Packing + Compression)&lt;/li&gt;
&lt;li&gt;Production loop optimization (Evaluation + Feedback)&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id="1-pre-retrieval-optimization-fix-input-and-corpus-quality-first"&gt;&lt;a href="#1-pre-retrieval-optimization-fix-input-and-corpus-quality-first" class="header-anchor"&gt;&lt;/a&gt;1) Pre-Retrieval Optimization: Fix Input and Corpus Quality First
&lt;/h2&gt;&lt;h3 id="what-i-focus-on"&gt;&lt;a href="#what-i-focus-on" class="header-anchor"&gt;&lt;/a&gt;What I focus on
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Semantic chunking&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;I no longer use fixed 300/500-token hard cuts.&lt;/li&gt;
&lt;li&gt;I chunk by semantic paragraphs, code boundaries, and heading hierarchy.&lt;/li&gt;
&lt;li&gt;My goal is to make each chunk self-contained and independently citable.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Query rewriting&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Normalize colloquial user questions into domain terms.&lt;/li&gt;
&lt;li&gt;Handle abbreviations, aliases, and typo normalization.&lt;/li&gt;
&lt;li&gt;Decompose complex questions into sub-queries.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;HyDE (Hypothetical Document Embeddings)&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Generate an &amp;ldquo;ideal answer draft&amp;rdquo; first.&lt;/li&gt;
&lt;li&gt;Retrieve using the draft embedding, not only the short user query.&lt;/li&gt;
&lt;li&gt;I treat HyDE as a recall-boost switch, enabled only in low-recall scenarios.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="my-assessment"&gt;&lt;a href="#my-assessment" class="header-anchor"&gt;&lt;/a&gt;My assessment
&lt;/h3&gt;&lt;p&gt;If pre-retrieval is weak, reranking/compression/caching are mostly damage control.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="2-retrieval-time-optimization-multi-path-recall--rerank-not-vector-only"&gt;&lt;a href="#2-retrieval-time-optimization-multi-path-recall--rerank-not-vector-only" class="header-anchor"&gt;&lt;/a&gt;2) Retrieval-Time Optimization: Multi-Path Recall + Rerank, Not Vector-Only
&lt;/h2&gt;&lt;h3 id="my-current-approach"&gt;&lt;a href="#my-current-approach" class="header-anchor"&gt;&lt;/a&gt;My current approach
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Hybrid search&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Dense vectors for semantic recall.&lt;/li&gt;
&lt;li&gt;Sparse retrieval (BM25/keywords) to recover exact-match cases.&lt;/li&gt;
&lt;li&gt;Fuse results before reranking.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Two-stage ranking (Recall L1 -&amp;gt; Rank L2)&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Stage 1 maximizes recall (better to over-fetch).&lt;/li&gt;
&lt;li&gt;Stage 2 reranker narrows to top-k precision.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Cross-encoder / API rerank&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Score query-doc pairs directly.&lt;/li&gt;
&lt;li&gt;More stable than pure embedding similarity, especially on long chunks.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="my-assessment-1"&gt;&lt;a href="#my-assessment-1" class="header-anchor"&gt;&lt;/a&gt;My assessment
&lt;/h3&gt;&lt;p&gt;In production, the issue is often not &amp;ldquo;nothing found,&amp;rdquo; but &amp;ldquo;too many low-precision hits.&amp;rdquo; Rerank is not optional; it is a quality gate.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="3-post-retrieval-optimization-turn-context-into-high-density-evidence"&gt;&lt;a href="#3-post-retrieval-optimization-turn-context-into-high-density-evidence" class="header-anchor"&gt;&lt;/a&gt;3) Post-Retrieval Optimization: Turn Context into High-Density Evidence
&lt;/h2&gt;&lt;h3 id="three-things-i-optimize"&gt;&lt;a href="#three-things-i-optimize" class="header-anchor"&gt;&lt;/a&gt;Three things I optimize
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Evidence compression&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Rerank first, then compress.&lt;/li&gt;
&lt;li&gt;Remove weakly relevant sentences, template noise, and duplicates.&lt;/li&gt;
&lt;li&gt;Keep entities, numbers, and conclusion-bearing sentences.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Context packing strategy&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Do not concatenate by raw retrieval order.&lt;/li&gt;
&lt;li&gt;Repack by &amp;ldquo;question sub-intent -&amp;gt; evidence groups.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Tag each evidence block with source IDs for traceability.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Cache-friendly prompt assembly&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Place stable system prefixes and static background first.&lt;/li&gt;
&lt;li&gt;Maximize prefix reuse and cache hit rate (cost + latency benefits).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="my-assessment-2"&gt;&lt;a href="#my-assessment-2" class="header-anchor"&gt;&lt;/a&gt;My assessment
&lt;/h3&gt;&lt;p&gt;RAG cost is often dominated not by retrieval itself, but by sending low-value context to the LLM. Post-retrieval refinement is one of the most direct cost levers.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="4-production-loop-optimization-make-rag-a-system-not-a-demo"&gt;&lt;a href="#4-production-loop-optimization-make-rag-a-system-not-a-demo" class="header-anchor"&gt;&lt;/a&gt;4) Production Loop Optimization: Make RAG a System, Not a Demo
&lt;/h2&gt;&lt;h3 id="my-evaluation-perspective"&gt;&lt;a href="#my-evaluation-perspective" class="header-anchor"&gt;&lt;/a&gt;My evaluation perspective
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Retrieval-layer metrics&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Recall@k&lt;/li&gt;
&lt;li&gt;MRR / nDCG&lt;/li&gt;
&lt;li&gt;Hit-rate buckets (short query / long query / code query)&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Generation-layer metrics&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Faithfulness (is the answer grounded in evidence?)&lt;/li&gt;
&lt;li&gt;Answer relevance (does it answer the actual question?)&lt;/li&gt;
&lt;li&gt;Context precision (how much retrieved context is truly useful?)&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;System-layer metrics&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;P95 latency&lt;/li&gt;
&lt;li&gt;Per-query token cost&lt;/li&gt;
&lt;li&gt;Cache hit rate&lt;/li&gt;
&lt;li&gt;Fallback-routing ratio (needs backup retrieval/web search)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="my-feedback-loop"&gt;&lt;a href="#my-feedback-loop" class="header-anchor"&gt;&lt;/a&gt;My feedback loop
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;User query -&amp;gt; recall -&amp;gt; rerank -&amp;gt; generate answer&lt;/li&gt;
&lt;li&gt;Evaluator scores answer and evidence automatically&lt;/li&gt;
&lt;li&gt;Low-score samples flow into a hard-case dataset&lt;/li&gt;
&lt;li&gt;Weekly regression over retrieval params, chunking policy, and reranker setup&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="vendorframework-recommendations-i-use-as-baseline"&gt;&lt;a href="#vendorframework-recommendations-i-use-as-baseline" class="header-anchor"&gt;&lt;/a&gt;Vendor/Framework Recommendations I Use as Baseline
&lt;/h2&gt;&lt;p&gt;I prioritize official vendor/framework docs over second-hand summaries.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Microsoft Learn: &lt;a class="link" href="https://learn.microsoft.com/en-us/azure/developer/ai/advanced-retrieval-augmented-generation" target="_blank" rel="noopener"
 &gt;Build Advanced Retrieval-Augmented Generation Systems&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;End-to-end advanced RAG workflow&lt;/li&gt;
&lt;li&gt;Strong emphasis on query rewriting, post-retrieval processing, and evaluation loops&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Azure Architecture Center: &lt;a class="link" href="https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/rag/rag-information-retrieval" target="_blank" rel="noopener"
 &gt;Develop a RAG Solution—Information-Retrieval Phase&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Systematic retrieval-phase guidance&lt;/li&gt;
&lt;li&gt;Explicitly covers query augmentation/decomposition/rewriting/HyDE&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Anthropic Engineering: &lt;a class="link" href="https://www.anthropic.com/engineering/contextual-retrieval" target="_blank" rel="noopener"
 &gt;Contextual Retrieval&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Practical guidance on hybrid retrieval and context utilization&lt;/li&gt;
&lt;li&gt;Clearly addresses &amp;ldquo;retrieved is not equal to used correctly&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Anthropic Help: &lt;a class="link" href="https://support.anthropic.com/en/articles/11473015-retrieval-augmented-generation-rag-for-projects" target="_blank" rel="noopener"
 &gt;Retrieval Augmented Generation (RAG) for Projects&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Checklist-oriented practical recommendations for productization&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Cohere Docs: &lt;a class="link" href="https://docs.cohere.com/docs/reranking-best-practices" target="_blank" rel="noopener"
 &gt;Best Practices for using Rerank&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Practical rerank guidance for input organization and deployment&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Paper: &lt;a class="link" href="https://arxiv.org/abs/2307.03172" target="_blank" rel="noopener"
 &gt;Lost in the Middle&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Evidence for middle-context utilization degradation&lt;/li&gt;
&lt;li&gt;Supports the need for reranking, compression, and packing&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="7"&gt;
&lt;li&gt;Paper: &lt;a class="link" href="https://arxiv.org/abs/2005.11401" target="_blank" rel="noopener"
 &gt;RAG: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Foundational retrieval+generation paradigm&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="how-i-integrate-these-optimizations-into-real-ai-application-iteration"&gt;&lt;a href="#how-i-integrate-these-optimizations-into-real-ai-application-iteration" class="header-anchor"&gt;&lt;/a&gt;How I Integrate These Optimizations into Real AI Application Iteration
&lt;/h2&gt;&lt;p&gt;I run a weekly optimization loop:&lt;/p&gt;
&lt;h3 id="step-0-define-scenario-buckets-and-baseline"&gt;&lt;a href="#step-0-define-scenario-buckets-and-baseline" class="header-anchor"&gt;&lt;/a&gt;Step 0: Define scenario buckets and baseline
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Build 100–300 real QA samples (bucketed by scenario).&lt;/li&gt;
&lt;li&gt;Record baseline: retrieval hit quality, answer quality, latency, and cost.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="step-1-change-only-one-variable-per-iteration"&gt;&lt;a href="#step-1-change-only-one-variable-per-iteration" class="header-anchor"&gt;&lt;/a&gt;Step 1: Change only one variable per iteration
&lt;/h3&gt;&lt;p&gt;I modify one parameter at a time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Chunking policy&lt;/li&gt;
&lt;li&gt;Query rewriting switch&lt;/li&gt;
&lt;li&gt;Hybrid fusion weights&lt;/li&gt;
&lt;li&gt;Reranker model/threshold&lt;/li&gt;
&lt;li&gt;Context compression ratio&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This avoids confounded results.&lt;/p&gt;
&lt;h3 id="step-2-pass-offline-evaluation-first"&gt;&lt;a href="#step-2-pass-offline-evaluation-first" class="header-anchor"&gt;&lt;/a&gt;Step 2: Pass offline evaluation first
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;No offline pass, no online rollout.&lt;/li&gt;
&lt;li&gt;I check three dimensions: quality gain, latency impact, cost impact.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="step-3-online-canary-with-rollback-thresholds"&gt;&lt;a href="#step-3-online-canary-with-rollback-thresholds" class="header-anchor"&gt;&lt;/a&gt;Step 3: Online canary with rollback thresholds
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Roll out on small traffic.&lt;/li&gt;
&lt;li&gt;Set automatic rollback thresholds (P95, complaint rate, empty-answer rate).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="step-4-convert-wins-into-engineering-assets"&gt;&lt;a href="#step-4-convert-wins-into-engineering-assets" class="header-anchor"&gt;&lt;/a&gt;Step 4: Convert wins into engineering assets
&lt;/h3&gt;&lt;p&gt;I persist proven improvements into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Retrieval config templates&lt;/li&gt;
&lt;li&gt;Prompt/context assembly conventions&lt;/li&gt;
&lt;li&gt;RAG regression scripts&lt;/li&gt;
&lt;li&gt;Failure case datasets and labeling rules&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="my-conclusion"&gt;&lt;a href="#my-conclusion" class="header-anchor"&gt;&lt;/a&gt;My Conclusion
&lt;/h2&gt;&lt;p&gt;My final view on RAG optimization:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Pre-retrieval defines the ceiling (is the question represented correctly?)&lt;/li&gt;
&lt;li&gt;Retrieval-time defines hit quality (are we finding the right evidence?)&lt;/li&gt;
&lt;li&gt;Post-retrieval defines cost and usability (is high-density evidence delivered to the LLM?)&lt;/li&gt;
&lt;li&gt;Production loop defines sustainability (can quality keep improving?)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;One-line summary:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;RAG optimization is not &amp;#34;just tune model parameters&amp;#34;; it is engineering governance across retrieval, reranking, context construction, evaluation, and feedback.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description></item><item><title>Agent_Context Engineering</title><link>https://xedczq.cn/en/post/agent_%E4%B8%8A%E4%B8%8B%E6%96%87%E5%B7%A5%E7%A8%8B/</link><pubDate>Tue, 19 May 2026 16:35:00 +0800</pubDate><guid>https://xedczq.cn/en/post/agent_%E4%B8%8A%E4%B8%8B%E6%96%87%E5%B7%A5%E7%A8%8B/</guid><description>&lt;h1 id="what-context-engineering-is"&gt;&lt;a href="#what-context-engineering-is" class="header-anchor"&gt;&lt;/a&gt;What Context Engineering Is
&lt;/h1&gt;&lt;p&gt;Context engineering can be defined as:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Injecting the &amp;ldquo;just-enough and highly relevant&amp;rdquo; information at every agent step, while continuously managing the lifecycle of that information.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If prompt engineering focuses on &amp;ldquo;how to phrase the task,&amp;rdquo; context engineering focuses on &amp;ldquo;what information to provide, in what order, and when to prune or rebuild it.&amp;rdquo;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="phase-1-passive-truncation-and-sliding-window-20202022--every-token-counts"&gt;&lt;a href="#phase-1-passive-truncation-and-sliding-window-20202022--every-token-counts" class="header-anchor"&gt;&lt;/a&gt;Phase 1: Passive Truncation and Sliding Window (2020–2022) — &amp;ldquo;Every Token Counts&amp;rdquo;
&lt;/h2&gt;&lt;h3 id="typical-characteristics"&gt;&lt;a href="#typical-characteristics" class="header-anchor"&gt;&lt;/a&gt;Typical Characteristics
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Context windows were generally small, and tokens were highly constrained.&lt;/li&gt;
&lt;li&gt;The default strategy was &amp;ldquo;truncate when over limit.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;A common implementation was sliding window (keep only the latest N turns).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="what-it-solved"&gt;&lt;a href="#what-it-solved" class="header-anchor"&gt;&lt;/a&gt;What It Solved
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Prevented immediate failure from overlong input.&lt;/li&gt;
&lt;li&gt;Preserved recent interaction and basic multi-turn continuity.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="core-problems"&gt;&lt;a href="#core-problems" class="header-anchor"&gt;&lt;/a&gt;Core Problems
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Early critical information was often dropped.&lt;/li&gt;
&lt;li&gt;Goal drift was severe in long tasks.&lt;/li&gt;
&lt;li&gt;Historical state could not be inherited reliably.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="phase-2-external-topology-introduction-20212023--the-birth-of-an-external-brain-rag"&gt;&lt;a href="#phase-2-external-topology-introduction-20212023--the-birth-of-an-external-brain-rag" class="header-anchor"&gt;&lt;/a&gt;Phase 2: External Topology Introduction (2021–2023) — &amp;ldquo;The Birth of an External Brain (RAG)&amp;rdquo;
&lt;/h2&gt;&lt;h3 id="typical-characteristics-1"&gt;&lt;a href="#typical-characteristics-1" class="header-anchor"&gt;&lt;/a&gt;Typical Characteristics
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;The paradigm shifted from &amp;ldquo;stuff everything into context&amp;rdquo; to &amp;ldquo;retrieve on demand then inject.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Vector retrieval and semantic recall became mainstream.&lt;/li&gt;
&lt;li&gt;RAG decoupled parametric knowledge from external knowledge.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="what-it-solved-1"&gt;&lt;a href="#what-it-solved-1" class="header-anchor"&gt;&lt;/a&gt;What It Solved
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Broke through the memory ceiling of single-window context.&lt;/li&gt;
&lt;li&gt;Reduced hallucinations by grounding responses with retrievable evidence.&lt;/li&gt;
&lt;li&gt;Enabled knowledge updates without retraining the model.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="core-problems-1"&gt;&lt;a href="#core-problems-1" class="header-anchor"&gt;&lt;/a&gt;Core Problems
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Retrieval quality remained unstable (missed recall, wrong recall).&lt;/li&gt;
&lt;li&gt;Attention dilution still occurred after retrieval chunks were merged.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Retrieved&amp;rdquo; did not necessarily mean &amp;ldquo;used correctly by the model.&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="phase-3-fine-grained-compression-and-reordering-20232024--addressing-the-lost-in-the-middle-problem"&gt;&lt;a href="#phase-3-fine-grained-compression-and-reordering-20232024--addressing-the-lost-in-the-middle-problem" class="header-anchor"&gt;&lt;/a&gt;Phase 3: Fine-Grained Compression and Reordering (2023–2024) — &amp;ldquo;Addressing the Lost-in-the-Middle Problem&amp;rdquo;
&lt;/h2&gt;&lt;h3 id="typical-characteristics-2"&gt;&lt;a href="#typical-characteristics-2" class="header-anchor"&gt;&lt;/a&gt;Typical Characteristics
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;The community began to systematically focus on long-context utilization.&lt;/li&gt;
&lt;li&gt;Research and engineering attention increased around the Lost-in-the-Middle effect.&lt;/li&gt;
&lt;li&gt;Strategy evolved from &amp;ldquo;adding more context&amp;rdquo; to &amp;ldquo;compressing, reordering, and layered memory.&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="common-methods"&gt;&lt;a href="#common-methods" class="header-anchor"&gt;&lt;/a&gt;Common Methods
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;History summarization (state snapshot / handoff summary)&lt;/li&gt;
&lt;li&gt;Tool-output pruning (keep recent critical rounds)&lt;/li&gt;
&lt;li&gt;Information reordering (place highest-priority evidence near strong attention zones)&lt;/li&gt;
&lt;li&gt;Task segmentation and stage-wise handoff&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="what-it-solved-2"&gt;&lt;a href="#what-it-solved-2" class="header-anchor"&gt;&lt;/a&gt;What It Solved
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Reduced middle-section information neglect.&lt;/li&gt;
&lt;li&gt;Improved long-task state continuity.&lt;/li&gt;
&lt;li&gt;Made cross-window agent execution more controllable.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="core-problems-2"&gt;&lt;a href="#core-problems-2" class="header-anchor"&gt;&lt;/a&gt;Core Problems
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Compression summaries could introduce information loss.&lt;/li&gt;
&lt;li&gt;Reordering rules were task-dependent and hard to generalize.&lt;/li&gt;
&lt;li&gt;Evaluation was required to verify post-compression executability.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="phase-4-ultra-long-context-and-infrastructure-caching-20242026-current--kv-cache-and-intelligent-memory"&gt;&lt;a href="#phase-4-ultra-long-context-and-infrastructure-caching-20242026-current--kv-cache-and-intelligent-memory" class="header-anchor"&gt;&lt;/a&gt;Phase 4: Ultra-Long Context and Infrastructure Caching (2024–2026, Current) — &amp;ldquo;KV Cache and Intelligent Memory&amp;rdquo;
&lt;/h2&gt;&lt;h3 id="typical-characteristics-3"&gt;&lt;a href="#typical-characteristics-3" class="header-anchor"&gt;&lt;/a&gt;Typical Characteristics
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Context windows continued to expand.&lt;/li&gt;
&lt;li&gt;Vendors and frameworks introduced stronger cache/reuse mechanisms.&lt;/li&gt;
&lt;li&gt;Agent systems moved from &amp;ldquo;context management&amp;rdquo; to &amp;ldquo;context infrastructure.&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="common-capabilities"&gt;&lt;a href="#common-capabilities" class="header-anchor"&gt;&lt;/a&gt;Common Capabilities
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Prompt/prefix caching (reducing repeated token cost)&lt;/li&gt;
&lt;li&gt;Session state snapshots and resume&lt;/li&gt;
&lt;li&gt;Multi-layer memory architecture (short-term working memory + long-term external memory)&lt;/li&gt;
&lt;li&gt;Policy-based dynamic context construction&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="what-it-solved-3"&gt;&lt;a href="#what-it-solved-3" class="header-anchor"&gt;&lt;/a&gt;What It Solved
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Lowered long-chain cost and latency.&lt;/li&gt;
&lt;li&gt;Improved continuity in long-running tasks.&lt;/li&gt;
&lt;li&gt;Made memory management governable as an engineering subsystem.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="core-problems-3"&gt;&lt;a href="#core-problems-3" class="header-anchor"&gt;&lt;/a&gt;Core Problems
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Cost and system complexity increased.&lt;/li&gt;
&lt;li&gt;Memory contamination and stale-information governance became harder.&lt;/li&gt;
&lt;li&gt;Strong observability was required to diagnose context failure points.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="representative-industry-articles-and-references"&gt;&lt;a href="#representative-industry-articles-and-references" class="header-anchor"&gt;&lt;/a&gt;Representative Industry Articles and References
&lt;/h2&gt;&lt;p&gt;Below are high-value public references for context engineering:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" target="_blank" rel="noopener"
 &gt;Effective context engineering for AI agents&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Clearly positions context engineering as the natural extension of prompt engineering.&lt;/li&gt;
&lt;li&gt;Emphasizes that reliability bottlenecks in agents are often in context construction, not single prompts.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://www.anthropic.com/research/prompting-long-context" target="_blank" rel="noopener"
 &gt;Prompt engineering for Claude&amp;rsquo;s long context window&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Early long-context practice guidance with concrete input-structuring patterns.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Anthropic Docs: &lt;a class="link" href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips" target="_blank" rel="noopener"
 &gt;Long context prompting tips&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Practical implementation checklist style guidance.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;LangChain Docs: &lt;a class="link" href="https://docs.langchain.com/oss/python/langchain/context-engineering" target="_blank" rel="noopener"
 &gt;Context engineering in agents&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Implementation-oriented strategies for what to inject at each agent step.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Paper: &lt;a class="link" href="https://arxiv.org/abs/2307.03172" target="_blank" rel="noopener"
 &gt;Lost in the Middle: How Language Models Use Long Contexts&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Provides systematic evidence for degraded utilization of middle context.&lt;/li&gt;
&lt;li&gt;Directly influenced later compression/reordering practices.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Foundational RAG Paper: &lt;a class="link" href="https://arxiv.org/abs/2005.11401" target="_blank" rel="noopener"
 &gt;Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Established the mainstream retrieval+generation paradigm.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="what-problems-context-engineering-solves"&gt;&lt;a href="#what-problems-context-engineering-solves" class="header-anchor"&gt;&lt;/a&gt;What Problems Context Engineering Solves
&lt;/h2&gt;&lt;p&gt;This can be summarized into 6 core problem classes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Information selection&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Not all data should be provided; only context relevant to the current step.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Memory continuity&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Keep long tasks continuous across turns, windows, and sessions.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Cost and performance&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Control token spend, latency, and throughput by reducing low-value context.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Reliability&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Reduce missed evidence, state misreads, and repeated failed attempts.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Governance&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Make context policies (compression/retrieval/reordering) configurable, measurable, and iteratable.&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Toolchain coordination&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Integrate context with RAG, caching, state machines, and orchestration systems.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One-line summary:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Context engineering is not about whether a model can answer once; it is about whether it can keep answering correctly, consistently, and cost-effectively in complex workflows.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id="my-practical-conclusion"&gt;&lt;a href="#my-practical-conclusion" class="header-anchor"&gt;&lt;/a&gt;My Practical Conclusion
&lt;/h2&gt;&lt;p&gt;For agent projects, a pragmatic build order is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Start with prompt engineering (clear task contract)&lt;/li&gt;
&lt;li&gt;Then add context engineering (information lifecycle management)&lt;/li&gt;
&lt;li&gt;Finally implement harness engineering (end-to-end execution loop)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you only do prompt engineering, long tasks remain fragile. If you skip context engineering and jump directly to harness engineering, complexity increases quickly and debugging becomes expensive.&lt;/p&gt;</description></item><item><title>Agent_Prompt Engineering</title><link>https://xedczq.cn/en/post/agent_%E6%8F%90%E7%A4%BA%E8%AF%8D%E5%B7%A5%E7%A8%8B/</link><pubDate>Tue, 19 May 2026 16:20:00 +0800</pubDate><guid>https://xedczq.cn/en/post/agent_%E6%8F%90%E7%A4%BA%E8%AF%8D%E5%B7%A5%E7%A8%8B/</guid><description>&lt;h1 id="what-prompt-engineering-is"&gt;&lt;a href="#what-prompt-engineering-is" class="header-anchor"&gt;&lt;/a&gt;What Prompt Engineering Is
&lt;/h1&gt;&lt;p&gt;Prompt engineering is essentially:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Designing input structure (instructions, context, examples, and output constraints) to improve model output quality, stability, and usability.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;At an early stage, this was mainly a “single-call optimization” problem:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How to reduce model drift for the same question&lt;/li&gt;
&lt;li&gt;How to force structured output for programmatic integration&lt;/li&gt;
&lt;li&gt;How to make the model focus on the most relevant information under limited context&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One-line view:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Prompt engineering = translating natural-language requirements into stable, executable model input contracts
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="what-early-prompt-engineering-tried-to-solve"&gt;&lt;a href="#what-early-prompt-engineering-tried-to-solve" class="header-anchor"&gt;&lt;/a&gt;What Early Prompt Engineering Tried to Solve
&lt;/h2&gt;&lt;p&gt;In early LLM usage, the main pain points were direct:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Unstable outputs&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Same input, varying output quality across runs&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Inconsistent instruction following&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Missing constraints, skipped steps, or task boundary drift&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Uncontrolled output format&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Hard to reliably produce JSON/table/structured fields&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Hallucination and fabrication&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Models tend to fill gaps with invented facts&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;High engineering integration cost&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Hard to plug responses into automated pipelines (parse/store/invoke)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The real value of prompt engineering was turning “probabilistic conversation behavior” into “repeatable invocation behavior.”&lt;/p&gt;
&lt;h2 id="typical-methods-in-prompt-engineering"&gt;&lt;a href="#typical-methods-in-prompt-engineering" class="header-anchor"&gt;&lt;/a&gt;Typical Methods in Prompt Engineering
&lt;/h2&gt;&lt;h3 id="1-instruction-clarification"&gt;&lt;a href="#1-instruction-clarification" class="header-anchor"&gt;&lt;/a&gt;1. Instruction Clarification
&lt;/h3&gt;&lt;p&gt;Break tasks into explicit actions and avoid vague intent.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You are a backend code review assistant.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Goal: identify concurrency safety issues.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Scope: only check src/service/*.java.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Output: return a Markdown table with columns risk_level/file_path/fix_suggestion.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="2-structured-constraints"&gt;&lt;a href="#2-structured-constraints" class="header-anchor"&gt;&lt;/a&gt;2. Structured Constraints
&lt;/h3&gt;&lt;p&gt;Define a fixed output schema to reduce “looks good but unusable” responses.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;risk_level&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;high|medium|low&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;file&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;issue&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;fix&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="3-few-shot-examples"&gt;&lt;a href="#3-few-shot-examples" class="header-anchor"&gt;&lt;/a&gt;3. Few-shot Examples
&lt;/h3&gt;&lt;p&gt;Provide 1-3 high-quality examples to improve style consistency and task alignment.&lt;/p&gt;
&lt;h3 id="4-role-and-boundary-control"&gt;&lt;a href="#4-role-and-boundary-control" class="header-anchor"&gt;&lt;/a&gt;4. Role and Boundary Control
&lt;/h3&gt;&lt;p&gt;State what the model can and cannot do, especially no guessing.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;If evidence is insufficient, return &amp;#34;insufficient information&amp;#34; and do not fabricate.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="5-iterative-tuning"&gt;&lt;a href="#5-iterative-tuning" class="header-anchor"&gt;&lt;/a&gt;5. Iterative Tuning
&lt;/h3&gt;&lt;p&gt;Treat prompts like code: version, test, and refine.&lt;/p&gt;
&lt;h2 id="how-to-use-it-in-real-development-executable-workflow"&gt;&lt;a href="#how-to-use-it-in-real-development-executable-workflow" class="header-anchor"&gt;&lt;/a&gt;How to Use It in Real Development (Executable Workflow)
&lt;/h2&gt;&lt;h3 id="step-0-define-the-task-interface-first"&gt;&lt;a href="#step-0-define-the-task-interface-first" class="header-anchor"&gt;&lt;/a&gt;Step 0: Define the Task Interface First
&lt;/h3&gt;&lt;p&gt;Define clearly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What the input is&lt;/li&gt;
&lt;li&gt;Who consumes the output (human/program)&lt;/li&gt;
&lt;li&gt;What qualifies as acceptable output&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is essentially defining an API contract for prompts.&lt;/p&gt;
&lt;h3 id="step-1-use-prompt-templates-not-one-off-writing"&gt;&lt;a href="#step-1-use-prompt-templates-not-one-off-writing" class="header-anchor"&gt;&lt;/a&gt;Step 1: Use Prompt Templates, Not One-off Writing
&lt;/h3&gt;&lt;p&gt;Use a stable template:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Role&lt;/li&gt;
&lt;li&gt;Goal&lt;/li&gt;
&lt;li&gt;Input&lt;/li&gt;
&lt;li&gt;Constraints&lt;/li&gt;
&lt;li&gt;Output format&lt;/li&gt;
&lt;li&gt;Failure handling rules&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Role]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You are a senior frontend reviewer.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Goal]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Check whether the following PR diff contains accessibility issues.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Input]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;{{DIFF_CONTENT}}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Constraints]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Judge only based on the provided diff
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Do not infer unprovided code
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Output Format]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;JSON array: [{&amp;#34;severity&amp;#34;:&amp;#34;&amp;#34;,&amp;#34;file&amp;#34;:&amp;#34;&amp;#34;,&amp;#34;issue&amp;#34;:&amp;#34;&amp;#34;,&amp;#34;fix&amp;#34;:&amp;#34;&amp;#34;}]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Failure Handling]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;If evidence is insufficient, return an empty array and include a reason field.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="step-2-add-automatic-evaluation-to-prompts"&gt;&lt;a href="#step-2-add-automatic-evaluation-to-prompts" class="header-anchor"&gt;&lt;/a&gt;Step 2: Add Automatic Evaluation to Prompts
&lt;/h3&gt;&lt;p&gt;Do not rely only on manual reading. At least run:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Format checks: JSON parsable, required fields present&lt;/li&gt;
&lt;li&gt;Quality checks: key constraints satisfied (e.g. &lt;code&gt;file&lt;/code&gt; and &lt;code&gt;fix&lt;/code&gt; must exist)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="step-3-feed-failure-samples-back-into-prompt-design"&gt;&lt;a href="#step-3-feed-failure-samples-back-into-prompt-design" class="header-anchor"&gt;&lt;/a&gt;Step 3: Feed Failure Samples Back into Prompt Design
&lt;/h3&gt;&lt;p&gt;Convert typical failures into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;New constraints&lt;/li&gt;
&lt;li&gt;New examples&lt;/li&gt;
&lt;li&gt;New counter-examples&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the core learning loop in prompt engineering.&lt;/p&gt;
&lt;h3 id="step-4-split-prompts-by-scenario"&gt;&lt;a href="#step-4-split-prompts-by-scenario" class="header-anchor"&gt;&lt;/a&gt;Step 4: Split Prompts by Scenario
&lt;/h3&gt;&lt;p&gt;Do not expect one mega-prompt to cover all tasks. Split by function:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Information extraction prompt&lt;/li&gt;
&lt;li&gt;Code review prompt&lt;/li&gt;
&lt;li&gt;Planning prompt&lt;/li&gt;
&lt;li&gt;Generation prompt&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This improves stability and testability.&lt;/p&gt;
&lt;h2 id="limits-of-prompt-engineering-alone"&gt;&lt;a href="#limits-of-prompt-engineering-alone" class="header-anchor"&gt;&lt;/a&gt;Limits of Prompt Engineering Alone
&lt;/h2&gt;&lt;p&gt;Prompt engineering is effective, but has natural boundaries, especially in agent/long-running development:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Limited memory management&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Prompt tuning optimizes “how to ask now,” not “how to manage multi-turn state”&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Long-context degradation&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;As history grows, prompt constraints alone cannot solve token/attention dilution&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Weak state continuity&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;After interruption, a single prompt cannot reliably restore full task state&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;No execution loop by itself&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;A prompt can say “run tests,” but that does not guarantee tests are executed, logs collected, and state updated&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;No system-level governance&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;It cannot alone solve tool orchestration, failure recovery, observability, and quality gates&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="why-it-evolved-into-context-engineering"&gt;&lt;a href="#why-it-evolved-into-context-engineering" class="header-anchor"&gt;&lt;/a&gt;Why It Evolved into Context Engineering
&lt;/h2&gt;&lt;p&gt;Once tasks evolved from Q&amp;amp;A to continuous development, the key problems became:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What history to keep&lt;/li&gt;
&lt;li&gt;When to compress history&lt;/li&gt;
&lt;li&gt;How to retrieve and refill old information&lt;/li&gt;
&lt;li&gt;How to hand off state without loss across context windows&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is the scope of context engineering:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Prompt engineering focuses on: how to express tasks
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Context engineering focuses on: how to manage task history and state
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="why-it-further-evolved-into-harness-engineering"&gt;&lt;a href="#why-it-further-evolved-into-harness-engineering" class="header-anchor"&gt;&lt;/a&gt;Why It Further Evolved into Harness Engineering
&lt;/h2&gt;&lt;p&gt;Even with prompt + context engineering, a larger challenge remains:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How to make agents reliably deliver in real engineering workflows.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That requires system capabilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Toolchain orchestration (lint/test/build/deploy)&lt;/li&gt;
&lt;li&gt;Quality gates and automatic verification&lt;/li&gt;
&lt;li&gt;Failure recovery and retry strategies&lt;/li&gt;
&lt;li&gt;Task scheduling and state tracking&lt;/li&gt;
&lt;li&gt;Rule accumulation and observability&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is the scope of harness engineering:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Harness engineering = assembling prompt, context, tools, checks, and workflow into a sustainable delivery system
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="relationship-among-the-three"&gt;&lt;a href="#relationship-among-the-three" class="header-anchor"&gt;&lt;/a&gt;Relationship Among the Three
&lt;/h2&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Dimension&lt;/th&gt;
 &lt;th&gt;Prompt Engineering&lt;/th&gt;
 &lt;th&gt;Context Engineering&lt;/th&gt;
 &lt;th&gt;Harness Engineering&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Core question&lt;/td&gt;
 &lt;td&gt;How to improve single-call output&lt;/td&gt;
 &lt;td&gt;How to manage multi-turn memory and state&lt;/td&gt;
 &lt;td&gt;How to make end-to-end delivery stable&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Main object&lt;/td&gt;
 &lt;td&gt;Single input text&lt;/td&gt;
 &lt;td&gt;History, summaries, retrieval, state&lt;/td&gt;
 &lt;td&gt;Toolchains, rules, validation, orchestration&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Typical artifact&lt;/td&gt;
 &lt;td&gt;Prompt templates&lt;/td&gt;
 &lt;td&gt;State snapshots, compression summaries, memory layers&lt;/td&gt;
 &lt;td&gt;Agent workflows, check loops, runtime policies&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Main failure point&lt;/td&gt;
 &lt;td&gt;Drift in long tasks&lt;/td&gt;
 &lt;td&gt;Lacks execution/governance&lt;/td&gt;
 &lt;td&gt;Higher implementation cost, but highest stability&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="my-practical-conclusion"&gt;&lt;a href="#my-practical-conclusion" class="header-anchor"&gt;&lt;/a&gt;My Practical Conclusion
&lt;/h2&gt;&lt;p&gt;Prompt engineering is not outdated. It is the foundational layer.&lt;/p&gt;
&lt;p&gt;In real development, a practical sequence is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Stabilize prompt engineering first (stable input/output)&lt;/li&gt;
&lt;li&gt;Add context engineering next (handle long-running memory)&lt;/li&gt;
&lt;li&gt;Build harness engineering last (close the system loop for stable delivery)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you jump directly to harness while prompt quality is unstable, complexity rises quickly and failures become harder to debug. If you only do prompt engineering, long-running development remains fragile.&lt;/p&gt;
&lt;h2 id="references"&gt;&lt;a href="#references" class="header-anchor"&gt;&lt;/a&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;a class="link" href="https://platform.openai.com/docs/guides/prompting" target="_blank" rel="noopener"
 &gt;Prompt Engineering Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;OpenAI: &lt;a class="link" href="https://help.openai.com/en/articles/6654000-comprehensive-guide-to-prompt-engineering" target="_blank" rel="noopener"
 &gt;Best practices for prompt engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview" target="_blank" rel="noopener"
 &gt;Prompt engineering overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags" target="_blank" rel="noopener"
 &gt;Use XML tags to structure prompts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Agent_Context Compression Prompt</title><link>https://xedczq.cn/en/post/agent_contextcompression/</link><pubDate>Fri, 15 May 2026 17:58:59 +0800</pubDate><guid>https://xedczq.cn/en/post/agent_contextcompression/</guid><description>&lt;h1 id="notes-on-agent-context-compression-design"&gt;&lt;a href="#notes-on-agent-context-compression-design" class="header-anchor"&gt;&lt;/a&gt;Notes on Agent Context Compression Design
&lt;/h1&gt;
 &lt;blockquote&gt;
 &lt;p&gt;Reference: &lt;a class="link" href="https://wakeup-jin.github.io/Practical-Guide-to-Context-Engineering/%E4%B8%8A%E4%B8%8B%E6%96%87%E7%AE%A1%E7%90%86/%E4%B8%8A%E4%B8%8B%E6%96%87%E5%8E%8B%E7%BC%A9%E6%8C%87%E4%BB%A4%EF%BC%9AClaudeCode%E4%B8%8EGemini%E7%9A%84%E5%8E%8B%E7%BC%A9%E6%8F%90%E7%A4%BA%E8%AF%8D%E8%A7%A3%E6%9E%90.html" target="_blank" rel="noopener"
 &gt;Context Compression Instruction: Prompt Analysis of Claude Code and Gemini&lt;/a&gt;&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;h2 id="what-problem-does-context-compression-solve"&gt;&lt;a href="#what-problem-does-context-compression-solve" class="header-anchor"&gt;&lt;/a&gt;What Problem Does Context Compression Solve?
&lt;/h2&gt;&lt;p&gt;An agent’s context window is not infinite. As multi-turn conversations, tool calls, file reads, error logs, and code diffs accumulate, the model gradually approaches the token limit. The goal of context compression is not simply to “make it shorter,” but to preserve task continuity while reorganizing history into a state that the next agent turn can continue from.&lt;/p&gt;
&lt;p&gt;I treat context compression as a work handoff:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keep what the user is actually trying to accomplish&lt;/li&gt;
&lt;li&gt;Keep project constraints, tech stack, and key decisions&lt;/li&gt;
&lt;li&gt;Keep file states that were read, modified, or created&lt;/li&gt;
&lt;li&gt;Keep errors, fixes, and unresolved issues&lt;/li&gt;
&lt;li&gt;Drop repetitive, outdated, and noisy tool outputs&lt;/li&gt;
&lt;li&gt;Let the next context window continue execution instead of re-exploring&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A good compression system should answer three questions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;When to compress: scheduling strategy based on token thresholds, message length, tool output size, etc.&lt;/li&gt;
&lt;li&gt;What to compress: user messages, system constraints, tool results, file states, or plans&lt;/li&gt;
&lt;li&gt;How to compress: LLM summarization, rule-based trimming, retrieval reconstruction, or a hybrid approach&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="classic-approach-1-llm-summarization-compression"&gt;&lt;a href="#classic-approach-1-llm-summarization-compression" class="header-anchor"&gt;&lt;/a&gt;Classic Approach 1: LLM Summarization Compression
&lt;/h2&gt;&lt;p&gt;Both Claude Code and Gemini CLI follow a core idea: when context is too long, pass history to a model and let it output a structured summary. This summary becomes the core memory in the next context window.&lt;/p&gt;
&lt;p&gt;The advantage is strong semantic retention: goals, constraints, errors, and plans scattered across long history can be reorganized. The downside is that quality depends on prompt design. A weak prompt may lose file paths, snippets, user preferences, or unfinished tasks.&lt;/p&gt;
&lt;h3 id="claude-code-style-detailed-structured-handoff"&gt;&lt;a href="#claude-code-style-detailed-structured-handoff" class="header-anchor"&gt;&lt;/a&gt;Claude Code Style: Detailed Structured Handoff
&lt;/h3&gt;&lt;p&gt;Claude Code-style compression is closer to a full handoff document. It emphasizes chronological analysis and focuses on user requests, technical details, file changes, error handling, and next steps.&lt;/p&gt;
&lt;p&gt;Suggested fields:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Field&lt;/th&gt;
 &lt;th&gt;Purpose&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Primary requests and intent&lt;/td&gt;
 &lt;td&gt;Preserve the initial user goal and later intent shifts&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Key technical concepts&lt;/td&gt;
 &lt;td&gt;Record stack, frameworks, architecture patterns, dependencies&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Files and code sections&lt;/td&gt;
 &lt;td&gt;Track read/modified/created files and key snippets&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Errors and fixes&lt;/td&gt;
 &lt;td&gt;Prevent repeating the same mistakes after compression&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Problem-solving status&lt;/td&gt;
 &lt;td&gt;Separate resolved issues from ongoing debugging&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;User messages&lt;/td&gt;
 &lt;td&gt;Preserve original feedback to reduce intent distortion&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Pending tasks&lt;/td&gt;
 &lt;td&gt;Make remaining work explicit&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Current work state&lt;/td&gt;
 &lt;td&gt;Capture what was in progress before compression&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optional next steps&lt;/td&gt;
 &lt;td&gt;Keep only directly relevant follow-up actions&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The point is not “a pretty summary,” but “a handoff that can keep coding.” In coding-agent workflows, file paths, function names, test commands, failed logs, and user corrections are critical.&lt;/p&gt;
&lt;p&gt;Compression template:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Please compress the conversation history into a handoff summary that can continue execution.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Must keep:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;1. User’s primary goals and explicit requests
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2. Tech stack, architecture constraints, and key decisions
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;3. Files read/modified/created/deleted and why
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;4. Key code snippets, function signatures, config items
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;5. Encountered errors, failure logs, and fixes
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;6. Important user feedback and preferences
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;7. Completed items, pending items, and current pause point
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;8. Next-step suggestions directly related to the current task only
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Must remove:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;1. Repetitive explanations
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2. Outdated tool outputs
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;3. Intermediate attempts that no longer help
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;4. Irrelevant small talk
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="gemini-cli-style-state-snapshot"&gt;&lt;a href="#gemini-cli-style-state-snapshot" class="header-anchor"&gt;&lt;/a&gt;Gemini CLI Style: State Snapshot
&lt;/h3&gt;&lt;p&gt;Gemini CLI-style compression is more like generating a compact &lt;code&gt;state_snapshot&lt;/code&gt;. It uses fewer fields but packs higher density.&lt;/p&gt;
&lt;p&gt;Typical fields:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Field&lt;/th&gt;
 &lt;th&gt;Purpose&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;overall_goal&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;One-line high-level user objective&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;key_knowledge&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Facts, constraints, and conventions that must be remembered&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;file_system_state&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Created/read/modified/deleted file state&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;recent_actions&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Recent key actions and outcomes&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;current_plan&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Current plan and progress&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This style works well as a runtime snapshot, especially for recovery after interruption. It is shorter than the Claude-style handoff but requires stricter detail retention.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-xml" data-lang="xml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;&amp;lt;state_snapshot&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;lt;overall_goal&amp;gt;&lt;/span&gt;User&amp;#39;s current high-level goal&lt;span class="nt"&gt;&amp;lt;/overall_goal&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;lt;key_knowledge&amp;gt;&lt;/span&gt;Critical facts, constraints, preferences, technical decisions&lt;span class="nt"&gt;&amp;lt;/key_knowledge&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;lt;file_system_state&amp;gt;&lt;/span&gt;File read/modify/create/delete state&lt;span class="nt"&gt;&amp;lt;/file_system_state&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;lt;recent_actions&amp;gt;&lt;/span&gt;Recent important actions and outcomes&lt;span class="nt"&gt;&amp;lt;/recent_actions&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;lt;current_plan&amp;gt;&lt;/span&gt;Current plan, completed steps, pending steps&lt;span class="nt"&gt;&amp;lt;/current_plan&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;&amp;lt;/state_snapshot&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="classic-approach-2-tool-message-trimming"&gt;&lt;a href="#classic-approach-2-tool-message-trimming" class="header-anchor"&gt;&lt;/a&gt;Classic Approach 2: Tool Message Trimming
&lt;/h2&gt;&lt;p&gt;In real agent systems, the biggest token consumer is often tool output, not user text or assistant replies. File reads, code search, test runs, and logs can explode token usage.&lt;/p&gt;
&lt;p&gt;So tool-message trimming is highly practical:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keep system messages&lt;/li&gt;
&lt;li&gt;Keep normal user and assistant messages&lt;/li&gt;
&lt;li&gt;Remove outdated tool calls and tool outputs&lt;/li&gt;
&lt;li&gt;Keep only the last N tool rounds&lt;/li&gt;
&lt;li&gt;Summarize key tool outputs before deleting raw long outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A common policy: identify all tool rounds, keep only the last &lt;code&gt;N&lt;/code&gt;, and remove older tool-related messages.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-ts" data-lang="ts"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;MessageRole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;system&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;user&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;tool&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;role&lt;/span&gt;: &lt;span class="kt"&gt;MessageRole&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;content&lt;/span&gt;: &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;tool_calls?&lt;/span&gt;: &lt;span class="kt"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;tool_call_id?&lt;/span&gt;: &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;CompressionOptions&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;enabled&lt;/span&gt;: &lt;span class="kt"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;keepLastToolRounds&lt;/span&gt;: &lt;span class="kt"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;compressToolMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;: &lt;span class="kt"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;options&lt;/span&gt;: &lt;span class="kt"&gt;CompressionOptions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;toolRounds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;identifyToolRounds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;roundsToKeep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toolRounds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;keepLastToolRounds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keepIndexes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;roundsToKeep&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flatMap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;round&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;round&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;indexes&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;system&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keepIndexes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isToolRelated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;tool&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;Boolean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isToolRelated&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The key decision is whether a tool output still helps future decisions. If it has already been absorbed into conclusions or is only exploratory noise, remove it. If it is a fresh test result, key error log, or important file content, keep or summarize it first.&lt;/p&gt;
&lt;h2 id="classic-approach-3-middle-drop-oldest-drop-and-hybrid-strategy"&gt;&lt;a href="#classic-approach-3-middle-drop-oldest-drop-and-hybrid-strategy" class="header-anchor"&gt;&lt;/a&gt;Classic Approach 3: Middle Drop, Oldest Drop, and Hybrid Strategy
&lt;/h2&gt;&lt;p&gt;Besides LLM summarization, rule-based algorithms can also trim messages directly. They are more controllable and cheaper, but weaker in semantic understanding.&lt;/p&gt;
&lt;p&gt;Three common methods:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Strategy&lt;/th&gt;
 &lt;th&gt;Method&lt;/th&gt;
 &lt;th&gt;Best for&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Middle drop&lt;/td&gt;
 &lt;td&gt;Keep head and tail, remove middle&lt;/td&gt;
 &lt;td&gt;Head has constraints, tail has current work&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Oldest drop&lt;/td&gt;
 &lt;td&gt;Remove earliest messages first&lt;/td&gt;
 &lt;td&gt;Long-running sessions where recent context matters most&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Hybrid&lt;/td&gt;
 &lt;td&gt;Choose dynamically by conversation shape&lt;/td&gt;
 &lt;td&gt;Mixed workloads and different model limits&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="middle-drop"&gt;&lt;a href="#middle-drop" class="header-anchor"&gt;&lt;/a&gt;Middle Drop
&lt;/h3&gt;&lt;p&gt;Works well when history has this structure:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Head: system prompt, project rules, user goals
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Middle: heavy tool usage, search process, trial-and-error
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Tail: current issue, latest code, latest errors
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Advantage: keeps task framing and current working context. Risk: key decisions may be lost if the middle is removed without summarization.&lt;/p&gt;
&lt;h3 id="oldest-drop"&gt;&lt;a href="#oldest-drop" class="header-anchor"&gt;&lt;/a&gt;Oldest Drop
&lt;/h3&gt;&lt;p&gt;This is a sliding-window style approach. It assumes the newest messages are most relevant.&lt;/p&gt;
&lt;p&gt;Advantage: simple and effective for continuity in long sessions. Risk: early constraints, architecture decisions, or initial goals may be dropped.&lt;/p&gt;
&lt;h3 id="hybrid-strategy"&gt;&lt;a href="#hybrid-strategy" class="header-anchor"&gt;&lt;/a&gt;Hybrid Strategy
&lt;/h3&gt;&lt;p&gt;Dynamic selection can use:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compression ratio target (current tokens vs target)&lt;/li&gt;
&lt;li&gt;Total message count&lt;/li&gt;
&lt;li&gt;Share of recent-message tokens&lt;/li&gt;
&lt;li&gt;Presence of long messages&lt;/li&gt;
&lt;li&gt;Presence of system messages&lt;/li&gt;
&lt;li&gt;Heavy tool-message density&lt;/li&gt;
&lt;li&gt;Model context window size&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A practical decision table:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Condition&lt;/th&gt;
 &lt;th&gt;Recommended strategy&lt;/th&gt;
 &lt;th&gt;Why&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Light compression + short dialogue&lt;/td&gt;
 &lt;td&gt;Middle drop&lt;/td&gt;
 &lt;td&gt;Head and tail are often most important&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Heavy compression + very long dialogue&lt;/td&gt;
 &lt;td&gt;Oldest drop&lt;/td&gt;
 &lt;td&gt;Recent context usually has higher priority&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Recent messages dominate tokens&lt;/td&gt;
 &lt;td&gt;Middle drop&lt;/td&gt;
 &lt;td&gt;Protect the current working context&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;System/tool-heavy history&lt;/td&gt;
 &lt;td&gt;Middle drop&lt;/td&gt;
 &lt;td&gt;Keep opening rules and latest state&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Uncertain&lt;/td&gt;
 &lt;td&gt;Try both and score&lt;/td&gt;
 &lt;td&gt;Data-driven selection&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;A simple score:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;efficiency_score = token_reduction_ratio * 0.6 + message_retention_ratio * 0.4
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If the system prioritizes staying under target tokens, increase token-reduction weight. If it prioritizes context continuity, increase retention weight.&lt;/p&gt;
&lt;h2 id="recommended-hybrid-compression-architecture"&gt;&lt;a href="#recommended-hybrid-compression-architecture" class="header-anchor"&gt;&lt;/a&gt;Recommended Hybrid Compression Architecture
&lt;/h2&gt;&lt;p&gt;A single method is usually not robust enough. For coding agents, I prefer a combined pipeline:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Raw history
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Token and structure statistics
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Compression threshold check
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Trim outdated tool messages
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;LLM structured summary for key history
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Generate state snapshot / handoff summary
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ↓
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Rebuild next context window
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I usually preserve four layers:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Layer&lt;/th&gt;
 &lt;th&gt;Content&lt;/th&gt;
 &lt;th&gt;Storage&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Stable rules layer&lt;/td&gt;
 &lt;td&gt;System prompt, project rules, security constraints&lt;/td&gt;
 &lt;td&gt;Persistent prompt/rule files&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Working memory layer&lt;/td&gt;
 &lt;td&gt;Current goal, plan, TODOs, user preferences&lt;/td&gt;
 &lt;td&gt;Structured summary&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Evidence layer&lt;/td&gt;
 &lt;td&gt;Latest tool results, key errors, key snippets&lt;/td&gt;
 &lt;td&gt;Last N tool rounds or summarized evidence&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;External knowledge layer&lt;/td&gt;
 &lt;td&gt;Docs, codebase, history&lt;/td&gt;
 &lt;td&gt;RAG / file retrieval&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Rebuilt context layout:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;System prompt
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Project rules
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Compression preface
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Structured summary
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Recent full conversation rounds
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Recent key tool results
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Current user request
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The “recent full rounds” part is important. Summaries keep the big picture, but recent raw turns often carry subtle intent, tone, corrections, and boundary conditions.&lt;/p&gt;
&lt;h2 id="compression-prompt-design-principles"&gt;&lt;a href="#compression-prompt-design-principles" class="header-anchor"&gt;&lt;/a&gt;Compression Prompt Design Principles
&lt;/h2&gt;&lt;p&gt;The goal is not to let the model freestyle. It is to enforce a stable handoff format.&lt;/p&gt;
&lt;p&gt;Recommended prompt constraints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Explicit role: you are a context compressor, not an executor&lt;/li&gt;
&lt;li&gt;Explicit goal: generate a state that the next agent can continue from&lt;/li&gt;
&lt;li&gt;Explicit retention: goals, constraints, files, code, errors, plan, user feedback&lt;/li&gt;
&lt;li&gt;Explicit deletion: repetition, irrelevant tool output, small talk, intermediate noise&lt;/li&gt;
&lt;li&gt;Explicit output format: Markdown, XML, JSON, or custom tags&lt;/li&gt;
&lt;li&gt;Explicit prohibition: do not fabricate file states, do not invent decisions, do not execute next steps&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Practical prompt template:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You are the context compressor for an agent.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Please compress the conversation history into a Chinese handoff summary.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;This summary will be the primary context for continuing execution in the next context window.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Must keep:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- User goals, explicit requests, and important feedback
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Tech stack, project constraints, architecture decisions, tool preferences
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- File paths read/modified/created/deleted
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Key code snippets, function names, config items, commands
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Encountered errors, failed tests, and fixes
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Completed tasks, pending tasks, and current pause point
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Next-step suggestions directly relevant to the current task
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Must remove:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Repetitive explanations
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Irrelevant small talk
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Tool output with no further value
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- Intermediate attempts that do not affect final decisions
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Do not fabricate information not present in history.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Do not execute tasks. Only output the compressed summary.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="engineering-implementation-notes"&gt;&lt;a href="#engineering-implementation-notes" class="header-anchor"&gt;&lt;/a&gt;Engineering Implementation Notes
&lt;/h2&gt;&lt;h3 id="trigger-timing"&gt;&lt;a href="#trigger-timing" class="header-anchor"&gt;&lt;/a&gt;Trigger Timing
&lt;/h3&gt;&lt;p&gt;Compression can be triggered when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Tokens exceed 70% to 85% of model context limit&lt;/li&gt;
&lt;li&gt;Single tool output exceeds threshold&lt;/li&gt;
&lt;li&gt;Tool call rounds exceed threshold&lt;/li&gt;
&lt;li&gt;A task phase ends and a handoff is needed&lt;/li&gt;
&lt;li&gt;User explicitly requests &lt;code&gt;/compact&lt;/code&gt; or equivalent command&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="compression-order"&gt;&lt;a href="#compression-order" class="header-anchor"&gt;&lt;/a&gt;Compression Order
&lt;/h3&gt;&lt;p&gt;Recommended order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Remove obviously low-value tool output&lt;/li&gt;
&lt;li&gt;Keep the last N complete conversation rounds&lt;/li&gt;
&lt;li&gt;Generate structured summaries for older messages&lt;/li&gt;
&lt;li&gt;Rebuild context with summary + rules + recent rounds&lt;/li&gt;
&lt;li&gt;Record metrics: pre/post token count, dropped message count, kept tool rounds&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="risk-control"&gt;&lt;a href="#risk-control" class="header-anchor"&gt;&lt;/a&gt;Risk Control
&lt;/h3&gt;&lt;p&gt;The most common failure is not “insufficient compression,” but “loss of critical facts.”&lt;/p&gt;
&lt;p&gt;Especially avoid:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Losing explicit user constraints&lt;/li&gt;
&lt;li&gt;Losing file paths&lt;/li&gt;
&lt;li&gt;Losing the latest error message&lt;/li&gt;
&lt;li&gt;Losing failed attempts that should not be repeated&lt;/li&gt;
&lt;li&gt;Turning assumptions into facts&lt;/li&gt;
&lt;li&gt;Mixing completed tasks with pending tasks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I prefer to keep explicit state labels in summaries:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Done] Fixed login form validation
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Failed attempt] Direct schema change breaks legacy API
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Pending confirmation] Whether to keep legacy export format
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Next] Run pnpm test for auth module verification
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="my-takeaway"&gt;&lt;a href="#my-takeaway" class="header-anchor"&gt;&lt;/a&gt;My Takeaway
&lt;/h2&gt;&lt;p&gt;Context compression is fundamentally an agent memory-management and handoff system. Claude Code-style compression is better for full development-context retention. Gemini CLI-style compression is better for high-density state snapshots. Tool-message trimming is the most direct way to reduce token noise.&lt;/p&gt;
&lt;p&gt;If I were implementing a stable agent compression module, I would prioritize this combination:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Keep recent conversation rounds intact
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ Trim outdated tool messages
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ LLM structured summary
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ File state snapshot
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ Current plan and TODO list
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ Compression metrics and observability logs
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The final objective is not the shortest context. It is that after compression, the agent still knows: what the user wants, what the project is, what has been done, what has failed, where it stopped, and what should happen next.&lt;/p&gt;</description></item><item><title>Agent_Harness Engineering</title><link>https://xedczq.cn/en/post/agent_harness%E5%B7%A5%E7%A8%8B/</link><pubDate>Tue, 19 May 2026 11:29:42 +0800</pubDate><guid>https://xedczq.cn/en/post/agent_harness%E5%B7%A5%E7%A8%8B/</guid><description>&lt;h1 id="what-harness-engineering-actually-is"&gt;&lt;a href="#what-harness-engineering-actually-is" class="header-anchor"&gt;&lt;/a&gt;What Harness Engineering Actually Is
&lt;/h1&gt;&lt;p&gt;My conclusion after reading these articles side by side:&lt;/p&gt;
&lt;p&gt;Harness Engineering is not just about writing better prompts. It is about engineering all the capabilities around the model into an iterative system, so an agent can produce stable and verifiable outcomes during long-running tasks.&lt;/p&gt;
&lt;p&gt;One-line summary:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Agent = Model + Harness
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Harness = State management + Tooling + Constraints + Feedback loops + Execution orchestration
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The model provides intelligence. The harness makes that intelligence usable, controllable, and repeatable.&lt;/p&gt;
&lt;h2 id="shared-takeaways-across-the-articles"&gt;&lt;a href="#shared-takeaways-across-the-articles" class="header-anchor"&gt;&lt;/a&gt;Shared Takeaways Across the Articles
&lt;/h2&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Theme&lt;/th&gt;
 &lt;th&gt;Common Ground&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Definition of harness&lt;/td&gt;
 &lt;td&gt;Not the model itself, but surrounding code, configuration, process, tools, and validation mechanisms&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Goal&lt;/td&gt;
 &lt;td&gt;Reduce supervision cost, improve first-pass correctness, and support long-running execution&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Core method&lt;/td&gt;
 &lt;td&gt;Turn repeated failure modes into engineered assets: rules, tools, tests, and loops&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Main long-task challenge&lt;/td&gt;
 &lt;td&gt;Limited context windows, session interruption, state drift, and premature “done” claims&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Solution direction&lt;/td&gt;
 &lt;td&gt;Incremental task decomposition, state handoff, automated checks, observability, and continuous correction&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="5-core-components-my-practical-view"&gt;&lt;a href="#5-core-components-my-practical-view" class="header-anchor"&gt;&lt;/a&gt;5 Core Components (My Practical View)
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Task scaffolding&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Clear decomposition strategy (one feature at a time)&lt;/li&gt;
&lt;li&gt;Clear Definition of Done (DoD) to avoid “looks finished” outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;State and memory&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Recoverable state: progress files, commit notes, change logs&lt;/li&gt;
&lt;li&gt;Reliable handoff between sessions instead of relying on model guessing&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Tools and environment&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Fast deterministic tools for agents (tests, lint, screenshots, logs)&lt;/li&gt;
&lt;li&gt;Self-serve context access instead of manual copy/paste&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Feedback and sensors&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Computational sensors: lint/typecheck/unit/e2e (fast, deterministic)&lt;/li&gt;
&lt;li&gt;Reasoning sensors: LLM review/semantic QA (slower, costlier, but useful for semantics)&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Scheduling and governance&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;After failure, do not only retry; improve capability&lt;/li&gt;
&lt;li&gt;Accumulate reusable rules in templates (&lt;code&gt;AGENTS.md&lt;/code&gt;, docs, checklists)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="practical-harness-workflow-for-normal-webcoding-users"&gt;&lt;a href="#practical-harness-workflow-for-normal-webcoding-users" class="header-anchor"&gt;&lt;/a&gt;Practical Harness Workflow for Normal WebCoding Users
&lt;/h2&gt;&lt;p&gt;This is my compressed version for individual developers. You do not need multi-agent orchestration to start.&lt;/p&gt;
&lt;h3 id="step-0-define-done-first"&gt;&lt;a href="#step-0-define-done-first" class="header-anchor"&gt;&lt;/a&gt;Step 0: Define “Done” First
&lt;/h3&gt;&lt;p&gt;Create a one-page &lt;code&gt;SPEC.md&lt;/code&gt; for each feature:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;User scenario&lt;/li&gt;
&lt;li&gt;Input and output&lt;/li&gt;
&lt;li&gt;Acceptance criteria&lt;/li&gt;
&lt;li&gt;Failure scenarios&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without this, agents tend to produce “confident but misaligned” output.&lt;/p&gt;
&lt;h3 id="step-1-create-minimal-harness-files"&gt;&lt;a href="#step-1-create-minimal-harness-files" class="header-anchor"&gt;&lt;/a&gt;Step 1: Create Minimal Harness Files
&lt;/h3&gt;&lt;p&gt;At least these 4 files:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AGENTS.md&lt;/code&gt;: repository rules (commands, directory conventions, no-touch zones, commit style)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TASKS.md&lt;/code&gt;: feature backlog with &lt;code&gt;todo/doing/done&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PROGRESS.md&lt;/code&gt;: what was done, what is unfinished, next step&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CHECKLIST.md&lt;/code&gt;: unified acceptance checks (build, test, UI, performance, security)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="step-2-one-feature-per-iteration"&gt;&lt;a href="#step-2-one-feature-per-iteration" class="header-anchor"&gt;&lt;/a&gt;Step 2: One Feature Per Iteration
&lt;/h3&gt;&lt;p&gt;Execution pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pick one item from &lt;code&gt;TASKS.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Give the agent a bounded task&lt;/li&gt;
&lt;li&gt;Avoid “build the entire site in one go” requests&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This sharply reduces context chaos and regressions.&lt;/p&gt;
&lt;h3 id="step-3-let-the-agent-change-then-prove"&gt;&lt;a href="#step-3-let-the-agent-change-then-prove" class="header-anchor"&gt;&lt;/a&gt;Step 3: Let the Agent Change, Then Prove
&lt;/h3&gt;&lt;p&gt;Require the agent to output every round:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Files changed&lt;/li&gt;
&lt;li&gt;Why each change was made&lt;/li&gt;
&lt;li&gt;Commands executed&lt;/li&gt;
&lt;li&gt;Passed/failed checks&lt;/li&gt;
&lt;li&gt;Risk and rollback points&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This converts hidden reasoning into auditable execution traces.&lt;/p&gt;
&lt;h3 id="step-4-two-layer-validation-computational-first"&gt;&lt;a href="#step-4-two-layer-validation-computational-first" class="header-anchor"&gt;&lt;/a&gt;Step 4: Two-Layer Validation (Computational First)
&lt;/h3&gt;&lt;p&gt;Run at least:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;npm run lint
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;npm run &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;npm run build
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For frontend UI changes, also add:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Key path screenshot checks&lt;/li&gt;
&lt;li&gt;Manual critical interaction checklist&lt;/li&gt;
&lt;li&gt;Responsive checks on main breakpoints&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Rule: pass deterministic checks first, then do semantic review.&lt;/p&gt;
&lt;h3 id="step-5-convert-every-failure-into-harness-assets"&gt;&lt;a href="#step-5-convert-every-failure-into-harness-assets" class="header-anchor"&gt;&lt;/a&gt;Step 5: Convert Every Failure into Harness Assets
&lt;/h3&gt;&lt;p&gt;When agent output fails, do not only patch the immediate bug:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If it is a rule issue, add it to &lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;If it is repeated execution, script it&lt;/li&gt;
&lt;li&gt;If it is quality drift, add it to &lt;code&gt;CHECKLIST.md&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Goal: prevent the same class of errors from recurring.&lt;/p&gt;
&lt;h3 id="step-6-force-handoff-for-long-tasks"&gt;&lt;a href="#step-6-force-handoff-for-long-tasks" class="header-anchor"&gt;&lt;/a&gt;Step 6: Force Handoff for Long Tasks
&lt;/h3&gt;&lt;p&gt;If work spans more than one context window, generate a handoff containing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Current goal&lt;/li&gt;
&lt;li&gt;Completed work&lt;/li&gt;
&lt;li&gt;Remaining work&lt;/li&gt;
&lt;li&gt;Blockers&lt;/li&gt;
&lt;li&gt;First step for next round&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Store it in &lt;code&gt;PROGRESS.md&lt;/code&gt; or planning files, not only in chat history.&lt;/p&gt;
&lt;h3 id="step-7-run-a-release-grade-loop-before-merge"&gt;&lt;a href="#step-7-run-a-release-grade-loop-before-merge" class="header-anchor"&gt;&lt;/a&gt;Step 7: Run a Release-Grade Loop Before Merge
&lt;/h3&gt;&lt;p&gt;Before merge, run one unified cycle:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Regression checks&lt;/li&gt;
&lt;li&gt;Critical user-path smoke tests&lt;/li&gt;
&lt;li&gt;Quick performance and error-log scan&lt;/li&gt;
&lt;li&gt;Agent self-review plus human spot-check&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This prevents “local pass, system-level failure.”&lt;/p&gt;
&lt;h3 id="step-8-weekly-harness-cleanup"&gt;&lt;a href="#step-8-weekly-harness-cleanup" class="header-anchor"&gt;&lt;/a&gt;Step 8: Weekly Harness Cleanup
&lt;/h3&gt;&lt;p&gt;Weekly maintenance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Remove stale rules&lt;/li&gt;
&lt;li&gt;Fix broken scripts&lt;/li&gt;
&lt;li&gt;Merge duplicate constraints&lt;/li&gt;
&lt;li&gt;Refresh docs index&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Harness is also code. Without maintenance, it decays.&lt;/p&gt;
&lt;h2 id="minimum-viable-harness-mvp-for-individuals"&gt;&lt;a href="#minimum-viable-harness-mvp-for-individuals" class="header-anchor"&gt;&lt;/a&gt;Minimum Viable Harness (MVP) for Individuals
&lt;/h2&gt;&lt;p&gt;If you want the fastest starting point, do this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Write 20-50 lines of hard rules in &lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Ask the agent to do only one feature per iteration&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;lint/test/build&lt;/code&gt; every round&lt;/li&gt;
&lt;li&gt;Update &lt;code&gt;PROGRESS.md&lt;/code&gt; each round&lt;/li&gt;
&lt;li&gt;Convert repeated failures into rules or scripts&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These five actions are usually enough to move from “using agents by feel” to “compounding engineering productivity.”&lt;/p&gt;
&lt;h2 id="my-practical-conclusion"&gt;&lt;a href="#my-practical-conclusion" class="header-anchor"&gt;&lt;/a&gt;My Practical Conclusion
&lt;/h2&gt;&lt;p&gt;Harness Engineering answers one core question:&lt;/p&gt;

 &lt;blockquote&gt;
 &lt;p&gt;When an agent fails, do you supervise it repeatedly, or convert that failure into system capability?&lt;/p&gt;

 &lt;/blockquote&gt;
&lt;p&gt;The first consumes human time. The second compounds.&lt;/p&gt;
&lt;p&gt;For normal webcoding users, the key is not the fanciest model, but:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Do you have executable rules?&lt;/li&gt;
&lt;li&gt;Do you have automated feedback?&lt;/li&gt;
&lt;li&gt;Do you convert failures into deterministic advantages for the next run?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is the real value of harness engineering.&lt;/p&gt;
&lt;h2 id="references"&gt;&lt;a href="#references" class="header-anchor"&gt;&lt;/a&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;a class="link" href="https://openai.com/index/harness-engineering/" target="_blank" rel="noopener"
 &gt;Harness engineering: leveraging Codex in an agent-first world&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" target="_blank" rel="noopener"
 &gt;Effective harnesses for long-running agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;a class="link" href="https://www.anthropic.com/engineering/harness-design-long-running-apps" target="_blank" rel="noopener"
 &gt;Harness design for long-running application development&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LangChain: &lt;a class="link" href="https://www.langchain.com/blog/the-anatomy-of-an-agent-harness" target="_blank" rel="noopener"
 &gt;The Anatomy of an Agent Harness&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Mitchell Hashimoto: &lt;a class="link" href="https://mitchellh.com/writing/my-ai-adoption-journey" target="_blank" rel="noopener"
 &gt;My AI Adoption Journey&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Martin Fowler: &lt;a class="link" href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html" target="_blank" rel="noopener"
 &gt;Harness Engineering - first thoughts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Martin Fowler: &lt;a class="link" href="https://martinfowler.com/articles/harness-engineering.html" target="_blank" rel="noopener"
 &gt;Harness engineering for coding agent users&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>AI Resume Analysis: Knowledgebase Module</title><link>https://xedczq.cn/en/post/aiinterview_knowledgebase/</link><pubDate>Fri, 15 May 2026 21:55:13 +0800</pubDate><guid>https://xedczq.cn/en/post/aiinterview_knowledgebase/</guid><description>&lt;h2 id="knowledgebase-module-design-and-implementation"&gt;&lt;a href="#knowledgebase-module-design-and-implementation" class="header-anchor"&gt;&lt;/a&gt;Knowledgebase Module Design and Implementation
&lt;/h2&gt;&lt;p&gt;This note records how I implemented the &lt;code&gt;Knowledgebase&lt;/code&gt; module in the &lt;code&gt;interview-guide&lt;/code&gt; project. The goal is to connect document upload, vectorization, RAG query, and session association into a sustainable knowledge service workflow.&lt;/p&gt;
&lt;h2 id="module-capability-overview"&gt;&lt;a href="#module-capability-overview" class="header-anchor"&gt;&lt;/a&gt;Module Capability Overview
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Document management: supports upload, download, deletion, categorization, keyword search, and statistics.&lt;/li&gt;
&lt;li&gt;Vectorization capability: stores vectors with &lt;code&gt;pgvector&lt;/code&gt;, and processes chunking/storage through async tasks.&lt;/li&gt;
&lt;li&gt;RAG Q&amp;amp;A: supports both non-streaming and streaming (SSE) multi-knowledgebase query.&lt;/li&gt;
&lt;li&gt;Session coordination: automatically removes associated session references when deleting a knowledgebase to reduce inconsistency risk.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="state-transitions"&gt;&lt;a href="#state-transitions" class="header-anchor"&gt;&lt;/a&gt;State Transitions
&lt;/h2&gt;&lt;h3 id="diagram-1-knowledgebase-main-state-machine"&gt;&lt;a href="#diagram-1-knowledgebase-main-state-machine" class="header-anchor"&gt;&lt;/a&gt;Diagram 1: KnowledgeBase Main State Machine
&lt;/h3&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Call POST /api/knowledgebase/upload to upload file"] --&gt; B["File validation + type detection + dedup check"]

B --&gt; C{"Is file duplicated (fileHash exists)?"}
C --&gt;|Yes| D["Return existing knowledgebase record\nduplicate=true\nno vectorization triggered"]
C --&gt;|No| E["Parse text content + upload file to storage"]

E --&gt; F["Save KnowledgeBaseEntity\ninitial vectorStatus=PENDING"]
F --&gt; G["Send vectorization task to Redis Stream"]

G --&gt; H["VectorizeStreamConsumer consumes task"]
H --&gt; I["markProcessing\nvectorStatus=PROCESSING"]

I --&gt; J["vectorizeAndStore\nchunk text and write to pgvector"]
J --&gt; K{"Did vectorization succeed?"}

K --&gt;|Yes| L["markCompleted\nvectorStatus=COMPLETED\nvectorError=null"]
K --&gt;|No| M{"retryCount &lt; 3 ?"}
M --&gt;|Yes| N["Requeue task (retry+1)"]
N --&gt; H
M --&gt;|No| O["markFailed\nvectorStatus=FAILED\nwrite vectorError"]

P["Call POST /api/knowledgebase/{id}/revectorize"] --&gt; Q["Reset status to PENDING\nclear vectorError"]
Q --&gt; G

R["Call DELETE /api/knowledgebase/{id} to delete knowledgebase"] --&gt; S["Remove RAG session associations"]
S --&gt; T["Delete vector data (best effort) + delete storage file (best effort)"]
T --&gt; U["Delete knowledgebase DB record\nlifecycle ends"]&lt;/pre&gt;&lt;h3 id="diagram-2-chunked-knowledgebase-vectorization-flow"&gt;&lt;a href="#diagram-2-chunked-knowledgebase-vectorization-flow" class="header-anchor"&gt;&lt;/a&gt;Diagram 2: Chunked Knowledgebase Vectorization Flow
&lt;/h3&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Knowledgebase upload succeeds"] --&gt; B["Save knowledgebase record vectorStatus=PENDING"]
B --&gt; C["Send vectorization task to Redis Stream"]
C --&gt; D["VectorizeStreamConsumer starts polling"]
D --&gt; E["Read one message: kbId + content + retryCount"]
E --&gt; F["Set status to PROCESSING"]
F --&gt; G["Execute vectorizeAndStore"]

G --&gt; H["Delete old vectors for this kbId"]
H --&gt; I["Text chunking via TokenTextSplitter"]
I --&gt; J["Add metadata kb_id to each chunk"]
J --&gt; K["Batch call vectorStore.add to write vectors"]

K --&gt; L["Set status to COMPLETED"]
L --&gt; M["ACK message"]

G --&gt; N{"Processing exception?"}
N --&gt;|Yes| O{"retryCount &lt; 3"}
O --&gt;|Yes| P["retryCount+1 and requeue"]
P --&gt; M
O --&gt;|No| Q["Set status to FAILED and record error"]
Q --&gt; M&lt;/pre&gt;&lt;h2 id="key-api-design"&gt;&lt;a href="#key-api-design" class="header-anchor"&gt;&lt;/a&gt;Key API Design
&lt;/h2&gt;&lt;h3 id="get-apiknowledgebaselist-get-knowledgebase-list-status-filter--sorting"&gt;&lt;a href="#get-apiknowledgebaselist-get-knowledgebase-list-status-filter--sorting" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/list&lt;/code&gt; Get Knowledgebase List (Status Filter + Sorting)
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;listKnowledgeBases&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sortBy&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;knowledgeBaseRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByVectorStatusOrderByUploadedAtDesc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorStatus&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;knowledgeBaseRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAllByOrderByUploadedAtDesc&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sortEntities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sortBy&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiknowledgebaseid-get-knowledgebase-detail"&gt;&lt;a href="#get-apiknowledgebaseid-get-knowledgebase-detail" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/{id}&lt;/code&gt; Get Knowledgebase Detail
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getKnowledgeBase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;knowledgeBaseRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="delete-apiknowledgebaseid-delete-knowledgebase"&gt;&lt;a href="#delete-apiknowledgebaseid-delete-knowledgebase" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;DELETE /api/knowledgebase/{id}&lt;/code&gt; Delete Knowledgebase
&lt;/h3&gt;&lt;p&gt;Core flow:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;deleteService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteKnowledgeBase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;knowledgeBaseRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByKnowledgeBaseIds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;vectorService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteByKnowledgeBaseId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;storageService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteKnowledgeBase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStorageKey&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;knowledgeBaseRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Removes RAG session associations first, then deletes vectors/storage files, then DB record.&lt;/li&gt;
&lt;li&gt;Vector/storage deletion failures are logged as &lt;code&gt;warn&lt;/code&gt; and do not block the main delete flow.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiknowledgebasequery-non-streaming-qa-multi-knowledgebase"&gt;&lt;a href="#post-apiknowledgebasequery-non-streaming-qa-multi-knowledgebase" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/knowledgebase/query&lt;/code&gt; Non-Streaming Q&amp;amp;A (Multi-Knowledgebase)
&lt;/h3&gt;&lt;p&gt;Rate limits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GLOBAL/IP: 10 each&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;queryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;queryKnowledgeBase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;answerQuestion&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;countService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateQuestionCounts&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;vectorService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;knowledgeBaseIds&lt;/code&gt; and &lt;code&gt;question&lt;/code&gt; are required.&lt;/li&gt;
&lt;li&gt;If no hit, returns fixed fallback text: &amp;ldquo;No information retrieved&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;If hit exists, builds context + prompts and calls default ChatClient for answer generation.&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;QueryResponse(answer, primaryKbId, kbNamesStr)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiknowledgebasequerystream-streaming-qa-sse-multi-knowledgebase"&gt;&lt;a href="#post-apiknowledgebasequerystream-streaming-qa-sse-multi-knowledgebase" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/knowledgebase/query/stream&lt;/code&gt; Streaming Q&amp;amp;A (SSE, Multi-Knowledgebase)
&lt;/h3&gt;&lt;p&gt;Rate limits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GLOBAL/IP: 5 each&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;queryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;answerQuestionStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kbIds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;countService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateQuestionCounts&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;vectorService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;normalizeStreamOutput&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Returns &lt;code&gt;Flux&amp;lt;String&amp;gt;&lt;/code&gt; (&lt;code&gt;text/event-stream&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Empty input or no hit returns fallback text stream directly.&lt;/li&gt;
&lt;li&gt;Both stream-time and outer exceptions are downgraded to safe fallback output.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apiknowledgebasecategories-get-all-category-names"&gt;&lt;a href="#get-apiknowledgebasecategories-get-all-category-names" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/categories&lt;/code&gt; Get All Category Names
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllCategories&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Result&amp;lt;List&amp;lt;String&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apiknowledgebasecategorycategory-get-knowledgebase-list-by-category"&gt;&lt;a href="#get-apiknowledgebasecategorycategory-get-knowledgebase-list-by-category" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/category/{category}&lt;/code&gt; Get Knowledgebase List by Category
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;listByCategory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Result&amp;lt;List&amp;lt;KnowledgeBaseListItemDTO&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apiknowledgebaseuncategorized-get-uncategorized-knowledgebase-list"&gt;&lt;a href="#get-apiknowledgebaseuncategorized-get-uncategorized-knowledgebase-list" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/uncategorized&lt;/code&gt; Get Uncategorized Knowledgebase List
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;listByCategory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Current implementation reuses category-query path and distinguishes uncategorized by specific category value.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="put-apiknowledgebaseidcategory-update-knowledgebase-category"&gt;&lt;a href="#put-apiknowledgebaseidcategory-update-knowledgebase-category" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;PUT /api/knowledgebase/{id}/category&lt;/code&gt; Update Knowledgebase Category
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateCategory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;category&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Queries by &lt;code&gt;id&lt;/code&gt; first and throws business exception if not found.&lt;/li&gt;
&lt;li&gt;Updates &lt;code&gt;category&lt;/code&gt; and persists record when found.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiknowledgebaseupload-upload-knowledgebase-file-multipart"&gt;&lt;a href="#post-apiknowledgebaseupload-upload-knowledgebase-file-multipart" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/knowledgebase/upload&lt;/code&gt; Upload Knowledgebase File (multipart)
&lt;/h3&gt;&lt;p&gt;Parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;file&lt;/code&gt; (required)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;name&lt;/code&gt; (optional)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;category&lt;/code&gt; (optional)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Rate limits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GLOBAL/IP: 3 each&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;uploadService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uploadKnowledgeBase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;findByFileHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fileHash&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Validate file presence and size (max 50MB).&lt;/li&gt;
&lt;li&gt;Validate type by MIME + extension whitelist (PDF/DOCX/DOC/TXT/MD).&lt;/li&gt;
&lt;li&gt;Compute &lt;code&gt;SHA-256&lt;/code&gt; for dedup check.&lt;/li&gt;
&lt;li&gt;Parse text content; fail directly on empty text.&lt;/li&gt;
&lt;li&gt;Upload file to RustFS (S3-compatible), generate &lt;code&gt;fileKey/fileUrl&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Save &lt;code&gt;KnowledgeBaseEntity&lt;/code&gt; with initial vector status &lt;code&gt;PENDING&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Enqueue async vectorization task to Redis Stream (&lt;code&gt;knowledgebase:vectorize:stream&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Return &lt;code&gt;knowledgeBase + storage + duplicate=false&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="get-apiknowledgebaseiddownload-download-original-knowledgebase-file"&gt;&lt;a href="#get-apiknowledgebaseiddownload-download-original-knowledgebase-file" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/{id}/download&lt;/code&gt; Download Original Knowledgebase File
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEntityForDownload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;downloadFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ResponseEntity&amp;lt;byte[]&amp;gt;&lt;/code&gt; (with &lt;code&gt;Content-Disposition&lt;/code&gt; and &lt;code&gt;Content-Type&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apiknowledgebasesearchkeyword-keyword-search-knowledgebase"&gt;&lt;a href="#get-apiknowledgebasesearchkeyword-keyword-search-knowledgebase" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/search?keyword=...&lt;/code&gt; Keyword Search Knowledgebase
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiknowledgebasestats-get-knowledgebase-statistics"&gt;&lt;a href="#get-apiknowledgebasestats-get-knowledgebase-statistics" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/knowledgebase/stats&lt;/code&gt; Get Knowledgebase Statistics
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;listService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStatistics&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;KnowledgeBaseStatsDTO&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiknowledgebaseidrevectorize-manual-re-vectorization"&gt;&lt;a href="#post-apiknowledgebaseidrevectorize-manual-re-vectorization" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/knowledgebase/{id}/revectorize&lt;/code&gt; Manual Re-Vectorization
&lt;/h3&gt;&lt;p&gt;Rate limits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GLOBAL/IP: 2 each&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;uploadService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;revectorize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Query knowledgebase by &lt;code&gt;id&lt;/code&gt;, throw exception if missing.&lt;/li&gt;
&lt;li&gt;Download source file from object storage and re-parse text.&lt;/li&gt;
&lt;li&gt;Fail directly if parsing fails or returns empty text.&lt;/li&gt;
&lt;li&gt;Reset vector status to &lt;code&gt;PENDING&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Enqueue vectorization task to Redis Stream.&lt;/li&gt;
&lt;li&gt;Return success immediately; frontend polls status afterward.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="async-vectorization-processing-flow-core-implementation"&gt;&lt;a href="#async-vectorization-processing-flow-core-implementation" class="header-anchor"&gt;&lt;/a&gt;Async Vectorization Processing Flow (Core Implementation)
&lt;/h2&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// 1) Delete old vectors&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;deleteByKnowledgeBaseId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;knowledgeBaseId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// 2) Text chunking (default no overlap)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;textSplitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// 3) Add metadata (kb_id)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMetadata&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;kb_id&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;knowledgeBaseId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// 4) Batch vector write (DashScope batch &amp;lt;= 10)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;batchCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MAX_BATCH_SIZE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MAX_BATCH_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;totalChunks&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;subList&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="summary"&gt;&lt;a href="#summary" class="header-anchor"&gt;&lt;/a&gt;Summary
&lt;/h2&gt;&lt;p&gt;The core value of the &lt;code&gt;Knowledgebase&lt;/code&gt; module is connecting file asset management with retrieval-augmented Q&amp;amp;A. For me, the real value is not just successful upload, but making sure documents reliably enter the vectorization pipeline and finally provide reusable, traceable knowledge support in Q&amp;amp;A scenarios.&lt;/p&gt;</description></item><item><title>AI Resume Analysis: Voice Interview Module</title><link>https://xedczq.cn/en/post/aiinterview_voiceinterview/</link><pubDate>Thu, 14 May 2026 22:34:43 +0800</pubDate><guid>https://xedczq.cn/en/post/aiinterview_voiceinterview/</guid><description>&lt;h2 id="voiceinterview-module-design-and-implementation"&gt;&lt;a href="#voiceinterview-module-design-and-implementation" class="header-anchor"&gt;&lt;/a&gt;VoiceInterview Module Design and Implementation
&lt;/h2&gt;&lt;p&gt;This note records how I implemented the &lt;code&gt;VoiceInterview&lt;/code&gt; module in the &lt;code&gt;interview-guide&lt;/code&gt; project. The core goal is to make voice interviews deliver a complete experience of real-time interaction, resumable sessions, and traceable evaluation.&lt;/p&gt;
&lt;h2 id="module-capability-overview"&gt;&lt;a href="#module-capability-overview" class="header-anchor"&gt;&lt;/a&gt;Module Capability Overview
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Real-time voice interaction: built on &lt;code&gt;WebSocket + Qwen3 Voice Model&lt;/code&gt; (shared API key for ASR/TTS/LLM).&lt;/li&gt;
&lt;li&gt;Streaming experience optimization: sentence-level concurrent TTS, generation/synthesis/playback in parallel, first-packet latency around 200ms.&lt;/li&gt;
&lt;li&gt;Server-side VAD: automatic segmentation with real-time subtitles (including intermediate results).&lt;/li&gt;
&lt;li&gt;Echo protection: supports manual submission to avoid AI playback being captured as user input.&lt;/li&gt;
&lt;li&gt;Session continuity: supports pause/resume and multi-turn context memory, with auto-pause on timeout.&lt;/li&gt;
&lt;li&gt;Observability metrics: Micrometer metrics for TTS/ASR latency, session duration, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="state-transitions"&gt;&lt;a href="#state-transitions" class="header-anchor"&gt;&lt;/a&gt;State Transitions
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Create Session&lt;br/&gt;POST /api/voice-interview/sessions"] --&gt; B["IN_PROGRESS"]

B --&gt; C{"Session Events"}
C -- "Pause / Timeout" --&gt; D["PAUSED"]
D -- "Resume" --&gt; B

C -- "End Interview" --&gt; E["COMPLETED"]
E --&gt; F["evaluateStatus = PENDING"]
F --&gt; G["evaluateStatus = PROCESSING"]

G --&gt; H{"Evaluation Result"}
H -- "Success" --&gt; I["EVALUATED&lt;br/&gt;evaluateStatus = COMPLETED"]
H -- "Failure" --&gt; J["evaluateStatus = FAILED"]

B --&gt; K["DELETE /api/voice-interview/sessions/{id}"]
D --&gt; K
E --&gt; K
I --&gt; K
J --&gt; K&lt;/pre&gt;&lt;h2 id="key-api-design"&gt;&lt;a href="#key-api-design" class="header-anchor"&gt;&lt;/a&gt;Key API Design
&lt;/h2&gt;&lt;h3 id="post-apivoice-interviewsessions-create-voice-interview-session"&gt;&lt;a href="#post-apivoice-interviewsessions-create-voice-interview-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/voice-interview/sessions&lt;/code&gt; Create Voice Interview Session
&lt;/h3&gt;&lt;p&gt;Controller entry:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;VoiceInterviewController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@Valid&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@RequestBody&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CreateSessionRequest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Core call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fallback &lt;code&gt;skillId&lt;/code&gt; (use default skill when missing).&lt;/li&gt;
&lt;li&gt;Fallback &lt;code&gt;llmProvider&lt;/code&gt; (use default provider when empty).&lt;/li&gt;
&lt;li&gt;Build &lt;code&gt;VoiceInterviewSessionEntity&lt;/code&gt; (phase switches, difficulty, resume ID, JD text, planned duration, etc.).&lt;/li&gt;
&lt;li&gt;Default &lt;code&gt;userId = &amp;quot;default&amp;quot;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Set initial phase (the first enabled one in &lt;code&gt;intro/tech/project/hr&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Persist to &lt;code&gt;voice_interview_sessions&lt;/code&gt; and cache in Redis (with TTL).&lt;/li&gt;
&lt;li&gt;Return &lt;code&gt;SessionResponseDTO&lt;/code&gt; (session ID, status, phase, config, etc.).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apivoice-interviewsessionssessionid-get-session-detail-by-id"&gt;&lt;a href="#get-apivoice-interviewsessionssessionid-get-session-detail-by-id" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/voice-interview/sessions/{sessionId}&lt;/code&gt; Get Session Detail by ID
&lt;/h3&gt;&lt;p&gt;Controller call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSessionDTO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Read Redis first, then DB fallback.&lt;/li&gt;
&lt;li&gt;Build &lt;code&gt;SessionResponseDTO&lt;/code&gt; when found.&lt;/li&gt;
&lt;li&gt;Return unified error when not found: &lt;code&gt;Session not found: {sessionId}&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apivoice-interviewsessionssessionidend-end-session-and-trigger-async-evaluation"&gt;&lt;a href="#post-apivoice-interviewsessionssessionidend-end-session-and-trigger-async-evaluation" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/voice-interview/sessions/{sessionId}/end&lt;/code&gt; End Session and Trigger Async Evaluation
&lt;/h3&gt;&lt;p&gt;Controller call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;endSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;End + evaluation logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setEndTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setCurrentPhase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setEvaluateStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PENDING&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceEvaluateStreamProducer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendEvaluateTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;redisService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;streamAdd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streamKey&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;buildMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;AsyncTaskStreamConstants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STREAM_MAX_LEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;API returns &lt;code&gt;Result.success()&lt;/code&gt; immediately without waiting for evaluation completion.&lt;/li&gt;
&lt;li&gt;Frontend polls &lt;code&gt;GET /api/voice-interview/sessions/{sessionId}/evaluation&lt;/code&gt; for progress.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="put-apivoice-interviewsessionssessionidpause-pause-session"&gt;&lt;a href="#put-apivoice-interviewsessionssessionidpause-pause-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;PUT /api/voice-interview/sessions/{sessionId}/pause&lt;/code&gt; Pause Session
&lt;/h3&gt;&lt;p&gt;Core call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;pauseSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Only &lt;code&gt;IN_PROGRESS&lt;/code&gt; sessions can be paused.&lt;/li&gt;
&lt;li&gt;Set status to &lt;code&gt;PAUSED&lt;/code&gt;, record reason, update &lt;code&gt;updatedAt&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Persist DB and sync Redis cache.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="put-apivoice-interviewsessionssessionidresume-resume-session"&gt;&lt;a href="#put-apivoice-interviewsessionssessionidresume-resume-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;PUT /api/voice-interview/sessions/{sessionId}/resume&lt;/code&gt; Resume Session
&lt;/h3&gt;&lt;p&gt;Core call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;resumeSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Only &lt;code&gt;PAUSED&lt;/code&gt; sessions can be resumed.&lt;/li&gt;
&lt;li&gt;After resume, status becomes &lt;code&gt;IN_PROGRESS&lt;/code&gt; without resetting phase/progress.&lt;/li&gt;
&lt;li&gt;Persist DB, sync Redis, and return latest &lt;code&gt;SessionResponseDTO&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apivoice-interviewsessions-get-session-list-filter-by-useridstatus"&gt;&lt;a href="#get-apivoice-interviewsessions-get-session-list-filter-by-useridstatus" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/voice-interview/sessions&lt;/code&gt; Get Session List (Filter by userId/status)
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllSessions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByUserIdAndStatusOrderByUpdatedAtDesc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;statusEnum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Result&amp;lt;List&amp;lt;SessionMetaDTO&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="delete-apivoice-interviewsessionssessionid-delete-voice-interview-session"&gt;&lt;a href="#delete-apivoice-interviewsessionssessionid-delete-voice-interview-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;DELETE /api/voice-interview/sessions/{sessionId}&lt;/code&gt; Delete Voice Interview Session
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Validate session existence.&lt;/li&gt;
&lt;li&gt;Delete session and related data (messages/evaluation, depending on repository implementation).&lt;/li&gt;
&lt;li&gt;Clear Redis cache.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apivoice-interviewsessionssessionidmessages-get-conversation-history"&gt;&lt;a href="#get-apivoice-interviewsessionssessionidmessages-get-conversation-history" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/voice-interview/sessions/{sessionId}/messages&lt;/code&gt; Get Conversation History
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getConversationHistoryDTO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Return:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Result&amp;lt;List&amp;lt;VoiceInterviewMessageDTO&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="get-apivoice-interviewsessionssessionidevaluation-get-async-evaluation-status-and-result"&gt;&lt;a href="#get-apivoice-interviewsessionssessionidevaluation-get-async-evaluation-status-and-result" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/voice-interview/sessions/{sessionId}/evaluation&lt;/code&gt; Get Async Evaluation Status and Result
&lt;/h3&gt;&lt;p&gt;Implementation highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Validate session first (throw &lt;code&gt;VOICE_SESSION_NOT_FOUND&lt;/code&gt; if missing).&lt;/li&gt;
&lt;li&gt;Read &lt;code&gt;evaluateStatus&lt;/code&gt; and &lt;code&gt;evaluateError&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If status is &lt;code&gt;COMPLETED&lt;/code&gt;, load evaluation details:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEvaluation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Return &lt;code&gt;VoiceEvaluationStatusDTO&lt;/code&gt; (includes status and result when completed).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apivoice-interviewsessionssessionidevaluation-manually-trigger-async-evaluation"&gt;&lt;a href="#post-apivoice-interviewsessionssessionidevaluation-manually-trigger-async-evaluation" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/voice-interview/sessions/{sessionId}/evaluation&lt;/code&gt; Manually Trigger Async Evaluation
&lt;/h3&gt;&lt;p&gt;Processing logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEvaluation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;voiceInterviewService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;triggerEvaluation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Rules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If already &lt;code&gt;COMPLETED&lt;/code&gt;: return existing evaluation result directly.&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;PENDING/PROCESSING&lt;/code&gt;: return current status without duplicate triggering.&lt;/li&gt;
&lt;li&gt;For other triggerable states: enqueue evaluation task and return &lt;code&gt;PENDING&lt;/code&gt;, then frontend continues polling.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="summary"&gt;&lt;a href="#summary" class="header-anchor"&gt;&lt;/a&gt;Summary
&lt;/h2&gt;&lt;p&gt;The key value of the &lt;code&gt;VoiceInterview&lt;/code&gt; module is not just making voice interaction work, but making the entire real-time pipeline and session lifecycle robustly connected. For me, only when the full chain (create, pause, resume, end, evaluate) works reliably can voice interviews become a truly evolvable product capability.&lt;/p&gt;</description></item><item><title>AI Resume Analysis: Interview Schedule Module</title><link>https://xedczq.cn/en/post/aiinterview_interviewschedule/</link><pubDate>Thu, 14 May 2026 17:10:42 +0800</pubDate><guid>https://xedczq.cn/en/post/aiinterview_interviewschedule/</guid><description>&lt;h2 id="interviewschedule-module-design-and-implementation"&gt;&lt;a href="#interviewschedule-module-design-and-implementation" class="header-anchor"&gt;&lt;/a&gt;InterviewSchedule Module Design and Implementation
&lt;/h2&gt;&lt;p&gt;This note records how I implemented the &lt;code&gt;InterviewSchedule&lt;/code&gt; module in the &lt;code&gt;interview-guide&lt;/code&gt; project. The goal is to integrate invitation parsing, record management, status maintenance, and reminder coordination into one stable and maintainable workflow.&lt;/p&gt;
&lt;h2 id="module-capability-overview"&gt;&lt;a href="#module-capability-overview" class="header-anchor"&gt;&lt;/a&gt;Module Capability Overview
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Invitation parsing: dual-channel parsing with rule engine + AI, supports Feishu/Tencent Meeting/Zoom text formats, automatically extracts company, role, interview time, and meeting link.&lt;/li&gt;
&lt;li&gt;Calendar management: supports day/week/month view, drag-and-drop adjustment, and list view collaboration.&lt;/li&gt;
&lt;li&gt;Status maintenance: supports manual status updates and scheduled auto-expiration.&lt;/li&gt;
&lt;li&gt;Reminder mechanism: supports configurable reminders to reduce missed interviews.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="state-transitions"&gt;&lt;a href="#state-transitions" class="header-anchor"&gt;&lt;/a&gt;State Transitions
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Call POST /api/interview-schedule/parse to parse invitation text"] --&gt; B{"Did rule parsing succeed?"}
B --&gt;|Yes| C["Return ParseResponse\nparseMethod = rule"]
B --&gt;|No| D["Call LLM parsing"]
D --&gt; E{"Did AI parsing succeed?"}
E --&gt;|Yes| F["Return ParseResponse\nparseMethod = ai"]
E --&gt;|No| G["Return parse failure\nsuccess = false"]

H["Call POST /api/interview-schedule to create record"] --&gt; I["create(): force status = PENDING"]
I --&gt; J["Write to DB\nstatus: PENDING"]

J --&gt; K["Call GET /api/interview-schedule or /{id} to query record"]

J --&gt; L["Call PUT /api/interview-schedule/{id} to update base info"]
L --&gt; M["Only update company/role/time fields\nwithout changing status"]
M --&gt; J

J --&gt; N["Call PATCH|PUT /api/interview-schedule/{id}/status?status=..."]
N --&gt; O["updateStatus(): entity.setStatus(status)"]

O --&gt; P{"Target status"}
P --&gt;|COMPLETED| Q["Status -&gt; COMPLETED"]
P --&gt;|CANCELLED| R["Status -&gt; CANCELLED"]
P --&gt;|RESCHEDULED| S["Status -&gt; RESCHEDULED"]
P --&gt;|PENDING| T["Status -&gt; PENDING"]

Q --&gt; U["Record can still be rewritten via status API"]
R --&gt; U
S --&gt; U
T --&gt; U
U --&gt; N

J --&gt; V["Scheduled task ScheduleStatusUpdater\nruns every hour"]
V --&gt; W{"Condition met?\nstatus=PENDING and interviewTime &lt; now"}
W --&gt;|Yes| X["Batch update to CANCELLED"]
W --&gt;|No| Y["No change"]

X --&gt; R
Y --&gt; J

J --&gt; Z["Call DELETE /api/interview-schedule/{id}"]
Z --&gt; AA["Delete record (lifecycle ends)"]&lt;/pre&gt;&lt;h2 id="key-api-design"&gt;&lt;a href="#key-api-design" class="header-anchor"&gt;&lt;/a&gt;Key API Design
&lt;/h2&gt;&lt;h3 id="post-apiinterview-scheduleparse-parse-interview-invitation-text"&gt;&lt;a href="#post-apiinterview-scheduleparse-parse-interview-invitation-text" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview-schedule/parse&lt;/code&gt; Parse Interview Invitation Text
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;parseService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRawText&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSource&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;tryRuleParsing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;parseWithAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Rule parsing handles structured patterns from Feishu/Tencent/Zoom first.&lt;/li&gt;
&lt;li&gt;AI parsing acts as a fallback channel for non-standard text.&lt;/li&gt;
&lt;li&gt;Input boundary constraints and prompt-injection protection are applied before AI parsing.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiinterview-schedule-create-interview-record"&gt;&lt;a href="#post-apiinterview-schedule-create-interview-record" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview-schedule&lt;/code&gt; Create Interview Record
&lt;/h3&gt;&lt;p&gt;Purpose:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Allows users to directly create an interview schedule record from manual input.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Request body (core fields):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;public&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CreateInterviewRequest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@NotBlank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Company name cannot be empty&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;companyName&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@NotBlank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Position cannot be empty&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@NotNull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Interview time cannot be empty&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@com.fasterxml.jackson.annotation.JsonFormat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;yyyy-MM-dd&amp;#39;T&amp;#39;HH:mm[:ss]&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LocalDateTime&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;interviewTime&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;interviewType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// ONSITE, VIDEO, PHONE&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;meetingLink&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;roundNumber&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;interviewer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterview-scheduleid-get-interview-record-by-id"&gt;&lt;a href="#get-apiinterview-scheduleid-get-interview-record-by-id" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview-schedule/{id}&lt;/code&gt; Get Interview Record by ID
&lt;/h3&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Controller receives &lt;code&gt;id&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;scheduleService.getById(id)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Service queries repository for one record and throws business exception if not found&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;Result&amp;lt;InterviewScheduleDTO&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterview-schedule-get-interview-record-list"&gt;&lt;a href="#get-apiinterview-schedule-get-interview-record-list" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview-schedule&lt;/code&gt; Get Interview Record List
&lt;/h3&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Controller accepts optional filters: &lt;code&gt;status/start/end&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;scheduleService.getAll(status, start, end)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Service queries by conditions and converts to DTO&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;Result&amp;lt;List&amp;lt;InterviewScheduleDTO&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="put-apiinterview-scheduleid-update-interview-record"&gt;&lt;a href="#put-apiinterview-scheduleid-update-interview-record" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;PUT /api/interview-schedule/{id}&lt;/code&gt; Update Interview Record
&lt;/h3&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Controller receives &lt;code&gt;id + CreateInterviewRequest&lt;/code&gt; (with &lt;code&gt;@Valid&lt;/code&gt; validation)&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;scheduleService.update(id, request)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Service loads existing record, updates fields, and saves&lt;/li&gt;
&lt;li&gt;Returns updated &lt;code&gt;Result&amp;lt;InterviewScheduleDTO&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="delete-apiinterview-scheduleid-delete-interview-record"&gt;&lt;a href="#delete-apiinterview-scheduleid-delete-interview-record" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;DELETE /api/interview-schedule/{id}&lt;/code&gt; Delete Interview Record
&lt;/h3&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Controller receives &lt;code&gt;id&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;scheduleService.delete(id)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Service deletes when found, throws exception when missing&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;Result&amp;lt;Void&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="patchput-apiinterview-scheduleidstatus-update-interview-status"&gt;&lt;a href="#patchput-apiinterview-scheduleidstatus-update-interview-status" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;PATCH/PUT /api/interview-schedule/{id}/status&lt;/code&gt; Update Interview Status
&lt;/h3&gt;&lt;p&gt;API implementation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;/{id}/status&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;RequestMethod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PATCH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;RequestMethod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PUT&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kd"&gt;public&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;InterviewScheduleDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;updateStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@PathVariable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;InterviewStatus&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Update interview status: ID={}, status={}&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;InterviewScheduleDTO&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;dto&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dto&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Core call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduleService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="summary"&gt;&lt;a href="#summary" class="header-anchor"&gt;&lt;/a&gt;Summary
&lt;/h2&gt;&lt;p&gt;The core value of the &lt;code&gt;InterviewSchedule&lt;/code&gt; module is connecting invitation understanding with interview process management. For me, this layer is what enables frontend calendar interaction, reminder strategy, and downstream interview evaluation to form a continuous user experience, instead of scattering information across chats and manual notes.&lt;/p&gt;</description></item><item><title>AI Resume Analysis: Interview Module</title><link>https://xedczq.cn/en/post/aiinterview_interview/</link><pubDate>Thu, 14 May 2026 15:00:53 +0800</pubDate><guid>https://xedczq.cn/en/post/aiinterview_interview/</guid><description>&lt;h2 id="interview-mock-interview-module-design-and-implementation"&gt;&lt;a href="#interview-mock-interview-module-design-and-implementation" class="header-anchor"&gt;&lt;/a&gt;Interview Mock Interview Module Design and Implementation
&lt;/h2&gt;&lt;p&gt;This note records how I implemented the &lt;code&gt;Interview&lt;/code&gt; module in the &lt;code&gt;interview-guide&lt;/code&gt; project, including the core APIs and evaluation pipeline. The main goal is to build a complete closed loop for question generation, answering, evaluation, and report export, while keeping text interviews and voice interviews aligned under the same evaluation logic.&lt;/p&gt;
&lt;h2 id="module-capability-overview"&gt;&lt;a href="#module-capability-overview" class="header-anchor"&gt;&lt;/a&gt;Module Capability Overview
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Skill-driven question generation: supports 10+ interview tracks (Java backend, major-company tracks, frontend, Python, algorithms, system design, test development, AI Agent, etc.). Each track is defined by &lt;code&gt;SKILL.md&lt;/code&gt; for scope and difficulty distribution.&lt;/li&gt;
&lt;li&gt;Historical question deduplication: previously asked questions in historical sessions are excluded during session creation to reduce repeated assessment.&lt;/li&gt;
&lt;li&gt;Interview stage duration linkage: after total duration changes, each stage (self-introduction, technical assessment, project deep-dive, reverse Q&amp;amp;A) is auto-allocated by ratio.&lt;/li&gt;
&lt;li&gt;Intelligent follow-up flow: supports multi-round follow-up configuration (default: 1 round) to simulate realistic interview interactions.&lt;/li&gt;
&lt;li&gt;Unified evaluation engine: text and voice interviews share the same evaluation architecture (batch evaluation + structured output + summarization + fallback).&lt;/li&gt;
&lt;li&gt;Report export: supports asynchronous generation and export of PDF interview reports.&lt;/li&gt;
&lt;li&gt;Interview center: unified entry for continue/restart/history operations.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="core-state-flow"&gt;&lt;a href="#core-state-flow" class="header-anchor"&gt;&lt;/a&gt;Core State Flow
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Call POST /api/interview/sessions to create session"] --&gt; B{"Any unfinished session\nand forceCreate != true?"}
B --&gt;|Yes| C["Return existing session"]
B --&gt;|No| D["Generate questions and save session"]

D --&gt; E["Session state: CREATED\nCache in Redis + persist in DB"]
C --&gt; E

E --&gt; F["Call GET /api/interview/sessions/{sessionId}/question"]
F --&gt; G{"Is current state CREATED?"}
G --&gt;|Yes| H["Switch to IN_PROGRESS"]
G --&gt;|No| I["Keep current state"]
H --&gt; J["Return current question"]
I --&gt; J

J --&gt; K["Call POST /api/interview/sessions/{sessionId}/answers to submit answer"]
K --&gt; L["Save answer"]
L --&gt; M{"Any next question?"}
M --&gt;|Yes| N["currentIndex + 1\nState remains IN_PROGRESS"]
M --&gt;|No| O["Switch state to COMPLETED"]

N --&gt; F
O --&gt; P["Set evaluateStatus to PENDING"]
P --&gt; Q["Send evaluation task to Redis Stream"]

R["Call POST /api/interview/sessions/{sessionId}/complete for early submit"] --&gt; O

Q --&gt; S["Evaluation consumer processes task"]
S --&gt; T["evaluateStatus = PROCESSING"]
T --&gt; U{"Evaluation successful?"}
U --&gt;|Yes| V["Save evaluation report"]
V --&gt; W["Session state = EVALUATED\nevaluateStatus = COMPLETED"]
U --&gt;|No| X{"Retry count &lt; 3 ?"}
X --&gt;|Yes| Q
X --&gt;|No| Y["evaluateStatus = FAILED\nRecord evaluateError"]

Z["Call DELETE /api/interview/sessions/{sessionId}"] --&gt; AA["Delete DB session and answers"]
AA --&gt; AB["Session ended"]&lt;/pre&gt;&lt;h2 id="key-api-design"&gt;&lt;a href="#key-api-design" class="header-anchor"&gt;&lt;/a&gt;Key API Design
&lt;/h2&gt;&lt;h3 id="get-apiinterviewsessions-list-interview-sessions"&gt;&lt;a href="#get-apiinterviewsessions-list-interview-sessions" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions&lt;/code&gt; List Interview Sessions
&lt;/h3&gt;&lt;p&gt;Purpose:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Used by the interview history page, returns session list in reverse creation order.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="post-apiinterviewsessions-create-interview-session"&gt;&lt;a href="#post-apiinterviewsessions-create-interview-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview/sessions&lt;/code&gt; Create Interview Session
&lt;/h3&gt;&lt;p&gt;Rate limiting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Global limit + IP limit (5)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getHistoricalQuestions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skillId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findTop10ByResumeIdAndSkillIdOrderByCreatedAtDesc&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findTop10BySkillIdOrderByCreatedAtDesc&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;questionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateQuestionsBySkill&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionCache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveSession&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveSession&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterviewsessionssessionid-get-session-info"&gt;&lt;a href="#get-apiinterviewsessionssessionid-get-session-info" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/{sessionId}&lt;/code&gt; Get Session Info
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionCache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;restoreSessionFromDatabase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterviewsessionssessionidquestion-get-current-question"&gt;&lt;a href="#get-apiinterviewsessionssessionidquestion-get-current-question" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/{sessionId}/question&lt;/code&gt; Get Current Question
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCurrentQuestionResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;getCurrentQuestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;getOrRestoreSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;If session is in &lt;code&gt;CREATED&lt;/code&gt; state, return question by &lt;code&gt;currentIndex&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiinterviewsessionssessionidanswers-submit-answer-and-move-forward"&gt;&lt;a href="#post-apiinterviewsessionssessionidanswers-submit-answer-and-move-forward" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview/sessions/{sessionId}/answers&lt;/code&gt; Submit Answer and Move Forward
&lt;/h3&gt;&lt;p&gt;Rate limiting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Global limit (10)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;submitAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Updates answer, session state, cache, and DB.&lt;/li&gt;
&lt;li&gt;If this is the last question:&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateEvaluateStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;AsyncTaskStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluateStreamProducer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendEvaluateTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="post-apiinterviewsessionssessionidanswers-save-draft-answer-no-progress"&gt;&lt;a href="#post-apiinterviewsessionssessionidanswers-save-draft-answer-no-progress" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview/sessions/{sessionId}/answers&lt;/code&gt; Save Draft Answer (No Progress)
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Syncs both Redis and DB.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="post-apiinterviewsessionssessionidcomplete-early-submit"&gt;&lt;a href="#post-apiinterviewsessionssessionidcomplete-early-submit" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;POST /api/interview/sessions/{sessionId}/complete&lt;/code&gt; Early Submit
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;completeInterview&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionCache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateSessionStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SessionStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Persists DB status.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluateStreamProducer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendEvaluateTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterviewsessionsunfinishedresumeid-find-unfinished-session"&gt;&lt;a href="#get-apiinterviewsessionsunfinishedresumeid-find-unfinished-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/unfinished/{resumeId}&lt;/code&gt; Find Unfinished Session
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findUnfinishedSessionOrThrow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;findUnfinishedSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionCache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findUnfinishedSessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findUnfinishedSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterviewsessionssessionidreport-generate-interview-evaluation-report"&gt;&lt;a href="#get-apiinterviewsessionssessionidreport-generate-interview-evaluation-report" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/{sessionId}/report&lt;/code&gt; Generate Interview Evaluation Report
&lt;/h3&gt;&lt;p&gt;Core logic:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluateInterview&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;unifiedEvaluationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;evaluateInBatches&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;summarizeBatchResults&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;structuredOutputInvoker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(...);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;securedSystemPrompt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;systemPromptWithFormat&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ANTI_INJECTION_INSTRUCTION&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Uses anti-injection instruction to reduce prompt contamination risk from user input.&lt;/p&gt;
&lt;h3 id="get-apiinterviewsessionssessioniddetails-get-interview-detail"&gt;&lt;a href="#get-apiinterviewsessionssessioniddetails-get-interview-detail" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/{sessionId}/details&lt;/code&gt; Get Interview Detail
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;historyService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getInterviewDetail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;interviewPersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findBySessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="get-apiinterviewsessionssessionidexport-export-interview-report-as-pdf"&gt;&lt;a href="#get-apiinterviewsessionssessionidexport-export-interview-report-as-pdf" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;GET /api/interview/sessions/{sessionId}/export&lt;/code&gt; Export Interview Report as PDF
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;historyService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exportInterviewPdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;interviewPersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findBySessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;pdfExportService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exportInterviewReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="delete-apiinterviewsessionssessionid-delete-interview-session"&gt;&lt;a href="#delete-apiinterviewsessionssessionid-delete-interview-session" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;DELETE /api/interview/sessions/{sessionId}&lt;/code&gt; Delete Interview Session
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteSessionBySessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findBySessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sessionRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="evaluation-engine-implementation-highlights"&gt;&lt;a href="#evaluation-engine-implementation-highlights" class="header-anchor"&gt;&lt;/a&gt;Evaluation Engine Implementation Highlights
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;A single evaluation pipeline supports both text and voice interviews, reducing branch complexity.&lt;/li&gt;
&lt;li&gt;Batch-first then summarize strategy balances long-context stability and structured output quality.&lt;/li&gt;
&lt;li&gt;Anti-injection prompt composition is applied to reduce malicious-input interference.&lt;/li&gt;
&lt;li&gt;In failure scenarios, unified invoker + fallback fields avoid hard report failures.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="summary"&gt;&lt;a href="#summary" class="header-anchor"&gt;&lt;/a&gt;Summary
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;Interview&lt;/code&gt; module now covers the full workflow from session creation, dynamic question generation, answer progression, asynchronous evaluation, to report export. For me, the key value is separating interview process management from evaluation result production into two evolvable layers, so future changes to question strategy or model upgrades can stay controlled.&lt;/p&gt;</description></item><item><title>AI Resume Analysis: Resume Module</title><link>https://xedczq.cn/en/post/aiinterview_resume/</link><pubDate>Thu, 14 May 2026 11:31:10 +0800</pubDate><guid>https://xedczq.cn/en/post/aiinterview_resume/</guid><description>&lt;h2 id="resume-module-design-and-implementation"&gt;&lt;a href="#resume-module-design-and-implementation" class="header-anchor"&gt;&lt;/a&gt;Resume Module Design and Implementation
&lt;/h2&gt;&lt;p&gt;This note records the core design, API responsibilities, async processing pipeline, and practical considerations of the &lt;code&gt;Resume&lt;/code&gt; module in the &lt;code&gt;interview-guide&lt;/code&gt; project.&lt;/p&gt;
&lt;h2 id="module-capabilities"&gt;&lt;a href="#module-capabilities" class="header-anchor"&gt;&lt;/a&gt;Module Capabilities
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Multi-format parsing: supports &lt;code&gt;PDF&lt;/code&gt;, &lt;code&gt;DOCX&lt;/code&gt;, &lt;code&gt;DOC&lt;/code&gt;, &lt;code&gt;TXT&lt;/code&gt;, and &lt;code&gt;MD&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Async processing: uses &lt;code&gt;Redis Stream&lt;/code&gt; for asynchronous resume analysis with status tracking.&lt;/li&gt;
&lt;li&gt;Stability: built-in auto-retry on analysis failure (up to 3 times) + duplicate detection based on file hash.&lt;/li&gt;
&lt;li&gt;Report export: supports one-click export of AI analysis results as a structured PDF report.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="core-status-flow"&gt;&lt;a href="#core-status-flow" class="header-anchor"&gt;&lt;/a&gt;Core Status Flow
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
A["Call /api/resumes/upload"] --&gt; B["Validate file and type"]
B --&gt; C{"Is duplicate resume?"}

C --&gt;|Yes| D["Return historical result or status (duplicate=true)"]
C --&gt;|No| E["Parse text + upload object storage + save ResumeEntity"]

E --&gt; F["Set analyzeStatus = PENDING"]
F --&gt; G["Send Redis Stream analyze task"]

G --&gt; H{"Task queued successfully?"}
H --&gt;|No| I["Set FAILED (queue failed)"]
H --&gt;|Yes| J["Consumer pulls task"]

J --&gt; K["Set PROCESSING"]
K --&gt; L["Call ResumeGradingService for AI analysis"]

L --&gt; M{"Any exception in this round?"}
M --&gt;|No| N["Save analysis result"]
N --&gt; O["Set COMPLETED"]

M --&gt;|Yes| P{"retryCount &lt; 3 ?"}
P --&gt;|Yes| Q["retryCount + 1, requeue task"]
Q --&gt; J
P --&gt;|No| R["Set FAILED (final failure)"]

S["Manual retry /api/resumes/{id}/reanalyze"] --&gt; T["Set PENDING and requeue"]
T --&gt; J&lt;/pre&gt;&lt;h2 id="key-api-design"&gt;&lt;a href="#key-api-design" class="header-anchor"&gt;&lt;/a&gt;Key API Design
&lt;/h2&gt;&lt;h3 id="apiresumesupload-upload-resume-async-analysis"&gt;&lt;a href="#apiresumesupload-upload-resume-async-analysis" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/upload&lt;/code&gt; Upload Resume (Async Analysis)
&lt;/h3&gt;&lt;p&gt;Rate limit strategy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Global limit: &lt;code&gt;@RateLimit(dimension = RateLimit.Dimension.GLOBAL, count = 5)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;IP limit: &lt;code&gt;@RateLimit(dimension = RateLimit.Dimension.IP, count = 5)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Entry call:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;uploadService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uploadAndAnalyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Processing flow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Basic file validation&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;fileValidationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;validateFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MAX_FILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;Resume&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Includes: null check, file size limit, and logging.
2. File type detection&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;contentType&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;parseService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;detectContentType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Supports: &lt;code&gt;PDF&lt;/code&gt;, &lt;code&gt;DOCX&lt;/code&gt;, &lt;code&gt;DOC&lt;/code&gt;, &lt;code&gt;TXT&lt;/code&gt;, &lt;code&gt;MD&lt;/code&gt;.
3. Duplicate file detection&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findExistingResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Internal flow:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fileHash&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fileHashService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;calculateHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumeRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByFileHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fileHash&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ol start="4"&gt;
&lt;li&gt;Resume parsing and text cleaning&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;parseService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parseResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Parse to plain text using &lt;code&gt;Apache Tika&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;textCleaningService.cleanText(content)&lt;/code&gt; to reduce excessive line breaks and token usage&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="5"&gt;
&lt;li&gt;File storage (unstructured data)&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;storageService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uploadResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;storageService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getFileUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fileKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Uploads to &lt;code&gt;RustFS/MinIO&lt;/code&gt; for unstructured file storage.
6. Metadata persistence&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;resumeText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fileKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fileUrl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ol start="7"&gt;
&lt;li&gt;Send async analysis task&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;analyzeStreamProducer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendAnalyzeTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;savedResume&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getId&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;resumeText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Uses &lt;code&gt;Redis Stream&lt;/code&gt; as the message queue
8. Return upload response&lt;br&gt;
Frontend checks subsequent APIs for async processing status.&lt;/p&gt;
&lt;h3 id="apiresumes-get-resume-list"&gt;&lt;a href="#apiresumes-get-resume-list" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes&lt;/code&gt; Get Resume List
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;historyService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllResumes&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumePersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAllResumes&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Current issue:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;User-level isolation is not implemented yet, so it currently returns the full list.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="apiresumesiddetail-get-resume-detail"&gt;&lt;a href="#apiresumesiddetail-get-resume-detail" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/{id}/detail&lt;/code&gt; Get Resume Detail
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;historyService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getResumeDetail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumePersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumeRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="apiresumesidexport-export-analysis-report-as-pdf"&gt;&lt;a href="#apiresumesidexport-export-analysis-report-as-pdf" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/{id}/export&lt;/code&gt; Export Analysis Report as PDF
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;historyService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exportAnalysisPdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumePersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumePersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLatestAnalysisAsDTO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;pdfExportService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exportResumeAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;analysisDTO&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="apiresumesid-delete-resume"&gt;&lt;a href="#apiresumesid-delete-resume" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/{id}&lt;/code&gt; Delete Resume
&lt;/h3&gt;&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;deleteService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;storageService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStorageKey&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;interviewPersistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteSessionsByResumeId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;persistenceService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deleteResume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="apiresumesidreanalyze-reanalyze-resume"&gt;&lt;a href="#apiresumesidreanalyze-reanalyze-resume" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/{id}/reanalyze&lt;/code&gt; Reanalyze Resume
&lt;/h3&gt;&lt;p&gt;Rate limit strategy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Global limit: &lt;code&gt;@RateLimit(dimension = RateLimit.Dimension.GLOBAL, count = 2)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;IP limit: &lt;code&gt;@RateLimit(dimension = RateLimit.Dimension.IP, count = 2)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Call chain:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;uploadService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;reanalyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;resumeRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;analyzeStreamProducer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendAnalyzeTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resumeId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;resumeText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then update and persist status in the processing step.&lt;/p&gt;
&lt;h3 id="apiresumeshealth-health-check"&gt;&lt;a href="#apiresumeshealth-health-check" class="header-anchor"&gt;&lt;/a&gt;&lt;code&gt;/api/resumes/health&lt;/code&gt; Health Check
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-java" data-lang="java"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For service liveness checks.&lt;/p&gt;
&lt;h2 id="stability-design-points"&gt;&lt;a href="#stability-design-points" class="header-anchor"&gt;&lt;/a&gt;Stability Design Points
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Async decoupling: upload and analysis are separated to improve responsiveness.&lt;/li&gt;
&lt;li&gt;Auto-retry: failed analysis retries up to 3 times to reduce transient failures.&lt;/li&gt;
&lt;li&gt;Hash-based dedup: &lt;code&gt;SHA-256&lt;/code&gt; content hash avoids repeated analysis of identical files.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="summary"&gt;&lt;a href="#summary" class="header-anchor"&gt;&lt;/a&gt;Summary
&lt;/h2&gt;&lt;p&gt;The &lt;code&gt;Resume&lt;/code&gt; module already forms a complete loop: upload, parse, async analyze, export, and delete. The current implementation is stable enough for iterative feature expansion and production hardening.&lt;/p&gt;</description></item></channel></rss>