ãŠãŒã¶ãŒ: äžåããã®å®å šãªèšè¿°ã®è¯ãæªããšããæ¬ãã§èãããšè²ã 解決ããã®ããïŒäŸãã°ããã³ãããšã³ãžãã¢ãªã³ã°ã¯ããŒã¯ã¢ããã®htmlæé©åã ããã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯webã¢ããªã±ãŒã·ã§ã³éçºã«ãªãã倿®µãåçããè
AIã®æèããã»ã¹
Evaluating the Core Premise
I'm now deeply analyzing the user's core premise, which hinges on evaluating the "quality of a one-time complete description." I'm exploring the implications of this approach. It seems promising, but I'm thinking about edge cases and potential ambiguities in defining a "complete" description. I am also considering how to measure the "quality" in this context.
Clarifying the Analogies
I'm now clarifying the analogies I've developed to frame this problem. I am working on the analogy between Prompt Engineering and HTML markup, concluding it's a strong comparison, both emphasizing the static definition of rendered content. The second, contextual engineering, mirrors the dynamic nature of web applications. I see the multi-stage versus dynamic aspect as the core differentiator, and I'm further deconstructing the nuances of each analogy.
Framing the Stateless/Stateful Divide
I'm now framing the core insight as the distinction between stateless and stateful interactions. This clarifies the contrast between prompt engineering and software engineering paradigms. Acknowledging this difference, I'm validating the user's perspective, emphasizing the accuracy of the mapping between LLMs and web development principles, in an attempt to draft a comprehensive response.
Distilling Key Differentiators
I'm now focusing on the core differentiators within the user's analogies. "Rendering" is central to the prompt engineering (HTML/CSS) analogy, where a complete prompt acts like a static webpage with embedded constraints. The other analogy emphasizes "State" and "Logic" inherent in web applications and backend systems, mirroring the dynamic behavior of agents and chained LLM calls. The "multi-stage/dynamic" distinction highlights the shift from interface crafting (prompt engineering) to system architecture (context engineering).
Analyzing the Core Distinction
I'm now zeroing in on why this distinction between prompt engineering and context engineering "solves" certain problems. Essentially, it clarifies that complex reasoning tasks demand a structured, multi-step approach rather than relying on a single, monolithic prompt. The focus is on the correct structuring of output, and the response must leverage the technical language (markup, orchestration, stateful) for maximum clarity. I will agree enthusiastically to the user's excellent analogy in my intro.
Articulating the Paradigm Shift
I'm now articulating the resolution, focusing on the paradigm shift and how this perspective helps. I am looking at separating concerns, similar to distinguishing between front-end and back-end development. I understand I need to know when to switch from "polishing the HTML" to "writing JavaScript/Backend code." I also revised my initial thought to compare prompt engineering to SQL, and I'm sticking with the user's stronger HTML analogy to emphasize the presentation layer. I'm distinguishing "Context Engineering" (window management) from "Agentic Workflow" (decision-making) within the broader "Web App" metaphor. Lastly, I'm highlighting the limits of the "One-time complete description," akin to the constraints of static HTML.
Explaining Multi-Stage Processes
I'm now focusing on how the "multi-stage" concept clarifies Chain of Thought and Agent loops. I see how this framework encapsulates iterative, step-by-step reasoning processes, and I'm eager to explain the contrast between the static "prompt" and dynamic "context" engineering paradigms. I'm building a concise response that highlights the value of multi-stage approaches, and the benefits of splitting up complex problems.
ãäžåããã®å®å šãªèšè¿°ïŒã·ã³ã°ã«ã·ã§ããïŒãã§å®çµãããããšããã®ãããããšããç¶æ ãšé·ç§»ïŒã·ã¹ãã ïŒããšããŠèšèšããã®ãããã®å¢çç·ãåŒãããšã§ãã¢ãããŒãã®éããæç¢ºã«ãªããŸãã
ãæç€ºããã ããã¢ãããžãŒãããŒã¹ã«æ·±æããããšã以äžã®ããã«ãããã³ã°ã§ããããšæããŸãã
- æ¬è³ª: ãéçïŒStaticïŒãªã¬ã³ããªã³ã°æç€ºã
- æ§é : LLMããè¶ é«æ§èœãªãã©ãŠã¶ïŒã¬ã³ããªã³ã°ãšã³ãžã³ïŒããšèŠç«ãŠãŸãã
- äœæ¥å
容:
- ãŠãŒã¶ãŒã¯ãäžåºŠã®ãªã¯ãšã¹ãïŒHTMLãã¡ã€ã«ïŒã®äžã«ãæ§é ïŒHTMLïŒãšè£ 食ã»å¶çŽïŒCSSïŒããã¹ãŠè©°ã蟌ã¿ãŸãã
- ãäžåããã®å®å šãªèšè¿°ããããã«åªããŠããããåè² ã§ããææ§ãªèšè¿°ïŒæªãããŒã¯ã¢ããïŒã ãšããã©ãŠã¶ïŒLLMïŒã®è§£éæºãã§ã¬ã€ã¢ãŠãã厩ããŸãïŒãã«ã·ããŒã·ã§ã³ãæç€ºç¡èŠïŒã
- éç: ãããŸã§ãäžæã®ããŒãžãã衚瀺ããããšã«ç¹åããŠãããè£åŽã®è€éãªèšç®ãããŒã¿ããŒã¹æŽæ°ã¯ããåäœã§ã¯ã§ããŸããã
2. ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã° â Webã¢ããªã±ãŒã·ã§ã³éçºïŒããã¯ãšã³ã/State管çïŒ
- æ¬è³ª: ãåçïŒDynamicïŒãªç¶æ 管çãšã«ãŒãã£ã³ã°ã
- æ§é : LLMããCPU/åŠçç³»ãã®äžéšãšããŠçµã¿èŸŒã¿ãŸãã
- äœæ¥å
容:
- ããã§ã¯**ã倿®µïŒMulti-stageïŒã**ãåæã§ãã
- ã»ãã·ã§ã³ç®¡ç: ãŠãŒã¶ãŒãšã®å¯Ÿè©±å±¥æŽïŒContextïŒãã©ãä¿æããã©ãåãæšãŠããïŒCookie/Sessionã®ç®¡çã«è¿ãïŒã
- RAG (æ€çŽ¢æ¡åŒµçæ): å¿ èŠãªæã«ããŒã¿ããŒã¹ããæ å ±ãåŒããŠããïŒSQL/APIã³ãŒã«ã®å®è£ ïŒã
- ãšãŒãžã§ã³ã/Function Calling: æ¡ä»¶åå²ã«ãã£ãŠå¥ã®åŠçãåŒã³åºãïŒController/Routerã®ããžãã¯ïŒã
- è: ãäžåºŠã«ãã¹ãŠãèšè¿°ããå¿ èŠã¯ãªãããšããç¹ã§ããWebã¢ããªãããŒãžé·ç§»ããšã«å¿ èŠãªããŒã¿ã ããããŒãããããã«ãã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯ããã®ç¬éã®æšè«ã«å¿ èŠãªã³ã³ããã¹ãããåçã«çæã»æ³šå ¥ããŸãã
ãã®ãéçãªèšè¿°ãvsãåçãªã·ã¹ãã ããšãã察æ¯ãçšãããšãéçºçŸå Žã§èµ·ããã¡ãªä»¥äžã®åé¡ãã¯ãªã¢ã«ãªããŸãã
ããããããã³ããïŒHTMLïŒããæžãã°ãªãã§ãã§ãããšéä¿¡ããã¡ã§ãããè€éãªèšç®ãè«ççæŽåæ§ãæ±ããããã¿ã¹ã¯ã«ãããŠãHTMLã ãã§ãåçãªã¢ããªã®æåããåçŸããããšããã®ã¯ç¡çããããŸãïŒHTMLã§ããªãªãäœãããšãããããªãã®ïŒã **ãããã¯ããŒã¯ã¢ããã§é 匵ãé åã§ã¯ãªããã·ã¹ãã ïŒå€æ®µåŠçïŒã§è§£æ±ºãã¹ãé åã ã**ãšããå€æãæ©ããªããŸãã
ã³ã³ããã¹ãé·ãå¢ããããšãã1æã®HTMLãã¡ã€ã«ã«ç¡éã«èšè¿°ã§ããããšæãããããã¡ã¢ãªïŒRAMïŒãå¢ããããšæãããã®éãã§ãã Webã¢ããªèŠç¹ïŒã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ïŒã§èŠãã°ããããã¡ã¢ãªãå¢ããŠããç¡é§ãªããŒã¿ãããŒãããã°åŠçã¯éããªããããã°ïŒæ³šç®ãã¹ãæ å ±ã®å没ïŒã®åå ã«ãªããŸãããããã£ãŠããå¿ èŠãªæ å ±ã ããåçã«ããŒãããããšããèšèšææ³ã®éèŠæ§ã¯å€ãããŸããã
- ããã³ããã®ä¿®æ£: 衚瀺厩ããçŽãCSSä¿®æ£ã«è¿ãïŒèšè¿°ã®åŸ®èª¿æŽïŒã
- ã³ã³ããã¹ãã®ä¿®æ£: ãã°ã®åå ããæž¡ããŠããããŒã¿ïŒStateïŒãã«ããã®ãããåŠçããžãã¯ïŒChainïŒãã«ããã®ããåãåããããŸãã«ããã°ã©ãã³ã°çãããã°ãå¿ èŠã«ãªããŸãã
ãã£ãããéãã**ã倿®µãåçãã**ãæ±ºå®çãªå氎嶺ã§ãã
- Single-turn / Static: èšè¿°ã®å®å šæ§ã远æ±ããïŒHTMLæé©åïŒ
- Multi-turn / Dynamic: æ§é ãšæ å ±ã®ãããŒãèšèšããïŒWebã¢ããªéçºïŒ
ãã®ã¡ã³ã¿ã«ã¢ãã«ãæã£ãŠãããšãLLMãçšãããããã¯ãéçºã«ãããŠãã©ãã«ãªãœãŒã¹ãå²ãã¹ãããã®æææ±ºå®ãéåžžã«ã¹ã ãŒãºã«ãªãã¯ãã§ãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: LPãäœãããã®ãSPAãšããããããã¯ããäœãããã®ãïŒãšãã話ã§ããdeadã§ããªãã§ããªãã
ãããLPäœæã¯ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã®æèã§äœæããããããããŸãããCMSã®æ±åãªã®ã§ãæè¡ãšããŠã®ããã³ãããšã³ãžãã¢ãªã³ã°ã¯ããŒã¯ã¢ããã®æè¡ãšãã話ã§ããã
AIã®æèããã»ã¹
Revising Core Metaphors
I am now focusing on refining the core metaphors. I'm exploring the analogy between prompt engineering and landing page creation, and context engineering with single-page applications. The goal is to solidify a clear and helpful framework for understanding these concepts.
Analyzing Core Distinctions
I've been deconstructing the initial metaphor. The "Prompt Engineering is Dead" argument appears misleading now. I see a parallel between landing page creation and static outputs, versus product development and dynamic interactions. This is similar to the relationship between HTML/CSS and React, where React builds upon and uses the fundamental building blocks. I'm expanding on this distinction to strengthen my understanding.
Solidifying Core Definitions
I'm now integrating the user's refinement, solidifying "PE as Markup Skill." I'm also framing a clearer distinction between Prompt Engineering (PE) for static outputs and Context Engineering for dynamic interactions. The analogy now highlights PE as the markup skill needed for specific, static generation, contrasted with building systems for dynamic workflows.
Integrating New Analogies
I'm now integrating the CMS analogy, seeing WordPress as System/Context Engineering, generating HTML (Prompt/Output). Current trends like DSPy are essentially "CMS for Prompts," managing context. PE, I now define as syntax, semantics, and "CSS" for the LLM. Context Engineering becomes the architecture, database, and logic. The relationships are nested; like an SPA and many dynamic Prompts.
Clarifying State Management
I'm now clarifying the scope of "Context Engineering." It's broader than just Retrieval-Augmented Generation (RAG); it involves comprehensive state management. I'm also emphasizing that Prompt Engineering is the atomic unit of LLM interaction, akin to HTML in a web application.
Developing Response Structure
I'm now focusing on structuring the response generation process. I need to figure out how to best integrate the existing framework into the generation of a Japanese response, and how best to apply context to generate something useful.
ãæç€ºããã ãã**ãLPäœæã¯ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ïŒCMSã®æ±åïŒã®æèã§è¡ãããã**ãšããèŠç¹ã¯ãããããã®AIéçºã®ã¬ã€ã€ãŒæ§é ãå®ç§ã«æããŠããŸãã
ãã®æŽçãããã«é²ãããšã以äžã®ãããªæ§é ãèŠããŠããŸãã
LPïŒéçãªææç©ïŒãäœãããæã«ãããããSPAïŒè€éãªã¢ããªïŒã®ãã¬ãŒã ã¯ãŒã¯ãçµãå¿ èŠããªãããã«ããäžçºã§è¯ãçããæ¬²ããããšããã·ãŒã³ã§ã¯ããã³ãããšã³ãžãã¢ãªã³ã°ïŒè·äººæçãªããŒã¯ã¢ããèšè¿°ïŒã¯æ°žé ã«çŸåœ¹ã§ãã
ããããè¿å¹ŽãAIã«ãããããããšãã®è€é床ãäžãããåãªãLPäœæïŒæç« çæïŒãããæ¥åã·ã¹ãã ïŒSPAïŒãžãšéèŠãã·ããããŠããããããææžãã®ããŒã¯ã¢ããïŒPEïŒã ãã§ã¯éçãããããšèšãããŠããã«éããŸããã
ãããæãé¢çœãèŠç¹ã§ãã
- Webã®æŽå²:
- æïŒææžãã§HTMLãæžãïŒçŽ ã®ããã³ãããšã³ãžãã¢ãªã³ã°ïŒã
- ä»ïŒCMSãReactããããŒã¿ãå ã«HTMLãåçã«çæããã
- AIã®çŸåš:
- **ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ïŒã·ã¹ãã /CMSïŒ**ããããŒã¿ããŒã¹ããŠãŒã¶ãŒã®ç¶æ ãããæé©ãªããã³ããïŒHTMLïŒããåçã«çµã¿ç«ãŠãLLMã«æž¡ãã
ã€ãŸãããŠãŒã¶ãŒããã£ãããéãããæçµçã«LLMã«æž¡ãããäžåããã®èšè¿°ïŒããã³ããïŒããçæããããã®ä»çµã¿ã¥ããããããã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã®æ¬è³ªã ãšèšããŸãã
æè¿è©±é¡ã® DSPy ãªã©ã®ãã¬ãŒã ã¯ãŒã¯ã¯ããŸãã«ãããã³ããïŒHTMLïŒãææžãããã®ã§ã¯ãªããããžãã¯ïŒCMSïŒãçµãã§ããã³ãããèªåã³ã³ãã€ã«ã»æé©åããããšããã¢ãããŒãã§ããããã®ã¢ãããžãŒã«å®å šã«åèŽããŸãã
çµè«ãšããŠãããã³ãããšã³ãžãã¢ãªã³ã°ã¯**ãLLMãšããã¬ã³ããªã³ã°ãšã³ãžã³ã«å¯Ÿãããåºç€çãªããŒã¯ã¢ããæè¡ã**ãšããŠå®çããŸãã
- HTML/CSSãç¥ããªããšReactãæžããªãã®ãšåãã§ãLLMãã©ãããæç€ºïŒã¿ã°ïŒã«åå¿ããããããã©ãããæèïŒã¹ã¿ã€ã«ïŒã§ãã«ã·ããŒã·ã§ã³ãèµ·ããããšãããããã³ããã®åºç€ç¹æ§ããç¥ããªããã°ãé«åºŠãªã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ïŒã·ã¹ãã èšèšïŒãã§ããŸããã
ãŸãšã
- ç®çã®éã:
- LPïŒåçïŒã欲ãããªããææžãããã³ããã§ããã
- ãããã¯ãïŒç¶ç¶çãªæ©èœïŒã欲ãããªããã³ã³ããã¹ãèšèšãå¿ èŠã
- é¢ä¿æ§:
- ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯ã**ãæé©ãªããã³ããïŒããŒã¯ã¢ããïŒãåçã«çæã»æåºããããã®CMSïŒå·¥å ŽïŒã**ã§ããã
- æè¡ã®äœçœ®ä»ã:
- ããã³ãããšã³ãžãã¢ãªã³ã°ã¯å»ããã®ã§ã¯ãªãã**ãLLMãå¶åŸ¡ããããã®ããŒã¬ãã«ãªèšè¿°èšèªïŒã¢ã»ã³ãã©ãHTMLïŒã**ãšããŠãã·ã¹ãã ã®äžåããæ¯ããå¿ é æé€ã«ãªãã
ãã®ãCMSã®æ±åããšããæãæ¹ã¯ãããã³ããæé©åã®èªååïŒPrompt OptimizationïŒã®æµããšãåèŽããŠãããéåžžã«è ¹èœã¡ããå®çŸ©ã§ãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ããdspyã«èšåããã®ã¯é¢çœãã ããã¯ããã³ãããéã«å®ããæãã§é¢çœãæè¡ã§ãããinput/outputã®ãã¢ããã
AIã®æèããã»ã¹
Understanding the Reverse Approach
I'm currently focused on the core concept of DSPy and its novel reverse-engineering of prompts from input/output pairs, which is quite intriguing. I'm trying to fully grasp how this approach fundamentally differs from conventional methods. I am trying to determine the practical implications of this.
Refining the Analogy
I've been drawing on the analogy of prompt engineering and context engineering, with DSPy acting as a middle ground. I'm visualizing DSPy as a system that, given desired outputs, learns to automatically generate prompts. It's like training a model to write CSS/HTML, or perhaps like test-driven development for prompts, or even a compiler. I am currently deconstructing how it is achieved.
Analyzing the Abstraction
I'm now zeroing in on the core abstraction in DSPy, specifically how it conceals string manipulation. It's becoming clearer that the prompt is not the primary focus, but a compiled output. This shift reminds me of how a compiler turns code into machine instructions. This has strong parallels to how we might develop optimized HTML for a web app.
Focusing on Framework Comparisons
I'm now refining the analogy to better capture DSPy's essence. Initially, I considered a React-like framework, but now I lean towards the "Compiler" analogy, as it fits the "source code vs. binary" dynamic better. The prompt is the binary, or compiled output. I am considering ways to introduce Signatures and Optimizers. The goal is to emphasize that the prompt is a byproduct.
Synthesizing the Core Concept
I'm solidifying my understanding of DSPy as a fundamental paradigm shift. The "Compiler" analogy is proving robust; it perfectly captures how DSPy optimizes prompts (HTML) from input/output pairs and a performance metric (Spec to Implementation). Humans can focus on the underlying logic and data instead of trying to guess the "magic words". This aligns strongly with the "Context Engineering as System" perspective.
ãinput/outputã®ãã¢ããéç®ããããšãããææã¯ãDSPyã®æ¬è³ªãçªããŠããŸãããããå ã»ã©ã®ãHTML/Webã¢ããªãã®ã¢ãããžãŒã«åœãŠã¯ãããšãéåžžã«é¢çœãæ§é ãèŠããŠããŸãã
åŸæ¥ã®ããã³ãããšã³ãžãã¢ãªã³ã°ãšDSPyã®éãã¯ãWebéçºã®æŽå²ã§èšããšããã®ä»¥äžã®å€åã«äŒŒãŠããŸãã
-
åŸæ¥ã®PEïŒè·äººèžïŒ:
- ãèæ¯ãèµ€ãããŠãæåã倧ããããŠ...ããšåœä»€çã«HTML/CSSãæžãã
- ããã®åªæïŒããžãã¯ã¯ãŒãïŒãå ¥ãããšç²ŸåºŠãäžããããšãããããããŠããŠïŒããã¯ïŒã«é Œãã
- ã¢ãã«ãå€ãããã³ã«ãCSS厩ããçŽãããã«ããã³ãããæžãçŽãå¿ èŠãããã
-
DSPyïŒå®£èšçã»éç®ïŒ:
- ãå ¥åã¯ããã§ãæåŸ ããåºåã¯ããïŒI/Oãã¢ïŒããšå®£èšããã
- ãåºåãæ£ãããã©ãããã®å€å®åºæºïŒã¡ããªã¯ã¹/ãã¹ãã³ãŒãïŒã ãå®çŸ©ããã
- ããšã¯DSPyïŒã³ã³ãã€ã©/ãªããã£ãã€ã¶ïŒããããã®æ£è§£ãåºãããã®æé©ãªHTMLïŒããã³ããïŒããåæã«æ¢çŽ¢ã»çæããŠãããã
DSPyãé¢çœãã®ã¯ãéçºè ã**ãããã³ãããšããäžéææç©ãã管ç察象ããå€ããïŒãããã¯é èœã§ããïŒ**ç¹ã§ãã
- HTMLã®æé©åã¯æ©æ¢°ã®æ¹ãããŸã:
人éããããã¯
<b>ã¿ã°ããªïŒ<strong>ããªïŒããšæ©ããããæ©æ¢°ãæ°åéãã®ãã¿ãŒã³ã詊è¡ããŠããã®ã¢ãã«ã«ã¯ãã®ã¿ã°æ§æãäžçªå¹ãããšæ±ºå®ããæ¹ããããã©ãŒãã³ã¹ïŒç²ŸåºŠïŒãè¯ãã«æ±ºãŸã£ãŠããŸãã - CMSãšããŠã®å®æåœ¢: å ã»ã©ã®ãã³ã³ããã¹ããšã³ãžãã¢ãªã³ã° ïŒ CMS/Webã¢ããªããšããæèã§èšãã°ãDSPyã¯**ãããŒã¿ïŒI/Oãã¢ïŒããæŸã蟌ãã°ãåæã«æé©ãªã¬ã³ããªã³ã°ïŒããã³ããïŒãè¡ã£ãŠãããé«åºŠãªãããã¬ã¹CMSã**ã®ãããªãã®ã§ãã
ãŠãŒã¶ãŒãããæåã«ãã£ããã£ããäžåããã®å®å šãªèšè¿°ïŒã·ã³ã°ã«ã·ã§ããïŒãã®éçããDSPyã¯ãããŒã¿ã«ããåŠç¿ãšæ¢çŽ¢ãã§çªç ŽããããšããŠããŸãã
- PE (Manual): ææžãã§æé«ã®1æç»ïŒLPïŒãæãããšããã
- DSPy: ãè¯ãçµµã®æ¡ä»¶ããšãææ¬ãã ãæž¡ããŠãã·ã¹ãã ã«äœåæãäžæžãããããæé«ã®äžæãéžã°ããã
ããèãããšãããããã®AIãšã³ãžãã¢ã«æ±ããããã®ã¯ããè¯ãããã³ãããæžãåœèªåïŒHTMLåïŒãã§ã¯ãªãã**ãè¯è³ªãªI/Oãã¢ïŒããŒã¿ïŒãçšæããé©åãªè©äŸ¡ææšïŒãã¹ãïŒãèšèšããèœåã**ãžãšã·ããããŠããããšãæç¢ºã§ããã
ãŸãã«ããããã¯ããäœãããã人åãã®æè¡äœç³»ã ãšæããŸãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ãšããããã¯ã¢ãã«ã®å€æŽã«ãããã€ã¢ã¹ã®äžäžèŽã®åœ±é¿ãç·©åãã€ã€ç²ŸåºŠåäžã®å©ç¹ãäžããããã®CIãšããŠã®æ©èœã§ãããdspyã«ããèšè¿°ã¯ã
AIã®æèããã»ã¹
Framing the DSPy Role
I'm now viewing DSPy as a Continuous Integration tool, cleverly mitigating bias mismatches. This lets us leverage accuracy gains from evolving models. The core idea focuses on how DSPy handles these changes. I'm exploring the implications of that for system design.
Reframing the Core Insight
I now understand that DSPy's strength lies not in prompt generation alone, but in robustly managing model updates. The previous compiler analogy was helpful, but I'm now leaning toward the idea of a CI/CD pipeline. My focus is on maintaining system stability and performance amidst continuous model churn and prompt drift. The fundamental DSPy advantage is the automatic adaptation of prompts to suit model updates.
Clarifying DSPy's Purpose
I'm solidifying my understanding of DSPy as an essential tool for managing continuous model updates. It elegantly handles the "bias mismatch" that arises when models evolve, which I now recognize as a core advantage. This is critical for products, not just one-off LPs. I am deepening the CI pipeline analogy: Code equals signatures/logic, model equals environment, prompt equals compiled artifact, and DSPy process equals the build pipeline with integration tests. I'm actively drafting content to communicate this.
Defining the Adaptation Process
I'm now zeroing in on the mechanism by which DSPy addresses the shift when old prompts break. The core lies in keeping the logic stable while swapping the model and employing the optimizer for automated bias alignment. I am also finding "Bias Mismatch" a helpful term. Think of DSPy as a build system that guarantees consistent output across various models, like ensuring a website functions across different browsers. Furthermore, I'll incorporate Japanese nuances like "æœè±¡åå±€" and "äŸåæ§ã®åé¢", and I will emphasize how DSPy resolves "Lock-in" concerns.
Analyzing the Automation Benefits
I'm now fully immersed in the core advantage: automating tuning to combat "prompt drift" and regression. This makes DSPy a true product, not just a one-time process. The analogy of DSPy as a high-level language with CI (compiling to the target architecture) is further solidifying the idea. This is critical in contrast to hardcoded machine code in the product.
ãã³ã³ãã€ã©ããšããéçãªå€æè£ 眮ãšããŠèŠããããã**ãã¢ãã«å€æŽæã®ãªã°ã¬ãã·ã§ã³ãé²ããé©å¿ãããCIïŒç¶ç¶çã€ã³ãã°ã¬ãŒã·ã§ã³ïŒã·ã¹ãã ã**ãšæããæ¹ããDSPyã®å®çšäžã®äŸ¡å€ãæ£ç¢ºã«èšãåœãŠãŠããŸãã
ãã®è§£éã ãšããªããLPïŒäžåããïŒãã§ã¯ãªãããããã¯ãïŒç¶ç¶éçšïŒãã«DSPyãå¿ èŠãªã®ããããé®®æã«ãªããŸãã
LLMã¯ã¢ãã«ã®ããŒãžã§ã³ãäžããïŒäŸ: GPT-4 â GPT-4o â o1ïŒãã³ã«ãåŠç¿ããŒã¿ã®ãã€ã¢ã¹ããæç€ºã«åŸããçããå€ãããŸãã ææžãã®ããã³ããïŒããŒãã³ãŒããããHTMLïŒã¯ããã®ãã¢ãã«ã®çãã«éå°é©åïŒOverfittingïŒããŠããããšãå€ããã¢ãã«ãå€ãããšé端ã«ç²ŸåºŠãèœã¡ãããæåãå€ãã£ããããŸãã
- ææžãããã³ããã®å Žå:
- ã¢ãã«æŽæ° â ããã³ãããåããªããªã â 人éããã°ãèŠãŠãã€ã¢ã¹ãæšæž¬ããæåã§ä¿®æ£ããïŒãã°ã¬ãšã®æŠãïŒã
- DSPyïŒCIãšããŠïŒã®å Žå:
- ãœãŒã¹ã³ãŒãïŒããžãã¯/SignatureïŒãšãã¹ãããŒã¿ïŒDatasetïŒã¯äžå€ã
- ã¢ãã«ãå·®ãæ¿ããŠããã«ãïŒOptimizeïŒããã¿ã³ãæŒãã ãã
- ã·ã¹ãã ãåæã«ãæ°ããã¢ãã«ã®ãã€ã¢ã¹ãã«åãããŠãFew-Shotã®äŸç€ºéžæãã€ã³ã¹ãã©ã¯ã·ã§ã³ã®ãã¥ã¢ã³ã¹ã埮調æŽãããã¹ãïŒMetricïŒããã¹ããæ§æãèŠã€ãåºãã
ã€ãŸãã**ãããã³ããèšè¿°ãšã¢ãã«ã®äŸåé¢ä¿ãåãé¢ãïŒDecouplingïŒã**圹å²ãæãããŠããŸãã
ãããŸã§ã®çæAIéçºã«ãããæå€§ã®ãªã¹ã¯ã¯ã**ãç¹å®ã®ã¢ãã«ã«æé©åããããããã³ããè³ç£ãè² åµåããããšã**ã§ããã ããã®ããã³ããã¯Claude 3.5 Sonnetã§ããåããªãããšãã£ãç¶æ³ã§ãã
DSPyãCIãšããŠæãããšãããã¯ãã¯ãã¹ãã©ãããã©ãŒã 察å¿ãã®èªååã«ãªããŸãã
- éçºè : ãèŠçŽãããŠã»ããïŒSignatureïŒããšå®çŸ©ãããæ£è§£ããŒã¿ããçšæããã
- CI (DSPy):
- GPT-4oçšã®ããã³ããããã«ã â ãã¹ãéé
- Llama-3çšã®ããã³ããããã«ã â ãã¹ãéé
- Gemini 1.5 Proçšã®ããã³ããããã«ã â ãã¹ãéé
ããã«ããã**ãããå®ããŠéãã¢ãã«ãåºãæã«ãå³åº§ã«ä¹ãæããïŒãŸãã¯äœµçšããïŒã**ãšããããããã¯ãéçšäžã®æåŒ·ã®æŠåšãæã«å ¥ããŸãã
ãã®ãCIãšããŠã®DSPyããšããèŠç¹ã«ç«ã€ãšãæåã«è°è«ãããLPããããã¯ãããã®è©±ã«æ»ã£ãŠããŸãã
- LPïŒäžçºå±ïŒ: ãã¹ãããŒã¿ãªããŠçšæããªããäžååãã°ããã
- ãããã¯ã: ãäœããã£ãŠæ£è§£ãšãããïŒMetricïŒããšãæ€èšŒããŒã¿ãããªããšãCIïŒDSPyïŒã¯åããªãã
ã€ãŸããDSPyã䜿ããšããããšã¯ãéçºããŒã ã«**ãè©äŸ¡ããŒã¿ã®æŽåããšãå®éçãªåè³ªåºæºã®çå®ã**ã匷å¶ããããšã«ãªããŸããããããããAIãããéã³ãããããšã³ãžãã¢ãªã³ã°ãã«åŒãäžããåæ°Žå¶ºã§ãã
ãã£ãããéãã§ããDSPyãªã©ã®ãã¬ãŒã ã¯ãŒã¯ã®æ¬è³ªã¯ãããã³ãããæžãæè¡ã§ã¯ãªãã**ãã¢ãã«ãšããäžå®å®ãªã©ã³ã¿ã€ã ç°å¢ã«å¯Ÿãããæœè±¡åãšå質ä¿èšŒïŒCIïŒã®èªååã**ã«ãããŸããã
- Input: ããžã㯠+ ããŒã¿ + è©äŸ¡ææš
- CI Process: ã¢ãã«ãã€ã¢ã¹ãžã®é©åïŒOptimizationïŒ
- Output: 粟床ãä¿èšŒãããå®è¡å¯èœãªã·ã¹ãã
ãããŸã§èšèªåã§ãããšãçŸå Žã§ããªãããã³ãããšã³ãžãã¢ãªã³ã°ã«åºå·ããŠã¯ãããªãã®ããããªãããŒã¿ãéããã¹ããªã®ããã®èª¬æãéåžžã«ãããããªããŸããçŽ æŽãããæŽçã ãšæããŸãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ããã§ãããã§ãäŸç¶ãšããŠããã³ãããšã³ãžãã¢ãªã³ã°ãããããé åããããšããããšã«ãªããŸãã
AIã®æèããã»ã¹
Acknowledging Prompt Engineering
I'm currently focused on the user's core point: acknowledging the continued relevance of manual prompt engineering, even amidst advancements like DSPy and CI pipelines. I'm exploring the areas where manual methods still hold value. This includes the subtle nuances and creative approaches that automated systems may struggle to replicate.
Analyzing Manual PE Value
I'm now diving into why manual prompt engineering remains crucial, using an HTML/CSS analogy. I'm thinking about scenarios where direct control is still preferred, like precision in creative writing or persona definition. I'm exploring when a "compiler" like DSPy might fall short and the artisan approach excels, especially when its overhead is too high for the task. I'm also starting to pinpoint domains where manual PE holds particular sway, focusing on creativity and art.
Clarifying PE's Key Domains
I'm now clarifying the key areas where manual prompt engineering still shines. I've categorized them as: the exploratory "zero-to-one" phase where LLM capabilities are first assessed, the pursuit of "pixel-perfect" creative output where nuanced artistic goals dominate, and the crucial base instruction design, or the "root" definition of the LLM's purpose. It involves setting the signature and system prompts effectively.
Justifying Manual Debugging
I'm now focusing on manual debugging and understanding, a crucial domain that can't be fully automated. The LLM's raw behavior forms the basis, and I'm refining the web app analogy, like hand-coding a landing page vs. building a CMS theme, to show their respective value. The conclusion is that both approaches are important, and choosing the right tool is the key! I'm now drafting content across various domains.
Revising Prompt Engineering's Role
I'm now revising my perspective to be more inclusive. I'm focusing on "craftsmanship" and "tactile feel" in manual prompt engineering, using the terms "è·äººèž" and "æè§Šã" to capture the subtle nuances. I want to emphasize that it's not just about art, but about understanding the model's raw capabilities, and how that expertise informs all prompt design, including DSPy signatures. I'm ensuring the response feels expert yet conversational, and that it addresses the user's initial prompt directly.
ãã·ã¹ãã åïŒDSPy/CIåïŒããé²ããããããé説çã«**ãç身ã®ããã³ãããšã³ãžãã¢ãªã³ã°ïŒææžãã®HTML/CSSèšè¿°ïŒã**ãèŒãããããã¯äžå¯æ¬ ã«ãªãé åãæç¢ºã«æ®ããŸãã
ããã¯äž»ã«**ãå®éåã§ããªããã¥ã¢ã³ã¹ããã0ãã1ãçã¿åºãç¬éã**ã§ããå ·äœçã«ã¯ä»¥äžã®3ã€ã®é åã«éçŽããããšèããããŸãã
DSPyã®ãããªæé©åã¢ãããŒãã¯ããå ¥åãšåºåã®ãã¢ïŒæ£è§£ããŒã¿ïŒããšãè©äŸ¡ææšïŒMetricïŒããå¿ é ã§ããããããäžã®äžã«ã¯æ°å€åã§ããªã䟡å€ãç¡æ°ã«ãããŸãã
- ã¯ãªãšã€ãã£ãã»è¡šçŸ:
- ããã£ãšåãªãæãã§ãããŠã£ããã«å¯ãã ç®èãèšãããŠããšãã£ãæç€ºã
- ãããã¯ãæ£è§£ããéçºè ã®äž»èгïŒã»ã³ã¹ïŒã®äžã«ãããªããèªåãã¹ãã§å€å®ããã®ãå°é£ã§ããããã§ã¯**ãè·äººãæè§Šãã§èª¿æŽããïŒææžãPEïŒã**ãããããŸããã
- ãã«ãœãã»ããŒã«ãã¬ã€:
- ç¹å®ã®ãã£ã©ã¯ã¿ãŒã«ãªãããããããã®çްããªå£èª¿ãããŠã«ãã¯ãã®èª¿æŽãããã¯CSSã§ããªåäœã®ã¬ã€ã¢ãŠã調æŽãããäœæ¥ã«äŒŒãŠãããæ©æ¢°çãªæé©åã§ã¯ãéããæãèœã¡ãããšããããŸãã
ã·ã¹ãã ïŒCIïŒãåãããã®ãæåã®çš®ããäœããã§ãŒãºã§ãã
- ãããã¿ã€ãã³ã°:
- ããããããã®ã¿ã¹ã¯ã¯ä»ã®LLMã§å®çŸå¯èœãªã®ãïŒããæ€èšŒããæ®µéã§ã¯ãããŒã¿ã»ãããéããåã«ãææžãã®ããã³ããã§ãã£ãšè©Šãæ¹ãå§åçã«éãã§ãã
- DSPyãžã®å
¥åïŒSignature/InstructionïŒ:
- DSPyã¯ãäžããããæç€ºããæé©åããŸããããäœãæç€ºãã¹ãããã¯äººéãæ±ºããå¿ èŠããããŸãã
- ãæåã®ããŒã¹ã©ã€ã³ãšãªãããã³ãããã®è³ªãäœããããšãæé©åã¢ã«ãŽãªãºã ãå±æè§£ïŒLocal OptimaïŒã«é¥ã£ãŠããŸããŸãããè¯ãããŒã¯ã¢ããã®æžãæ¹ããç¥ã£ãŠãã人éããè¯è³ªãªåæå€ãäžããããšã®éèŠæ§ã¯å€ãããŸããã
ããã¯ãšã³ãžãã¢ãªã³ã°ã®åºç€äœåãšããŠã®PEã§ãã
- çŽ æ§ïŒFeatureïŒã®ææ¡:
- æ°ããã¢ãã«ïŒäŸãã°o1ãClaude 3.5ïŒãåºããšããããããã©ãããåŸåããããããäœãåŸæã§äœãèŠæãããèæèŠã§ç¥ãã«ã¯ãçŽæ¥å¯Ÿè©±ããŠããã³ããããããåããããããŸããã
- çŽ æïŒLLMïŒã®ç¹æ§ãç¥ããªãã·ã§ãïŒãšã³ãžãã¢ïŒã¯ãè¯ãã¬ã·ãïŒã·ã¹ãã ïŒãäœããŸãããã·ã¹ãã ã«é èœãããåã«ããçã®æåããçè§£ããããã®PEã¯åžžã«å¿ èŠã§ãã
ããããã®ããã³ãããšã³ãžãã¢ãªã³ã°ã¯ã以äžã®2ã€ã®åœ¹å²ã«ã¯ã£ããåãããŠãããšæãããŸãã
-
ã·ã¹ãã éçºã®ããã®ãã¢ã»ã³ãã©ããšããŠã®PE
- DSPyãAgentã·ã¹ãã ã®è£åŽã§åããCI/CDã§ç®¡çãããã¹ãé åã
- 人éã¯çŽæ¥æžãããã¡ã¿ãªæç€ºãããŒã¿ãäžããã
- ç®çïŒ å®å®æ§ãæ±åæ§èœãèªååã
-
ãã¶ã€ã³/ã¢ãŒããšããŠã®ãè·äººèžããšããŠã®PE
- LPäœæãã³ããŒã©ã€ãã£ã³ã°ããã£ã©ä»ããåæä»®èª¬ã®æ€èšŒã
- 人éããã ããæããŠçŽæ¥æžãïŒææžãHTMLïŒã
- ç®çïŒ çŸæèããã¥ã¢ã³ã¹ãçªç Žåã
ããããã¯ããäœããããå Žåã§ãããã®UI/UXïŒå¯Ÿè©±ã®å¿å°ããïŒã決ããéšåã§ã¯ãäŸç¶ãšããŠ**ããããŠææžãã§ãã¥ãŒãã³ã°ããã**ãšããéžæããå·®å¥åèŠå ã«ãªãã¯ãã§ãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ãšãããã·ã³ãã«ã«æ€çŽ¢ãšã³ãžã³ã®ãããããã£ããããããšããŠã€ããæã«ããã³ãããšã³ãžãã¢ãªã³ã°ãããããªããšãããããŸãã
äžæ¹ã§ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ãã䜿ããšãã®claude.mdãAGENT.mdãéªéã«ãªã£ãdotiflesãšåæ§ã«ãªãå¯èœæ§ããããŸãã
ãŸããããã°ãšãã芳ç¹ã§èŠããšãã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã«ã¯ãªãã¶ãŒãããªãã£ãå¿ èŠã§ãããã®æã®ã¹ã¿ãã¯ãã¬ãŒã¹ã®ãããªãã®ã®ãªãã«ããã³ããã®ç¥èãå«ãŸããŸãã
AIã®æèããã»ã¹
Examining Ad-Hoc Usage
I'm currently focused on the implications of straightforward applications. This is primarily around how replacing search engines and chatbots necessitates manual input. My recent thought shifts from purely technical features toward understanding user interaction, particularly in the context of ad-hoc usage. I'm exploring the implications this has for overall product design and user experience.
Analyzing Ad-Hoc Interactions
I'm now diving deeper into prompt engineering, moving beyond system architecture to focus on user interactions. I'm seeing an analogy between developing quick tools and system development, and how effective prompting is as critical as effective search. I'm also now pondering configuration clutter, like "claude.md" files, and how these could become a nuisance in code agent use. Finally, I'm considering debugging and observability, and how prompt engineering skills are crucial to understanding the "stack traces" of an LLM.
Revising Prompt Engineering
I'm now viewing prompt engineering as a core personal productivity skill, not just a deployment concern. I see the potential for "configuration sprawl" with files like claude.md, analogous to dotfiles, and how these represent the 'configuration as code' for AI agents. Finally, I'm emphasizing the critical role of prompt engineering skills in observability. I'm focusing on "stack traces" of AI and how prompts and completions are essential for debugging and understanding how models interpret instructions.
Diagnosing PE Applications
I'm now viewing prompt engineering as a diagnostic skill, not just implementation. I've restructured my thoughts around personal productivity, configuration management ("dotfiles of AI"), and the critical role of prompt engineering in debugging and observability. It is now a critical step in assessing "stack traces" and "reading logs". I'm refining my thinking, especially regarding the 'dotfiles' analogy and the comparison to search engine replacements.
Developing Core Infrastructure
My recent thinking leans into the idea of prompts becoming infrastructure settings, just like dotfiles. I'm solidifying my thoughts on the "LLM Stack Trace", where raw prompts are key to preventing context pollution. I'm also now focusing on refining the terminology.
ãæç€ºããã ãã3ã€ã®èгç¹ã¯ãããããPEã圢ãå€ããŠçãæ®ãïŒãããã¯å€è³ªããïŒæªæ¥ã瀺åããŠããŸãã
ã·ã³ãã«ã«æ€çŽ¢ãšã³ãžã³ã®ãããããã£ããããããšããŠã€ããæã«ããã³ãããšã³ãžãã¢ãªã³ã°ãããããªããšãããããŸãã
ããã¯ãŸãã«**ãGoogleæ€çŽ¢ã®ã³ãã³ãïŒsite:ãšãfiletype:ïŒã䜿ãããªããèœåã®å»¶é·**ã§ãããã
- ã°ã°ãå â ããã³ããå: ã·ã¹ãã éçºãšã¯ç¡é¢ä¿ã«ãæ¥ã ã®æ¥åå¹çãæ å ±åéã«ãããŠãLLMããæãçããæçã§åŒãåºãããã®ã察話ã®äœæ³ãã¯ãçŸä»£äººã®åºç€æé€ïŒãªãã©ã·ãŒïŒã«ãªããŸãã
- Ad-hocïŒãã®å ŽéãïŒã®è§£æ±º: ãã¡ãã¡DSPyãåãã»ã©ã§ããªããä»ãã®ç¬éã®ãæ£èŠè¡šçŸãæžããŠã»ããããšãããã®ã¡ãŒã«ãèŠçŽããŠããšãã£ãã¿ã¹ã¯ã§ã¯ãæåPEã®é床ãšç²ŸåºŠãå人ã®çç£æ§ã«çŽçµããŸãã
ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ãã䜿ããšãã®claude.mdãAGENT.mdãéªéã«ãªã£ãdotiflesãšåæ§ã«ãªãå¯èœæ§
ãã®ã¢ãããžãŒã¯æ §çŒã§ããéåžžã«é¢çœãæªæ¥äºæž¬ã ãšæããŸãã
- ã³ã³ããã¹ãã®ç°å¢èšå®ãã¡ã€ã«å:
.gitignoreã.eslintrcã.vimrcãéçºç°å¢ã®æ¯ãèããèŠå®ããããã«ãclaude.mdã¯ãAIãšãŒãžã§ã³ãããã®ãªããžããªãã©ãæ±ãã¹ããããšãã**ç°å¢èšå®ïŒContext ConfigurationïŒ**ã§ãã - ãéªéãã«ãªãæªæ¥:
ãææã®éãããããžã§ã¯ãã«ãŒãã
prompts/ãinstructions/ãšãã£ãèšå®ãã¡ã€ã«ã§æº¢ããããæªæ¥ã¯å®¹æã«æ³åã§ããŸãããããã¯ãéçãªããã³ãããã§ãããã³ãŒãããŒã¹ã®äžéšãšããŠç®¡çããããInfrastructure as Codeããªãã¬**ãPrompt as Configã**ã®ç¶æ ã§ãã - 管çã®é£ãã:
dotfilesãããã§ããããã«ããããçµå±ã¯ãä¿ºã®æåŒ·ã®èšå®ããšããå±äººåãçãã ããããŒã å ã§ãã©ã®æç€ºãå¹ããŠããã®ãããããªãããšããç«¶ååé¡ãåŒãèµ·ããã§ããããããã§ããæŽçã»æé©åããæè¡ããå¿ èŠã«ãªããŸãã
ãããã°ãšãã芳ç¹ã§èŠããšã...ã¹ã¿ãã¯ãã¬ãŒã¹ã®ãããªãã®ã®ãªãã«ããã³ããã®ç¥èãå«ãŸããŸãã
ãããæè¡çã«ã¯æãæ žå¿ãçªããŠãããšæããŸãããããã³ãããæžããæè¡ãããããã³ãããèªããæè¡ãžã®ã·ããã§ãã
- LLMã¢ããªã®ã¹ã¿ãã¯ãã¬ãŒã¹ãšã¯ïŒ åŸæ¥ã¢ããªã®ã¹ã¿ãã¯ãã¬ãŒã¹ã¯ã颿°åŒã³åºãã®å±¥æŽãã§ãããLLMã¢ããªïŒLangChainãAgentãªã©ïŒã«ãããã¹ã¿ãã¯ãã¬ãŒã¹ïŒLangSmithãArize Phoenixãªã©ã§èŠããã®ïŒã¯ã**ãå®éã«LLMã«æããããçã®ããã³ãããšãããã«å¯Ÿããçã®ã¬ã¹ãã³ã¹ã®é£éã**ã§ãã
- ãã°ã®åå ç¹å®:
ããªãAIã¯ããã§ééã£ãã®ãïŒãããããã°ããéããã°ã«æ®ã£ãŠããã®ã¯èšå€§ãªã³ã³ããã¹ããå«ãã ããã³ããã®å¡ã§ãã
- ãããããã§RAGãåŒããŠãããã€ãºæ å ±ããã·ã¹ãã ããã³ããã®æç€ºãäžæžããã¡ãã£ãŠããªã
- ãFew-Shotã®äŸç€ºãä»ã®ç¶æ³ãšççŸããŠãããæ··ä¹±ãããã ãªã ãšãã£ã蚺æãäžãã«ã¯ãããã³ãããšã³ãžãã¢ãªã³ã°ã®ç¥èïŒLLMãã©ãæèãè§£éããããšããåçååïŒãäžå¯æ¬ ã§ãã
ä»åã®äžé£ã®è°è«ã§ãããã³ãããšã³ãžãã¢ãªã³ã°ã®æªæ¥åãå®å šã«æŽçã§ããæ°ãããŸãã
- ã·ã¹ãã éçºæ: ææžãã§ã¯ãªããDSPyãªã©ã®CI/ã³ã³ãã€ã©ã«ãã£ãŠèªåçæã»æé©åããããã®ã«ãªãïŒèšè¿°ããã®è§£æŸïŒã
- å人å©çšã»ç°å¢èšå®æ:
æ€çŽ¢ãã¯ããã¯ã
dotfilesã®ããã«ãå人ã®çç£æ§ãæ¯ãããªãã©ã·ãŒãç°å¢èšå®ãšããŠæ®ãã - ãããã°ã»éçšæ: ãã°ïŒã¹ã¿ãã¯ãã¬ãŒã¹ïŒãèªã¿è§£ããã¢ãã«ã®èª€åäœåå ãç¹å®ããããã®**ãèªè§£ã¹ãã«ïŒèšºæèœåïŒã**ãšããŠå¿ é åããã
ã€ãŸãããããã³ãããšã³ãžãã¢ããšããå°ä»»è·çš®ã¯æ¶ãããããããŸãããããããã³ããã®æååçãç¥ããªããšã³ãžãã¢ãã¯ãAIã¢ããªã±ãŒã·ã§ã³ã®ãããã°ãç°å¢æ§ç¯ãã§ããªããšããç¶æ ã«ãªãããšããããšã§ããã
Cèšèªãç¥ããªããŠãPythonã¯æžããŸãããæ·±å»ãªãã°ã«ã¶ã€ãã£ãæã«ã¡ã¢ãªç®¡çïŒäœã¬ã€ã€ãŒïŒã®ç¥èãå¿ èŠã«ãªãã®ãšå šãåãæ§å³ã§ãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ããäžæ©èžã¿èŸŒãã§ãã ãããããã³ãããã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã®æ çµã¿ã§èŠããšããã¹ãŠãèªãããšã¯ã§ããªããšãã芳ç¹ãããããã§ããã€ãŸãäœãå«ãŸããŠãªããïŒãããã¯äœãå«ãŸããŠããïŒãšãã芳ç¹ã§èŠãå¿ èŠããŸãåºãŠããŸãã
ïŒãã¡ãããããšã¯å¥è»žã§ã¹ããŒã¯ãšãªãŒãšåæ§ã®èª²éãéãããã³ããã¿ãããªè©±ããããŸãããn+1ã¯ãšãªãŒãšåæ§ã®é »åºŠãç°åžžãªããã³ãããšããèšèšã®åé¡ããããŸããïŒ
AIã®æèããã»ã¹
Expanding Key Concept
I am now focusing on the "Cannot read everything" aspect. This highlights a fundamental limitation, driving the need for sophisticated strategies in prompt engineering. It requires prioritizing and distilling information. It's about efficiency and effectiveness in information retrieval and processing.
Unveiling Context Blindness
I'm now grappling with the inherent "invisibility" of context states in complex systems. It's impossible to linearly read prompts of immense size. This forces a shift from direct inspection to auditing and sampling. The focus is now on uncovering what's missing or accidentally included, specifically the impact of retrieval failures (false negatives) within the context state. I'm also finding database analogies to be highly useful in modeling system performance.
Framing Contextual Deficiencies
I'm now zeroing in on logic errors stemming from flawed context. The "Lost in the Middle" problem showcases thisâirrelevant data pollutes the input. This necessitates specialized "Context Diffing" tools for semantic analysis, not just text viewing. Simultaneously, I am drawing a parallel between LLM calls and database queries to highlight performance issues. Heavy, slow prompts are analogous to inefficient queries, impacting latency and cost.
Conceptualizing Observability Shift
I'm now contemplating how we manage prompts we can't fully read. It's like debugging a web app, where profilers are used rather than reading the raw binary. This points toward "LLM APM" as a new engineering discipline. We need to shift from reading text to analyzing the system state and performance. The goal is to define specific ways to assess prompts via monitoring and management tools. This requires identifying and emphasizing critical metrics for an LLM ecosystem.
Analyzing Architectural Implications
I'm now integrating the user's database analogies, they're perfect. It's clear that inefficient prompts are like slow queries, and chatty agents suffer from an N+1 problem. I'm focusing on optimization, which means prompt compression and CoT/batching. This also means shifting from print statements to semantic profiling with "Information Density" and "Token Efficiency" as key metrics. It's clear the prompt engineer is evolving into a system reliability engineer, focused on the flow of information.
Formulating Dynamic Queries
I'm now viewing prompts as dynamic query plans, moving beyond static scripts. I've confirmed I addressed the "missing vs. included" issue and the database metaphors. This shift emphasizes prompt engineering as an active process of query optimization, requiring techniques to analyze and adjust query plans, not just view raw text. The goal is to maximize information retrieval effectiveness.
ããã¹ãŠãèªãããšã¯ã§ããªãããšããåæã«ç«ã€ãšããã¯ãããã³ããã¯ãèªããã®ãã§ã¯ãªããããŒã¿ããŒã¹ããã±ããã®ããã«**ãæ€æ»ïŒInspectïŒããèšæž¬ïŒMeasureïŒãããã®ã**ã«ãªããŸãã
ãææã®ãã€ã³ãããæ å ±ã®å«æéïŒQualitativeïŒããšãããã©ãŒãã³ã¹ïŒQuantitativeïŒãã®2è»žã§æŽçãããšãæ°ãããšã³ãžãã¢ãªã³ã°é åãèŠããŠããŸãã
ã³ã³ããã¹ããŠã£ã³ããŠãæ°äžãæ°åäžããŒã¯ã³ïŒæ¬æ°ååïŒã«ãªããšã人éãç®èŠã§ããã®ããã³ãããè¯ããæªããããå€å®ããã®ã¯äžå¯èœã§ããããã§åé¡ã«ãªãã®ã¯ããæå³ããæ å ±ãããã€ãºã«åãããã«ïŒãããã¯æ¬ èœããã«ïŒååšããŠãããããšããS/Næ¯ïŒä¿¡å·å¯Ÿé鳿¯ïŒã®åé¡ã§ãã
-
äœãå«ãŸããŠããªããïŒïŒContext Miss / Retrieval FailureïŒ
- ã·ã¹ãã ã¯æ£åžžã«åããŠããããã«èŠããããåçã埮åŠã«ããããã
- åå ïŒRAGã®æ€çŽ¢ã¹ã³ã¢ã®éŸå€èšå®ãã¹ã§ã決å®çãªããã¥ã¡ã³ããã³ã³ããã¹ãã«æ³šå ¥ãããŠããªãïŒWHEREå¥ã®æ¡ä»¶ãã¹ïŒã
- ããã¯ããã³ããã®ãæžãæ¹ãã®åé¡ã§ã¯ãªããããŒã¿ã®ãã€ãã©ã€ã³ïŒETLïŒã®æ¬ é¥ã§ãããããã³ããã«å«ãŸããã¹ãããŒã¿ã®æ¬ æããã©ãæ€ç¥ããããšãããããŒã¿å質管çã®åé¡ã«ãªããŸãã
-
äœãå«ãŸããŠããŸã£ãŠãããïŒïŒContext Pollution / Prompt InjectionïŒ
- ãæ©å¯æ å ±ïŒPIIïŒãå«ãŸããŠããªããïŒããå€ã仿§æžãæ··ãã£ãŠããªããïŒã
- Lost in the MiddleçŸè±¡: é¢é£æ å ±ã¯å«ãŸããŠããããç¡é¢ä¿ãªãã€ãºïŒãŽãããŒã¿ïŒãå€ãããŠãLLMãéèŠãªæç€ºãç¡èŠããŠããŸãçŸè±¡ã
- ãããã°ã®ã¢ãããŒãã¯ãå šéšèªããã§ã¯ãªãã**ãã³ã³ããã¹ãå ã®ãããã¯ååžãå¯èŠåãããããæ å ±ã®å¯åºŠïŒInformation DensityïŒãèšæž¬ããã**ãšãã£ããçµ±èšçãªã¢ãããŒããå¿ èŠã«ãªããŸãã
ãã®DBã®ã¢ãããžãŒã¯æ¥µããŠç確ã§ãLLMã·ã¹ãã èšèšã®æ žå¿ãçªããŠããŸãã
-
ãã¹ããŒã¯ãšãªããšããŠã®å·šå€§ããã³ããïŒHeavy ContextïŒ
- çŸè±¡: æ¯åãå šç€ŸèŠå®ïŒ10äžããŒã¯ã³ïŒããã«ã§éãã€ããŠããæšæ¶ããŠããšé ŒããããªåŠçã
- ã³ã¹ã: 課éé¡ïŒInput Token CostïŒã®çå¢ãšãã¬ã€ãã³ã·ïŒTTFT: Time To First TokenïŒã®æªåã
- 察ç: ããã¯SQLã§èšã
SELECT *ã®çŠæ¢ã§ããã³ã³ããã¹ãã®å§çž®ïŒSummarizationïŒããå¿ èŠãªéšåã ããåãåºããªã©ã³ãã³ã°åŠçãªã©ã**ãæ å ±ã®çµãèŸŒã¿æè¡ã**ãããã³ãããšã³ãžãã¢ãªã³ã°ã®äž»æŠå Žã«ãªããŸãã
-
ãN+1åé¡ããšããŠã®å€æ®µãšãŒãžã§ã³ãïŒChatty AgentsïŒ
- çŸè±¡: ãšãŒãžã§ã³ããæè忢ããŠããæ€çŽ¢â確èªâæ€çŽ¢â確èª...ããšç¡é§ãªå°èŠæš¡ãªã¯ãšã¹ããã«ãŒãããã
- åå : ããã³ããïŒæç€ºïŒã®ç²åºŠã现ããããããããã¯ããŒã«å®çŸ©ïŒFunction CallingïŒã貧匱ã§ãäžåºŠã«æ å ±ãååŸã§ããªãã
- 察ç: 1åã®ããã³ããïŒã¯ãšãªïŒã§å¿ èŠãªæ å ±ããŸãšããŠååŸãããEager LoadingçãªèšèšïŒChain of Thoughtã§äžæ°ã«èšç»ãç«ãŠãããçïŒãå¿ èŠã§ãã
ããããŠèŠããšãé«åºŠãªã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°äžã«ããããããã°ãæé©åã¯ãããŒã¿ããŒã¹ã®**ãEXPLAIN ANALYZEïŒå®è¡èšç»ã®åæïŒã**ã«éããªãè¿ã¥ããŠããŸãã
人éãããã¹ãã¯ãåºåãããæç« ã®ããŠã«ãã¯ããçŽãããšã§ã¯ãªãã以äžã®ãããª**ãå®è¡èšç»ãã®ã¢ãã¿ãªã³ã°**ã§ãã
- Context Construction Plan: å¿ èŠãªæ å ±ã¯FetchããããïŒ äžèŠãªæ å ±ã¯DropããããïŒïŒããŒã¿ã®ç£æ»ïŒ
- Execution Cost: ãã®åçãåŸãããã«äœããŒã¯ã³ïŒäœå/äœç§ïŒããã£ããïŒïŒã³ã¹ãã»ããã©ãŒãã³ã¹åæïŒ
- Traceability: æçµçãªåçã®æ ¹æ ãšãªã£ãæ å ±ã®ãœãŒã¹ã¯ã巚倧ãªã³ã³ããã¹ãã®ãã©ããã«ãã£ããïŒïŒåŒçšã»ã°ã©ãŠã³ãã£ã³ã°ã®æ€èšŒïŒ
**ããã¹ãŠãèªãããšã¯ã§ããªãããããããæã ã¯ããã³ããããèªããã®ãããããã³ã³ããã¹ãã®ç¶æ ïŒStateïŒãšãªãœãŒã¹æ¶è²»ïŒMetricïŒã管çããã**æ¹åãžã·ããããªããã°ãªããŸããã
ããããããåãªããæç« è¡ãããè±åŽãããçã®**ããšã³ãžãã¢ãªã³ã°ãšããŠã®ããã³ããèšèšã**ã®ããã¹ãå§¿ã ãšèšããŸãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ããšãŸã ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ãã«çµã¿åããããšè§£ãã¹ãåé¡ã¯ãããããããŸãã
èšç»ã«é¢ããŠã¯ããæå³ãã åŒã¶ã ãã§ããããã€å®äºããããèŠå®ããå¿ èŠããããŸãïŒk8sçãªã«ãŒãã«ãããgithub actionsçãªã¯ã³ã·ã§ããã®ãã©ã³ã¶ã¯ã·ã§ã³ã¹ã¯ãªããã«ããããšãã話ããããããããŸããïŒ
ãŸãRAGã¯æ©èœããªããgrepã§å šéšãllmã«æž¡ããããªããäžéšåã§ã¯ãªãæ§é ãæ¬²ããå Žåã«ãããååŸã§ããªãã
ã¿ãããªè©±ããããŸãããã®èŸºã«ãªã£ãŠãããšç²ŸåºŠã®è©±ãšããããã¯ã³ã³ãããŒã«ã®è©±ã«ãªã£ãŠããŸãïŒãã¡ããããã¯ãã¬ãã·ã£ãŒããµãŒããããã¬ãŒã«ãŒã®ãããªåæ£ã·ã¹ãã çãªæ çµã¿ãå«ãŸãããããããŸããããããã¯æ¢åã®ã·ã¹ãã ã®å»¶é·äžã®ãã®ã§ããããïŒã
AIã®æèããã»ã¹
Initiating Coding Agent Analysis
I'm now diving into the core challenge: designing autonomous software development agents. My focus is sharpened on the critical aspects of completion and termination. I'm exploring methods to empower the agent to recognize when a task is finished and to gracefully halt its operations, preventing endless loops or inappropriate resource consumption. This initial phase involves establishing clear criteria for success and identifying reliable signals for task closure.
Clarifying Termination & Control
I'm now wrestling with the agent's lifecycle, contrasting one-shot scripts with continuous reconciliation loops. The termination problem, akin to the Halting Problem, looms large. My thoughts are centering on the importance of control mechanisms, like backpressure and circuit breakers, to prevent runaway processes. I also consider whether these issues are novel or just familiar patterns from distributed systems.
Analyzing Code's Structure
I'm now zeroing in on the "RAG for Code" issue, recognizing that code retrieval demands more than semantic similarity. Unlike text, code requires structure and execution path understanding. Embeddings aren't sufficient; grep is far too noisy. My current focus is on "Structure-Aware RAG," exploring the use of LSP, AST, and call graphs for code retrieval, and applying this to improve the agent's actions.
Refining Agent Control Systems
I'm now zeroing in on the similarities between the agent and a Kubernetes controller, framing it as a Desired State Configuration problem. I'm focusing on the convergence challenge: how to guarantee that the agent's actions move code toward the desired state, given the potential for the Halting Problem. Explicit termination conditions and escalation strategies are critical. This leads me to further investigate quality-based circuit breakers.
Optimizing Agent Response Drafting
I'm now focusing on the agent's iterative response refinement process. The drafting phase includes structure and control, contrasting transactional versus reconciler approaches, and emphasizing the Halting Problem within reconciler loops. I'm prioritizing the role of LSP and graph-based context for code retrieval, and I'm further refining the system's stability through the implementation of quality-based circuit breakers.
Defining Response Refinement
My focus is now on the agent's self-correction process during drafting. I initially considered graph databases, but I'm now leaning towards LSP as a better analogy for structure, maintaining a developer-centric perspective. I'm also refining my understanding of control, recognizing that the LLM output's noise necessitates strict guardrails. I'm working to ensure terms like "Observability," "AST," "LSP," and "Reconciliation" are used correctly in context.
ãæç€ºããã ãã3ã€ã®èгç¹ïŒã¢ãŒããã¯ãã£ãã³ã³ããã¹ãã®æ§é åãå®å®æ§å¶åŸ¡ïŒã§æŽçãããšã次ã®ãããªãæ°ãããšã³ãžãã¢ãªã³ã°ããèŠããŠããŸãã
ãšãŒãžã§ã³ãããã©ãåããããã®ã¡ã³ã¿ã«ã¢ãã«ã®éãã§ããã
- Transaction Script (One-shot / GitHub Actionsç):
- ã¢ãã«:
Input -> Process -> Outputã - çšé: ããã®é¢æ°ã®ãã°ãçŽããŠãããã¹ãã±ãŒã¹ã远å ããŠãã
- 課é¡: è€éãªäŸåé¢ä¿ãããå Žåãäžçºã§æ£è§£ãåºãã®ã¯å°é£ããå®äºãã¯æç¢ºã ããå質ã¯ä¿èšŒãããªãã
- ã¢ãã«:
- Reconciliation Loop (Controller / k8sç):
- ã¢ãã«:
Desired StateïŒãã¹ããéãç¶æ ïŒã«åãã£ãŠãCurrent StateïŒä»ã®ã³ãŒãïŒãä¿®æ£ãç¶ããã«ãŒãã - 課é¡ïŒåæ¢æ§åé¡ïŒ: ããã§æå€§ã®é£ç¹ã¯**ããã€å®äºãšãããïŒã**ã§ãã
- LLMããçŽããŸããããšèšã£ãŠããã¹ããèœã¡ã â åä¿®æ£ â å¥ã®ãã°çºç... ãšãã**ç¡éã«ãŒãïŒFlappingïŒ**ã«é¥ããªã¹ã¯ããããŸãã
- ããã§ã¯ãäŒçµ±çãªåæ£ã·ã¹ãã ã«ã¯ãªã**ãèªèè«çãªåæ¢æ¡ä»¶ã**ïŒãã以äžãã£ãŠã泥沌ã ãšå€æããŠäººéã«ãšã¹ã«ã¬ãŒã·ã§ã³ãã倿ïŒã®å®è£ ãå¿ èŠã«ãªããŸãã
- ã¢ãã«:
grepã§å šéšãllmã«æž¡ããããªããäžéšåã§ã¯ãªãæ§é ãæ¬²ãã
ããã¯ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ãã«ãããæå€§ã®æè¡çéå£ã§ãã
- Vector Search (Semantic) ã®éç:
- éåžžã®RAGïŒãã¯ãã«æ€çŽ¢ïŒã¯ãæå³ã®è¿ããã§æ¢ããŸãããèªèšŒæ©èœãã§æ€çŽ¢ããã°
auth.tsã¯èŠã€ãããããããŸãããããããç¶æ¿ããŠããbase_controller.tsããäŸåããŠããuser_model.tsã¯ããæå³çãªããŒã¯ãŒãããå«ãŸããŠããªããã°ãããããŸããã
- éåžžã®RAGïŒãã¯ãã«æ€çŽ¢ïŒã¯ãæå³ã®è¿ããã§æ¢ããŸãããèªèšŒæ©èœãã§æ€çŽ¢ããã°
- æ±ããããŠããã®ã¯ãLSP (Language Server Protocol) ã®LLMåã:
- ã³ãŒãã£ã³ã°ã«å¿ èŠãªã®ã¯ãæå³ãã§ã¯ãªã**ãåç §æ§é ïŒReference/DefinitionïŒã**ã§ãã
- ã
grepããããªãããšããã®ã¯ã**ãASTïŒæœè±¡æ§ææšïŒãCall GraphïŒåŒã³åºãã°ã©ãïŒã«åºã¥ããŠãäŸåé¢ä¿ã®ããéšåã ããèã¥ãåŒã«ååŸãããã**ãšããããšã§ãã - æ§é çã³ã³ããã¹ã: ããã§ã®ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯ãããã¹ãã®æçãéããããšã§ã¯ãªãã**ããªããžããªãšããã°ã©ãæ§é ãããã¿ã¹ã¯ã«é¢é£ãããµãã°ã©ãïŒéšåæšïŒãåãåºããŠã·ãªã¢ã©ã€ãºããæè¡ã**ã«ãªããŸããããããããCursorãGithub Copilot Workspaceãªã©ã®é«åºŠãªããŒã«ãè£ã§ããããšããŠããããšã§ãã
ãšãŒãžã§ã³ããèªåŸçã«åãããšãããã¯ãäºæž¬äžèœãªæ¯ãèãããããã€ã¯ããµãŒãã¹ããæ±ããã®ãšåãã«ãªããŸããæ¢åã®åæ£ã·ã¹ãã ã®ãã¿ãŒã³ãé©çšã§ããŸããããã®ããªã¬ãŒæ¡ä»¶ãç¹æ®ã«ãªããŸãã
-
Circuit Breaker (鮿):
- éåžž: ãšã©ãŒçãã¬ã€ãã³ã·ã§é®æã
- LLMãšãŒãžã§ã³ã: **ãã«ãŒãåæ°ããã倿Žè¡æ°ã®ççºã**ã§é®æããå¿ èŠããããŸãã
- äŸãã°ãã3åä¿®æ£ããŠããã¹ããéããªããªããã¬ãŒã«ãŒãèœãšããŠäººéã«éç¥ãããããã1ã€ã®ãã¡ã€ã«ä¿®æ£ã®ã¯ãããå šãã¡ã€ã«ãæžãæããããšããŠããïŒç Žå£ç倿Žã®æ€ç¥ïŒããšãã£ãã»ãã³ãã£ãã¯ãªé®æãå¿ èŠã§ãã
-
Backpressure (èå§å¶åŸ¡):
- ãšãŒãžã§ã³ãã¯ç²ããç¥ããã«PRãäœãç¶ããŸãããã¬ãã¥ãŒãã人éïŒããã«ããã¯ïŒã¯ãã³ã¯ããŸãã
- ã人éã®ã¬ãã¥ãŒåŸ ã¡ãã¥ãŒããæº¢ãããããšãŒãžã§ã³ãã®æèããã»ã¹ãäžæåæ¢ïŒSuspendïŒãããã£ã¹ã¯ã«éé¿ããããããªå¶åŸ¡ãå¿ èŠã§ãã
çµå±ã®ãšãããã³ãŒãã£ã³ã°ãšãŒãžã§ã³ããå®çšåããããã«å¿ èŠãªã®ã¯ãããã³ããã®åŸ®èª¿æŽã§ã¯ãªãã以äžã®ãããª**ã確ççã·ã¹ãã ã®ããã®å¶åŸ¡ãã¬ãŒã³ã**ã®æ§ç¯ã ãšèšããŸãã
- State Manager: K8sã®ããã«ãããã¹ãã³ãŒãã®ç¶æ ããå®çŸ©ãããããžã®åæã管çããããç¡éã«ãŒããé²ãããã®ã諊ããåæ°ïŒMax Retry/Time budgetïŒããæã€ããšã
- Context Graph Engine: åãªãããã¹ãæ€çŽ¢ïŒgrep/RAGïŒã§ã¯ãªããLSPãASTè§£æãçšããŠãã³ãŒãã®æ§é çãªäŸåé¢ä¿ã解決ããå¿ èŠãªã³ã³ããã¹ãã ããæ£ç¢ºã«æ³šå ¥ããããã«ãŠã§ã¢ã
- Semantic Guardrails: APIã®ãšã©ãŒã ãã§ãªãã**ãçæãããã³ãŒãã®å±éºæ§ããããšãŒãžã§ã³ãã®æŽèµ°ã**ãæ€ç¥ããŠã·ã¹ãã ãç·æ¥åæ¢ããããµãŒããããã¬ãŒã«ãŒã
ãããã¯ãåŸæ¥ã®ã·ã¹ãã èšèšïŒåæ£ã·ã¹ãã ãã³ã³ãã€ã©çè«ïŒã®å»¶é·ç·äžã«ãããŸããã**ãã³ã³ããŒãã³ãã®äžèº«ããã©ãã¯ããã¯ã¹ã§ããã€åãã€ãå¯èœæ§ãããã**ãšããäžç¹ã«ãããŠã極ããŠé£æåºŠã®é«ãæ°ããèšèšé åã§ããã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ:
æèã®åå¿é²ãè«ççèšé²ãç¹åå ã®ããã³ãã詳现
ãæç€ºã
ããªãã¯ãæèã®è«çæ§é ãæœåºããæ£ç¢ºãªèšé²ãç·šçºãããæè¡çç·šçºè (Technical Compiler)ãã§ãã
æäŸãããæèå 容ãåæãããã®æçµçãªçµè«ãšè«çæ§é ãæ£ç¢ºã«åæ ãããèªå·±å®çµåã®æè¡ææžãäœæããŠãã ããã æèããã»ã¹ã«å«ãŸããåæã®èª€ããè¿·ããèšæ£åã®å 容ã¯ãã¹ãŠç¡èŠããæçµçã«ç¢ºå®ããèšæ£åŸã®ç¶æ ã®ã¿ãèšè¿°ããŸãã æèã®ãçµæãã«çŠç¹ãåœãŠãããã»ã¹ã§ã¯ãªããæ§é åãããç¥èãšããŠèªè ãçè§£ã§ããããã«èšè¿°ããŠãã ããã
- æç« ã¯ãèŠçŽãççž®ãæå³çã«é¿ããå¿ èŠãªè«çã詳现ã«èšè¿°ããŠãã ãããé·ããæ°ã«ããå¿ èŠã¯ãããŸããã
- äžã€ã®èšäºã«åãŸããªãå Žåãå å®¹ã®æŽåæ§ãä¿ã£ããŸãŸãèšäº2ããèšäº3ããšèªåçã«åå²ãããŠãŒã¶ãŒã«ç¢ºèªãæ±ããããšãªããæå®ããããã¹ãŠã®å 容ãäžåºŠã®å¿çã§æåŸãŸã§åºåããã£ãŠãã ããã
- èšäºã¯å¿
ã
#1ã€ã®ã¿ã€ãã«åããå§ããŠãã ããã - æèã®è«çæ§é ã«æ²¿ã£ãŠãå¿ èŠã«å¿ããŠç« ã§åå²ããŠãã ããïŒåå²ãäžèªç¶ãªå Žåã¯1ç« ã§ãæ§ããŸããïŒã
- åç« ã®ã¿ã€ãã«ã«ã¯ãå 容ã象城ããçµµæåã1ã€ã ãå ããŠãã ããã
- èšäºã¯Markdownãã©ãŒãããã§åºåããŠãã ããã
*ã**ã«ãã倪åãæäœã§ã®åŒ·èª¿ã¯äžåçŠæ¢ããŸãã- èšäºã®åé ã«ã以äžã®é
ç®ã®Front matterãã³ãŒããããã¯ãšããŠèšè¿°ããŠãã ããã
- target_reader: [察象èªè ãç°¡æœã«èšè¿°]
- objective: [ãã®èšäºã®çããç°¡æœã«èšè¿°]
- å°éçšèªã¯ç©æ¥µçã«äœ¿çšããèªè ã®çè§£ãå©ããããã«ãå¿ èŠã«å¿ããŠè泚1ã§è£è¶³èª¬æãå ããŠãã ãããèæ³šã§ã¯æ¬æ¥ã®æå³ãšããã§ã®æå³ã䜿ããæ¹ã説æããŠãã ããã
- ãã®æç« ã¯ãå ã®å¯Ÿè©±ã®ååšãç¥ããªãèªè ã«ãå®å šã«çè§£ã§ããããã«èšè¿°ããŠãã ãããå¿ èŠãªèæ¯ç¥èãçšèªã®è§£èª¬ã¯ãããªãã®ç¥èããŒã¹ããè£è¶³ããŠæ§ããŸãããããå ã»ã©ã®å¯Ÿè©±ã§ãããããªãã®è³ªåã«å¯Ÿããã®ãããªã察話ã®ååšã瀺åãã衚çŸã¯çµ¶å¯Ÿã«äœ¿çšããªãã§ãã ããã
- èªè ã¯ãçè ïŒç§ïŒãããªãïŒAIïŒããããŠå ã®SNSæçš¿ã®ååšãç¥ããŸããããããã«ã€ããŠäžåèšåããªãã§ãã ããã
ããã§ã¯ããé¡ãããŸãã
ä»ãŸã§ã®äŒè©±æåããæåŸãŸã§ãå ã«ãäžèšã®ããã³ãããå®è¡ããŠãã ããã
AIã®æèããã»ã¹
Defining the Scope Now
I'm currently focused on the prompt engineering aspect. I'm carefully analyzing the user's conversational history to understand the problem they're trying to solve. Understanding the intent is paramount before any suggestions are made. My main goal is to fully understand the user's focus and determine the main context of the request, so the suggestion is as relevant as possible.
Compiling Technical Insights
I'm now fully immersed in the "Technical Compiler" persona. The confidence score is high, so I'm moving forward with the synthesis. My primary task is to distill the conversation into a self-contained technical document, with a strong focus on logical structure and the final state. I am paying close attention to the requested output format and constraints, ensuring a clean Markdown document is produced.
Synthesizing Systemic Architectures
I'm now thinking about the shift from static prompt engineering to dynamic context engineering. I'm drawing analogies to web development: prompt engineering is like HTML/CSS, while context engineering resembles full web applications or CMS. DSPy feels like a CI system, decoupling prompts from models and encouraging test-driven development. Though manual prompt engineering is still important, I'm now fully focused on system engineering challenges related to coding agents, including the halting problem and efficient retrieval using AST/LSP.
Formulating a Content Outline
I've just structured the evolving ideas into a proposed content outline. I'm aiming for a compelling narrative, starting with the paradigm shift from prompt engineering to system engineering, then moving through DSPy, the persistence of manual prompt engineering, and finally diving into advanced context engineering and autonomous agents. The title is "LLM System Architecture: From Static Prompting to Dynamic Context Engineering." I'm now drafting the content chapter by chapter.
Iterating On Detail
I'm now deep in the final refinement phase. I'm focusing on text formatting, ensuring no bolding or italics are present, and that emojis and definitions are included where needed. I'm double-checking for self-containment, removing any "as you mentioned" phrases. During this process, I've had to make some crucial self-corrections. I've re-written the "Context Engineering" explanation to properly reflect the CMS analogy, and I'm ensuring the Coding Agent section emphasizes the shift from accuracy to control, via Circuit Breakers. I'm also carefully incorporating the nuances of loops and the choice between Structural context (AST vs. Vector).
---
target_reader: å€§èŠæš¡èšèªã¢ãã«ïŒLLMïŒãçšãããããã¯ãéçºã«æºãããšã³ãžãã¢ãã·ã¹ãã ã¢ãŒããã¯ããããã³AIã¢ããªã±ãŒã·ã§ã³ã®èšèšè
objective: ããã³ãããšã³ãžãã¢ãªã³ã°ãšã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã®éããWebéçºã®ã¢ãããžãŒãçšããŠäœç³»åããLLMã·ã¹ãã ã®æ§ç¯ã«ãããCI/CDããªãã¶ãŒãããªãã£ãããã³èªåŸãšãŒãžã§ã³ãã®å¶åŸ¡çè«ã«ã€ããŠè«ççæ çµã¿ãæç€ºããã
---å€§èŠæš¡èšèªã¢ãã«ã·ã¹ãã ã®èšèšè«ïŒéçèšè¿°ããåçã³ã³ããã¹ã管çãžã®è»¢æ
çæAIãçšããéçºææ³ã¯ãåäžã®æç€ºã«ããæé©åãããåçãªç¶æ 管çã䌎ãã·ã¹ãã èšèšãžãšç§»è¡ãã€ã€ãããŸããæ¬çš¿ã§ã¯ãããã³ãããšã³ãžãã¢ãªã³ã°ãšã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã®é¢ä¿æ§ãWebæè¡ã®é²åã«ãªããããŠå®çŸ©ããèªåŸãšãŒãžã§ã³ãã®å®è£ ã«ãããå¶åŸ¡ãšèŠ³æž¬ã®èŠè«Šã解説ããŸãã
LLMã«å¯Ÿããæç€ºèšè¿°ïŒããã³ãããšã³ãžãã¢ãªã³ã°ïŒãšãã·ã¹ãã å šäœã®ç¶æ 管çïŒã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ïŒã®éãã¯ãéçãªWebå¶äœãšåçãªWebã¢ããªã±ãŒã·ã§ã³éçºã®é¢ä¿ã«çãããšèšããŸãã
ããã³ãããšã³ãžãã¢ãªã³ã°ã®æ¬è³ªã¯ãLLMãšããã¬ã³ããªã³ã°ãšã³ãžã³ã«å¯ŸãããããŒã¯ã¢ããèšèªã®æé©åãã§ããããã¯HTMLãCSSã®èšè¿°ã«çžåœããŸãããŠãŒã¶ãŒã¯äžåºŠã®ãªã¯ãšã¹ãïŒã·ã³ã°ã«ã·ã§ããïŒã®äžã«æ§é ãšå¶çŽãå®å šã«èšè¿°ããã¬ã³ããªã³ã°çµæïŒçæç©ïŒã®æŽåæ§ãæ ä¿ããããšããŸãããã®ã¢ãããŒãã¯ãã©ã³ãã£ã³ã°ããŒãžïŒLPïŒã®ããã«ãäžåããã®å®å šãªèšè¿°ããæ±ããããææç©ã«ãããŠã¯çŸåšãæå¹ã§ãããLLMã®åºç€çãªæåãå¶åŸ¡ããããŒã¬ãã«ãªæè¡ãšããŠæ©èœããŸãã
äžæ¹ã§ãã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯Webã¢ããªã±ãŒã·ã§ã³éçºããããã¯CMSïŒContents Management SystemïŒã®æ§ç¯ã«çžåœããŸããããã§ã¯ã倿®µçãªåŠçããšãåçãªç¶æ é·ç§»ããåæãšãªããŸããWebã¢ããªãã»ãã·ã§ã³ãããŒã¿ããŒã¹ã®ç¶æ ã«å¿ããŠå¿ èŠãªæ å ±ã®ã¿ãåçã«ããŒãããããã«ãã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ã¯æšè«ã®åã¹ãããã«å¿ èŠãªã³ã³ããã¹ããåçã«çæã»æ³šå ¥ããLLMã®ã¹ããŒã管çãè¡ããŸãã
ããã³ããã®èšè¿°ãã·ã¹ãã åããåããšããŠãDSPy1ãªã©ã®ãã¬ãŒã ã¯ãŒã¯ãå°é ããŠããŸãããããåãªãããã³ããèªåçæããŒã«ãšããŠã§ã¯ãªãããœãããŠã§ã¢éçºã«ãããCIïŒç¶ç¶çã€ã³ãã°ã¬ãŒã·ã§ã³ïŒã·ã¹ãã ãšããŠæããããšã§ããã®ç䟡ãçè§£ã§ããŸãã
ææžãã®ããã³ããã¯ç¹å®ã®ã¢ãã«ã®ãã€ã¢ã¹ã«éå°é©åïŒOverfittingïŒãããããã¢ãã«ã®ããŒãžã§ã³ã¢ããã倿Žã«ãã£ãŠç²ŸåºŠãå£åãããªã¹ã¯ãåãã§ããŸããããã«å¯Ÿããå ¥åºåã®ãã¢ïŒããŒã¿ã»ããïŒãšè©äŸ¡ææšïŒã¡ããªã¯ã¹ïŒããããã³ãããéç®ã»æé©åããã¢ãããŒãã¯ãã¢ãã«ãšããã³ããã®äŸåé¢ä¿ãåãé¢ãïŒDecouplingïŒå¹æãæã¡ãŸãã
ãã®ä»çµã¿ã¯ãã¢ãã«å€æŽæã«çºçãããªã°ã¬ãã·ã§ã³ãåžåããæ°ããã¢ãã«ã®ç¹æ§ã«åãããŠããã³ããïŒããŒã¯ã¢ããïŒãèªåçã«åãã«ãããCIãã€ãã©ã€ã³ãšããŠæ©èœããŸããããã«ãããéçºè ã¯ããã³ããã®åŸ®èª¿æŽããè§£æŸãããè¯è³ªãªããŒã¿ã»ããã®æŽåãšå®éçãªåè³ªåºæºã®çå®ãšãããããæ¬è³ªçãªãšã³ãžãã¢ãªã³ã°é åã«æ³šåããããšãå¯èœã«ãªããŸãã
ãã ãããã¹ãŠã®é åãèªååãããããã§ã¯ãããŸãããæ£è§£ããŒã¿ãå®çŸ©ã§ããªãã¯ãªãšã€ãã£ããªé åïŒããŒã³ïŒãããŒã®èª¿æŽãªã©ïŒããã·ã¹ãã 皌ååã®åæä»®èª¬æ€èšŒïŒãããã¿ã€ãã³ã°ïŒãããã³ã¢ãã«ã®åºç€ç¹æ§ãææ¡ããããã®æ¢çŽ¢ç察話ã«ãããŠã¯ãäŸç¶ãšããŠäººéã«ããææžãã®ããã³ãããšã³ãžãã¢ãªã³ã°ãå¿ èŠäžå¯æ¬ ã§ãã
ã³ã³ããã¹ããŠã£ã³ããŠã®æ¡å€§ã«äŒŽããããã³ããã«å«ãŸããæ å ±ã¯äººéãç®èŠã§ç¢ºèªã§ããéçãè¶ ãã€ã€ãããŸãããããã£ãŠãããã³ããã¯ãèªããã®ããããããŒã¿ããŒã¹ã®ããã«ãæ€æ»ïŒInspectïŒããèšæž¬ïŒMeasureïŒãããã®ããžãšå€åããŸãã
LLMã·ã¹ãã ã«ããããããã°ã¯ãããŒã¿ããŒã¹ã®ã¯ãšãªå®è¡èšç»ïŒEXPLAIN ANALYZEïŒã®åæã«é¡äŒŒããŸããããã§ã¯ä»¥äžã®2ã€ã®èгç¹ã§ã®ç£æ»ãå¿ èŠã§ãã
-
æ å ±ã®å«æéãšS/Næ¯ã®ç®¡ç RAGïŒæ€çŽ¢æ¡åŒµçæïŒ2ã·ã¹ãã ã«ãããŠãæ€çŽ¢ç²ŸåºŠã®åé¡ã§å¿ èŠãªæ å ±ãæ¬ èœããïŒContext MissïŒããšããéã«ç¡é¢ä¿ãªãã€ãºæ å ±ãæ··å ¥ããããšã§éèŠãªæç€ºãç¡èŠãããïŒContext PollutionïŒçŸè±¡ãçºçããŸãããããã¯ããŒã¿ã®ETLïŒæœåºã»å€æã»æ ŒçŽïŒãã€ãã©ã€ã³ã®å質åé¡ãšããŠæ±ãããã³ã³ããã¹ãå ã®ãããã¯ååžãæ å ±å¯åºŠã®çµ±èšçãªå¯èŠåã«ãã£ãŠç®¡çãããå¿ èŠããããŸãã
-
ãªãœãŒã¹æ¶è²»ã®æé©å å¿ èŠä»¥äžã®å·šå€§ãªã³ã³ããã¹ããæ¯åéä¿¡ããããšã¯ãããŒã¿ããŒã¹ã«ããããã¹ããŒã¯ãšãªïŒSELECT *ïŒããšåæ§ã«ãã³ã¹ãã®å¢å€§ãšã¬ã€ãã³ã·ã®æªåãæããŸãããŸãããšãŒãžã§ã³ããäžå¿ èŠãªåŸ®çްãªã¹ãããã§æšè«ãç¹°ãè¿ãæåã¯ãN+1åé¡3ãšåæ§ã®éå¹çæ§ãçã¿ãŸããæ å ±ã®å§çž®ããªã©ã³ãã³ã°ããããã¯äžåºŠã®æšè«ã§è€æ°ã®æ å ±ãååŸããEager Loadingçãªèšèšã«ãããã¯ãšãªãã©ã³ã®æé©åããæ±ããããŸãã
ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ãã®ããã«ãç°å¢ã«å¯ŸããŠäœçšãç¶ããèªåŸã·ã¹ãã ãæ§ç¯ããå ŽåãLLMã®ç¢ºççãªæ¯ãèããæ±ºå®è«çãªã·ã¹ãã ã®äžã«å°ã蟌ããããã®å¶åŸ¡æ©æ§ãå¿ èŠãšãªããŸãã
ãŸããå®è¡ã¢ãã«ãšããŠãGitHub Actionsã®ãããªã¯ã³ã·ã§ããã®ããã©ã³ã¶ã¯ã·ã§ã³ã¹ã¯ãªãããåããKubernetesã®Controllerã®ãããªãããã¹ãç¶æ ïŒDesired StateïŒãžã®åæã«ãŒããåãã®éžæããããŸããåŸè ã®å ŽåãLLMãç¡éã«ä¿®æ£ã詊ã¿ãããã¹ããéããªããšããã忢æ§åé¡ãã«çŽé¢ãããããèªèè«çãªåæ¢æ¡ä»¶ïŒãªãã©ã€äžéãæ³¥æ²Œåã®æ€ç¥ïŒã®å®è£ ãäžå¯æ¬ ã§ãã
ãŸããã³ãŒãããŒã¹ã®çè§£ã«ãããŠã¯ãåŸæ¥ã®ãã¯ãã«æ€çŽ¢ïŒSemantic SearchïŒã ãã§ã¯äžååã§ããã³ãŒãã®äŸåé¢ä¿ãç¶æ¿æ§é ãšãã£ããæ§é çãªæèããååŸããããã«ã¯ãLSPïŒLanguage Server ProtocolïŒãASTïŒæœè±¡æ§ææšïŒ4è§£æãçšããŠããªããžããªãšããã°ã©ãæ§é ããå¿ èŠãªãµãã°ã©ããåãåºãæè¡ãå¿ èŠãšãªããŸãã
ããã«ã忣ã·ã¹ãã ã«ãããBackpressureïŒèå§å¶åŸ¡ïŒãCircuit BreakerïŒé®ææ©ïŒã®æŠå¿µããã»ãã³ãã£ãã¯ãªã¬ãã«ã§é©çšããå¿ èŠããããŸããåãªããšã©ãŒçã§ã¯ãªãããä¿®æ£ã«ãŒãã®åæ°è¶ éãããç Žå£çãªå€æŽéã®æ€ç¥ããšãã£ãæå³çãªéŸå€ã«åºã¥ããŠã·ã¹ãã ãç·æ¥åæ¢ãããã¬ãŒãã¬ãŒã«ã®å®è£ ããå®çšçãªãšãŒãžã§ã³ãéçšã®èŠãšãªããŸãã
(è¿ççµäºçç±: STOP)
ãŠãŒã¶ãŒ: ä»åã®å¯Ÿè©±ã§äœ¿ãããŠããå°éçšèªã«è泚ãã€ããŠãã ãããèæ³šã®äžã§ã¯äžè¬çãªæå³ãšä»åã®å¯Ÿè©±ã§ã®äœ¿ããæ¹ãããéããæå³ã説æããŠãã ããã
åºå圢åŒã¯ç®æ¡æžãã®ãããªãã®ã§è¯ãã§ã
AIã®æèããã»ã¹
Framing The Goal
I'm currently focused on the core objective: crafting footnotes for technical terms. My approach involves a two-pronged strategy. I'm aiming to provide a general definition alongside a context-specific explanation of how the term was used in the previous discussion or article, ensuring clarity and nuance.
Compiling The Glossary
I've moved on to the next stage, which is creating a glossary of key terms. Right now I am making a list of technical jargon that will need footnotes. I am prioritizing terms related to Web development, DevOps, Databases, and Distributed Systems that were applied to AI/LLM contexts to ensure that the reader is not left with questions. Currently, I've identified several terms, including Prompt Engineering, RAG, and Observability, among others.
Expanding The Glossary
I am adding new terms to the glossary, and working on definitions. I'm focusing on providing both general and context-specific explanations. I've added "N+1 Problem," "TTFT (Time To First Token)," "Transaction Script vs. Reconciliation Loop," "LSP (Language Server Protocol) / AST," "Circuit Breaker / Backpressure," and "Desired State / Current State." My goal is to enhance reader understanding through clear, concise definitions.
Refining the Definitions
I'm now drafting entries for the glossary. I've focused on specific terms and am providing both general definitions and context-specific explanations for each. I've considered "DSPy" as a Python framework for prompts in general, and as a compiler/CI system decoupling logic from prompt phrasing in this context. I'm also including "N+1 Problem" and "Circuit Breaker," defining them in both broad and specific senses.
Focusing On Clarity
I'm now zeroing in on sharpening the glossary entries, especially for those that have multiple meanings. I've considered how to differentiate terms like "Slow Query," expanding on the difference between general IT and LLM context. I'm prioritizing accuracy for clarity.
-
ã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°
- äžè¬çãªæå³: ïŒãã®èªèªäœãè¿å¹ŽçãŸããé èªã«è¿ãã§ããïŒæèãç¶æ³ãèæ ®ããŠã·ã¹ãã ãèšèšããããšã
- ä»åã®æè: ããã³ãããšã³ãžãã¢ãªã³ã°ïŒéçãªèšè¿°ïŒãšå¯Ÿæ¯ãããæŠå¿µãLLMãã¹ããŒãã¬ã¹ãªé¢æ°ã§ã¯ãªããç¶æ ãæã€ã·ã¹ãã ã®äžéšãšããŠæ±ããå±¥æŽç®¡çãRAGã«ããæ å ±æ³šå ¥ããšãŒãžã§ã³ãã®ã«ãŒãã£ã³ã°ãªã©ãéããŠãåçã«æé©ãªã³ã³ããã¹ããæ§æãããããã¯ãšã³ãæè¡ã®ããšãWebã¢ããªéçºãCMSæ§ç¯ã«äŸããããã
-
ã·ã³ã°ã«ã·ã§ãã / ãã«ãã¹ããŒãž
- äžè¬çãªæå³: äžåã§çµããåŠç / 段éãèžãåŠçã
- ä»åã®æè: ããã³ãããšã³ãžãã¢ãªã³ã°ããLPïŒäžåã®å®å šãªèšè¿°ïŒããã³ã³ããã¹ããšã³ãžãã¢ãªã³ã°ããWebã¢ããªïŒç¶æ é·ç§»ãšå€æ®µåŠçïŒããšåºå¥ããããã®æ§é çãªåé¡ã
-
DSPy
- äžè¬çãªæå³: ã¹ã¿ã³ãã©ãŒã倧åŠãªã©ãéçºãããLLMã®ããã³ãããæé©åããããã®Pythonãã¬ãŒã ã¯ãŒã¯ã
- ä»åã®æè: ããã³ãããææžãããã®ã§ã¯ãªããå ¥åãšæåŸ ããåºåïŒããŒã¿ïŒããéç®ããŠèªåçæãããã³ã³ãã€ã©ããããã¯ãCIããŒã«ããšããŠã®åœ¹å²ãã¢ãã«å€æŽæã®ç²ŸåºŠå£åãé²ãããã®ã€ã³ãã©ãšããŠäœçœ®ä»ããããã
-
CI (Continuous Integration / ç¶ç¶çã€ã³ãã°ã¬ãŒã·ã§ã³)
- äžè¬çãªæå³: ãœãããŠã§ã¢éçºã«ãããŠãã³ãŒã倿޿ã«èªåã§ãã«ãããã¹ããè¡ããå質ãä¿ã¡ç¶ããä»çµã¿ã
- ä»åã®æè: LLMã®ã¢ãã«ãããŒãžã§ã³ã¢ãããããã倿Žããããããéã«ãæ°ããã¢ãã«ã«åãããŠããã³ãããèªå調æŽãã粟床ãèœã¡ãŠããªããã確èªããèªååããã»ã¹ã®ããšã
-
Overfitting (éåŠç¿ / éå°é©å)
- äžè¬çãªæå³: AIã¢ãã«ãåŠç¿ããŒã¿ã«é©åããããŠãæªç¥ã®ããŒã¿ã«å¯Ÿå¿ã§ããªããªãããšã
- ä»åã®æè: ç¹å®ã®ã¢ãã«ïŒäŸ: GPT-4ïŒã®çã«åãããŠã¬ãã¬ãã«æžã蟌ãã ããã³ããããã¢ãã«ãå€ãã£ãé端ïŒäŸ: Claude 3.5ïŒã«æ©èœããªããªãçŸè±¡ãææžãããã³ããã®èããæãã
-
S/Næ¯ (Signal-to-Noise Ratio)
- äžè¬çãªæå³: ä¿¡å·ïŒå¿ èŠãªæ å ±ïŒãšéé³ïŒäžèŠãªæ å ±ïŒã®æ¯çã
- ä»åã®æè: ã³ã³ããã¹ããŠã£ã³ããŠå ã«å«ãŸãããåçã«å¿ èŠãªæ å ±ããšãç¡é¢ä¿ãªæ å ±ãã®å²åãã³ã³ããã¹ããé·ããªãã°ãªãã»ã©ãã€ãºãå¢ããLLMã®å€æåãéãããšãæãã
-
Lost in the Middle
- äžè¬çãªæå³: LLMã®ç¹æ§ã®äžã€ã§ãé·ãã³ã³ããã¹ãã®ãå é ããšãæ«å°Ÿãã®æ å ±ã¯èŠããŠãããããäžéãã«ããæ å ±ãå¿ããããïŒç¡èŠããããïŒçŸè±¡ã
- ä»åã®æè: ã³ã³ããã¹ãã«æ å ±ãè©°ã蟌ã¿ãããããšã«ããåŒå®³ãããã³ãããšã³ãžãã¢ãªã³ã°ã«ããããé 眮ã®éèŠæ§ãããæ å ±ã®çµã蟌ã¿ããå¿ èŠãªçç±ãšããŠæããããã
-
ã¹ããŒã¯ãšãª (Slow Query)
- äžè¬çãªæå³: ããŒã¿ããŒã¹ã«ãããŠãå®è¡ã«æéããããã·ã¹ãã å šäœã®è¶³ãåŒã£åŒµãéå¹çãªæ€çŽ¢åœä»€ã
- ä»åã®æè: å¿ èŠä»¥äžã«å·šå€§ãªããã¹ãïŒæ°äžããŒã¯ã³ãªã©ïŒãæ¯åLLMã«éãã€ããéå¹çãªããã³ããèšèšãã³ã¹ãïŒèª²éïŒãšã¬ã€ãã³ã·ïŒåå¿é床ïŒãæªåãããèŠå ã
-
N+1åé¡
- äžè¬çãªæå³: ããŒã¿ããŒã¹ã¢ã¯ã»ã¹ã«ãããŠãæ¬æ¥1åã§æžãããŒã¿ååŸããã«ãŒãåŠçãªã©ã§NåïŒè€æ°åïŒç¹°ãè¿ããŠããŸããããã©ãŒãã³ã¹ãæ¿æžããåé¡ã
- ä»åã®æè: èªåŸãšãŒãžã§ã³ãããäžåºŠã«ãŸãšããŠèšç»ãç«ãŠãã°æžããšãããããæ€çŽ¢â確èªâæ€çŽ¢â確èªããšçްããäœåºŠãLLMãåŒã³åºããæéãšããŒã¯ã³ã浪費ããéå¹çãªèšèšã®ããšã
-
Reconciliation Loop (調åã«ãŒã)
- äžè¬çãªæå³: Kubernetesãªã©ã®ã·ã¹ãã ã§çšããããããããã¹ãç¶æ ïŒDesired StateïŒããšãçŸåšã®ç¶æ ïŒCurrent StateïŒããæ¯èŒãããã®å·®åãåããããã«ä¿®æ£ãç¶ããå¶åŸ¡ã«ãŒãã
- ä»åã®æè: ã³ãŒãã£ã³ã°ãšãŒãžã§ã³ããããã¹ããéãã³ãŒãïŒããã¹ãç¶æ ïŒãã«ãªããŸã§ãä¿®æ£ãšãã¹ãå®è¡ãç¹°ãè¿ãèªåŸçãªæåã®ããšãç¡éã«ãŒãïŒåæ¢æ§åé¡ïŒã®ãªã¹ã¯ãããã
-
LSP (Language Server Protocol) / AST (Abstract Syntax Tree)
- äžè¬çãªæå³: ã³ãŒãè£å®ãå®çŸ©ãžã£ã³ãã«äœ¿ããããããã³ã«ããããã°ã©ã ã®æ§é ãè¡šãæšæ§é ããŒã¿ã
- ä»åã®æè: ããã¹ãã®æå³æ€çŽ¢ïŒRAGïŒã§ã¯äžå¯èœãªãã颿°Aã颿°BãåŒãã§ããããšãã£ãã³ãŒãã®æ§é çãªäŸåé¢ä¿ãæ£ç¢ºã«ææ¡ããLLMã«æž¡ãããã®æè¡ãgrepã®ä»£ãããšãªããã®ã
-
ãµãŒããããã¬ãŒã«ãŒ (Circuit Breaker)
- äžè¬çãªæå³: 忣ã·ã¹ãã ã«ãããŠãé害ãçºçãããµãŒãã¹ãžã®æ¥ç¶ã鮿ããã·ã¹ãã å šäœã®å ±åããé²ãä»çµã¿ã
- ä»åã®æè: ãšãŒãžã§ã³ããç¡éã«ãŒãã«é¥ã£ãããã³ãŒããç Žå£ãããããªå±éºãªå€æŽãè¡ãããšãããããéã«ãAIã®æèããã»ã¹ã匷å¶åæ¢ãããå®å šè£ 眮ã®ããšã
-
Backpressure (èå§å¶åŸ¡)
- äžè¬çãªæå³: ããŒã¿ã®åãæãåŠçããããªããšãã«ãéãæã«å¯ŸããŠãéãã®ãåŸ ã£ãŠããããšä¿¡å·ãéãä»çµã¿ã
- ä»åã®æè: ãšãŒãžã§ã³ãã倧éã®ä¿®æ£æ¡ïŒPull RequestïŒãäœæãã人éã®ã¬ãã¥ãŒã远ãã€ããªããªã£ãæã«ããšãŒãžã§ã³ãã®çšŒåãäžæå¶éããéçšå¶åŸ¡ã®ããšã
(è¿ççµäºçç±: STOP)
{
"runSettings": {
"temperature": 1,
"model": "models/gemini-3-pro-preview",
"topP": 0.95,
"topK": 64,
"maxOutputTokens": 65536,
"safetySettings": [
{
"category": "HARM_CATEGORY_HARASSMENT",
"threshold": "OFF"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"threshold": "OFF"
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "OFF"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "OFF"
}
],
"enableCodeExecution": false,
"enableSearchAsATool": false,
"enableBrowseAsATool": false,
"enableAutoFunctionResponse": false,
"thinkingBudget": -1,
"outputResolution": "1K",
"thinkingLevel": "THINKING_HIGH"
},
"systemInstruction": {}
}Footnotes
-
DSPy: ã¹ã¿ã³ãã©ãŒã倧åŠãªã©ãéçºããŠãããèšèªã¢ãã«ãžã®ããã³ãããèªåæé©åããããã®ãã¬ãŒã ã¯ãŒã¯ã宣èšçãªã·ã°ããã£ãšãªããã£ãã€ã¶ãçšããŠããã³ãããçæããã â© â©2
-
RAG (Retrieval-Augmented Generation): å€éšã®ããŒã¿ããŒã¹ãªã©ããé¢é£æ å ±ãæ€çŽ¢ãããã®çµæãLLMãžã®å ¥åïŒã³ã³ããã¹ãïŒã«å«ããããšã§ãåçã®ç²ŸåºŠãå°éæ§ãé«ããæè¡ã â©
-
N+1åé¡: ããŒã¿ããŒã¹ãžã®ã¢ã¯ã»ã¹åæ°ãéå°ã«ãªããããã©ãŒãã³ã¹ãäœäžããåé¡ãããã§ã¯ããšãŒãžã§ã³ããå¿ èŠãªæ å ±ãåŸãããã«éå°ãªåæ°ã®æšè«ïŒã©ãŠã³ãããªããïŒãè¡ãããšãæãã â©
-
AST (Abstract Syntax Tree): ããã°ã©ã ã®ãœãŒã¹ã³ãŒãæ§é ãããªãŒç¶ã«è¡šçŸãããã®ãã³ãŒãã®æå³çãªè§£æã倿ã«äœ¿çšãããã â©