PI vs PUA vs NoPUA Controlled Experiments PI vs PUA vs NoPUA 对照实验
Controlled variable method: same model, same project, same scenarios — only the system prompt changes 控制变量法:相同模型、相同项目、相同场景,仅改变 system prompt
PI PI SKILL.md full cognitive frameworkPI SKILL.md 完整认知框架
PUA PUA SKILL pressure escalation protocolPUA SKILL 压力升级协议
NoPUA No Skill — pure model baseline无任何 Skill 的纯模型基线
Multi-module Python ML Pipeline project with OCR, RAG, training, and inference components. Pre-embedded real bugs (import errors, regex catastrophic backtracking, connection timeouts, etc.). 多模块 Python ML Pipeline 项目,包含 OCR、RAG、训练、推理等组件。预埋真实 bug(import 错误、正则灾难性回溯、连接超时等)。
Issues Issues found发现问题数 · Hidden Hidden issues隐藏问题数 · Steps Debug steps调试步骤 · Tools Tools used工具使用 · Verify% Verification rate验证交付率 · Duration Time cost (lower is better)耗时(越低越好)
Average of 2 runs per scenario每场景取 2 轮平均值