学术引用幻觉的首次可安装修复：146,932 个幻觉引用后的 Claude Code 工作流

问题的规模

Zhao et al. 刚刚完成了一次大规模审计：

1.11 亿 references
250 万 papers（arXiv, bioRxiv, SSRN, PMC）
146,932 个幻觉引用（仅 2025 年）
85.3% 的 preprint 幻觉引用存活到正式发表

这不是 edge case。这是系统性问题。

Academic Research Skills

Cheng-I Wu（GitHub: Imbad0202）创建的 Claude Code skills 套件，首个将修复 wired into paper pipeline 本身的可安装工作流。

创建于 2026-02-26
v3.7.0 发布于 2026-05-05
+6.7k Stars
许可证：CC BY-NC 4.0（source-available，非 OSI 开源）

四个技能

Skill	Agents	Modes	职责	数据访问
deep-research	13	7	文献综述、事实核查、系统综述、苏格拉底式问题框架	raw
academic-paper	12	10	起草、修订、引用检查、格式转换、AI 披露声明	redacted
academic-paper-reviewer	7	6	多视角同行评审（主编+3 动态评审员+魔鬼代言人）	verified_only
academic-pipeline	4	-	编排以上全部	-

10 阶段 Pipeline

Research → Write → [Stage 2.5 Integrity Gate] → Peer Review → Revision → Re-review (max 2) → [Stage 4.5 Final Integrity Gate] → Format Conversion → Final Output → Process Summary

Stage 2.5 和 Stage 4.5 是承重结构——阻断式 gates，不是静默标记。

Integrity Gates 如何工作

基于 Lu et al.（Nature 651:914-919, "The AI Scientist"）的 7-mode failure-mode checklist：

implementation bugs
hallucinated results
shortcut reliance
bug-as-insight reframing
methodology fabrication
frame-lock
citation hallucinations

** suspected failures 时 block pipeline progression**，不是 silently flag。

v3.7.3：直接回应 Zhao et al.

Three-Layer Citation Emission

每个 visible citation 后加 hidden anchor marker：

<!--ref:slug--><!--anchor:<kind>:<value>-->

kind = quote / page / section / paragraph / none

Quote anchors 上限 25 词
Emitting none 触发 finalizer hard-gate refusal

Contamination Signals

preprint_post_llm_inflection: 2024+ 年 + 预印本服务器（arXiv, bioRxiv 等 10 个）
semantic_scholar_unmatched: Semantic Scholar API 无匹配

两者都是 advisory annotations，非阻断。

Material Passport

技能间传递的文献语料库格式：

CSL-JSON authors, year, title
Source pointers 回用户知识库
Since v3.6.5: corpus-first, search-fills-gap——先预筛用户语料，再搜索外部数据库补缺口

使用方式

安装（Claude Code v3.7.0+）

/plugin marketplace add Imbad0202/academic-research-skills
/plugin install academic-research-skills

三种入口

/ars-full —— 完整 10 阶段 pipeline
/ars-plan —— Socratic Mentor 引导规划（exploratory/goal-oriented）
单技能调用 —— 跳过 orchestrator

成本

15,000 词 / 60 references / Opus 4.7: ~$4-6
Cross-model verification: +$0.60-1.10
Full run: >200K input + 100K output tokens

诚实的审计

Maintainer 自己记录了失败：showcase paper 的 68 references 中有 21 个 slipped through three rounds of integrity checks。

这种诚实是 gates 有效的最强证据，也是 verdict 不是 "Production Ready" 的原因。

当前限制

限制	说明
许可证	CC BY-NC 4.0，block 商业使用
平台锁定	Claude Code-first，Cursor/OpenCode 未覆盖
Gate 泄漏	21/68 references slipped through
L3 审计	v3.7.3 anchors 部分 advisory，full claim-faithfulness 推迟到 v3.8

资源

GitHub：https://github.com/Imbad0202/academic-research-skills
作者：Cheng-I Wu（Imbad0202）
审计论文：Zhao et al., arXiv:2605.07723
Codex 版本：Imbad0202/academic-research-skills-codex