Author:严灿平
《基于大模型的RAG应用开发与优化——构建企业级LLM应用》是一本全面介绍基于大语言模型的RAG应用开发的专业图书。本书共分为3篇:预备篇、基础篇和高级篇。预备篇旨在帮助你建立起对大模型与RAG的基本认识,并引导你搭建起RAG应用开发的基础环境;基础篇聚焦于经典RAG应用开发的核心要素与阶段,介绍关键模块的开发过程,剖析相关的技术原理,为后面的深入学习打下坚实的基础;高级篇聚焦于RAG应用开发的高阶模块与技巧,特别是在企业级RAG应用开发中的优化策略与技术实现,并探索了一些新型的RAG工作流与范式,旨在帮助你了解最新的RAG应用技术发展,掌握RAG应用的全方位开发能力。 《基于大模型的RAG应用开发与优化——构建企业级LLM应用》适合对大模型及RAG技术感兴趣的开发者、研究人员、产品经理及希望了解并掌握RAG应用开发能力的人阅读。无论你是进入AI领域的初学者,还是已经有一定基础的进阶者,都能从本书中找到适合自己的内容。
Tags
Support Statistics
¥.00 ·
0times
Text Preview (First 20 pages)
Registered users can read the full content for free
Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.
Page
1
(This page has no text content)
Page
2
内 容 简 介 本书是一本全面介绍基于大语言模型的 RAG 应用开发的专业图书。本书共分为 3 篇:预备篇、 基础篇和高级篇。预备篇旨在帮助你建立起对大模型与 RAG 的基本认识,并引导你搭建起 RAG 应 用开发的基础环境;基础篇聚焦于经典 RAG 应用开发的核心要素与阶段,介绍关键模块的开发过 程,剖析相关的技术原理,为后面的深入学习打下坚实的基础;高级篇聚焦于 RAG 应用开发的高 阶模块与技巧,特别是在企业级 RAG 应用开发中的优化策略与技术实现,并探索了一些新型的 RAG 工作流与范式,旨在帮助你了解最新的 RAG 应用技术发展,掌握 RAG 应用的全方位开发能力。 本书适合对大模型及 RAG 技术感兴趣的开发者、研究人员、产品经理及希望了解并掌握 RAG 应用开发能力的人阅读。无论你是进入 AI 领域的初学者,还是已经有一定基础的进阶者,都能从 本书中找到适合自己的内容。 未经许可,不得以任何方式复制或抄袭本书之部分或全部内容。 版权所有,侵权必究。 图书在版编目(CIP)数据 基于大模型的 RAG 应用开发与优化 : 构建企业级 LLM 应用 / 严灿平著. -- 北京 : 电子工业出版社, 2024. 11. -- ISBN 978-7-121-49038-5 Ⅰ. TP391 中国国家版本馆 CIP 数据核字第 2024GA0760 号 责任编辑:石 悦 印 刷: 装 订: 出版发行:电子工业出版社 北京市海淀区万寿路 173 信箱 邮编:100036 开 本:720×1000 1/16 印张:32.75 字数:536 千字 版 次:2024 年 11 月第 1 版 印 次:2024 年 11 月第 1 次印刷 定 价:139.00 元 凡所购买电子工业出版社图书有缺损问题,请向购买书店调换。若书店售缺,请与本社发行部 联系,联系及邮购电话:(010)88254888,88258888。 质量投诉请发邮件至 zlts@phei.com.cn,盗版侵权举报请发邮件至 dbqq@phei.com.cn。 本书咨询联系方式:faq@phei.com.cn。
Page
3
前 言 大语言模型(Large Language Model,LLM,也称为大模型)以卓越的自然语 言处理能力,正引领着人工智能(Artificial Intelligence,AI)技术变革的新浪潮。 作为大模型应用的一个重要分支与形态,检索增强生成(Retrieval-Augmented Generation,RAG)在智能搜索、智能问答、智能客服、数据分析及 AI 智能体等 多个领域展现出了巨大的应用前景。 RAG 可以很简单。RAG 的基础技术原理可以用几句话简单进行描述。你可 以使用低代码开发平台或者成熟的大模型应用开发框架在几分钟之内开发出一个 可以演示的原型应用。RAG 也可以很复杂。当把一个 RAG 应用真正投入生产, 特别是在企业级应用环境中业务需求与数据复杂性都有了数量级的提升,面临着 更高的准确性与可用性等工程化要求时,你可能会发现原型应用与生产应用之间 有巨大的鸿沟,会面临诸如数据形态多样、检索不够准确、模型输出时好时坏、 用户提问千奇百怪、端到端响应性能不足等各种在原型应用演示中不会出现的 问题。 所以,对于广大开发者而言,如何高效地设计、开发、部署并优化“生产就 绪”的企业级 RAG 应用仍然充满挑战。因此,我衷心地希望本书为有志于探索 大模型应用世界并充满热情的开发者抛砖引玉,提供一份较为详尽的开发 RAG 应用的指南,助力他们在这次技术变革中乘风破浪。 本书的内容基于 AI 开发的首选语言 Python,并选择侧重于 RAG 领域的主流 开发框架 LlamaIndex 作为基础框架。两者丰富的工具资源和强大的社区支持,为 RAG 应用开发提供了得天独厚的条件,大大减少了“重复造轮子”的时间。需要 说明的是,尽管我们的开发技术与样例是基于 Python 与 LlamaIndex 框架介绍的, 但书中绝大部分关于 RAG 的思想、原理、架构与优化方法都是通用的,你完全
Page
4
IV | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 可以使用其他语言与框架实现相同的功能。 当然,随着技术的不断进步和应用的深入拓展,新理论、新方法、新技术层 出不穷。我衷心希望本书能够作为一个起点,激发你对大模型应用开发技术的兴 趣与探索欲,也期待未来能够有更多的学者、专家从事这一领域的研究,共同推 动大模型应用的落地与演进,为人工智能的未来贡献更多的智慧与力量。 严灿平 2024 年 7 月
Page
5
目 录 预 备 篇 第 1 章 了解大模型与 RAG ································································· 3 1.1 初识大模型 ·········································································· 3 1.1.1 大模型时代:生成式 AI 应用的爆发 ·································· 3 1.1.2 大模型应用的持续进化 ·················································· 4 1.1.3 大模型是无所不能的吗 ·················································· 7 1.2 了解 RAG ··········································································· 11 1.2.1 为什么需要 RAG ························································· 11 1.2.2 一个简单的 RAG 场景 ·················································· 12 1.3 RAG 应用的技术架构 ···························································· 14 1.3.1 RAG 应用的经典架构与流程 ·········································· 14 1.3.2 RAG 应用面临的挑战 ··················································· 17 1.3.3 RAG 应用架构的演进 ··················································· 18 1.4 关于 RAG 的两个话题 ··························································· 20 1.4.1 RAG 与微调的选择 ······················································ 21 1.4.2 RAG 与具有理解超长上下文能力的大模型 ························ 24 第 2 章 RAG 应用开发环境搭建 ·························································· 27 2.1 开发 RAG 应用的两种方式 ····················································· 27 2.1.1 使用低代码开发平台 ···················································· 27 2.1.2 使用大模型应用开发框架 ·············································· 29 2.2 RAG 应用开发环境准备 ························································· 33
Page
6
VI | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 2.2.1 硬件环境建议 ····························································· 33 2.2.2 基础大模型 ································································ 34 2.2.3 嵌入模型 ··································································· 41 2.2.4 Python 虚拟运行环境 ···················································· 44 2.2.5 Python IDE 与开发插件 ················································· 45 2.2.6 向量库 ······································································ 47 2.2.7 LlamaIndex 框架 ·························································· 51 2.3 关于本书开发环境的约定 ······················································· 51 【预备篇小结】············································································ 52 基 础 篇 第 3 章 初识 RAG 应用开发 ······························································· 55 3.1 开发一个最简单的 RAG 应用 ·················································· 55 3.1.1 使用原生代码开发 ······················································· 56 3.1.2 使用 LlamaIndex 框架开发 ············································· 64 3.1.3 使用 LangChain 框架开发 ·············································· 68 3.2 如何跟踪与调试 RAG 应用 ····················································· 70 3.2.1 借助 LlamaDebugHandler ··············································· 70 3.2.2 借助第三方的跟踪与调试平台 ········································ 73 3.3 准备:基于 LlamaIndex 框架的 RAG 应用开发核心组件 ················· 77 第 4 章 模型与 Prompt ····································································· 78 4.1 大模型 ··············································································· 78 4.1.1 大模型在 RAG 应用中的作用 ········································· 79 4.1.2 大模型组件的统一接口 ················································· 80 4.1.3 大模型组件的单独使用 ················································· 82 4.1.4 大模型组件的集成使用 ················································· 83 4.1.5 了解与设置大模型的参数 ·············································· 84 4.1.6 自定义大模型组件 ······················································· 85 4.1.7 使用 LangChain 框架中的大模型组件 ······························· 87 4.2 Prompt ··············································································· 87
Page
7
目 录 | VII 4.2.1 使用 Prompt 模板 ························································· 87 4.2.2 更改默认的 Prompt 模板 ················································ 88 4.2.3 更改 Prompt 模板的变量 ················································ 91 4.3 嵌入模型 ············································································ 92 4.3.1 嵌入模型在 RAG 应用中的作用 ······································ 92 4.3.2 嵌入模型组件的接口 ···················································· 93 4.3.3 嵌入模型组件的单独使用 ·············································· 95 4.3.4 嵌入模型组件的集成使用 ·············································· 97 4.3.5 了解与设置嵌入模型的参数 ··········································· 97 4.3.6 自定义嵌入模型组件 ···················································· 98 第 5 章 数据加载与分割 ·································································· 100 5.1 理解两个概念:Document 与 Node ·········································· 100 5.1.1 什么是 Document 与 Node ············································ 100 5.1.2 深入理解 Document 与 Node ········································· 102 5.1.3 深入理解 Node 对象的元数据 ······································· 103 5.1.4 生成 Document 对象 ··················································· 106 5.1.5 生成 Node 对象 ························································· 107 5.1.6 元数据的生成与抽取 ··················································· 111 5.1.7 初步了解 IndexNode 类型 ············································ 115 5.2 数据加载 ·········································································· 116 5.2.1 从本地目录中加载 ····················································· 117 5.2.2 从网络中加载数据 ····················································· 123 5.3 数据分割 ·········································································· 129 5.3.1 如何使用数据分割器 ·················································· 129 5.3.2 常见的数据分割器 ····················································· 131 5.4 数据摄取管道 ···································································· 145 5.4.1 什么是数据摄取管道 ·················································· 145 5.4.2 用于数据摄取管道的转换器 ········································· 147 5.4.3 自定义转换器 ··························································· 149 5.4.4 使用数据摄取管道 ····················································· 150
Page
8
VIII | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 5.5 完整认识数据加载阶段 ························································ 155 第 6 章 数据嵌入与索引 ·································································· 156 6.1 理解嵌入与向量 ································································· 156 6.1.1 直接用模型生成向量 ·················································· 157 6.1.2 借助转换器生成向量 ·················································· 157 6.2 向量存储 ·········································································· 158 6.2.1 简单向量存储 ··························································· 159 6.2.2 第三方向量存储 ························································ 161 6.3 向量存储索引 ···································································· 164 6.3.1 用向量存储构造向量存储索引对象 ································ 165 6.3.2 用 Node 列表构造向量存储索引对象 ······························ 166 6.3.3 用文档直接构造向量存储索引对象 ································ 169 6.3.4 深入理解向量存储索引对象 ········································· 172 6.4 更多索引类型 ···································································· 175 6.4.1 文档摘要索引 ··························································· 175 6.4.2 对象索引 ································································· 177 6.4.3 知识图谱索引 ··························································· 180 6.4.4 树索引 ···································································· 186 6.4.5 关键词表索引 ··························································· 187 第 7 章 检索、响应生成与 RAG 引擎 ················································· 190 7.1 检索器 ············································································· 191 7.1.1 快速构造检索器 ························································ 191 7.1.2 理解检索模式与检索参数 ············································ 192 7.1.3 初步认识递归检索 ····················································· 197 7.2 响应生成器 ······································································· 199 7.2.1 构造响应生成器 ························································ 200 7.2.2 响应生成模式 ··························································· 201 7.2.3 响应生成器的参数 ····················································· 210 7.2.4 实现自定义的响应生成器 ············································ 212
Page
9
目 录 | IX 7.3 RAG 引擎:查询引擎 ·························································· 214 7.3.1 构造内置类型的查询引擎的两种方法 ····························· 214 7.3.2 深入理解查询引擎的内部结构和运行原理 ······················· 217 7.3.3 自定义查询引擎 ························································ 219 7.4 RAG 引擎:对话引擎 ·························································· 221 7.4.1 对话引擎的两种构造方法 ············································ 221 7.4.2 深入理解对话引擎的内部结构和运行原理 ······················· 224 7.4.3 理解不同的对话模式 ·················································· 227 7.5 结构化输出 ······································································· 238 7.5.1 使用 output_cls 参数 ··················································· 239 7.5.2 使用输出解析器 ························································ 240 【基础篇小结】·········································································· 242 高 级 篇 第 8 章 RAG 引擎高级开发 ······························································ 245 8.1 检索前查询转换 ································································· 245 8.1.1 简单查询转换 ··························································· 246 8.1.2 HyDE 查询转换 ························································· 247 8.1.3 多步查询转换 ··························································· 249 8.1.4 子问题查询转换 ························································ 252 8.2 检索后处理器 ···································································· 257 8.2.1 使用节点后处理器 ····················································· 257 8.2.2 实现自定义的节点后处理器 ········································· 258 8.2.3 常见的预定义的节点后处理器 ······································ 259 8.2.4 Rerank 节点后处理器 ·················································· 264 8.3 语义路由 ·········································································· 270 8.3.1 了解语义路由 ··························································· 270 8.3.2 带有路由功能的查询引擎 ············································ 272 8.3.3 带有路由功能的检索器 ··············································· 274 8.3.4 使用独立的选择器 ····················································· 275
Page
10
X | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 8.3.5 可多选的路由查询引擎 ··············································· 276 8.4 SQL 查询引擎 ···································································· 278 8.4.1 使用 NLSQLTableQueryEngine 组件 ································ 279 8.4.2 基于实时表检索的查询引擎 ········································· 281 8.4.3 使用 SQL 检索器 ······················································· 283 8.5 多模态文档处理 ································································· 284 8.5.1 多模态文档处理架构 ·················································· 284 8.5.2 使用 LlamaParse 解析文档 ············································ 286 8.5.3 多模态文档中的表格处理 ············································ 292 8.5.4 多模态大模型的基础应用 ············································ 295 8.5.5 多模态文档中的图片处理 ············································ 301 8.6 查询管道:编排基于 Graph 的 RAG 工作流 ······························· 306 8.6.1 理解查询管道 ··························································· 307 8.6.2 查询管道支持的两种使用方式 ······································ 308 8.6.3 深入理解查询管道的内部原理 ······································ 311 8.6.4 实现并插入自定义的查询组件 ······································ 313 第 9 章 开发 Data Agent ································································· 319 9.1 初步认识 Data Agent ···························································· 320 9.2 构造与使用 Agent 的工具 ····················································· 321 9.2.1 深入了解工具类型 ····················································· 322 9.2.2 函数工具 ································································· 323 9.2.3 查询引擎工具 ··························································· 324 9.2.4 检索工具 ································································· 325 9.2.5 查询计划工具 ··························································· 326 9.2.6 按需加载工具 ··························································· 328 9.3 基于函数调用功能直接开发 Agent ·········································· 329 9.4 用框架组件开发 Agent ························································· 333 9.4.1 使用 OpenAIAgent ····················································· 333 9.4.2 使用 ReActAgent ······················································· 334 9.4.3 使用底层 API 开发 Agent ············································· 336
Page
11
目 录 | XI 9.4.4 开发带有工具检索功能的 Agent ···································· 338 9.4.5 开发带有上下文检索功能的 Agent ································· 339 9.5 更细粒度地控制 Agent 的运行 ··············································· 341 9.5.1 分步可控地运行 Agent ················································ 342 9.5.2 在 Agent 运行中增加人类交互 ······································ 344 第 10 章 评估 RAG 应用 ································································· 347 10.1 为什么 RAG 应用需要评估 ·················································· 347 10.2 RAG 应用的评估依据与指标 ················································ 348 10.3 RAG 应用的评估流程与方法 ················································ 349 10.4 评估检索质量 ··································································· 350 10.4.1 生成检索评估数据集 ················································· 350 10.4.2 运行评估检索过程的程序 ··········································· 352 10.5 评估响应质量 ··································································· 354 10.5.1 生成响应评估数据集 ················································· 354 10.5.2 单次响应评估 ·························································· 356 10.5.3 批量响应评估 ·························································· 358 10.6 基于自定义标准的评估 ······················································· 360 第 11 章 企业级 RAG 应用的常见优化策略 ········································· 362 11.1 选择合适的知识块大小 ······················································· 362 11.1.1 为什么知识块大小很重要 ··········································· 362 11.1.2 评估知识块大小 ······················································· 363 11.2 分离检索阶段的知识块与生成阶段的知识块 ···························· 367 11.2.1 为什么需要分离 ······················································· 367 11.2.2 常见的分离策略及实现 ·············································· 367 11.3 优化对大文档集知识库的检索 ·············································· 376 11.3.1 元数据过滤 + 向量检索 ·············································· 376 11.3.2 摘要检索+ 内容检索 ················································· 381 11.3.3 多文档 Agentic RAG ················································· 388 11.4 使用高级检索方法 ····························································· 395
Page
12
XII | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 11.4.1 融合检索 ································································ 396 11.4.2 递归检索 ································································ 404 第 12 章 构建端到端的企业级 RAG 应用 ············································ 427 12.1 对生产型 RAG 应用的主要考量 ············································ 427 12.2 端到端的企业级 RAG 应用架构 ············································ 428 12.2.1 数据存储层 ····························································· 429 12.2.2 AI 模型层 ······························································· 430 12.2.3 RAG 工作流与 API 模块 ············································ 430 12.2.4 前端应用模块 ·························································· 431 12.2.5 后台管理模块 ·························································· 432 12.3 端到端的全栈 RAG 应用案例 ··············································· 434 12.3.1 简单的全栈 RAG 查询应用 ········································· 434 12.3.2 基于多文档 Agent 的端到端对话应用 ···························· 453 第 13 章 新型 RAG 范式原理与实现 ·················································· 476 13.1 自纠错 RAG:C-RAG ························································ 476 13.1.1 C-RAG 诞生的动机 ··················································· 476 13.1.2 C-RAG 的原理 ························································· 477 13.1.3 C-RAG 的实现 ························································· 478 13.2 自省式 RAG:Self-RAG ······················································ 483 13.2.1 Self-RAG 诞生的动机 ················································ 483 13.2.2 Self-RAG 的原理 ······················································ 484 13.2.3 Self-RAG 的实现 ······················································ 491 13.2.4 Self-RAG 的优化 ······················································ 504 13.3 检索树 RAG:RAPTOR ······················································ 505 13.3.1 RAPTOR 诞生的动机 ················································ 505 13.3.2 RAPTOR 的原理 ······················································ 506 13.3.3 RAPTOR 的实现 ······················································ 508 【高级篇小结】·········································································· 512
Page
13
预 备 篇
Page
14
第 1 章 了解大模型与 RAG | 3 第 1 章 了解大模型与 RAG 毋庸置疑,大模型与生成式 AI(Generative AI,Gen-AI)是自 2023 年以 来在全球科技界最受瞩目的计算机技术。RAG 随之成了一个被反复提及与研 究的大模型应用范式与架构,也是当前在生成式 AI 领域最成熟的一类应用层 解决方案。 在深入学习如何开发与优化 RAG 应用之前,本章先简单介绍一下 RAG 的 前世今生。这将有助于你深入理解与构建 RAG 应用。 1.1 初识大模型 1.1.1 大模型时代:生成式 AI 应用的爆发 要说近两年最火的现象级信息科技应用,自然非横空出世的来自美国 OpenAI 公司的 ChatGPT 莫属,其不仅创造了最短时间内达到亿个级用户的世 界纪录,还引发了整个科技界的“百模大战”,甚至“千模大战”,引领了 AI 大步迈入 2.0 时代,也向我们描绘了更加强大的通用人工智能(Artificial General Intelligence,AGI)的未来。 为什么大模型会忽然火爆?纵观之前的计算机技术发展史,曾有过不少革 命性的技术忽然涌现,比如区块链、元宇宙都曾经是很多创业者与科技观察者 的宠儿,但它们掀起的研究热潮远不如这次大模型掀起的研究热潮,而且这次 热潮还远远没有结束。这其中的一个可能原因来自应用层,它带来了能够真正
Page
15
4 | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 提供价值与生产力的应用,带来了能够贴近普通人且使用门槛极低的应用(见 图 1-1)。尽管大模型底层技术来自复杂的深度学习与神经网络模型,但其在应 用层以极度简洁的形式呈现在普通人的面前。 图 1-1 我们现在提到的大模型,无论是输出文本的模型,还是文生图或者文生视 频的扩散模型,与之前专注于发现隐藏模式或者学习人类视觉语言处理能力的 决策式 AI 相比,都更擅长推理与创造,因此基于大模型生成能力开发的应用 也被称为生成式 AI 应用。这一类应用借助大模型生成的内容,无论是文本、 图片还是音视频,无论是无固定格式的自然语言文本,还是具备指定格式的结 构化信息,都可以被称为人工智能生成内容(AI Generated Content,AIGC)。 1.1.2 大模型应用的持续进化 无论技术原理与底层算法多么先进,大模型的价值都必须在真正的应用场 景中才能体现。大模型不仅要在个人应用领域体现价值,还要在对工程化能力 要求更高的企业级应用领域实现规模化的应用,并实现良性循环。 大模型最原始也是最为人所知的应用形式就是类似于 ChatGPT 最初版本 的一个自然语言对话机器人。经过飞跃式发展,生成式 AI 应用无论是在数量、 形式、创意上还是在赋能的领域都已经远远超越了最初的应用。 个人应用领域(To C):人们可以借助大模型实现众多场景中的应用,如 对话机器人(见图 1-2)、精确搜索、文字翻译、文档写作、文档辅助阅读、虚 拟角色扮演、多媒体内容创作、设计助理、代码辅助生成(见图 1-3)等。这 些应用有的以独立工具的形式出现,有的以嵌入式的 AI 辅助助手(AI Copilot)
Page
16
第 1 章 了解大模型与 RAG | 5 的形式集成到各类通用软件中。 图 1-2 图 1-3
Page
17
6 | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 企业级应用领域(To B):虽然受限于企业级应用场景中更严苛的业务环 境与工程化能力要求,很多应用仍然处于原型与验证阶段,但我们能看到在智 能客服、在线咨询、智能营销、交互式数据分析、智能企业搜索、机器人流程 自动化(Robotic Process Automation,RPA)增强等领域有着越来越多的实验 性的应用。 基于大模型的 AI 能力输出的形式可以有多种,比如我们可以简单地借助 大模型的应用程序接口(Application Program Interface,API)将 AI 能力植入 现有软件功能与流程中实现 AI 赋能。当然,能体现大模型价值的更多的形式 仍然是独立的 AI 原生应用与工具。需要注意的是,现有的基于大模型的很多 AI 应用已经远远超越了以最初版本的 ChatGPT 为代表的简单对话机器人(见 图 1-4),更多的是具备了自我规划与记忆、使用外部工具或插件,甚至自我反 省与纠错能力的 AI 智能体(通常称为 AI Agent 或 AI Assistant,见图 1-5)。我 们将要深入学习的 RAG 既可以是一种独立的应用形态,也可以是在开发更复 杂的 AI 智能体时所依赖的一种常见的设计范式或架构。 图 1-4 图 1-5
Page
18
第 1 章 了解大模型与 RAG | 7 1.1.3 大模型是无所不能的吗 既然大模型已经如此强大,能够很好地理解人类的语言甚至人类所看到的 世界,也具备了强大的推理与输出能力,是否就意味着我们可以跑步进入 AGI 时代了呢?答案显然是否定的。大模型的底层技术原理决定了其在自然语言理 解与处理能力上带来了革命性提升,但也带来了一些天然存在,甚至“很难根 治的疾病”。 1.知识的时效性问题 大模型是一个具有海量参数(通常从几百万个到几千亿个不等)、学习海 量人类知识的神经网络模型。一个具有如此多的参数的模型在预训练与微调过 程中需要耗费极大的财力、物力与时间成本,所以大模型的迭代周期通常短则 几天,多则几个月。商业的通用大模型还需要进行各种安全测试与风险评估。 所以,大模型存在一定的知识滞后(时效性)问题,即大模型掌握的知识很可 能是过时的,它无法回答超出其训练知识时间点之后的问题(见图 1-6)。 图 1-6 2.输出难以解释的“黑盒子”问题 大模型简单易用的一个重要原因来自其“黑盒”运行的模式。除了大模型 的输入提示词(通常简称为 Prompt),你无须关心,也无法观察到其内部的推 理、决策与输出过程。这降低了使用者的使用门槛,但在一些深层的应用场景
Page
19
8 | 基于大模型的 RAG 应用开发与优化——构建企业级 LLM 应用 中会给使用者带来困惑,或者给应用开发者带来调试上的麻烦。比如,在一些 关键的应用场景中,当需要对大模型输出进行精确的调试与控制时,你可能会 发现除了修改 Prompt 和几个简单的推理参数,在大部分时候需要靠点“运气”, 或者说,有很大的随机性。 2023 年,Anthropic 公司(大模型 Claude 的开发公司)的研究团队发布了 一篇研究报告“Towards Monosemanticity: Decomposing Language Models With Dictionary Learning”。该报告展示了在神经网络“模型黑箱”可解释性上的一 些可喜的进展,但也揭示了要在更大规模的语言模型上实现可解释性,还面临 着在技术、方法与工具上的极大挑战。 3.输出的不确定性问题 如果你使用过基于大模型的 AI 对话机器人,那么应该对此深有体会:大 模型的输出有很大的随机性与不确定性。当然,这并非总是坏事。正如前文所 说,大模型的强大之处就在于其区别于之前 AI 模型的推理与生成,能够根据 你的 Prompt 输出多样而富有创造性的内容。这在一些场景中恰恰是必需的, 比如创意生成或者自媒体内容创作(见图 1-7),但是在一些需要更确定与可预 测结果的场景中(比如智能家居应用需要精确地理解使用者的意图,或者在开 发时需要一致地输出以便更好地调试与排除故障)就会面临挑战(见图 1-8)。 图 1-7
Page
20
第 1 章 了解大模型与 RAG | 9 图 1-8 大模型输出不确定性的根本原因来自其本质上是一种基于所学知识统计 规律的概率输出模型,是一种非线性模型。这意味着即使对于相同的上下文, 也可能在不同的时间选择不同的下一个词(token)输出。因为大模型学习到的 是一种基于概率分布的多个可能输出,而非“如果上个 token是X那么下个 token 就是 Y”的明确规则。尽管大模型会提供“temperature”这样的参数来在一定 程度上控制随机性,但该参数无法完全消除随机性。OpenAI 在后来的大模型 更新中,还引入了 seed 参数,用于在相同输入的前提下尽量产生可重现的输出 结果。 4.著名的“幻觉”问题 这是一个耳熟能详的大模型的经典问题,指的是大模型在试图生成内容或 回答问题时,输出的结果不完全正确甚至错误,即通常所说的“一本正经地胡 说八道”。这被称为大模型的“幻觉”问题。这种“幻觉”可以体现为对事实 的错误陈述与编造、错误的复杂推理或者在复杂语境下处理能力不足等。大模 型产生“幻觉”的主要原因如下。 (1)训练知识存在偏差。在训练大模型时输入的海量知识可能包含错误、 过时,甚至带有偏见的信息。这些信息在被大模型学习后,就可能在未来的输
Comments 0
Loading comments...
Reply to Comment
Edit Comment