LLM Develop Tools

May 31, 2023

Motivation

如何开发以下需求？

业务（Models、Prompt）
- 多模型（gpt-3.5, gpt-4, chatglm-6b, chatglm-6b-ft）
- 多业务（运营push、个性化push、weibo地域、商业评论、新闻评论……）
- 不同业务 X 不同模型 = 不同prompt
- 敏感词风控
- 输出标准化json格式
网页（Memory）
- 多轮对话
- 控制记忆上限（按轮数、按token[全量, 摘要……]）
外部数据（Indexes）
- 汽车博文生成，需要参考车型各类参数（pdf，md, txt……）
任务拆分（Chain）
- 汽车博文3段式生成
  1. 根据brief =》开头
  2. 根据brief、开头 =》中间
  3. 根据开头、中间 =》结尾
日志（Callbacks）
- 过程日志统计、token统计……
……

工具

目前开源工具包括以下几种：llama_index、langchain、semantic-kernel等。

tool	starred
langchain	43.6k
LlamaIndex(GPTIndex）	16.3k
semantic-kernel	9.6k

langChain是一个用于开发LLM应用程序的框架，使LLM开发变得规范快捷，类似于 java的Spring Boot、php的Laravel、python的Django ……
LlamaIndex（GPTIndex）是LLM应用程序的数据框架。
- GptIndex为什么改名为LlamaIndex？
  
  Funny that we had just rebranded our tool from GPT Index to LlamaIndex about a week ago to avoid potential trademark issues with OpenAI, and turns out Meta has similar ideas around LLM+llama puns :). Must mean the name is good though! Also very excited to try plugging in the LLaMa model into LlamaIndex, will report the results. https://news.ycombinator.com/item?id=34928386
semantic-kernel（SK）是一个轻量级的SDK，能够将人工智能大型语言模型（LLM）与传统编程语言集成。

区别

Q: Llamaindex vs langchain, which one should be used?

WEB	LLM
Spring Boot	langchain
ElasticSearch-Lucene、MyBatis-Mysql	LlamaIndex（GptIndex）-Faiss

取决于你的最终目标，如果它主要是一个智能搜索工具，llamaindex是很棒的，如果你想构建一个能够创建插件的chatgpt克隆，那就完全不同了。Langchain允许您利用ChatGPT的多个实例，为它们提供内存，甚至多个llamaindex实例。你可以用langchain做的事情是构建代理，它可以做不止一件事，一个例子是执行python代码，同时搜索谷歌。基本上llmaindex是一种智能存储机制，而Langchain是一种将多种工具结合在一起的工具。

Q: Differences between semantic-kernel and langchain? #936