LLM-Embedder [paper]

This is the codebase for LLM-Embedder, a unified embedding model to comprehensively support the retrieval augmentation needs of large language models, including knowledge retrieval, memory retrieval, examplar retrieval, and tool retrieval. It is fine-tuned over 6 tasks: - *Question Answering (qa)* - *Conversational Search (convsearch)* - *Long Conversation (chat)* - *Long-Range Language Modeling (lrlm)* - *In-Context Learning (icl)* - *Tool Learning (tool)* ## Roadmap - Details about how to fine-tune the LLM-Embedder are [here](docs/fine-tune.md). - Details about how to evaluate different retrievers on various retrieval-augmented scenarios are [here](docs/evaluation.md). ## Usage ### Using `FlagEmbedding` ```pip install -U FlagEmbedding``` ```python from FlagEmbedding import FlagModel INSTRUCTIONS = { "qa": { "query": "Represent this query for retrieving relevant documents: ", "key": "Represent this document for retrieval: ", }, "icl": { "query": "Convert this example into vector to look for useful examples: ", "key": "Convert this example into vector for retrieval: ", }, "chat": { "query": "Embed this dialogue to find useful historical dialogues: ", "key": "Embed this historical dialogue for retrieval: ", }, "lrlm": { "query": "Embed this text chunk for finding useful historical chunks: ", "key": "Embed this historical text chunk for retrieval: ", }, "tool": { "query": "Transform this user request for fetching helpful tool descriptions: ", "key": "Transform this tool description for retrieval: " }, "convsearch": { "query": "Encode this query and context for searching relevant passages: ", "key": "Encode this passage for retrieval: ", }, } # Define queries and keys queries = ["test query 1", "test query 2"] keys = ["test key 1", "test key 2"] # Encode for a specific task (qa, icl, chat, lrlm, tool, convsearch) task = "qa" # Load model (automatically use GPUs) model = FlagModel('BAAI/llm-embedder', use_fp16=False, query_instruction_for_retrieval=INSTRUCTIONS[task]['query'], passage_instruction_for_retrieval=INSTRUCTIONS[task]['key'], devices=['cuda:0']) query_embeddings = model.encode_queries(queries) key_embeddings = model.encode_corpus(keys) similarity = query_embeddings @ key_embeddings.T print(similarity) # [[0.8971, 0.8534] # [0.8462, 0.9091]] ``` ### Using `transformers` ```pip install -U transformers``` ```python import torch from transformers import AutoTokenizer, AutoModel INSTRUCTIONS = { "qa": { "query": "Represent this query for retrieving relevant documents: ", "key": "Represent this document for retrieval: ", }, "icl": { "query": "Convert this example into vector to look for useful examples: ", "key": "Convert this example into vector for retrieval: ", }, "chat": { "query": "Embed this dialogue to find useful historical dialogues: ", "key": "Embed this historical dialogue for retrieval: ", }, "lrlm": { "query": "Embed this text chunk for finding useful historical chunks: ", "key": "Embed this historical text chunk for retrieval: ", }, "tool": { "query": "Transform this user request for fetching helpful tool descriptions: ", "key": "Transform this tool description for retrieval: " }, "convsearch": { "query": "Encode this query and context for searching relevant passages: ", "key": "Encode this passage for retrieval: ", }, } # Define queries and keys queries = ["test query 1", "test query 2"] keys = ["test key 1", "test key 2"] # Load model tokenizer = AutoTokenizer.from_pretrained('BAAI/llm-embedder') model = AutoModel.from_pretrained('BAAI/llm-embedder') # Add instructions for specific task (qa, icl, chat, lrlm, tool, convsearch) instruction = INSTRUCTIONS["qa"] queries = [instruction["query"] + query for query in queries] keys = [instruction["key"] + key for key in keys] # Tokenize sentences query_inputs = tokenizer(queries, padding=True, return_tensors='pt') key_inputs = tokenizer(keys, padding=True, return_tensors='pt') # Encode with torch.no_grad(): query_outputs = model(**query_inputs) key_outputs = model(**key_inputs) # CLS pooling query_embeddings = query_outputs.last_hidden_state[:, 0] key_embeddings = key_outputs.last_hidden_state[:, 0] # Normalize query_embeddings = torch.nn.functional.normalize(query_embeddings, p=2, dim=1) key_embeddings = torch.nn.functional.normalize(key_embeddings, p=2, dim=1) similarity = query_embeddings @ key_embeddings.T print(similarity) # [[0.8971, 0.8534] # [0.8462, 0.9091]] ``` ### Using `sentence-transformers` ```pip install -U sentence-transformers``` ```python from sentence_transformers import SentenceTransformer INSTRUCTIONS = { "qa": { "query": "Represent this query for retrieving relevant documents: ", "key": "Represent this document for retrieval: ", }, "icl": { "query": "Convert this example into vector to look for useful examples: ", "key": "Convert this example into vector for retrieval: ", }, "chat": { "query": "Embed this dialogue to find useful historical dialogues: ", "key": "Embed this historical dialogue for retrieval: ", }, "lrlm": { "query": "Embed this text chunk for finding useful historical chunks: ", "key": "Embed this historical text chunk for retrieval: ", }, "tool": { "query": "Transform this user request for fetching helpful tool descriptions: ", "key": "Transform this tool description for retrieval: " }, "convsearch": { "query": "Encode this query and context for searching relevant passages: ", "key": "Encode this passage for retrieval: ", }, } # Define queries and keys queries = ["test query 1", "test query 2"] keys = ["test key 1", "test key 2"] # Load model model = SentenceTransformer('BAAI/llm-embedder', device="cpu") # Add instructions for specific task (qa, icl, chat, lrlm, tool, convsearch) instruction = INSTRUCTIONS["qa"] queries = [instruction["query"] + query for query in queries] keys = [instruction["key"] + key for key in keys] # Encode query_embeddings = model.encode(queries) key_embeddings = model.encode(keys) similarity = query_embeddings @ key_embeddings.T print(similarity) # [[0.8971, 0.8534] # [0.8462, 0.9091]] ``` ## Contact If you have any question or suggestion related to this project, feel free to open an issue or pull request. You also can email Peitian Zhang (namespace.pt@gmail.com). ## Citation If you find this repository useful, please consider giving a star ⭐ and citation ``` @misc{zhang2023retrieve, title={Retrieve Anything To Augment Large Language Models}, author={Peitian Zhang and Shitao Xiao and Zheng Liu and Zhicheng Dou and Jian-Yun Nie}, year={2023}, eprint={2310.07554}, archivePrefix={arXiv}, primaryClass={cs.IR} } ```