Get better answers by using state-of-the-art components
In summary, this section walks you through the following steps:
- Switch from LLM-based retrieval to vector-based retrieval.
- Set up API keys for the embedder, vector store and reranker.
- Index the codebase.
- Enjoy the improved chat experience.
text-embedding-3-small
model, which achieves highest quality and has the fastest batch embedding API. Below you will find instructions for each provider:
OpenAI
Google Gemini
Voyage
Pinecone
NVIDIA
nvidia/nv-rerankqa-mistral-4b-v3
.Voyage
Cohere
Jina
OpenAI
Anthropic
text-embedding-small-3
model from OpenAI. However, you can overwrite these defaults via command-line flags:
--embedding-provider
. Pass this flag to both sage-index
and sage-chat
.--reranker-provider
. Pass this flag to sage-chat
.sage-chat
and sage-index
:
sage
. This flag allows you to specify an existing index to reuse.huggingface/transformers
) for the namespace, but you can customize it via this flag.