Ask HN: What is your set up for using LLMs for documentation question answering?

manibatra · on April 24, 2023

Depends on the complexity of the requirements. I have built a couple:

- https://crimson-glade-1527.section.app/ : to talk to www.section.io docs. I used straight up python and Gradio. The chatbot doesn't have memory. Just use a CSV for embeddings. Has limited functionality but does it's job well. https://github.com/manibatra/sectiongpt - consider this toy code which was hacked together in a few hours.

- Also building https://www.everbility.com : Using Langchain, Pinecone. I often found Langchain to be a bit of an overkill with it's so many abstractions if your use case is just sending one off prompts. It is powerful though for document ingestion and Agents (which is something I plan to use it more for). When using Langchain you will often have to debug a bunch of edge cases which is understandable since

A random hack I found for improving the quality of answers by a big factor was creating embeddings for entire documents and also sections of document. Resulted in search on the vector space being very accurate for big picture and granular questions.

battybro0034 · on April 24, 2023

I was thinking of using pinecone and langchain.

I have one question tho, in pinecone, can I make multiple indexes in 1 pod? I was thinking of making a Saas based app for it and give access to others, was just wondering how that worked.

melovy · on April 25, 2023

You might want to check out MyScale, a SaaS product with multiple vector index support and Langchain integration: https://python.langchain.com/en/latest/ecosystem/myscale.htm...

manibatra · on April 24, 2023

You can only have one index in a pod but you can have multiple namespaces in an index and a vector search is constrained to one namespace at a time : https://docs.pinecone.io/docs/namespaces