Generative models sometimes take some time to return a result, so it is interesting to leverage token streaming in order to see the result appear on the fly in the UI. Here is how you can achieve such a text streaming frontend for your LLM with Go, FastAPI and Javascript.
RAG: Question Answering On Domain Knowledge With Semantic Search And Generative AI
Answering questions based on domain knowledge (like internal documentation, contracts, books, etc.) is challenging. In this article we explore an