NEWTrain a custom GPT Chatbot on YouTube videosTry Now

No Chunks, No Embeddings: OpenAI’s Index‑Free Long RAG

Summary

The video introduces OpenAI's new multi-agent retriever augmented generation system that simplifies chunking using long context models like GPT 4.1. It details a document processing phase involving selecting relevant chunks, dividing them into sub-sections, and assessing their relevancy at multiple depths to generate accurate answers. The system's architecture breakdown includes the use of long context LM models like GPT 4.1 for reasoning, verification, and answer validation. It explains the implementation of a recursive decomposition function for splitting documents, selecting chunks, and validating responses for accuracy and confidence scoring. Additionally, it discusses the system's estimated costs, applications in legal documents, and benefits in terms of latency and scalability considerations, as well as the potential use of knowledge graphs for adding relevant information based on user queries.

Chapters

Introduction to Retrieval Augmented Generation System
Document Processing Phase
Architecture Breakdown
System Workflow and Chunking Process
Implementation Details and Recursive Division
Cost Analysis and Use Cases
Knowledge Graphs and Information Editing

Introduction to Retrieval Augmented Generation System

OpenAI introduces a new multi-agent retriever augmented generation system that simplifies the chunking strategy by mimicking indexing free retrieval using long context models like GPT 4.1.

Document Processing Phase

Details the document processing phase where the system selects relevant chunks, divides them into sub-sections, and assesses their relevancy at multiple depths, controlled by the user, to generate accurate answers.

Architecture Breakdown

Explains the architecture breakdown involving the use of long context LM models like GPT 4.1 and O4 mini for reasoning, verification, and answer validation in generating accurate responses.

System Workflow and Chunking Process

Describes the workflow of processing documents into chunks, selecting individual sentences, and generating responses by breaking down subchunks further to identify specific accurate answers.

Implementation Details and Recursive Division

Illustrates the implementation details involving recursive decomposition function for splitting documents into chunks, selecting relevant chunks, and the validation process for accuracy and confidence scoring.

Cost Analysis and Use Cases

Analyzes the estimated fixed and variable costs of the system, applications in legal documents, trade-offs in different scenarios, and benefits of the system including latency and scalability considerations.

Knowledge Graphs and Information Editing

Discusses the use of knowledge graphs to add relevant information based on user queries, implications on depth and legal document processing, and the potential of improving relationships and citations.

Get your own AI Agent Today

Thousands of businesses worldwide are using Chaindesk Generative AI platform.
Don't get left behind - start building your own custom AI chatbot now!

Start For Free

Book a Demo

No Chunks, No Embeddings: OpenAI’s Index‑Free Long RAG

Get your own AI Agent Today

No Chunks, No Embeddings: OpenAI’s Index‑Free Long RAG