Ragflow
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasoning workflows. By integrating document intelligence with advanced retrieval pipelines, the platform enables the creation of grounded, verifiable responses supported by traceable citations.
The platform distinguishes itself through deep document understanding and sophisticated knowledge orchestration. It supports complex document parsing, including the extraction of tables and images, and utilizes graph-based indexing to enhance reasoning over large document collections. Users can configure multiple recall strategies and fused re-ranking to optimize retrieval accuracy, while the system maintains context through multi-turn dialogue management and flexible tool-use frameworks.
The architecture is built on a modular, containerized microservice foundation that supports both local inference engines and external language model APIs. It includes asynchronous task processing for document ingestion and indexing, ensuring system responsiveness during heavy workloads. The platform also provides a standardized interface for model abstraction, allowing for seamless integration with existing language model ecosystems.
Developers can interact with the platform through a comprehensive suite of RESTful endpoints and Python client libraries, which cover the full lifecycle of agents, datasets, and knowledge graphs. The system is designed for flexible deployment, offering configurable environment settings and support for custom containerized environments to facilitate local development and infrastructure portability.
Features
- Chat Assistants - **POST** `/api/v1/chats` Creates a chat assistant. #### Request[](#request-28 "Direct link to Request") - Method: POST - URL: `/api/v1/chats` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer <YOU
- Retrieval-Augmented Generation Platforms - A comprehensive environment for building, managing, and deploying knowledge-based AI applications with advanced document parsing and retrieval capabilities.
- AI Agent Frameworks - A set of tools for defining autonomous agents that utilize custom knowledge bases, memory, and external tools to execute multi-step workflows.
- Grounded Answer Generation - The platform produces responses with traceable citations and visual chunking to reduce hallucinations and facilitate human verification of generated content.
- RAG Workflow Orchestrators - The platform coordinates retrieval workflows using configurable models, multiple recall strategies, and fused re-ranking to ensure seamless integration with business logic.
- RAG Workflows - Coordinates multi-stage recall, re-ranking, and citation-based generation to produce grounded, verifiable responses from indexed datasets.
- Agent Management APIs - **GET** `/api/v1/agents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={agent_name}&id={agent_id}` Lists agents. #### Request[](#request-51 "Direct link to Request") - Method: GET - URL: `/api/v1/ag
- Document Knowledge Extraction - The platform processes unstructured data using deep document understanding to support unlimited tokens and complex formats for high-quality information retrieval.
- Chat Assistant Configurations - The platform enables deploying conversational agents that leverage indexed knowledge bases to provide context-aware responses to user queries through defined interaction patterns.
- Knowledge Dataset Managers - The platform organizes datasets by uploading, parsing, and indexing documents to create a structured knowledge base for retrieval-augmented generation tasks.
- Semantic Search Engines - The platform executes semantic searches across indexed datasets to retrieve relevant information and document snippets for answering complex user questions.
- Agentic Tool-Use Frameworks - Enables autonomous agents to execute multi-step reasoning tasks by leveraging defined memory, knowledge bases, and external tool integrations.
- OpenAI-Compatible APIs - * * * ### Create chat completion[](#create-chat-completion "Direct link to Create chat completion") **POST** `/api/v1/openai/{chat_id}/chat/completions` Creates a model response for a given chat conversation. DEPRECATED
- Graph-Based Knowledge Indexers - Constructs multi-layered knowledge graphs and hierarchical summaries to improve reasoning capabilities over complex document collections.
- LLM API Integrations - The platform supports connecting third-party applications via API endpoints to enable streaming outputs and multi-turn dialogues that maintain context for coherent query responses.
- Chat Assistant Management APIs - **GET** `/api/v1/chats?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&owner_ids={owner_id}&name={chat_name}&id={chat_id}` Lists chat assistants. #### Request[](#request-34 "Direct lin
- Local LLM Configurations - The platform enables connecting local inference engines or OpenAI-compatible model providers through a unified configuration interface to execute language models within a private infrastructure.
- Document Chunking Strategies - The platform segments documents using intelligent, explainable templates to improve retrieval accuracy and data processing efficiency during knowledge base indexing.
- Knowledge Graph Construction - **POST** `/api/v1/datasets/{dataset_id}/run_graphrag` Constructs a knowledge graph from a specified dataset. #### Request[](#request-8 "Direct link to Request") - Method: POST - URL: `/api/v1/datasets/{dataset_id}/run_gr
- Document Intelligence Engines - A specialized processing layer that extracts structured data, tables, and text from complex documents to facilitate high-accuracy information retrieval.
- Document Parsing Services - ``` DataSet.parse_documents(document_ids: list[str]) -> list[tuple[str, str, int, int]] ``` *Asynchronously* parses documents in the current dataset. This method encapsulates `async_parse_documents()`. It awaits the comp
- Knowledge Graph APIs - **GET** `/api/v1/datasets/{dataset_id}/knowledge_graph` Retrieves the knowledge graph of a specified dataset. #### Request[](#request-6 "Direct link to Request") - Method: GET - URL: `/api/v1/datasets/{dataset_id}/knowle
- Source-Based Execution Environments - The platform enables running the service directly from source code to facilitate real-time debugging, local development, and comprehensive testing of core application functionality.
- System Service Configurations - The platform enables defining system-level service configurations for API servers, database connections, object storage, and third-party authentication providers using YAML templates.
- System Settings Management - The platform provides tools to adjust environment parameters for managing application behavior, resource allocation, and backend operations for the underlying service architecture.
- Chat Assistant APIs - **GET** `/api/v1/chats/{chat_id}` Retrieves a specified chat assistant. #### Request[](#request-30 "Direct link to Request") - Method: GET - URL: `/api/v1/chats/{chat_id}` - Headers: - `'Authorization: Bearer <YOUR_API_K
- Agent Orchestration - The platform allows defining autonomous agents that utilize specific tools, memory, and knowledge to execute multi-step workflows and complex reasoning tasks.
- Knowledge Graph Orchestrators - A framework for constructing and querying relational data structures alongside vector-based search to improve reasoning and context-aware response generation.
- Document Parsing Pipelines - Extracts structured data from unstructured files using specialized OCR and layout analysis to enable high-fidelity knowledge retrieval.
- OpenAI-Compatible Inference Servers - A standardized API layer that exposes retrieval and generation capabilities through common interfaces for seamless integration with existing LLM ecosystems.
- Dataset Management APIs - **PUT** `/api/v1/datasets/{dataset_id}` Updates configurations for a specified dataset. #### Request[](#request-4 "Direct link to Request") - Method: PUT - URL: `/api/v1/datasets/{dataset_id}` - Headers: - `'content-Type
- RESTful APIs - A complete reference for RAGFlow's RESTful API. Before proceeding, please ensure you have your RAGFlow API key ready for authentication.
- Model Abstraction Layers - Provides a standardized interface for integrating local inference engines and external LLM providers into retrieval workflows.
- Document Parsers - The platform provides fine-grained control for extracting text, images, and tables from documents to generate traceable answers with citations that minimize hallucinations in retrieval workflows.
- Knowledge Graph Deletion - **DELETE** `/api/v1/datasets/{dataset_id}/knowledge_graph` Removes the knowledge graph of a specified dataset. #### Request[](#request-7 "Direct link to Request") - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/kn
- Document Management APIs - **PUT** `/api/v1/datasets/{dataset_id}/documents/{document_id}` Updates configurations for a specified document. #### Request[](#request-13 "Direct link to Request") - Method: PUT - URL: `/api/v1/datasets/{dataset_id}/do
- Document Retrieval APIs - **GET** `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={time
- Document Parsing Controls - **DELETE** `/api/v1/datasets/{dataset_id}/chunks` Stops parsing specified documents. #### Request[](#request-18 "Direct link to Request") - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/chunks` - Headers: - `'cont
- Document Uploads - **POST** `/api/v1/datasets/{dataset_id}/documents` Uploads documents to a specified dataset. This endpoint supports three creation modes via the optional `type` query parameter: - `type=local` or omitted: Upload one or m
- Document Deletion APIs - **DELETE** `/api/v1/datasets/{dataset_id}/documents` Deletes documents by ID. #### Request[](#request-16 "Direct link to Request") - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/documents` - Headers: - `'Content-
- Python SDKs - A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you have your RAGFlow API key ready for authentication.
- RAG Pipeline Optimizers - The platform accelerates document parsing and retrieval speed by tuning batch sizes, configuring external parsing services, or utilizing specialized OCR engines for data processing.
- Chat Management APIs - **DELETE** `/api/v1/chats/{chat_id}` Deletes a chat assistant by ID. #### Request[](#request-32 "Direct link to Request") - Method: DELETE - URL: `/api/v1/chats/{chat_id}` - Headers: - `'Authorization: Bearer <YOUR_API_K
- Dataset Management - **DELETE** `/api/v1/datasets` Deletes datasets by ID. #### Request[](#request-3 "Direct link to Request") - Method: DELETE - URL: `/api/v1/datasets` - Headers: - `'content-Type: application/json'` - `'Authorization: Bear