7 repos

Document Processing Tools — Content Management & Publishing

We curate 7 GitHub repositories matching content management & publishing · Document Processing Tools. Refine with filters or upvote what's useful.

Document Processing Tools — Content Management & Publishing

We'll search the best matching repositories with AI.
  • avelino/awesome-go

    avelino/awesome-go

    165,543GitHub

    This project serves as a comprehensive language ecosystem index, functioning as a centralized, community-curated directory for the Go programming language. It organizes a vast landscape of software components, libraries, and development tools into a structured, navigable hierarchy, enabling developers to efficiently di

    Goawesomeawesome-listgo
  • jaywcjlove/awesome-mac

    jaywcjlove/awesome-mac

    99,007GitHub

    This project is a comprehensive, curated collection of software resources designed for the macOS ecosystem. It serves as a centralized directory for discovering applications across a wide range of functional domains, including professional development, system management, and personal productivity. The directory distin

    JavaScriptappappleapplication
  • oven-sh/bun

    oven-sh/bun

    87,491GitHub

    Bun is a high-performance runtime environment designed to execute JavaScript and TypeScript applications with minimal latency and high throughput. Built on a native core implemented in Zig, it provides a unified execution engine that leverages JavaScriptCore for efficient memory management and low-latency startup. The

    Zigbunbundlerjavascript
  • microsoft/markitdown

    microsoft/markitdown

    87,305GitHub

    This project is an AI-powered document processing engine designed to transform diverse file formats into structured Markdown. By leveraging multimodal language models, it performs complex layout analysis and semantic text extraction, allowing for the conversion of both unstructured files and scanned images into machine

    Pythonautogenautogen-extensionlangchain
  • Stirling-Tools/Stirling-PDF

    Stirling-Tools/Stirling-PDF

    74,357GitHub

    Stirling-PDF is a self-hosted document processing suite designed for secure, private file management. It functions as a comprehensive transformation engine that executes complex operations—such as merging, splitting, converting, and redacting documents—directly on the host machine. The platform provides both a browser-

    TypeScriptdockerhacktoberfestjava
  • infiniflow/ragflow

    infiniflow/ragflow

    73,425GitHub

    This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin

    Pythonagentagenticagentic-ai
  • tesseract-ocr/tesseract

    tesseract-ocr/tesseract

    72,460GitHub

    Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into d

    C++hacktoberfestlstmmachine-learning