Overview

What is InnoClaw?

InnoClaw is an AI-powered research assistant web application similar to Google NotebookLM. It lets you open server-side folders as workspaces, browse and manage files, and chat with AI grounded in your workspace files via RAG (Retrieval-Augmented Generation).

Key Features

  • Workspace Management — Map server folders as persistent workspaces

  • File Browser — Tree view with upload, create, edit, rename, and delete

  • GitHub Integration — Clone and pull GitHub repositories (including private ones)

  • RAG Chat — AI answers questions based on workspace file content with source citations

  • Note Generation — Auto-generate summaries, FAQs, briefs, timelines, and daily/weekly reports

  • Multi-LLM Support — Switch between OpenAI GPT, Anthropic Claude, and Google Gemini

  • Bilingual UI — Chinese / English toggle

  • Dark Mode — Light and dark theme support

  • Bot Integrations — Feishu (Lark) and WeChat Enterprise bot support

  • Agent Mode — Autonomous AI agent with tool usage (bash, file operations, kubectl, article search)

  • Skills — 206 built-in SCP scientific skills plus custom workflow templates

  • Article / Paper Search — Search arXiv and Hugging Face Daily Papers

  • Scheduled Tasks — Cron-based automated tasks (daily/weekly reports, git sync, source sync)

  • Dataset Management — Download and manage datasets from HuggingFace Hub and ModelScope

  • Cluster Integration — Kubernetes cluster management with Volcano GPU job submission

Architecture

The following diagram shows the high-level architecture of InnoClaw:

        graph TB
    subgraph Client["Browser Client"]
        UI["Next.js App Router<br/>(React + Tailwind CSS)"]
        I18N["next-intl<br/>(EN / ZH)"]
        Theme["next-themes<br/>(Light / Dark)"]
    end

    subgraph Server["Next.js Server"]
        API["API Routes"]
        RAG["RAG Pipeline"]
        DB["SQLite + Drizzle ORM"]
        FS["File System"]
        Git["Git Operations"]
        Scheduler["Task Scheduler"]
    end

    subgraph AI["AI Providers"]
        OpenAI["OpenAI API"]
        Anthropic["Anthropic API"]
        Gemini["Gemini API"]
        Embed["Embedding API"]
    end

    subgraph Bots["Bot Integrations"]
        Feishu["Feishu / Lark"]
        WeChat["WeChat Enterprise"]
    end

    subgraph External["External Services"]
        HF["HuggingFace Hub"]
        ModelScope["ModelScope"]
        ArXiv["arXiv"]
        K8s["Kubernetes Cluster"]
    end

    UI --> API
    API --> RAG
    API --> DB
    API --> FS
    API --> Git
    API --> Scheduler
    RAG --> Embed
    RAG --> DB
    API --> OpenAI
    API --> Anthropic
    API --> Gemini
    API --> HF
    API --> ModelScope
    API --> ArXiv
    API --> K8s
    Bots --> API
    

Tech Stack

Category

Technology

Framework

Next.js 16 (App Router) + TypeScript

UI

Tailwind CSS 4 + shadcn/ui + Radix UI

AI

Vercel AI SDK 6 + OpenAI + Anthropic + Gemini

Database

SQLite (better-sqlite3 + Drizzle ORM)

Vector Search

Pure JS cosine similarity (no extensions needed)

i18n

next-intl (Chinese / English)

Theme

next-themes

RAG Pipeline

The core feature of InnoClaw is RAG-based AI chat. The pipeline works in three stages:

        flowchart LR
    subgraph Index["Indexing (Sync)"]
        Files["Workspace Files"] --> Extract["Text Extraction"]
        Extract --> Chunk["Text Chunking"]
        Chunk --> Embedding["Vector Embedding"]
        Embedding --> Store["SQLite Storage"]
    end

    subgraph Query["Query (Chat)"]
        Question["User Question"] --> QEmbed["Question Embedding"]
        QEmbed --> Search["Cosine Similarity Search"]
        Store --> Search
        Search --> TopK["Top-K Relevant Chunks"]
    end

    subgraph Generate["Generation"]
        TopK --> Prompt["System Prompt + Chunks + Question"]
        Prompt --> LLM["LLM (GPT / Claude / Gemini)"]
        LLM --> Answer["Streaming Answer with Source Citations"]
    end
    
  1. Indexing — When the user clicks “Sync”, files are scanned, text is extracted, chunked, and embedded into vectors stored in SQLite.

  2. Retrieval — When the user asks a question, the question is embedded and compared against all stored vectors using cosine similarity to find the top 8 most relevant chunks.

  3. Generation — The relevant chunks and the user’s question are sent to the LLM, which generates a streaming answer with source citations.