Skip to content

BlockForge-Dev/ForgeRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ForgeRAG

Secure RAG infrastructure for document intelligence, citations, access control, evaluation, and observability.

ForgeRAG is not a "chat with PDF" toy. It is a production-style Retrieval Augmented Generation platform where an organization can upload documents, index them asynchronously, ask grounded questions, receive cited answers, restrict access by workspace and document permissions, inspect failures, and evaluate answer quality over time.

Problem

Teams want AI search over private documents, but a useful enterprise system needs more than a prompt and a file upload. It needs durable ingestion, tenant isolation, authorization, source citations, audit trails, evaluation, observability, and deployment discipline.

ForgeRAG is designed to show the infrastructure behind enterprise AI search:

  • Documents are ingested asynchronously.
  • Answers must be grounded in retrieved chunks.
  • Answers must include citations.
  • Retrieval must be observable.
  • Every query must be logged.
  • Ingestion failures must be visible.
  • Access control matters from the start.

Current Scope

The current implementation includes:

  • Go API service
  • Workspace owner registration
  • Workspace-scoped API key authentication
  • Role-based access control for document APIs
  • PDF, text, and markdown upload
  • Local document storage with checksum and file metadata
  • Durable ingestion jobs with worker leasing, retries, backoff, attempt history, and dead-letter handling
  • Worker-based text extraction for PDF, text, and markdown files
  • Cleaned document chunks with overlap, page metadata, and token estimates
  • Chunk embeddings generated during ingestion
  • Local deterministic embedding provider for development and tests
  • OpenAI-compatible embedding provider configuration
  • Tenant-filtered pgvector semantic search endpoint
  • Persisted document_chunks records with a workspace-scoped inspection endpoint
  • Postgres connection and readiness check
  • SQL migrations with pgvector enabled and HNSW indexing
  • Docker Compose stack
  • Structured JSON logs
  • Request IDs
  • Consistent API error responses
  • Workspace and tenant-scoped document metadata endpoints
  • Architecture and deployment documentation

Future milestones add answer generation, citations, query history, evaluation, observability dashboards, and frontend screens.

Architecture

flowchart LR
    User[User or API Client] --> API[Go API Service]
    API --> DB[(Postgres + pgvector)]
    API --> Embed[Embedding Provider]
    API --> Storage[Document Storage]
    API --> Jobs[Durable Job Queue]
    Jobs --> Worker[Go Ingestion Worker]
    Worker --> Storage
    Worker --> Extract[Text Extraction]
    Worker --> Chunk[Chunking]
    Worker --> Embed
    Worker --> DB
    API --> LLM[Chat Model Provider]
    API --> Telemetry[Logs, Metrics, Traces]
Loading

Core Services

  • API service: authentication, workspaces, documents, search, ask, feedback, admin inspection
  • Worker service: document extraction, chunking, embedding, indexing, retries, failure classification
  • Postgres: transactional data, pgvector semantic index, query history, audit logs
  • Storage: original uploaded documents and extracted artifacts
  • Evaluation runner: repeatable quality tests for retrieval and answers
  • Dashboard: document status, query history, citations, feedback, evaluation, health

Initial API

GET  /health
GET  /ready

POST /auth/register

POST /api/v1/workspaces

POST /api/v1/documents
GET  /api/v1/documents
GET  /api/v1/documents/{id}
GET  /api/v1/documents/{id}/chunks

POST /api/v1/search

GET  /api/v1/admin/jobs
GET  /api/v1/admin/jobs/{id}
POST /api/v1/admin/jobs/{id}/retry

Planned API:

POST /auth/login

GET    /workspaces/{id}
DELETE /documents/{id}

POST /ask

GET  /queries
GET  /queries/{id}
POST /queries/{id}/feedback

POST /eval/datasets
POST /eval/cases
POST /eval/runs
GET  /eval/runs/{id}

Data Model

Core tables:

users
workspaces
workspace_members
api_keys

documents
document_versions
document_files
document_chunks
document_permissions
document_collections
collection_members

jobs
job_attempts

rag_queries
rag_query_retrievals
rag_feedback

eval_datasets
eval_cases
eval_runs
eval_results

audit_logs

Failure Modes

ForgeRAG treats failures as product-visible infrastructure events, not hidden logs.

Current ingestion and retrieval failure classes:

document_parse_failed
storage_failed
embedding_failed
chunk_persist_failed
search_failed
database_unavailable
job_lease_expired

Planned provider and answer failure classes:

provider_timeout
llm_failed
permission_denied
rate_limited
insufficient_context

Local Development

Requirements:

  • Docker
  • Go 1.24 or newer

Start the full local stack:

docker compose up --build

Run the API directly:

cd backend
go run ./cmd/api

Run the ingestion worker directly:

cd backend
go run ./cmd/worker

The API listens on http://localhost:8080 by default.

Health checks:

curl http://localhost:8080/health
curl http://localhost:8080/ready

Register an owner, workspace, and first API key:

curl -X POST http://localhost:8080/auth/register \
  -H "Content-Type: application/json" \
  -d "{\"email\":\"owner@example.com\",\"name\":\"Acme Owner\",\"workspace_name\":\"Acme Finance\"}"

The api_key.token value is shown only once. Use it as a bearer token for workspace APIs.

Create another workspace:

curl -X POST http://localhost:8080/api/v1/workspaces \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"name\":\"Acme Legal\"}"

Create a document metadata record:

curl -X POST http://localhost:8080/api/v1/documents \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"workspace_id\":\"<workspace_id>\",\"title\":\"Refund Policy\",\"source_type\":\"manual\"}"

Upload a PDF, text, or markdown document:

curl -X POST http://localhost:8080/api/v1/documents \
  -H "Authorization: Bearer <api_key_token>" \
  -F "workspace_id=<workspace_id>" \
  -F "title=Refund Policy" \
  -F "file=@./refund-policy.pdf"

Uploaded documents are stored locally under FORGERAG_STORAGE_DIR, and the database records file name, storage URI, content type, size, checksum, version, document status, and queued ingestion job. The worker resolves the stored file URI, extracts text, chunks it with overlap, embeds each chunk, persists document_chunks, and updates document status based on the job result.

Inspect extracted chunks for a document:

curl http://localhost:8080/api/v1/documents/<document_id>/chunks \
  -H "Authorization: Bearer <api_key_token>"

Search indexed chunks semantically:

curl -X POST http://localhost:8080/api/v1/search \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"query\":\"What is the refund window?\",\"top_k\":5}"

Inspect ingestion jobs as an admin or owner:

curl http://localhost:8080/api/v1/admin/jobs \
  -H "Authorization: Bearer <api_key_token>"

Embeddings

ForgeRAG defaults to FORGERAG_EMBEDDING_PROVIDER=local, which uses deterministic hash embeddings. This keeps local development, tests, and demos working without paid credentials.

For production-style embeddings, set:

FORGERAG_EMBEDDING_PROVIDER=openai
FORGERAG_EMBEDDING_MODEL=text-embedding-3-small
FORGERAG_OPENAI_API_KEY=<your_api_key>

The current pgvector column is vector(1536), so FORGERAG_EMBEDDING_DIMENSIONS must remain 1536 unless you add a matching database migration.

Environment

Copy .env.example and adjust values as needed.

Important variables:

FORGERAG_ENV
FORGERAG_HTTP_ADDR
FORGERAG_DATABASE_URL
FORGERAG_MIGRATIONS_DIR
FORGERAG_STORAGE_DIR
FORGERAG_MAX_UPLOAD_BYTES
FORGERAG_EMBEDDING_PROVIDER
FORGERAG_EMBEDDING_MODEL
FORGERAG_EMBEDDING_DIMENSIONS
FORGERAG_OPENAI_API_KEY
FORGERAG_OPENAI_BASE_URL
FORGERAG_WORKER_POLL_INTERVAL
FORGERAG_JOB_LEASE_DURATION
FORGERAG_JOB_RETRY_BACKOFF
FORGERAG_LOG_LEVEL

Security Model

Milestone 2 uses workspace-scoped API keys tied to users. Registration creates a user, workspace, owner membership, and API key in one transaction. Document and search APIs require:

  • Authorization: Bearer <api_key_token>
  • membership in the API key workspace
  • role checks for write operations
  • tenant-scoped document and retrieval queries

The production path still adds:

  • document permissions
  • secret redaction
  • rate limiting
  • request and upload size limits

Evaluation

Planned evaluation dimensions:

  • answer correctness
  • groundedness
  • citation presence
  • retrieval relevance
  • refusal correctness
  • latency
  • token usage and cost

Observability

The API already emits structured request logs with request IDs. Worker job attempts record ingestion failures and retry behavior. Search responses include embedding, retrieval, and total latency fields. Later milestones add OpenTelemetry-style spans and metrics across:

  • HTTP requests
  • ingestion jobs
  • text extraction
  • chunking
  • embedding calls
  • vector search
  • LLM generation
  • database operations

Deployment Plan

Milestone 1 is Docker Compose ready. Production deployment will add:

  • GitHub Actions CI
  • migration step
  • container health checks
  • secrets management notes
  • backup and restore notes
  • cloud deployment guide
  • optional Prometheus and Grafana

Roadmap

  • Milestone 0: product boundary and architecture
  • Milestone 1: backend foundation
  • Milestone 2: auth, tenants, and workspaces
  • Milestone 3: document upload and storage
  • Milestone 4: async ingestion queue
  • Milestone 5: text extraction and chunking (implemented)
  • Milestone 6: embeddings and vector search (implemented)
  • Milestone 7: RAG answer generation with citations
  • Milestone 8: query history, audit logs, and feedback
  • Milestone 9: document-level access control
  • Milestone 10: evaluation system
  • Milestone 11: observability and tracing
  • Milestone 12: frontend dashboard
  • Milestone 13: DevOps and deployment
  • Milestone 14: security and reliability hardening
  • Milestone 15: public proof package

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors