ForgeRAG

Secure RAG infrastructure for document intelligence, citations, access control, evaluation, and observability.

ForgeRAG is not a "chat with PDF" toy. It is a production-style Retrieval Augmented Generation platform where an organization can upload documents, index them asynchronously, ask grounded questions, receive cited answers, restrict access by workspace and document permissions, inspect failures, and evaluate answer quality over time.

Problem

Teams want AI search over private documents, but a useful enterprise system needs more than a prompt and a file upload. It needs durable ingestion, tenant isolation, authorization, source citations, audit trails, evaluation, observability, and deployment discipline.

ForgeRAG is designed to show the infrastructure behind enterprise AI search:

Documents are ingested asynchronously.
Answers must be grounded in retrieved chunks.
Answers must include citations.
Retrieval must be observable.
Every query must be logged.
Ingestion failures must be visible.
Access control matters from the start.

Current Scope

The current implementation includes:

Go API service
Workspace owner registration
Workspace-scoped API key authentication
Role-based access control for document APIs
PDF, text, and markdown upload
Local document storage with checksum and file metadata
Durable ingestion jobs with worker leasing, retries, backoff, attempt history, and dead-letter handling
Worker-based text extraction for PDF, text, and markdown files
Cleaned document chunks with overlap, page metadata, and token estimates
Chunk embeddings generated during ingestion
Local deterministic embedding provider for development and tests
OpenAI-compatible embedding provider configuration
Tenant-filtered pgvector semantic search endpoint
Persisted document_chunks records with a workspace-scoped inspection endpoint
Postgres connection and readiness check
SQL migrations with pgvector enabled and HNSW indexing
Docker Compose stack
Structured JSON logs
Request IDs
Consistent API error responses
Workspace and tenant-scoped document metadata endpoints
Architecture and deployment documentation

Future milestones add answer generation, citations, query history, evaluation, observability dashboards, and frontend screens.

Architecture

flowchart LR
    User[User or API Client] --> API[Go API Service]
    API --> DB[(Postgres + pgvector)]
    API --> Embed[Embedding Provider]
    API --> Storage[Document Storage]
    API --> Jobs[Durable Job Queue]
    Jobs --> Worker[Go Ingestion Worker]
    Worker --> Storage
    Worker --> Extract[Text Extraction]
    Worker --> Chunk[Chunking]
    Worker --> Embed
    Worker --> DB
    API --> LLM[Chat Model Provider]
    API --> Telemetry[Logs, Metrics, Traces]

Core Services

API service: authentication, workspaces, documents, search, ask, feedback, admin inspection
Worker service: document extraction, chunking, embedding, indexing, retries, failure classification
Postgres: transactional data, pgvector semantic index, query history, audit logs
Storage: original uploaded documents and extracted artifacts
Evaluation runner: repeatable quality tests for retrieval and answers
Dashboard: document status, query history, citations, feedback, evaluation, health

Initial API

GET  /health
GET  /ready

POST /auth/register

POST /api/v1/workspaces

POST /api/v1/documents
GET  /api/v1/documents
GET  /api/v1/documents/{id}
GET  /api/v1/documents/{id}/chunks

POST /api/v1/search

GET  /api/v1/admin/jobs
GET  /api/v1/admin/jobs/{id}
POST /api/v1/admin/jobs/{id}/retry

Planned API:

POST /auth/login

GET    /workspaces/{id}
DELETE /documents/{id}

POST /ask

GET  /queries
GET  /queries/{id}
POST /queries/{id}/feedback

POST /eval/datasets
POST /eval/cases
POST /eval/runs
GET  /eval/runs/{id}

Data Model

Core tables:

users
workspaces
workspace_members
api_keys

documents
document_versions
document_files
document_chunks
document_permissions
document_collections
collection_members

jobs
job_attempts

rag_queries
rag_query_retrievals
rag_feedback

eval_datasets
eval_cases
eval_runs
eval_results

audit_logs

Failure Modes

ForgeRAG treats failures as product-visible infrastructure events, not hidden logs.

Current ingestion and retrieval failure classes:

document_parse_failed
storage_failed
embedding_failed
chunk_persist_failed
search_failed
database_unavailable
job_lease_expired

Planned provider and answer failure classes:

provider_timeout
llm_failed
permission_denied
rate_limited
insufficient_context

Local Development

Requirements:

Docker
Go 1.24 or newer

Start the full local stack:

docker compose up --build

Run the API directly:

cd backend
go run ./cmd/api

Run the ingestion worker directly:

cd backend
go run ./cmd/worker

The API listens on http://localhost:8080 by default.

Health checks:

curl http://localhost:8080/health
curl http://localhost:8080/ready

Register an owner, workspace, and first API key:

curl -X POST http://localhost:8080/auth/register \
  -H "Content-Type: application/json" \
  -d "{\"email\":\"owner@example.com\",\"name\":\"Acme Owner\",\"workspace_name\":\"Acme Finance\"}"

The api_key.token value is shown only once. Use it as a bearer token for workspace APIs.

Create another workspace:

curl -X POST http://localhost:8080/api/v1/workspaces \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"name\":\"Acme Legal\"}"

Create a document metadata record:

curl -X POST http://localhost:8080/api/v1/documents \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"workspace_id\":\"<workspace_id>\",\"title\":\"Refund Policy\",\"source_type\":\"manual\"}"

Upload a PDF, text, or markdown document:

curl -X POST http://localhost:8080/api/v1/documents \
  -H "Authorization: Bearer <api_key_token>" \
  -F "workspace_id=<workspace_id>" \
  -F "title=Refund Policy" \
  -F "file=@./refund-policy.pdf"

Uploaded documents are stored locally under FORGERAG_STORAGE_DIR, and the database records file name, storage URI, content type, size, checksum, version, document status, and queued ingestion job. The worker resolves the stored file URI, extracts text, chunks it with overlap, embeds each chunk, persists document_chunks, and updates document status based on the job result.

Inspect extracted chunks for a document:

curl http://localhost:8080/api/v1/documents/<document_id>/chunks \
  -H "Authorization: Bearer <api_key_token>"

Search indexed chunks semantically:

curl -X POST http://localhost:8080/api/v1/search \
  -H "Authorization: Bearer <api_key_token>" \
  -H "Content-Type: application/json" \
  -d "{\"query\":\"What is the refund window?\",\"top_k\":5}"

Inspect ingestion jobs as an admin or owner:

curl http://localhost:8080/api/v1/admin/jobs \
  -H "Authorization: Bearer <api_key_token>"

Embeddings

ForgeRAG defaults to FORGERAG_EMBEDDING_PROVIDER=local, which uses deterministic hash embeddings. This keeps local development, tests, and demos working without paid credentials.

For production-style embeddings, set:

FORGERAG_EMBEDDING_PROVIDER=openai
FORGERAG_EMBEDDING_MODEL=text-embedding-3-small
FORGERAG_OPENAI_API_KEY=<your_api_key>

The current pgvector column is vector(1536), so FORGERAG_EMBEDDING_DIMENSIONS must remain 1536 unless you add a matching database migration.

Environment

Copy .env.example and adjust values as needed.

Important variables:

FORGERAG_ENV
FORGERAG_HTTP_ADDR
FORGERAG_DATABASE_URL
FORGERAG_MIGRATIONS_DIR
FORGERAG_STORAGE_DIR
FORGERAG_MAX_UPLOAD_BYTES
FORGERAG_EMBEDDING_PROVIDER
FORGERAG_EMBEDDING_MODEL
FORGERAG_EMBEDDING_DIMENSIONS
FORGERAG_OPENAI_API_KEY
FORGERAG_OPENAI_BASE_URL
FORGERAG_WORKER_POLL_INTERVAL
FORGERAG_JOB_LEASE_DURATION
FORGERAG_JOB_RETRY_BACKOFF
FORGERAG_LOG_LEVEL

Security Model

Milestone 2 uses workspace-scoped API keys tied to users. Registration creates a user, workspace, owner membership, and API key in one transaction. Document and search APIs require:

Authorization: Bearer <api_key_token>
membership in the API key workspace
role checks for write operations
tenant-scoped document and retrieval queries

The production path still adds:

document permissions
secret redaction
rate limiting
request and upload size limits

Evaluation

Planned evaluation dimensions:

answer correctness
groundedness
citation presence
retrieval relevance
refusal correctness
latency
token usage and cost

Observability

The API already emits structured request logs with request IDs. Worker job attempts record ingestion failures and retry behavior. Search responses include embedding, retrieval, and total latency fields. Later milestones add OpenTelemetry-style spans and metrics across:

HTTP requests
ingestion jobs
text extraction
chunking
embedding calls
vector search
LLM generation
database operations

Deployment Plan

Milestone 1 is Docker Compose ready. Production deployment will add:

GitHub Actions CI
migration step
container health checks
secrets management notes
backup and restore notes
cloud deployment guide
optional Prometheus and Grafana

Roadmap

Milestone 0: product boundary and architecture
Milestone 1: backend foundation
Milestone 2: auth, tenants, and workspaces
Milestone 3: document upload and storage
Milestone 4: async ingestion queue
Milestone 5: text extraction and chunking (implemented)
Milestone 6: embeddings and vector search (implemented)
Milestone 7: RAG answer generation with citations
Milestone 8: query history, audit logs, and feedback
Milestone 9: document-level access control
Milestone 10: evaluation system
Milestone 11: observability and tracing
Milestone 12: frontend dashboard
Milestone 13: DevOps and deployment
Milestone 14: security and reliability hardening
Milestone 15: public proof package

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ForgeRAG

Problem

Current Scope

Architecture

Core Services

Initial API

Data Model

Failure Modes

Local Development

Embeddings

Environment

Security Model

Evaluation

Observability

Deployment Plan

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ForgeRAG

Problem

Current Scope

Architecture

Core Services

Initial API

Data Model

Failure Modes

Local Development

Embeddings

Environment

Security Model

Evaluation

Observability

Deployment Plan

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages