Skip to content

feat: security contacts worker [CM-1297]#4283

Open
mbani01 wants to merge 19 commits into
mainfrom
feat/security_contacts_worker
Open

feat: security contacts worker [CM-1297]#4283
mbani01 wants to merge 19 commits into
mainfrom
feat/security_contacts_worker

Conversation

@mbani01

@mbani01 mbani01 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

This pull request introduces support for processing and storing security contacts for repositories, including database schema changes, new worker service setup, configuration, and supporting code. It also updates dependencies and environment files as needed. Below are the most important changes grouped by theme:

Security Contacts Feature Implementation

  • Adds a new database table security_contacts and several related columns to the repos table to store security contact information and metadata.
  • Implements the processSecurityContactsBatch activity and supporting extractor logic to fetch and process security contacts, including HTTP helpers and parsing utilities. [1] [2]
  • Adds a new security-contacts-worker service with Docker Compose configuration, build integration, and npm scripts for running and developing the worker. [1] [2] [3]

Configuration and Environment

  • Adds new environment variables for configuring the security contacts worker (interval, user agent, concurrency, timeouts, batch size) in both composed and local environment files, and exposes them through a new getSecurityContactsConfig function. [1] [2] [3]

Dependency and Codebase Maintenance

  • Updates and adds dependencies for js-yaml and its types in both the lockfile and package.json to support new parsing needs. [1] [2] [3] [4] [5] [6] [7]
  • Refactors the InstallationPool class into its own file to be reused and removes its inline definition from the enricher loop. [1] [2] [3]

These changes lay the groundwork for collecting, processing, and storing security contact information for repositories in a scalable and configurable way.


Note

Medium Risk
Large new outbound-ingestion surface (GitHub + registries) that stores contact emails/URLs and mutates packages DB schema; mitigations include SSRF guards, rate limiting, and non-destructive updates on partial failures.

Overview
Adds a security contacts ingestion path for GitHub repos tied to critical packages: new security_contacts rows (channel, value, role, score, confidence, provenance) plus repos policy fields (PVR, policy URLs, contacts_last_refreshed).

A new security-contacts-worker registers a daily Temporal schedule (0 6 * * *) and runs ingestSecurityContacts, which batches repos (daily vs weekly cadence), runs tiered extractors in parallel, then reconciles, scores, and writes results. Sources include SECURITY-INSIGHTS, GitHub PVR, SECURITY_CONTACTS, security.txt (homepage), SECURITY.md, and registry manifests (npm, PyPI, Maven, Cargo, NuGet, RubyGems, Composer). GitHub calls use a shared installation pool with rate-limit handling; registry calls require SECURITY_CONTACTS_USER_AGENT.

Failure behavior is conservative: any extractor failure skips destructive updates and only bumps contacts_last_refreshed; successful runs replace contacts in a transaction while policy columns use COALESCE so partial runs do not clear prior values. InstallationPool is extracted from the github enricher for reuse.

Deploy wiring adds Docker Compose, build scripts, env samples, js-yaml, and npm start/dev scripts on the packages worker task queue.

Reviewed by Cursor Bugbot for commit 2ece757. Bugbot is set up for automated code reviews on this repo. Configure here.

mbani01 added 10 commits June 29, 2026 13:17
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
…CM-1243)

Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
…1243)

Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
@mbani01 mbani01 self-assigned this Jun 30, 2026
Copilot AI review requested due to automatic review settings June 30, 2026 18:04
Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a security-contacts ingestion pipeline to packages_worker. It introduces a daily Temporal cron workflow that selects stale GitHub repos tied to critical packages, fans each repo out across six extractor families (SECURITY-INSIGHTS, GitHub PVR, SECURITY_CONTACTS, security.txt, SECURITY.md, and package-registry manifests), reconciles and scores the discovered contacts, and transactionally persists the top results into a new security_contacts table plus new policy columns on repos. It also wires up a dedicated security-contacts-worker entrypoint (bin, schedule, Docker Compose, build list, env/config), extracts the shared InstallationPool out of the enricher for GitHub App token round-robin, and adds js-yaml for SECURITY-INSIGHTS parsing.

Changes:

  • New security-contacts module: extractors, scoring, reconciliation, batch processing, transactional write, schedule, workflow, and activity.
  • DB migration adding security_contacts and repos policy/refresh columns; config + env vars for the worker.
  • Supporting refactors/deps: InstallationPool extracted to its own file; js-yaml added; new worker entrypoint, npm scripts, Compose service, and build wiring.

Reviewed changes

Copilot reviewed 35 out of 37 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
.../security-contacts/processBatch.ts Orchestrates batch fetch, per-repo extractor fan-out, PVR veto, reconcile, write; defines a local concurrency helper
.../security-contacts/score.ts Pure scoring (tier/channel/freshness/corroboration) and confidence bands
.../security-contacts/reconcile.ts Dedupe, identity-link handle→email, sort, cap to 5 contacts
.../security-contacts/writeContacts.ts Transactional replace of contacts + repo policy column refresh
.../security-contacts/types.ts Shared types for contacts, provenance, policies, extractors
.../security-contacts/extractors/* HTTP helpers, PVR, security.txt/md, SECURITY-INSIGHTS, SECURITY_CONTACTS, registry fetchers
.../security-contacts/extractors/registry/* Per-ecosystem manifest fetchers (npm/pypi/maven/cargo/nuget/rubygems/composer) + purl parser
.../security-contacts/{workflows,schedule,activities,githubToken}.ts Temporal workflow, cron schedule, activity, cached GitHub token pool
.../bin/security-contacts-worker.ts New worker entrypoint (init → schedule → start)
.../enricher/installationPool.ts & runEnrichmentLoop.ts Extracts InstallationPool to its own file and removes the inline copy
.../src/{config,activities,workflows/index}.ts Registers config getter, activity, and workflow
backend/src/osspckgs/migrations/V1782950400__security_contacts.sql New table, indexes, and repos columns
backend/.env.dist.*, scripts/..., package.json, pnpm-lock.yaml Env vars, Compose service, build list, npm scripts, js-yaml dependency
Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts Outdated
Comment thread services/apps/packages_worker/src/security-contacts/score.ts
Comment thread services/apps/packages_worker/src/security-contacts/extractors/securityMd.ts Outdated
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 37 changed files in this pull request and generated 5 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts Outdated
Comment thread services/apps/packages_worker/src/security-contacts/extractors/securityTxt.ts Outdated
Comment thread services/apps/packages_worker/src/security-contacts/reconcile.ts
Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts Outdated
Comment thread services/apps/packages_worker/src/security-contacts/score.ts
mbani01 added 2 commits July 1, 2026 12:46
…nstants

Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Copilot AI review requested due to automatic review settings July 1, 2026 12:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 37 changed files in this pull request and generated 3 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts
Comment thread services/apps/packages_worker/src/security-contacts/score.ts
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Comment thread services/apps/packages_worker/src/security-contacts/githubToken.ts
@mbani01 mbani01 requested a review from themarolt July 1, 2026 14:11
Copilot AI review requested due to automatic review settings July 1, 2026 14:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 37 changed files in this pull request and generated 5 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts
Comment thread services/apps/packages_worker/src/config.ts
Comment thread services/apps/packages_worker/src/security-contacts/score.ts
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Copilot AI review requested due to automatic review settings July 1, 2026 14:50

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 082f6a5. Configure here.

Comment thread services/apps/packages_worker/src/security-contacts/processBatch.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 37 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment on lines +112 to +117
const declaredAt: string | undefined =
typeof root.header?.['last-updated'] === 'string'
? root.header['last-updated']
: typeof root.header?.['last-reviewed'] === 'string'
? root.header['last-reviewed']
: undefined
Comment on lines +14 to +16
function isBlockedHost(h: string): boolean {
return h === 'localhost' || h === '::1' || h === '0.0.0.0' || h.startsWith('127.')
}
Comment on lines +49 to +54
export function getSecurityContactsConfig() {
return {
// Sent on all registry calls; crates.io rejects requests without an identifying UA.
userAgent: requireEnv('SECURITY_CONTACTS_USER_AGENT'),
}
}
Comment on lines +94 to +97
export function scoreContact(
contact: RawContact,
now: Date = new Date(),
): { score: number; confidence: ConfidenceBand } {
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants