A self-hostable research and evidence collection archive
  • JavaScript 53.9%
  • Svelte 24.3%
  • TypeScript 17.5%
  • CSS 2.4%
  • Shell 1.8%
Find a file
Nic Weyand 10152c765c
Some checks failed
ci / verify (push) Failing after 7m7s
deploy-on-main / deploy (push) Failing after 1m1s
Merge pull request 'Frontend overhaul + auto-update pipeline' (#1) from feat/archive-frontend-overhaul into main
2026-05-27 13:16:15 -04:00
.forgejo/workflows feat(ops): scheduled template sync -> auto-merge PR + deploy on green CI 2026-05-27 12:10:13 -04:00
content feat: content loader with Zod validation and configurable schema 2026-04-08 12:27:42 -04:00
docs docs(plan): frontend overhaul + auto-update implementation plan 2026-05-27 10:18:40 -04:00
evidence-keeper feat(evidence): video harvester + watchlist for any archive instance 2026-05-13 13:29:00 -04:00
news-pipeline fix(news-pipeline): robust URL extraction (paren-safe) + deterministic jobId attribution in openseeker-client 2026-05-27 12:50:30 -04:00
scripts feat(ops): scheduled template sync -> auto-merge PR + deploy on green CI 2026-05-27 12:10:13 -04:00
src fix(a11y): header search bar accessible name matches visible text (WCAG 2.5.3) 2026-05-27 12:05:58 -04:00
static feat(map): self-contained SVG choropleth, drop Leaflet/OSM tiles (CSP/Tor-safe) 2026-05-27 11:39:11 -04:00
tests/e2e test(theme): lock prefers-color-scheme dark/light behavior 2026-05-27 11:52:40 -04:00
.env.example feat: deploy script with multi-server support 2026-04-08 12:50:41 -04:00
.gitignore fix(csp): operator theme overrides as an external stylesheet (no inline <style>) 2026-05-11 13:26:32 -04:00
.lighthouserc.json chore(perf): Lighthouse CI with per-spec budget assertions 2026-05-11 13:26:41 -04:00
.pa11yci.json test(a11y): add /search to pa11y-ci URL set 2026-05-11 12:02:46 -04:00
collections.config.ts feat: collections feature module 2026-04-08 12:49:43 -04:00
content.config.ts feat: content loader with Zod validation and configurable schema 2026-04-08 12:27:42 -04:00
package.json feat(map): self-contained SVG choropleth, drop Leaflet/OSM tiles (CSP/Tor-safe) 2026-05-27 11:39:11 -04:00
playwright.config.ts chore(perf): Lighthouse CI with per-spec budget assertions 2026-05-11 13:26:41 -04:00
pnpm-lock.yaml feat(map): self-contained SVG choropleth, drop Leaflet/OSM tiles (CSP/Tor-safe) 2026-05-27 11:39:11 -04:00
README.md docs: comprehensive README reflecting full feature set 2026-04-09 20:44:49 -04:00
site.config.ts fix(a11y): WCAG AA contrast on muted text, severity pills, status badges 2026-05-11 13:05:58 -04:00
svelte.config.js feat: homepage, archive, entry detail routes and components 2026-04-08 12:43:03 -04:00
tags.config.ts feat: add tag definitions with index page 2026-04-09 13:29:38 -04:00
tsconfig.json feat: project scaffold with SvelteKit, Tailwind v4, adapter-static 2026-04-08 12:23:05 -04:00
vite.config.ts fix(csp): operator theme overrides as an external stylesheet (no inline <style>) 2026-05-11 13:26:32 -04:00
vitest.config.components.ts refactor(browse): hide Apply via mounted flag (hydration-safe), drop $app test mock 2026-05-27 10:49:15 -04:00

argand-archive

A comprehensive, open-source template for building forensic-grade archival sites. Ships with automated news monitoring, evidence preservation, LLM-powered editorial pipelines, and full-text search — all local-first, no cloud dependencies.

Built for journalists, researchers, human rights organizations, and anyone who needs to document events with verifiable evidence that can't be scrubbed or tampered with.

Quick Start

git clone git.argand.org/nicweyand/argand-archive my-archive
cd my-archive
pnpm install
# Edit site.config.ts with your site details
# Add markdown files to content/entries/
pnpm dev     # Development server at localhost:5173
pnpm build   # Production build to build/
pnpm preview # Preview production build

What Ships

Display & Navigation

  • Full-text search — Argand Mini Search (WASM, client-side, instant)
  • Map — Leaflet map with severity-colored markers and location data
  • Timeline — Day-based chronological timeline with filters and view modes
  • Statistics — SVG chart dashboard (severity, category, temporal, tags)
  • Citations — Chicago, APA, Bluebook, BibTeX, RIS export per entry
  • Collections — Curated entry groupings
  • Tag definitions — Described tags with index page
  • Download — Site archive as tar.gz + ZIP with SHA-256 checksums

Data Feeds & API

  • RSS / JSON / CSV — Standard export feeds
  • Researcher API — Structured JSON endpoints (/api/entries, /api/stats, /api/export/bibtex, /api/export/ris)

Automated News Pipeline (news-pipeline/)

A 7-stage LLM-powered editorial pipeline that monitors news sources, corroborates facts, and publishes entries — all via local Ollama models:

  1. Intake — RSS feeds, GDELT news API, social media, watchlist monitoring, CC-News
  2. Triage — Small model classifies relevance (biased toward inclusion)
  3. Extract — Mid-range model extracts structured claims from articles
  4. Verify — Large model corroborates across multiple independent sources with circular source detection
  5. Draft — Large model generates publication-ready markdown
  6. Review — Different model performs 5 editorial checks (factual, sources, schema, legal, style)
  7. Publish — Validation battery, random deep audit, deploy

Configure in news-pipeline/pipeline.config.mjs. Run with pnpm pipeline.

Evidence Preservation (evidence-keeper/)

Forensic-grade evidence capture and cryptographic proof:

  • Capture tools — WARC (wget), HTML (monolith), media (yt-dlp) — auto-detected, progressive enhancement
  • Cryptographic proofs — SHA-256 hashing + RFC 3161 timestamps (FreeTSA) + Sigstore/Rekor transparency log
  • Account monitoring — RSS/Atom → ActivityPub → Bluesky AT Protocol → generic API → scraping (priority order)
  • Scrub detection — Monitors watchlisted URLs for changes, classifies severity, submits to Wayback Machine on deletion
  • Storage adapters — Local filesystem + rsync replication (extensible interface for S3, SFTP, etc.)
  • GPG-signed manifests — Batch audit trail, publishable to git

Configure in evidence-keeper/evidence.config.mjs. Run with pnpm evidence.

LLM Analysis Tools

On-demand analysis powered by local Ollama models:

  • Pattern detection — Cross-entry pattern analysis, escalation trends, clustering (pnpm analyze:patterns)
  • Legal framework mapping — Maps entries to applicable statutes, treaties, case law (pnpm analyze:legal)
  • Source reliability scoring — Tracks publisher accuracy over time (pnpm analyze:sources)

Editorial & Collaboration

  • Decap CMS — Web-based editor at /admin/ with Forgejo/Gitea OAuth
  • Multi-editor workflow — Draft → Review → Published status, editorial dashboard, Decap editorial workflow (branch-per-draft + PR review)
  • FOIA tracking — Track FOIA requests and legal holds per entry
  • Verification transparency — Per-entry provenance display showing pipeline audit trail

Source Health & Quality

  • Source URL checker — Scans all entries, verifies every source URL, reports dead/redirected/live (pnpm check:sources)
  • Auto-repair--fix follows redirects and searches Google News / DuckDuckGo for replacement URLs (tier-1 outlets only)
  • Source health dashboard/source-health route showing URL status across all entries

Resilience & Security

  • Threat model — Formal threat model document (docs/threat-model.md)
  • Deployment guide — Single server, multi-region, Tor hidden service (docs/deployment-guide.md)
  • Mirror support — Configure mirror URLs and .onion address in footer
  • Warrant canary — Optional canary URL in site config
  • Accessibility — pa11y-ci WCAG 2.1 AA testing (pnpm test:a11y)

Internationalization

  • i18n — Locale system with translation keys, RTL support, content language tagging
  • Extensible — Add locale files in src/lib/i18n/locales/

Configuration

Everything is configured in site.config.ts:

export default {
  name: 'My Archive',
  tagline: 'Documenting what matters',
  description: '...',
  url: 'https://my-archive.org',
  author: 'Your Name',
  startDate: '2024-01-01',
  contentUnit: { singular: 'incident', plural: 'incidents', slug: 'incidents' },
  categories: { /* your categories */ },
  severityLevels: { /* your severity scale */ },
  theme: { /* colors */ },
  features: {
    search: true, citations: true, collections: false,
    map: true, timeline: true, statistics: true,
    download: true, tagDefinitions: true, evidence: true,
    rss: true, json: true, csv: true,
  },
  // nav, footer, i18n, resilience...
}

Features are opt-in. Disabled features don't appear in navigation and their routes return 404.

Content

Entries go in content/entries/ as markdown with YAML frontmatter:

---
title: "Entry Title"
date: 2025-01-01
lastUpdated: 2025-01-01
description: "One-line description"
summary: "2-3 sentence summary"
category: your_category
severity: your_level
ongoing: true
tags: [tag1, tag2]
sources:
  - url: https://reuters.com/article
    title: "Article title"
    publisher: Reuters
---

Your markdown content here.

Optional fields: location (lat/lng for map), timeline (event array), legalHold, foiaRequests, language, status (draft/review/published), relatedEntries.

Scripts

Command Description
pnpm dev Development server
pnpm build Production build
pnpm check TypeScript/Svelte check
pnpm test Run tests
pnpm pipeline Run news-pipeline
pnpm evidence Run evidence-keeper
pnpm evidence:monitors Run account monitors only
pnpm check:sources Check all source URLs
pnpm check:sources:fix Check + auto-repair dead URLs
pnpm analyze:patterns LLM pattern detection
pnpm analyze:legal LLM legal framework mapping
pnpm analyze:sources Source reliability scoring
pnpm build:archive Generate downloadable site archive
pnpm test:a11y Accessibility testing (WCAG 2.1 AA)
pnpm quality Full quality gate

Template Sync

Archives built from this template can pull upstream updates:

bash scripts/sync-from-template.sh

This merges template changes while preserving site-specific config, content, and customizations. A convenience script syncs all downstream archives at once: bash ~/RustroverProjects/sync-all-archives.sh.

Tech Stack

SvelteKit 2, Svelte 5, TypeScript, Tailwind v4, adapter-static, Zod, Vitest, Argand Mini Search WASM, Ollama (local LLM), gray-matter, rss-parser, pa11y-ci

Tests

176 tests across 21 test files:

  • News pipeline: 75 tests (queue, ollama, dedup, config, RSS, triage, extract, verify, draft, review, publish, integration)
  • Evidence keeper: 101 tests (tools, config, storage, hashing, timestamps, capture, scrub detection, monitors, integration)

Documentation

  • docs/threat-model.md — Formal threat model
  • docs/deployment-guide.md — Deployment and hardening guide
  • docs/editorial-workflow.md — Multi-editor workflow guide
  • docs/i18n.md — Internationalization guide
  • docs/api.md — Researcher API reference
  • docs/accessibility.md — Accessibility testing guide

License

Public domain. No copyright restrictions.