Paper Auto-Index Standard

AI Craftspeople Guild · April 2026 · ACG-STD-AUTOPARSE-2026

This document defines how Guild publications are automatically parsed, classified, and listed on the white papers page when submitted via Pull Request. The document uses its own format — if the parser can read this file, the standard is working.

How it works

An author writes a paper. They add a small metadata block at the top. They open a Pull Request. A GitHub Action reads the metadata, assigns an ID if needed, extracts keywords if missing, rebuilds the index page, and posts a summary on the PR. The author never touches the index page. The action is the librarian.

Part 1 — The frontmatter block

Every publication must include a metadata block. The fields are the same across file types; the enclosing syntax differs.

For HTML files

<!--
acg-paper:
  id: ACG-WP-005-2026
  type: white-paper
  title: "My Paper Title"
  author: "Jane Smith (Affiliation)"
  date: 2026-04-17
  status: published
  tags: [ai-safety, governance]
  abstract: "A one-paragraph description for the index card."
-->

For Markdown files

---
acg-paper:
  id: ACG-WP-005-2026
  type: white-paper
  title: "My Paper Title"
  author: "Jane Smith"
  date: 2026-04-17
  status: published
  tags: [ai-safety, governance]
  abstract: "A one-paragraph description for the index card."
---

For PDF files

Place a companion sidecar file with the same name — my-paper.meta.yml next to my-paper.pdf. Same fields, plus a file: field pointing to the PDF.

Part 2 — Field definitions

Required fields

If any of these is missing the PR is rejected with a warning comment.

title — the name of the paper
author — who wrote it
type — what kind of publication it is
abstract — the description shown on the index card

Auto-filled fields

id — auto-increment from the highest existing ACG-WP-NNN
date — set to today's date
status — defaults to draft
tags — auto-extracted from title and abstract against the vocabulary
slug — derived from the filename

Enum values

type: white-paper, position-paper, experimental, research-note, knowledge, standard.

status: draft, review, published, archived.

Part 3 — Tag vocabulary

Tags are lowercase kebab-case. The parser recognises the known tags listed here and auto-extracts them from the title and abstract when the author omits the tags field. Authors may add tags that are not on this list — the vocabulary grows organically.

Core: ai-safety, governance, testing, ethics, architecture, automation, calibration, peer-review, culture, epistemics, falsification, consciousness, harness, federation, healthcare, standards.

Technical: blockchain, konomi, isa-88, isa-95, packml, scada, opc-ua, guild-chain, guild-ops, webrtc, p2p, github-actions, ci-cd, indexing, white-papers.

Meta: knowledge-about-knowledge, systems-thinking, cognitive-apprenticeship, triad-engine, toast, occam.

Part 4 — The parser

Scan. Find all files that contain paper metadata.
Parse. Extract the YAML block and validate required fields.
Fill. Auto-fill id, date, status, tags, slug.
Sort. Group by type (enum order), then newest first by date, then alphabetical by title.
Generate. Rebuild papers.json and white-papers.html.
Validate. Every paper URL resolves; IDs unique; dates ISO 8601; enums valid.
Commit. Add the updated files to the PR branch.
Comment. Post a summary on the PR.

Part 5 — The index card template

<article class="paper-card" data-type="{type}" data-tags="{tags joined by space}">
  <div class="paper-meta">
    <span class="paper-author">{author}</span>
    <span class="paper-date">{date formatted as Month YYYY}</span>
    <span class="paper-id">{id}</span>
  </div>
  <div class="paper-type-badge">{Type Label}</div>
  <h3>{title}</h3>
  <p class="paper-abstract">{abstract}</p>
  <div class="paper-tags">
    <span class="tag">{tag1}</span>
  </div>
  <a href="{slug}.html" class="paper-link">Read {Type Label} →</a>
</article>

Part 6 — papers.json

The machine-readable index. Committed alongside white-papers.html. The HTML page is generated from this file, which is the source of truth.

[
  {
    "id": "ACG-WP-005-2026",
    "type": "white-paper",
    "title": "My Paper Title",
    "author": "Jane Smith",
    "date": "2026-04-17",
    "status": "published",
    "tags": ["ai-safety", "governance"],
    "abstract": "A one-paragraph description.",
    "slug": "my-paper",
    "url": "https://aicraftspeopleguild.github.io/my-paper.html"
  }
]

Part 7 — The GitHub Action

See .github/workflows/paper-index.yml in this repo.

Part 8 — File layout

repo/
├── .github/
│   ├── workflows/paper-index.yml          the action
│   └── scripts/parse-papers.js            scanner + parser + generator
├── papers.json                            machine-readable index (generated)
├── white-papers.html                      human-readable index (generated)
├── paper-auto-index-standard.html         this paper (has frontmatter)
└── index.html                             site homepage (no frontmatter, ignored)

Part 9 — Author checklist

Write your paper as .html, .md, or .pdf.
Add the acg-paper: metadata block.
Fill in title, author, type, abstract.
Leave id blank — the bot assigns it.
Add tags if you want, or leave blank.
Place the file at the repo root.
Open a Pull Request.
The bot parses, indexes, and comments automatically.

You never edit white-papers.html by hand. You never assign your own ID. The bot is the librarian. You are the author.

Part 10 — Test spec

See tests/paper-index.test.yml — the canonical ACG-TEST test for the paper-index workflow.

Part 11 — Self-validation

This document has an acg-paper: frontmatter block at the top. If the parser processes this file, it appears in the index as Paper Auto-Index Standard (ACG-STD-AUTOPARSE-2026). The standard indexes itself. The format describes the format. The document is its own test case.