# Health Metrics Dashboard

A tool to upload blood test PDF reports, redact sensitive data, extract metrics via an LLM, and visualize results in a structured health dashboard.

## Purpose

Upload one or more blood test PDFs, optionally redact sensitive data, send to an LLM (OpenAI, Anthropic, or Fireworks) for structured extraction, validate the output, and render a multi-category health dashboard with historical comparisons.

## Workflow

### Step 1: Upload PDFs

- User uploads one or more blood test PDF reports via a file input or drag-and-drop zone.
- Uploaded files are listed with filename and size.
- Files are read in the browser (no server upload).

### Step 2: Redact Sensitive Data

- PDF text is extracted first (via PDF.js) and shown in a plain-text view.
- The user selects text spans to redact; selected spans are replaced with `[REDACTED]` inline.
- Keyboard shortcut: press `R` to redact the current selection.
- Undo stack allows reverting individual redactions before confirming.
- This step is optional — a "Skip" button advances without redacting.
- The redacted plain text is what gets sent to the LLM — no PDF rendering or coordinate mapping involved.

### Step 3: Configure Provider

- User selects provider: Anthropic, OpenAI, or Fireworks.
- Provider selection is persisted to `localStorage` under `health_provider`.
- If no API key exists for the selected provider, an inline key entry section appears.
- Keys are stored in `localStorage` under `{provider}_api_key`.

### Step 4: LLM Extraction

- PDF content (post-redaction) is sent to the selected LLM with the extraction prompt (see below).
- The LLM returns a structured JSON object conforming to the metrics schema.
- Metrics not in the schema are surfaced separately so the user can consider extending the schema.

### Step 5: Validation

- The LLM response is validated against the schema using [Valibot](https://github.com/fabian-hiller/valibot).
- Validation errors are shown in the error screen.
- Valid entries are accepted and stored in the session.

### Step 6: Dashboard

- All validated measurement entries are rendered in the dashboard.
- The dashboard builds a historical record across uploads and page loads (see Session & Persistence below).

## Extraction Prompt

The prompt instructs the LLM to:

1. Treat the attached document(s) as blood test laboratory reports.
2. Extract every measurable metric it can identify.
3. Return a **JSON object** `{ "entries": [{ ... }] }` with exactly one entry — each file is sent in a separate LLM call.
4. Only include fields where a value was found — omit fields with no data entirely (reduces response size).
5. For each metric found that does **not** map to the schema, return it in a separate `unknown_metrics` array with: `name`, `value`, `unit`, and `raw_text`.
6. Return valid JSON only, no prose.

## Metrics Schema

```json
{
  "type": "object",
  "properties": {
    "date": { "type": "string", "format": "date" },

    "zinc": {
      "type": "object",
      "properties": {
        "umol_l": { "type": ["number", "null"] }
      }
    },

    "total_testosterone": {
      "type": "object",
      "properties": {
        "nmol_l": { "type": ["number", "null"] },
        "ng_ml": { "type": ["number", "null"] }
      }
    },

    "free_testosterone": {
      "type": "object",
      "properties": {
        "pmol_l": { "type": ["number", "null"] }
      }
    },

    "shbg": {
      "type": "object",
      "properties": {
        "nmol_l": { "type": ["number", "null"] }
      }
    },

    "tsh": {
      "type": "object",
      "properties": {
        "uU_ml": { "type": ["number", "null"] }
      }
    },

    "free_t4": {
      "type": "object",
      "properties": {
        "ng_dl": { "type": ["number", "null"] },
        "pmol_l": { "type": ["number", "null"] }
      }
    },

    "free_t3": {
      "type": "object",
      "properties": {
        "pg_ml": { "type": ["number", "null"] },
        "pmol_l": { "type": ["number", "null"] }
      }
    },

    "folate": {
      "type": "object",
      "properties": {
        "ng_ml": { "type": ["number", "null"] },
        "nmol_l": { "type": ["number", "null"] }
      }
    },

    "vitamin_b12": {
      "type": "object",
      "properties": {
        "pg_ml": { "type": ["number", "null"] },
        "pmol_l": { "type": ["number", "null"] }
      }
    },

    "vitamin_d_25_oh": {
      "type": "object",
      "properties": {
        "ng_ml": { "type": ["number", "null"] },
        "nmol_l": { "type": ["number", "null"] }
      }
    },

    "cholesterol": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "cholesterol_hdl_ratio": {
      "type": ["number", "null"]
    },

    "hdl": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "ldl": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "triglycerides": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "tg_hdl_ratio": {
      "type": ["number", "null"]
    },

    "free_androgen_index": {
      "type": "object",
      "properties": {
        "percent": { "type": ["number", "null"] }
      }
    },

    "luteinizing_hormone": {
      "type": "object",
      "properties": {
        "iu_l": { "type": ["number", "null"] }
      }
    },

    "fsh": {
      "type": "object",
      "properties": {
        "iu_l": { "type": ["number", "null"] }
      }
    },

    "potassium": {
      "type": "object",
      "properties": {
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "calcium": {
      "type": "object",
      "properties": {
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "magnesium": {
      "type": "object",
      "properties": {
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "creatinine": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "umol_l": { "type": ["number", "null"] }
      }
    },

    "gfr": {
      "type": "object",
      "properties": {
        "ml_min": { "type": ["number", "null"] }
      }
    },

    "bun": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "uric_acid": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "umol_l": { "type": ["number", "null"] }
      }
    },

    "asat_got": {
      "type": "object",
      "properties": {
        "u_l": { "type": ["number", "null"] }
      }
    },

    "alat_gpt": {
      "type": "object",
      "properties": {
        "u_l": { "type": ["number", "null"] }
      }
    },

    "gamma_gt": {
      "type": "object",
      "properties": {
        "u_l": { "type": ["number", "null"] }
      }
    },

    "bilirubin": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "umol_l": { "type": ["number", "null"] }
      }
    },

    "lipase": {
      "type": "object",
      "properties": {
        "u_l": { "type": ["number", "null"] }
      }
    },

    "alpha_amylase": {
      "type": "object",
      "properties": {
        "u_l": { "type": ["number", "null"] }
      }
    },

    "iron": {
      "type": "object",
      "properties": {
        "ug_dl": { "type": ["number", "null"] },
        "umol_l": { "type": ["number", "null"] }
      }
    },

    "transferrin": {
      "type": "object",
      "properties": {
        "g_l": { "type": ["number", "null"] }
      }
    },

    "transferrin_saturation": {
      "type": "object",
      "properties": {
        "percent": { "type": ["number", "null"] }
      }
    },

    "ferritin": {
      "type": "object",
      "properties": {
        "ng_ml": { "type": ["number", "null"] }
      }
    },

    "glucose": {
      "type": "object",
      "properties": {
        "mg_dl": { "type": ["number", "null"] },
        "mmol_l": { "type": ["number", "null"] }
      }
    },

    "total_protein": {
      "type": "object",
      "properties": {
        "g_l": { "type": ["number", "null"] }
      }
    },

    "albumin": {
      "type": "object",
      "properties": {
        "g_l": { "type": ["number", "null"] }
      }
    }
  },
  "required": ["date"]
}
```

Unknown metrics returned by the LLM are collected in:

```json
{
  "unknown_metrics": [
    { "name": "string", "value": "number|null", "unit": "string|null", "raw_text": "string|null" }
  ]
}
```

These are surfaced to the user with a prompt to consider extending the schema.

## Marker Definitions

All markers are defined in a flat `MARKER_DEFS` array. Each entry has:

| Field | Description |
|-------|-------------|
| `key` | Property key in the extracted JSON |
| `name` | Display name |
| `pu` | Primary unit key (e.g. `nmol_l`); `null` for scalar metrics |
| `pl` | Primary unit label (e.g. `nmol/l`) |
| `au` | Alt unit key (optional) |
| `al` | Alt unit label (optional) |
| `ref` | Reference range string for display |
| `cat` | Category name |
| `check(v)` | Returns `'ok'`, `'borderline'`, `'high'`, `'low'`, or `null` (for missing values) |

`CATEGORIES` is derived from `MARKER_DEFS` — it is not declared separately:

```js
const CATEGORIES = CATEGORY_NAMES.map(name => ({ name, markers: MARKER_DEFS.filter(m => m.cat === name) }))
```

### Metric Categories

| Category | Metrics |
|---|---|
| Hormones | Total Testosterone, Free Testosterone, SHBG, Free Androgen Index, LH, FSH |
| Thyroid | TSH, Free T4, Free T3 |
| Vitamins & Minerals | Zinc, Folate, Vitamin B12, Vitamin D (25-OH), Potassium, Calcium, Magnesium |
| Cholesterol & Lipids | Cholesterol, HDL, LDL, Triglycerides, Chol/HDL Ratio, TG:HDL Ratio |
| Metabolic & Kidney | Glucose, Creatinine, GFR, BUN, Uric Acid |
| Liver Function & Other | ASAT/GOT, ALAT/GPT, Gamma-GT, Bilirubin, Lipase, Alpha-Amylase, Iron, Transferrin, Transferrin Saturation, Ferritin, Total Protein, Albumin |

## Dashboard

### Page Structure (top to bottom)

1. **Header**: title + timespan summary + action buttons (Download CSV, Add report, Clear all)
2. **Markers**: All metrics grouped by category, each with sparkline and expandable history
3. **Timetable**: All measurement dates as columns, markers as rows, with category filter
4. **Key Highlights & Clinical Summary**: Non-OK markers with findings, status badges, and recommendations
5. **Unknown Metrics**: Metrics extracted by LLM but not in the schema
6. **Footer**: Disclaimer — "Not medical advice. Consult a physician for clinical decisions."

### Markers Section

Each marker row shows:
- Marker name + primary unit label
- Inline sparkline SVG (80×24 px) showing trend across all measurement dates
- Latest value (colored by status) + alt unit below if available
- Reference range (hidden on small screens)
- Status badge: `ok`, `~`, `high`, or `low`
- Chevron (`▸`/`▾`) indicating expand/collapse state

Clicking a row toggles an expand panel showing all historical measurements per date, plus the reference range. Expand state is tracked in a module-level `expandState` object (not a signal), so rows stay open across dashboard re-renders.

### Timetable

- daisyUI `table table-xs` with monospace font
- First column: marker name + unit; subsequent columns: one per measurement date (formatted as `Mon 'YY`)
- Category filter via `<select select-xs>` — directly rebuilds rows via `buildRows()`, no signal
- Category group separator: bold top border on first row of each category, category name shown in the name cell
- Color legend below the table

### Key Highlights & Clinical Summary

- daisyUI `table table-sm` with monospace font
- Shows only non-OK markers from the latest entry
- Columns: Category, Finding, Status, Note
- Note column states the status factually with the reference range (e.g. "LDL is above the reference range (<3.37 mmol/l).")
- Status badges: green (Optimal/Normal), amber (Watch), red (Elevated/Low)
- If all markers are in range, shows a green "All measured markers within reference range." message

### Unknown Metrics Panel

- daisyUI `table table-xs`
- Columns: Name, Value, Unit, Raw text
- Accompanied by a note suggesting the user extend the schema

## Session & Persistence

A "session" spans multiple page loads. All validated measurement entries are persisted in `localStorage` under the key `health_metrics_entries` as a JSON array, sorted by date.

- **PDFs are never stored** — only the validated, structured data extracted from them.
- **Each entry** is one measurement date's worth of data, conforming to the metrics schema.
- **On page load**, entries are read from `localStorage` and the dashboard renders immediately if any exist.
- **On new upload**, the newly extracted and validated entry is merged into the stored array (de-duplicated by date), then written back to `localStorage`.
- **Manual deletion**: not currently supported per-entry; use "Clear all" to wipe all measurements.
- **Clear all**: a destructive action to wipe `localStorage` and reset the dashboard, with a `confirm()` prompt.

The `localStorage` limit (~5 MB) is not a concern for this data shape — dozens of annual blood test entries with ~40 numeric fields each amount to a few kilobytes.

## UI Components

### Upload Screen

- Header with title
- Drag-and-drop zone or file picker (`accept=".pdf"`, multiple files allowed)
- File list with name, size, and remove button per file
- "Extract & Continue →" button — extracts PDF text, then advances to redaction
- If existing entries exist, a "Back to dashboard" link appears below

### Redaction Screen

- Header with title
- Extracted text in a scrollable `<pre>` with `select-text` enabled
- Search input + "Redact all (Enter)" button replaces every occurrence of the search term with `[REDACTED]`; Enter key triggers it from the input
- "Redact selection (R)" button + `R` keyboard shortcut replace selection with `[REDACTED]`
- "Undo" button to revert last redaction (covers both search-replace and selection-redact)
- "Skip" button to advance without redacting
- "Confirm & Continue →" button to finalize and advance to provider selection

### Configure Screen

- Header with title
- Provider toggle: Anthropic / OpenAI / Fireworks (active = `btn-neutral`, inactive = `btn-outline`)
- Selected provider stored in `localStorage` under `health_provider`
- If no API key for the selected provider: inline section with password input + "Save & Run →" button appears
- "Run Extraction →" button starts the extraction if a key is already stored

### Processing Screen

- Header with title
- Centered loading message with animated pulse bar

### Error Screen

- Header with title
- Error message in red
- Two recovery buttons:
  - "Try different provider →" → navigates to `configuring` state (redacted text preserved)
  - "← Start over" → navigates to `upload` state

### Dashboard Screen

See Dashboard section above.

## Dependencies

| Library | Version | How loaded | Purpose |
|---------|---------|------------|---------|
| @tailwindcss/browser | 4 | `<script src>` CDN | Utility CSS |
| daisyui | 5 | `<link>` CDN (`themes.css` + main) | Theme system + component classes |
| spellcaster | 6.0.0 | importmap (esm.sh) | Reactive signals + effects |
| pdfjs-dist | 4.9.155 | importmap (esm.sh) | PDF text extraction in browser |
| valibot | 1 | importmap (esm.sh) | Schema validation of LLM output |

daisyUI is loaded via two `<link>` tags, not importmap:
```html
<link href="https://cdn.jsdelivr.net/npm/daisyui@5/themes.css" rel="stylesheet" />
<link href="https://cdn.jsdelivr.net/npm/daisyui@5" rel="stylesheet" />
```

## Visual Language

- Minimal, text-first interface with almost no decorative chrome.
- Monospace-led typography (`ui-monospace`) applied globally via `body` style.
- daisyUI semantic color tokens (`bg-base-100`, `text-base-content`, etc.) throughout — no raw color values in layout.
- High whitespace density; structural clarity through alignment and spacing over cards/shadows.
- Motion: minimal — `.sparkline-dot` hover transitions, `.marker-row` background transitions.

### Theming (daisyUI v5)

- `data-theme` attribute on `<html>` controls the active theme (`light` or `dark`).
- Theme is always in sync with the system `prefers-color-scheme` — no manual override.
- An inline `<script>` in `<head>` sets `data-theme` before first paint to avoid FOUC.
- The module script adds a `change` listener on `matchMedia` to update `data-theme` live when the OS theme changes mid-session.
- Tailwind dark-mode variant: `@variant dark (&:where([data-theme="dark"], [data-theme="dark"] *))` — enables `dark:` utilities to respond to `data-theme="dark"` rather than a CSS class.

### Status Colors

| Status | Text color | Badge |
|--------|-----------|-------|
| `ok` | green-600 / dark:green-400 | green-100/green-700 |
| `borderline` | amber-500 | amber-100/amber-700 |
| `high` / `low` | red-500 | red-100/red-700 |
| `null` | base-content/50 | base-content/20 |

### Sparklines

- 80×24 px inline SVG per marker row.
- Polyline connecting all data points (subtle, 15% opacity).
- Colored dots: `#22c55e` (ok), `#f59e0b` (borderline), `#ef4444` (high/low), `#94a3b8` (no data).
- Latest measurement dot is rendered at radius 3; historical at radius 2.
- `.sparkline-dot` class has a `0.15s` radius transition for hover effects.

## Key Functions

### `extractTextFromPDFs(files)`
- Accepts `FileList`, reads each as `ArrayBuffer`.
- Uses PDF.js `page.getTextContent()` per page.
- Joins pages within each file with `\n\n---\n\n`.
- Joins files with `====FILE====` separator so `runExtraction` can split them for separate LLM calls.
- Returns full combined string.

### `buildPrompt(text)`
- Constructs the LLM user prompt.
- Inlines `SCHEMA_SUMMARY` (derived at runtime from `MARKER_DEFS`) and the extracted text.
- Instructs model to omit fields with no data (reduces response token count) and return JSON only.

### `callLLM(provider, apiKey, text)`
- Dispatches to the appropriate API based on `provider`:
  - `openai` → `gpt-4o` with `response_format: { type: 'json_object' }`
  - `anthropic` → `claude-opus-4-6`, `max_tokens: 4096`, requires `anthropic-dangerous-direct-browser-access: true` header
  - `fireworks` → `llama-v3p3-70b-instruct` via OpenAI-compatible endpoint, `max_tokens: 4096` (non-streaming cap)
- Returns raw response string.

### `parseAndValidate(responseText)`
- Strips markdown code fences if present, then JSON-parses.
- Validates against `metricsSchema` (Valibot).
- Logs validation errors to console with full raw response.
- Returns `{ data, unknownMetrics, errors }`.

### `getMarkerValue(entry, m)`
- Returns the primary unit value for a marker from a given entry.
- Handles scalar markers (direct number) and object markers (`entry[m.key][m.pu]`).

### `getMarkerAltValue(entry, m)`
- Returns the alt unit value for a marker (if `m.au` is defined).

### `statusTextCls(status)`
- Maps `ok`/`borderline`/`high`/`low`/`null` to Tailwind text color classes.

### `statusBadgeCls(status)`
- Maps status to badge background + text color classes.

### `statusLabelText(status)`
- Returns short display string: `ok`, `~`, `high`, `low`, or `—`.

### `findingBadgeCls(status)`
- Used in Clinical Summary for free-text status strings from the generated findings.
- Maps `optimal`/`normal` → green, `watch`/`monitoring`/`borderline` → amber, `elevated`/`low`/`critical` → red.

### `buildSparklineSVG(m, ents)`
- Builds an 80×24 inline SVG string for a marker across all entries.
- Normalizes y-axis to the observed value range.
- Returns placeholder SVG with `—` if no data exists.

### `downloadCSV(ents)`
- Converts all stored entries to a flat CSV (one row per date, one column per unit field).
- Both primary and alt unit columns are included for markers that have both.
- Triggers browser download of `health-metrics.csv`.

### `el(tag, attrs, ...children)`
- Minimal DOM element factory.
- Handles `className`, `innerHTML`, event listeners (`on*`), and arbitrary attributes.
- Accepts string children (converted to text nodes) and nested arrays.

### `solidBtn(label, onClick)` / `ghostBtn(label, onClick)`
- Convenience wrappers for daisyUI `btn btn-neutral` and `btn btn-ghost btn-sm` buttons.

### `fmtDateShort(dateStr)`
- Formats `YYYY-MM-DD` as `Mon 'YY` (e.g. `Jan '24`) for table column headers.

### `mergeEntry(ents, newEntry)`
- De-duplicates by `date`, then sorts by date ascending.

## User Workflows

### First-Time Upload Workflow

1. User opens tool.
2. User drags or selects one or more PDF blood test reports.
3. User optionally selects text and presses `R` to redact spans; clicks "Confirm & Continue →".
4. User selects provider; enters API key if not yet stored.
5. Tool sends extracted text + prompt to the LLM.
6. Loading screen shown while awaiting response.
7. Response is parsed and validated.
8. Dashboard renders with all metrics, grouped by category.
9. Unknown metrics are listed below the dashboard.

### Returning User Workflow

1. User opens tool; existing entries render the dashboard immediately.
2. User clicks "+ Add report" to upload new PDFs.
3. Steps 3–9 as above, skipping the API key entry if key is already stored.

### Error Recovery Workflow

1. LLM call fails (e.g. wrong API key, rate limit, provider tier issue).
2. Error screen shows the error message.
3. User clicks "Try different provider →" to return to the configure screen with redacted text preserved — no need to re-upload PDFs.
4. Alternatively, "← Start over" returns to the upload screen.