Skip to content

Archive Identity Resolution

Plan: PLAN-1607 v1.2 G1
Status: Shipped (v1.2)


Overview

When a SharePoint farm is archived, Cesivi must preserve user identity data exactly as it existed at import time — even decades later. Two failure scenarios drive this feature:

  1. Source AD survives: Live IDP federation renders current names/emails. Good UX.
  2. User deleted from live IDP: Without a snapshot, the archive shows Unknown user (id=S-1-5-…) and the audit trail becomes unreadable.

Cesivi solves this with three-tier identity resolution and immutable frozen ACLs.


Resolution Tiers

Tier Source Badge shown to user Authorization
Live Live IDP (AD / Entra) None — plain name Live ACL
Snapshot Import-time snapshot (no longer in directory) Frozen ACL
Unknown No snapshot & no live hit Unknown user (id=S-1-5-…) Frozen ACL (if any)

Resolution happens in order: Live → Snapshot → Unknown. The first hit wins.

Every identity snapshot capture is recorded as an IdentitySnapshotCaptured event in the WORM audit log (v1.2+). See ARCHIVE_AUDIT.md.


Tier Rendering Examples

All three tiers are verified by ArchiveIdentityTierTests in Cesivi.Tests.WebUI/BrowserTests/ (PLAN-1608 Phase 13). Below is how each tier looks in DispForm.

Live tier

User is still present in the live IDP. No special markup is added.

<a href="/_layouts/15/userdisp.aspx?ID=5">Alice Smith</a>

Snapshot tier

User was deleted from the IDP but a snapshot was captured at import time. CSS class cesivi-identity-snapshot is applied. The muted badge appears inline.

<span class="cesivi-identity-snapshot">
  <a href="/_layouts/15/userdisp.aspx?ID=12">Bob Jones</a>
  <em>(no longer in directory)</em>
</span>

Unknown tier

No live IDP hit AND no snapshot. Only the raw ID is shown with a warning style. CSS class cesivi-identity-unknown is applied.

<span class="cesivi-identity-unknown">Unknown user (id=S-1-5-21-…)</span>

Components

Identity Snapshot Store (IIdentitySnapshotStore)

Persists immutable per-user snapshots captured at archive-import time.

Fields: SourceFarmId, Sid, Upn, DisplayName, Email, PrimaryGroups, CapturedAt, CapturedBy.

Write-once: TryAddAsync returns false when the (SourceFarmId, Sid) key already exists. There is no update path by design — snapshots are append-only.

Backends: FileSystem (default), InMemory, SQLite, LiteDB.

Data path (FileSystem): {DataRootPath}/identity-snapshots/{safeFarmId}/{safeSid}.json

Frozen ACL Store (IArchivedAclStore)

Persists frozen ACL records (role assignments) captured at import time.

Scope: "list" or "item". Each ACE has a SID + display name + list of role names.

Write-once: TryAddAsync returns false when the ACL ID already exists.

Backends: FileSystem (default), InMemory.

Data path (FileSystem): {DataRootPath}/archived-acls/{safeFarmId}/{scope}/{scopeId}.json

Federated Identity Lookup (IFederatedIdentityLookup)

Abstraction for live AD/Entra lookups. The default production registration is NullFederatedIdentityLookup, which always returns null (always falls through to snapshot tier). A real AD/Entra adapter is a v1.x deliverable.

Archive Identity Resolver (IArchiveIdentityResolver)

Three-tier resolver: calls the federated lookup first (Live), falls back to snapshot store (Snapshot), falls back to Unknown.

Cache: 5-minute in-memory TTL per (farmId, sid) to avoid hammering live IDP on AllItems renders.

Archive ACL Authorization Service (ArchivedAclAuthorizationService)

For archived list items: call HasReadAccessAsync(farmId, scope, scopeId, userSid) to check the frozen ACL. Returns false when no frozen ACL exists for the scope.


Capture API Endpoints

POST /_api/archive/identity-snapshots

Capture a new identity snapshot. Requires site-admin. Idempotent on (sourceFarmId, sid).

{
  "sourceFarmId": "farm-contoso-2019",
  "sid": "S-1-5-21-12345-67890-11111-1001",
  "displayName": "Alice Smith",
  "upn": "alice.smith@contoso.com",
  "email": "alice.smith@contoso.com",
  "primaryGroups": ["Domain Users", "Finance"]
}

Returns 201 on first capture, 200 on duplicate.

GET /_api/archive/identity-snapshots/{farmId}/{sid}

Returns the snapshot for a specific SID. Returns 404 if not found.

GET /_api/archive/identity-snapshots?farmId={farmId}

Lists all snapshots for a farm. Requires site-admin.

POST /_api/archive/acls

Capture a frozen ACL record. Requires site-admin. Idempotent on AclId.

{
  "sourceFarmId": "farm-contoso-2019",
  "scope": "list",
  "scopeId": "a1a1a1a1-0000-0000-0000-000000000001",
  "entries": [
    { "sid": "S-1-5-21-...", "displayName": "Alice Smith", "roles": ["Read"] },
    { "sid": "S-1-5-21-...", "displayName": "Bob Jones",  "roles": ["Contribute"] }
  ]
}

Scope is "list" or "item".

Returns 201 on first capture, 200 on duplicate.


ControlCenter Dashboard

The Identity Resolution dashboard is at /Archive/Identity in the ControlCenter.

It shows: - Identity snapshots: count for the queried farm - Frozen ACL records: count for the queried farm - Store availability: whether snapshot and ACL stores are responding - Resolution tier table: explains what each tier does

Access it via the Identity Stats button on the Archive page, or navigate directly.


Production Federation

This is the #1 deployment step for production environments.

The default NullFederatedIdentityLookup always returns null, so every archived user falls through to Snapshot tier. That is intentional for development and for archives where the source AD no longer exists.

For production archives where the source directory is still live (e.g. AD on-premises or Azure Entra ID), register a real federation adapter so the Tier-1 (Live) path works and users see their current name without any badge.

Why you must enable federation in production

Without it: - Every user in an archived list/library shows the (no longer in directory) badge even when they are actively working in the company. - The ControlCenter Identity dashboard always shows Live tier: 0.

Wiring a real adapter

  1. Implement IFederatedIdentityLookup in Cesivi.Server.Services:
public class ActiveDirectoryLookup : IFederatedIdentityLookup
{
    public Task<FederatedIdentityResult?> LookupBySidAsync(string sid,
        CancellationToken ct = default)
    {
        // Query your AD / LDAP / Entra Graph API here.
        // Return null if the user is not found in the live directory.
        // Return FederatedIdentityResult with DisplayName + Email if found.
    }
}
  1. Register it in Cesivi.Server/Program.cs, replacing the null implementation:
// Before (development default):
builder.Services.AddSingleton<IFederatedIdentityLookup, NullFederatedIdentityLookup>();

// After (production):
builder.Services.AddSingleton<IFederatedIdentityLookup, ActiveDirectoryLookup>();

The resolver's 5-minute in-memory cache prevents hammering the IDP on every request.

Common adapters

Source directory Recommended approach
AD on-premises Query LDAP via System.DirectoryServices.AccountManagement by SID (UserPrincipal.FindBySid)
Azure Entra ID (cloud) Call GET /v1.0/users?$filter=onPremisesSecurityIdentifier eq '{sid}' via Microsoft Graph SDK
ADFS Resolve via the backing AD as above

Configuration

Federation adapter

See Production Federation above for step-by-step wiring. The short form:

Storage backend

The default backend for both stores is FileSystem, using the same DataRootPath as other storage services. To use a different backend, update Program.cs.


Troubleshooting

"Why is this user showing as Unknown?"

  1. The user's SID is not in the snapshot store for this farm: run GET /_api/archive/identity-snapshots/{farmId}/{sid} to confirm.
  2. The live IDP lookup returned null (default: NullFederatedIdentityLookup always returns null).
  3. Fix: capture a snapshot via POST /_api/archive/identity-snapshots with the correct farm ID and SID.

"Why are snapshot counts showing as 0 in the dashboard?"

  1. No snapshots have been captured yet — the import tool (PLAN-A-3) is responsible for seeding them.
  2. The wrong farmId was queried — the farm ID must match exactly what was used when capturing snapshots.

"Why does a user still appear as (no longer in directory) even though they're back in AD?"

The cache TTL is 5 minutes. Wait or restart the server to clear the resolution cache.

"Why can't an admin edit an archived item even with the override header?"

The X-Cesivi-Archive-Override: true header bypasses the write gate (allowing the edit), but the ACL still applies. If the admin's SID isn't in the frozen ACL, the operation may be rejected at a different layer.

See also: Archive Admin Bundle — ControlCenter Quick Tour

See also: Archive Tools Operator Guide

See also: Tutorial G — SharePoint On-Premises Retirement Archive

See also: Cesivi Archive Variant A — Whitepaper

See also: Compliance Cookbook — HIPAA/GDPR/SOX/FRCP

See also: Archive Cluster Deployment Guide