Cesivi as a SharePoint Archive¶
Home → Documentation → Usage → SharePoint Archive
Status: PREVIEW. This scenario is a strategic focus area, not yet a turnkey workflow. The pieces exist — content migration, multiple storage backends, full SharePoint API surface — but the end-to-end flow needs polishing before customers can rely on it. This document captures the scenario, the current state honestly, and the gap to "production-ready archive". Read it as a roadmap and a sales conversation, not as a step-by-step recipe.
Table of Contents¶
- Scenario Overview
- Why Cesivi Fits
- Two Variants of the Use Case
- What Works Today
- What's Missing for v1
- Reference Architecture
- Current Known Workflow (Preview)
- Storage Choices
- Operational Considerations
- Roadmap to Production-Ready
Scenario Overview¶
A customer needs long-term, guaranteed access to data that originated in real SharePoint — beyond the end-of-life of the source system, or as an offline guarantee while the source system is still alive.
Concretely, two real-world drivers:
- SharePoint On-Premises retirement. SharePoint Server (2013/2016/2019/SE) is being decommissioned. Compliance, legal, contractual, or operational requirements demand that the content remain queryable and openable for years to decades after the farm is shut down. A flat ZIP is not enough — auditors and end-users still need to open documents, follow links, and run searches the way they always did.
- SharePoint Online local backup. Microsoft's own retention guarantees on SPO are not sufficient for some customers (e.g. sovereignty requirements, ransomware recovery requirements, contractual local-copy obligations). A periodic, self-hosted, fully offline copy of the SPO tenant — with the same APIs the original tenant exposed — eliminates the dependency on Microsoft's availability and retention policies for the worst-case scenario.
In both cases the customer wants a read-mostly system that behaves like SharePoint — same URLs, same APIs, same client tooling — running on infrastructure they own.
Why Cesivi Fits¶
Cesivi is the only system that combines all of the following:
| Property | Why it matters for archival |
|---|---|
| ~99% client-perspective compatibility with real SharePoint | Existing client tooling (Office desktop, Word/Excel WOPI, custom integrations, browsers, PnP scripts, third-party tools) keeps working against the archive. No custom export viewer, no "sorry, that link no longer works". |
| Full REST + SOAP + CSOM + PnP PowerShell surface | All four canonical SharePoint API styles work. Power-users running their own PnP scripts on the original farm can keep using them against the archive. |
| Pluggable storage backends (FileSystem, Sqlite, SqlServer, MySql, LiteDb, PostgreSQL) | Pick the backend that matches the archival policy: FileSystem for cold/cheap/long-life, SqlServer/PostgreSQL for query-heavy retention, Sqlite for tiny footprint. Switch later via Cesivi.StorageConverter. |
| Vendor-independent, source-available | No dependency on Microsoft's continued operation. No license seat checks tied to a Microsoft product. Auditors can read the source. |
| Runs anywhere .NET 10 runs | Bare-metal Windows, Linux, IIS, Docker, Kubernetes (see HELM_CHART_GUIDE.md and KUBERNETES_DEPLOYMENT.md). On-premises sovereignty is straightforward. |
| Cluster-ready for HA archives | Phase 4 cluster work in flight (leader-elected Lucene writer + multi-node read paths) — an archive that needs to survive single-node failures will have a supported deployment topology. |
Most "SharePoint backup" tools are file-level archivers — they grab the documents and metadata but leave you with a flat structure that can no longer answer the original API calls. Cesivi keeps the API surface alive, which is the part that's hard to re-create later.
Two Variants of the Use Case¶
Variant A: On-Premises Retirement Archive¶
Scenario: Customer is shutting down SharePoint Server 2016/2019/SE. They need to keep the content live-queryable for compliance retention windows (often 7, 10, or 30+ years).
Pattern:
- Bulk one-shot import from the soon-to-be-retired farm into Cesivi, using
Cesivi.MigrationTool. - Validate that all critical user workflows still work against the Cesivi instance (open documents, search, follow links, run reports).
- Cut over DNS so the original SharePoint URL now points at Cesivi (or document a migration of bookmarks).
- Switch the archive to read-only mode (see What's Missing for v1).
- Decommission the source farm.
- Operate Cesivi as a long-term archive — periodic integrity checks, periodic backend snapshots, occasional storage-backend upgrades via
Cesivi.StorageConverter.
Why Cesivi over alternatives: Microsoft's own "Retention" features keep paying for SharePoint Server licensing; export-to-flat-files breaks every API consumer; commercial third-party archivers lock the customer into their own retrieval format. Cesivi keeps the original SharePoint contract intact on infrastructure the customer owns, with no per-seat cost.
Variant B: SharePoint Online Permanent Local Backup¶
Scenario: Customer is on SharePoint Online and needs an always-current local mirror — for ransomware recovery, sovereignty, contractual obligations, or "just in case Microsoft has a bad day".
Pattern:
- Initial bulk import from SPO via the same migration path.
- Scheduled incremental sync (see What's Missing for v1) — daily or hourly, picking up changed items.
- Local Cesivi instance runs alongside the live SPO tenant. End-users use SPO normally; the local Cesivi is a hot standby.
- In a disaster scenario (SPO outage, account lockout, ransomware on the SPO tenant, regulator request for sovereign retrieval), DNS or client config flips to Cesivi.
- After SPO recovers, decide whether to re-sync the gap from Cesivi back to SPO, or keep both running.
Why Cesivi over alternatives: Microsoft's "Backup" service is a SPO-internal restore — it doesn't help if SPO itself is unreachable. Third-party SPO backup tools store data in their own format, requiring their tooling to retrieve. Cesivi is a fully-functional second SharePoint, on infrastructure the customer owns, that any SP-aware client can talk to without modification.
What Works Today¶
These pieces are real, tested, and shipping — they form the foundation of the scenario:
Content migration (MIGRATION.md)¶
Cesivi.MigrationToolreads from SharePoint Server (2016/2019/SE) and SharePoint Online via standard REST/CSOM/PnP.- Migrates: site collections, webs, lists, libraries, items, file content + metadata, users, groups, permissions, role assignments, content types, site columns.
- Partially: user profiles, custom actions.
- One-shot bulk import is the proven path.
Storage flexibility (STORAGE_PROVIDERS.md, STORAGE_CONVERTER.md)¶
- Six storage backends, switchable per deployment.
Cesivi.StorageConvertermigrates content between backends without data loss — so a customer can start on FileSystem (cheap) and move to SqlServer later (or vice versa) as needs change.
API compatibility¶
- ~99% client-side compatibility with real SharePoint surface (CSOM_GUIDE.md, PNP_GUIDE.md, API_REFERENCE.md).
- 580/580 REST/SOAP tests passing.
- 224/224 PnP cmdlets implemented.
Permissions, search, content types¶
- SharePoint permissions model intact (PERMISSIONS_GUIDE.md).
- Lucene-based full-text search with KQL parsing.
- Content types, site columns, taxonomy.
Deployment options¶
- Direct .NET runtime, Docker, Kubernetes (HELM_CHART_GUIDE.md), IIS hosting.
- Production deployment checklist (PRODUCTION_DEPLOYMENT_CHECKLIST.md).
- Backup/restore via EXPORT_IMPORT.md.
Operational tooling¶
Cesivi.ControlCenter— admin UI for runtime config and monitoring.Cesivi.StorageBrowser— direct content browsing/inspection in the storage backend.- Observability (OBSERVABILITY_GUIDE.md).
What's Missing for v1¶
Honest list of gaps that prevent us from selling this as a turnkey archive today:
High priority — block the scenario¶
- Read-only / write-once archive mode. No site- or list-level "freeze this content, no further writes accepted" toggle exists. An archive that accepts new writes is a liability for compliance scenarios.
- Scheduled incremental sync from SPO/SP-OnPrem.
Cesivi.MigrationToolis one-shot. Variant B needs a daemon that polls (or webhook-listens to) the source tenant and applies deltas. Reusing Cesivi's existing change-token machinery is the obvious path. - Tooling currency.
Cesivi.MigrationToolhas not had functional changes since early March 2026; the Server has moved substantially since then. The tool needs a top-to-bottom audit before we can promise customers it works against the current Server. (Tracked inmessage_inject.mddirective 2026-04-29.) - Long-term integrity verification. No background job that walks the archive and verifies "every file's stored bytes still match the checksum recorded at import time". Required for any 7+ year retention story.
Medium priority — make the scenario respectable¶
- Audit log of archive operations. When was each item imported? Who triggered the import? What was the source URL/version? Required for legal-hold and discovery scenarios.
- Tamper-evident storage. Append-only audit log + hash-chained metadata so a customer can prove the archive hasn't been altered.
- WORM / document immutability beyond a simple read-only flag.
- Retention policies that enforce minimum-retention windows (you cannot delete this item before YYYY-MM-DD).
- Compliance reporting UI — what's in the archive, when did it arrive, what's its retention status.
- Round-trip back to SharePoint (export Cesivi content into a new SP farm). Partial today via
Cesivi.MigrationToolbut not validated as a supported path.
Lower priority — polish¶
- Differential snapshots. Roll back the archive to "as it looked on YYYY-MM-DD" without restoring a full backup.
- Legal-hold UI. Mark specific items as legally protected, surface them to compliance officers.
- Archive-specific dashboards in Cesivi.ControlCenter — ingest rate, integrity status, retention window violations.
- Encrypted-at-rest helpers beyond what the chosen storage backend offers natively.
Tracking¶
These gaps will be filed as plans under the P1 "complete in-progress features" track per the 2026-04-29 directive in _project/masterplan/message_inject.md. The first concrete step is the Cesivi.MigrationTool repair and audit, since every gap above depends on a working migration path.
Reference Architecture¶
┌──────────────────────────────────────────────────────────┐
│ SOURCE: SharePoint On-Prem (2016/2019/SE) or SP Online │
└──────────────────────────────────────────────────────────┘
│
│ Variant A: one-shot bulk migrate
│ Variant B: incremental sync (FUTURE)
▼
┌──────────────────────────────────────────────────────────┐
│ Cesivi.MigrationTool │
│ REST + CSOM + PnP PowerShell client against the source │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Cesivi.Server │
│ • Same SP API surface (REST/SOAP/CSOM/PnP) │
│ • Auth (NTLM/Basic/Bearer/Forms/SAML/OIDC/LDAP) │
│ • Lucene search │
│ • Permissions model │
│ • Cluster-ready (HA archives) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ STORAGE BACKEND (pick one — switchable later) │
│ FileSystem │ Sqlite │ SqlServer │ MySql │ LiteDb │ PostgreSQL │
└──────────────────────────────────────────────────────────┘
▲
│
│ StorageBrowser (admin)
│ StorageConverter (one-time migration)
│ ControlCenter (operational UI)
End-user clients connect to Cesivi.Server with the same URL/protocol they previously used against the original SharePoint:
- Web browser → Cesivi.WebUI (renders classic + modern SharePoint look)
- Office desktop apps → WOPI endpoints (OnlyOffice integration available)
- PnP PowerShell scripts → unchanged
- Custom REST/CSOM integrations → unchanged
Current Known Workflow (Preview)¶
Reminder: this is the path that would work today if
Cesivi.MigrationToolis current. Treat the steps below as a working draft, not a guaranteed recipe. Validate each step against your actual environment and report breakages — this scenario is a strategic priority and gaps will be fixed.
Variant A: On-Premises Retirement Archive¶
1. Stand up Cesivi.Server in target environment
→ see DEPLOYMENT_GUIDE.md, choose storage backend per
"Storage Choices" below
2. Validate connectivity to source SharePoint farm
→ service account with read access to all site collections
in scope
3. Run Cesivi.MigrationTool in bulk mode
→ see MIGRATION.md
→ expect long runtime for large farms (TB-scale)
→ migration log lists every item imported
4. Validate the archive
→ spot-check critical site collections via browser
→ run a known PnP script against the archive
→ verify search returns expected results
5. Switch archive to read-only mode (BLOCKED — feature not yet in v1)
→ planned: per-site or per-list flag in admin UI
6. Cut over DNS / re-issue bookmarks
7. Decommission source farm
8. Operate archive
→ periodic backend snapshots
→ periodic integrity verification (BLOCKED — feature not yet in v1)
→ storage-backend upgrades via Cesivi.StorageConverter
Variant B: SharePoint Online Local Backup¶
1. Stand up Cesivi.Server in target environment
2. Initial bulk import from SPO
→ Cesivi.MigrationTool against the SPO tenant
→ service principal with sites.read.all / files.read.all
3. Schedule incremental sync (BLOCKED — feature not yet in v1)
→ planned: daemon that polls source tenant on a cron, applies deltas
4. Operate normally — clients use SPO
5. In a disaster, flip clients to Cesivi
→ DNS, client config, or proxy depending on customer setup
6. After SPO recovers, decide on resync direction
→ SPO → Cesivi: keep the local mirror current
→ Cesivi → SPO: push back changes that happened during outage
(BLOCKED — round-trip not yet validated as a supported path)
Storage Choices¶
Quick guide for picking a backend per archive scenario:
| Backend | Best for | Notes |
|---|---|---|
| FileSystem | Cold/cheap/long-life. Petabyte-scale archives where you put the directory on object storage (S3, Azure Blob via FUSE, etc.) | Native Cesivi format, easy to inspect with normal file tools. Slowest for query-heavy access. |
| Sqlite | Small archives (single-server, < 50 GB), demo/eval | Single file, trivial backup. Not for clustered deployments. |
| SqlServer | Mid-size (10 GB – 10 TB) with query SLAs, HA via Always-On | Customer probably already has SQL Server licensing. Mature backup story. |
| PostgreSQL | Same as SqlServer but no Microsoft licensing dependency | Recommended for sovereignty / open-source-only customers. |
| MySql | Same shape as PostgreSQL | Lower preference unless customer already runs MySQL. |
| LiteDb | Single-file, embedded, .NET-native | Niche; Sqlite is usually a better choice. |
Switch later: Cesivi.StorageConverter migrates between any two. Plan the initial pick around current scale + budget; revisit at the 2-year, 5-year, 10-year retention checkpoints.
Operational Considerations¶
Sizing¶
- A SharePoint On-Prem farm with 2 TB of content + 100k items + 10k users has been the proof point for migration runs. Larger has been demonstrated; document your specific sizing as a customer engagement.
- Storage overhead: depends on backend. FileSystem ~1.05× source; SqlServer ~1.2× source (indexes); add encrypted-at-rest overhead per backend.
Backup of the archive itself¶
- The archive is itself data — back it up.
- For FileSystem backend: standard file-level backup (snapshot, rsync, object-storage replication).
- For SQL backends: native backup tools.
- For all: periodic
Cesivi.StorageBrowseraudit + checksum verification once the integrity-check feature lands.
Authentication¶
- Most archive scenarios want the same auth as the original farm so users don't need to re-learn anything.
- See AUTHENTICATION.md, IDENTITY_PROVIDERS.md, NTLM_SETUP.md, OAUTH2_SETUP.md.
- Variant B (SPO mirror) often wants the same OIDC tenant so SPO and Cesivi share users.
Disaster recovery¶
- Cluster-mode Cesivi (Phase 4 in flight) lets you run the archive across two or more nodes. See MULTI_SERVER_DEPLOYMENT.md.
- For single-node deployments, ensure storage-backend snapshots are off-host (S3, Azure Blob, off-site SQL backup).
Retention compliance¶
- Until the audit-log + WORM features land, retention compliance has to be enforced outside Cesivi (e.g. by storing the backend on WORM-locked S3, by external audit logging of admin actions). Document this honestly to customers; do not claim Cesivi enforces compliance internally yet.
Roadmap to Production-Ready¶
Goal: turn this preview into a turnkey scenario customers can deploy without bespoke engineering. Tracked under the v1.0 push.
Phase 1 — Foundation (P1, near-term)¶
- Repair and audit
Cesivi.MigrationToolagainst the current Server. Per-endpoint contract verification, end-to-end smoke test importing a real SP farm. (Already P1 per 2026-04-29 directive.) - Read-only mode toggle. Site- and list-level. Server enforces; Modern UI shows a clear banner.
- Tools-smoke-test in CI. Catches future drift between Server and tools automatically. (Already P1 per 2026-04-29 directive.)
Phase 2 — Variant A turnkey (P1, mid-term)¶
- Long-term integrity verification job. Background walker, configurable cadence, surface violations in
ControlCenter. - Audit log of archive operations. Append-only, persisted alongside content.
- End-to-end Variant A walkthrough in
_docs/tutorials/(workingTUTORIAL_*style, not preview).
Phase 3 — Variant B turnkey (P2, deferred to v1.x)¶
- Incremental sync daemon from SPO and SP On-Prem. Reuses change-token plumbing.
- Round-trip export back to SharePoint validated as a supported path.
- Compliance dashboard in
ControlCenter(retention windows, integrity status, ingest rate).
Phase 4 — Compliance polish (v1.x)¶
- Tamper-evident storage (hash-chained audit log).
- WORM enforcement beyond simple read-only.
- Legal-hold workflow.
See Also¶
- MIGRATION.md — current migration tool capabilities
- STORAGE_PROVIDERS.md — choose a backend
- STORAGE_CONVERTER.md — switch backends later
- EXPORT_IMPORT.md — backup/restore the archive itself
- PRODUCTION_DEPLOYMENT_CHECKLIST.md — deploy with care
- MULTI_SERVER_DEPLOYMENT.md — HA archives
- AUTHENTICATION.md — match the source farm's auth
- PERMISSIONS_GUIDE.md — preserve the SharePoint permission model
_project/masterplan/message_inject.md— current v1 directive that prioritises this scenario
Status note for the team: this document is intentionally honest about the gap. When customers ask "does Cesivi do SharePoint archival?", the right answer is "yes, that's a strategic priority — here's what works today and here's our roadmap", not "yes" full stop. Update this file as gaps close. Remove the PREVIEW banner only when Phase 1 + Phase 2 of the roadmap are complete.
Navigation: - ← Usage Guide - ← Documentation Home
Last Updated: 2026-04-29