Skip to content

Cesivi Server - Production Best Practices

Version: 1.1 Last Updated: 2026-03-28


Table of Contents

  1. Overview
  2. Server Configuration
  3. Security Hardening
  4. Performance Tuning
  5. Monitoring and Observability
  6. Scaling Recommendations
  7. Backup and Recovery
  8. Docker Deployment
  9. Operational Runbook
  10. Health Check Endpoints

Overview

This guide provides production deployment best practices for Cesivi Server. Following these recommendations ensures reliable, secure, and performant operation in production environments.

Production Readiness Checklist

Before deploying to production, verify:

  • [ ] Security configuration reviewed and hardened
  • [ ] SSL/TLS certificates configured
  • [ ] Authentication method chosen and configured
  • [ ] Storage backend selected (file system, LiteDB, or SQL Server)
  • [ ] Logging and monitoring configured
  • [ ] Backup strategy implemented
  • [ ] Health check endpoints accessible
  • [ ] Load testing completed
  • [ ] Rollback plan documented

Server Configuration

{
  "Cesivi": {
    "ServerUrl": "https://your-domain.com",
    "SiteCollections": [
      {
        "Name": "Default",
        "RootSite": {
          "Title": "Production Site",
          "Url": "/Default/RootSite"
        }
      }
    ],
    "MockDataPath": "/var/cesivi/data",
    "EnableDetailedErrors": false,
    "EnableRequestLogging": false,
    "MaxRequestBodySize": 104857600,
    "RequestTimeout": 120
  },
  "Authentication": {
    "DefaultScheme": "Bearer",
    "AllowAnonymous": false,
    "RequireHttps": true,
    "TokenLifetimeMinutes": 60,
    "AzureAd": {
      "Enabled": true,
      "TenantId": "mock-tenant-id",
      "ClientId": "your-client-id"
    }
  },
  "Storage": {
    "Provider": "FileSystem",
    "ConnectionString": "/var/cesivi/data",
    "EnableCaching": true,
    "CacheExpirationMinutes": 30
  },
  "Logging": {
    "LogLevel": {
      "Default": "Warning",
      "Microsoft.AspNetCore": "Warning",
      "Cesivi": "Information"
    },
    "File": {
      "Path": "/var/log/cesivi/app.log",
      "MaxFileSizeMB": 100,
      "MaxFiles": 10
    }
  },
  "Kestrel": {
    "Limits": {
      "MaxConcurrentConnections": 200,
      "MaxConcurrentUpgradedConnections": 200,
      "MaxRequestBodySize": 104857600,
      "RequestHeadersTimeout": "00:01:00"
    }
  }
}

Environment Variables

For sensitive configuration, use environment variables:

# Required
export CESIVI_SERVER_URL="https://your-domain.com"
export CESIVI_DATA_PATH="/var/cesivi/data"

# Authentication
export CESIVI_AUTH_CLIENT_ID="your-client-id"
export CESIVI_AUTH_CLIENT_SECRET="your-client-secret"

# Database (if using SQL Server)
export CESIVI_CONNECTION_STRING="Server=db.example.com;Database=Cesivi;User Id=cesivi;Password=secure-password;"

# SSL/TLS
export ASPNETCORE_Kestrel__Certificates__Default__Path="/etc/cesivi/cert.pfx"
export ASPNETCORE_Kestrel__Certificates__Default__Password="cert-password"

Systemd Service File (Linux)

# /etc/systemd/system/cesivi.service
[Unit]
Description=Cesivi Server - SharePoint Mock
After=network.target

[Service]
Type=notify
User=cesivi
Group=cesivi
WorkingDirectory=/opt/cesivi
ExecStart=/usr/bin/dotnet /opt/cesivi/Cesivi.Server.dll
Restart=always
RestartSec=10
KillSignal=SIGINT
SyslogIdentifier=cesivi
Environment=ASPNETCORE_ENVIRONMENT=Production
Environment=DOTNET_PRINT_TELEMETRY_MESSAGE=false

# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/cesivi/data /var/log/cesivi

# Resource limits
LimitNOFILE=65535
LimitNPROC=4096
MemoryMax=4G
CPUQuota=200%

[Install]
WantedBy=multi-user.target

IIS Configuration (Windows)

<!-- web.config -->
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <location path="." inheritInChildApplications="false">
    <system.webServer>
      <handlers>
        <add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModuleV2" resourceType="Unspecified" />
      </handlers>
      <aspNetCore processPath="dotnet"
                  arguments=".\Cesivi.Server.dll"
                  stdoutLogEnabled="true"
                  stdoutLogFile=".\logs\stdout"
                  hostingModel="InProcess">
        <environmentVariables>
          <environmentVariable name="ASPNETCORE_ENVIRONMENT" value="Production" />
        </environmentVariables>
      </aspNetCore>
      <security>
        <requestFiltering>
          <requestLimits maxAllowedContentLength="104857600" />
        </requestFiltering>
      </security>
    </system.webServer>
  </location>
</configuration>

Security Hardening

SSL/TLS Configuration

Kestrel HTTPS Configuration:

{
  "Kestrel": {
    "Endpoints": {
      "Https": {
        "Url": "https://0.0.0.0:443",
        "Certificate": {
          "Path": "/etc/cesivi/cert.pfx",
          "Password": "certificate-password"
        },
        "SslProtocols": ["Tls12", "Tls13"]
      }
    }
  }
}

Nginx Reverse Proxy (recommended):

server {
    listen 443 ssl http2;
    server_name cesivi.example.com;

    ssl_certificate /etc/nginx/ssl/cesivi.crt;
    ssl_certificate_key /etc/nginx/ssl/cesivi.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;
    ssl_session_timeout 1d;
    ssl_session_cache shared:SSL:50m;
    ssl_stapling on;
    ssl_stapling_verify on;

    # Security headers
    add_header Strict-Transport-Security "max-age=63072000" always;
    add_header X-Content-Type-Options nosniff;
    add_header X-Frame-Options DENY;
    add_header X-XSS-Protection "1; mode=block";

    location / {
        proxy_pass http://localhost:5000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection keep-alive;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
        proxy_read_timeout 120s;
        proxy_send_timeout 120s;
    }
}

Authentication Security

Recommended Authentication Configuration:

{
  "Authentication": {
    "DefaultScheme": "Bearer",
    "AllowAnonymous": false,
    "RequireHttps": true,
    "TokenLifetimeMinutes": 60,
    "RefreshTokenLifetimeDays": 7,
    "PasswordPolicy": {
      "MinLength": 12,
      "RequireUppercase": true,
      "RequireLowercase": true,
      "RequireDigit": true,
      "RequireSpecialCharacter": true
    },
    "Lockout": {
      "MaxFailedAttempts": 5,
      "LockoutDurationMinutes": 15
    }
  }
}

IP Allowlisting (Optional)

For internal-only deployments:

{
  "Security": {
    "IpAllowList": [
      "10.0.0.0/8",
      "192.168.0.0/16",
      "172.16.0.0/12"
    ],
    "EnableIpAllowList": true
  }
}


Performance Tuning

Optimized Configuration

Based on PLAN-179 performance optimization (96.7% improvement achieved):

{
  "Performance": {
    "EnableResponseCaching": true,
    "CacheDurationSeconds": 300,
    "EnableResponseCompression": true,
    "CompressionLevel": "Optimal",
    "BatchSize": 100,
    "MaxConcurrentOperations": 50
  },
  "Storage": {
    "EnableCaching": true,
    "CacheExpirationMinutes": 30,
    "MaxCacheItems": 10000,
    "EnableAsyncWrites": true
  },
  "Kestrel": {
    "Limits": {
      "MaxConcurrentConnections": 200,
      "MinRequestBodyDataRate": {
        "BytesPerSecond": 100,
        "GracePeriod": "00:00:10"
      },
      "MinResponseDataRate": {
        "BytesPerSecond": 100,
        "GracePeriod": "00:00:10"
      }
    }
  }
}

Performance Benchmarks (PLAN-179 Results)

Metric Before After Improvement
Item Creation 459ms 15ms 96.7%
Items/Second 2.2 65 30x
1000-Item Batch 7.5 min 0.3 min 96%
Error Rate 76% 0% 100%

Batch Operation Guidelines

// Optimal batch sizes for different operations
const int ITEM_CREATION_BATCH = 100;      // 50-100 items per ExecuteQuery
const int ITEM_UPDATE_BATCH = 50;         // 50 items per batch for updates
const int FILE_UPLOAD_BATCH = 10;         // 10 files per batch
const int CONCURRENT_REQUESTS = 10;       // 10 concurrent users optimal

Memory Management

{
  "GarbageCollection": {
    "Server": true,
    "Concurrent": true,
    "RetainVM": false
  }
}

Add to Cesivi.Server.csproj:

<PropertyGroup>
  <ServerGarbageCollection>true</ServerGarbageCollection>
  <ConcurrentGarbageCollection>true</ConcurrentGarbageCollection>
</PropertyGroup>


Monitoring and Observability

Built-in Endpoints

Endpoint Purpose Response
/_health Basic health check {"status":"Healthy"}
/_health/ready Readiness probe HTTP 200 or 503
/_health/live Liveness probe HTTP 200 or 503
/_diagnostics/metrics Prometheus metrics Metrics text
/_diagnostics/health Detailed health Component status

Prometheus Metrics Configuration

# prometheus.yml
scrape_configs:
  - job_name: 'cesivi'
    scrape_interval: 15s
    static_configs:
      - targets: ['cesivi.example.com:5000']
    metrics_path: '/_diagnostics/metrics'
    scheme: https

Grafana Dashboard

Key metrics to monitor:

# Request rate
rate(cesivi_http_requests_total[5m])

# Error rate
rate(cesivi_http_requests_total{status=~"5.."}[5m]) / rate(cesivi_http_requests_total[5m])

# Response latency (p99)
histogram_quantile(0.99, rate(cesivi_request_duration_seconds_bucket[5m]))

# Active connections
cesivi_active_connections

# Memory usage
process_resident_memory_bytes{job="cesivi"}

Alerting Rules

# alerts.yml
groups:
  - name: cesivi
    rules:
      - alert: CesiviHighErrorRate
        expr: rate(cesivi_http_requests_total{status=~"5.."}[5m]) > 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High error rate on Cesivi Server"
          description: "Error rate is {{ $value | humanizePercentage }}"

      - alert: CesiviHighLatency
        expr: histogram_quantile(0.99, rate(cesivi_request_duration_seconds_bucket[5m])) > 2
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High latency on Cesivi Server"
          description: "P99 latency is {{ $value }}s"

      - alert: CesiviDown
        expr: up{job="cesivi"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Cesivi Server is down"

Structured Logging

{
  "Serilog": {
    "Using": ["Serilog.Sinks.Console", "Serilog.Sinks.File"],
    "MinimumLevel": {
      "Default": "Information",
      "Override": {
        "Microsoft": "Warning",
        "System": "Warning"
      }
    },
    "WriteTo": [
      {
        "Name": "Console",
        "Args": {
          "outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {SourceContext} {Message:lj}{NewLine}{Exception}"
        }
      },
      {
        "Name": "File",
        "Args": {
          "path": "/var/log/cesivi/app-.log",
          "rollingInterval": "Day",
          "retainedFileCountLimit": 30,
          "outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {SourceContext} {Message:lj}{NewLine}{Exception}"
        }
      }
    ],
    "Enrich": ["FromLogContext", "WithMachineName", "WithEnvironmentName"]
  }
}

Scaling Recommendations

Hardware Requirements

Workload CPU RAM Storage Users
Development 2 cores 4 GB 20 GB SSD 1-5
Small 4 cores 8 GB 100 GB SSD 5-25
Medium 8 cores 16 GB 500 GB SSD 25-100
Large 16 cores 32 GB 1 TB NVMe 100-500

Horizontal Scaling

For high availability, deploy multiple instances behind a load balancer:

                    ┌─────────────────┐
                    │  Load Balancer  │
                    │  (nginx/HAProxy)│
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
        ┌─────▼─────┐  ┌─────▼─────┐  ┌─────▼─────┐
        │ Cesivi-1  │  │ Cesivi-2  │  │ Cesivi-3  │
        │ (Primary) │  │ (Standby) │  │ (Standby) │
        └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
              │              │              │
              └──────────────┼──────────────┘
                             │
                    ┌────────▼────────┐
                    │ Shared Storage  │
                    │ (NFS/SQL/Redis) │
                    └─────────────────┘

Load Balancer Configuration (nginx):

upstream cesivi_cluster {
    least_conn;
    server cesivi-1.internal:5000 weight=5;
    server cesivi-2.internal:5000 weight=5;
    server cesivi-3.internal:5000 weight=5;

    keepalive 32;
}

server {
    listen 443 ssl http2;
    server_name cesivi.example.com;

    location / {
        proxy_pass http://cesivi_cluster;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location /_health {
        proxy_pass http://cesivi_cluster;
        proxy_connect_timeout 5s;
        proxy_read_timeout 5s;
    }
}

Database Backend for Scaling

For multi-instance deployments, use a shared database:

{
  "Storage": {
    "Provider": "SqlServer",
    "ConnectionString": "Server=db.example.com;Database=Cesivi;User Id=cesivi;Password=secure-password;MultipleActiveResultSets=true;Connection Timeout=30"
  }
}

Backup and Recovery

Backup Strategy

File System Storage:

#!/bin/bash
# backup-cesivi.sh

BACKUP_DIR="/backup/cesivi"
DATA_DIR="/var/cesivi/data"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

# Create backup
tar -czf "${BACKUP_DIR}/cesivi-data-${TIMESTAMP}.tar.gz" -C "${DATA_DIR}" .

# Keep last 7 daily backups
find "${BACKUP_DIR}" -name "cesivi-data-*.tar.gz" -mtime +7 -delete

# Verify backup integrity
if tar -tzf "${BACKUP_DIR}/cesivi-data-${TIMESTAMP}.tar.gz" > /dev/null 2>&1; then
    echo "Backup completed successfully: cesivi-data-${TIMESTAMP}.tar.gz"
else
    echo "ERROR: Backup verification failed!"
    exit 1
fi

SQL Server Backup:

-- Full backup
BACKUP DATABASE CesiviDB
TO DISK = '/backup/cesivi/CesiviDB_Full.bak'
WITH FORMAT, MEDIANAME = 'CesiviBackup', NAME = 'Full Backup';

-- Differential backup (daily)
BACKUP DATABASE CesiviDB
TO DISK = '/backup/cesivi/CesiviDB_Diff.bak'
WITH DIFFERENTIAL, MEDIANAME = 'CesiviBackup', NAME = 'Differential Backup';

Recovery Procedures

File System Recovery:

#!/bin/bash
# restore-cesivi.sh

BACKUP_FILE=$1
DATA_DIR="/var/cesivi/data"

# Stop service
sudo systemctl stop cesivi

# Restore data
rm -rf "${DATA_DIR}/*"
tar -xzf "${BACKUP_FILE}" -C "${DATA_DIR}"

# Fix permissions
chown -R cesivi:cesivi "${DATA_DIR}"

# Start service
sudo systemctl start cesivi

Point-in-Time Recovery:

-- Restore to specific point in time
RESTORE DATABASE CesiviDB
FROM DISK = '/backup/cesivi/CesiviDB_Full.bak'
WITH NORECOVERY;

RESTORE LOG CesiviDB
FROM DISK = '/backup/cesivi/CesiviDB_Log.trn'
WITH STOPAT = '2026-01-23 14:30:00', RECOVERY;


Docker Deployment

Production Dockerfile

FROM mcr.microsoft.com/dotnet/aspnet:10.0-alpine AS runtime
WORKDIR /app

# Security: Run as non-root user
RUN addgroup -g 1000 cesivi && \
    adduser -u 1000 -G cesivi -s /bin/sh -D cesivi

# Copy application
COPY --chown=cesivi:cesivi publish/ .

# Create data directory
RUN mkdir -p /app/MockData && chown cesivi:cesivi /app/MockData

USER cesivi

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:5000/_health || exit 1

EXPOSE 5000
ENTRYPOINT ["dotnet", "Cesivi.Server.dll"]

Docker Compose (Production)

version: '3.8'

services:
  cesivi:
    image: cesivi/server:latest
    container_name: cesivi-server
    restart: unless-stopped
    ports:
      - "5000:5000"
    environment:
      - ASPNETCORE_ENVIRONMENT=Production
      - ASPNETCORE_URLS=http://+:5000
      - CESIVI_DATA_PATH=/app/MockData
    volumes:
      - cesivi-data:/app/MockData
      - ./appsettings.Production.json:/app/appsettings.Production.json:ro
      - ./logs:/app/logs
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:5000/_health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '0.5'
          memory: 512M
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "5"

volumes:
  cesivi-data:
    driver: local

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cesivi-server
  labels:
    app: cesivi
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cesivi
  template:
    metadata:
      labels:
        app: cesivi
    spec:
      containers:
        - name: cesivi
          image: cesivi/server:latest
          ports:
            - containerPort: 5000
          env:
            - name: ASPNETCORE_ENVIRONMENT
              value: "Production"
          resources:
            requests:
              memory: "512Mi"
              cpu: "500m"
            limits:
              memory: "4Gi"
              cpu: "2000m"
          livenessProbe:
            httpGet:
              path: /_health/live
              port: 5000
            initialDelaySeconds: 10
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /_health/ready
              port: 5000
            initialDelaySeconds: 5
            periodSeconds: 10
          volumeMounts:
            - name: data
              mountPath: /app/MockData
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: cesivi-data-pvc

Operational Runbook

Startup Procedure

# 1. Verify prerequisites
echo "Checking prerequisites..."
dotnet --version  # Should be 10.0+
systemctl status nginx

# 2. Start the service
sudo systemctl start cesivi

# 3. Verify startup
sleep 5
curl -s http://localhost:5000/_health | jq .

# 4. Check logs for errors
sudo journalctl -u cesivi -n 50 --no-pager

Shutdown Procedure

# 1. Graceful shutdown
sudo systemctl stop cesivi

# 2. Verify process terminated
pgrep -f "Cesivi.Server" || echo "Process stopped successfully"

# 3. Backup data (optional)
./backup-cesivi.sh

Common Operations

Clear Cache:

curl -X POST "http://localhost:5000/_diagnostics/clearcache" -u admin:password

View Active Connections:

curl -s "http://localhost:5000/_diagnostics/metrics" | grep cesivi_active_connections

Restart Service:

sudo systemctl restart cesivi
sleep 5
curl -s http://localhost:5000/_health

View Recent Errors:

sudo journalctl -u cesivi -p err -n 20 --no-pager


Health Check Endpoints

Endpoint Details

Endpoint Purpose Success Failure
/_health Overall health 200 OK 503 Service Unavailable
/_health/live Liveness (is process running?) 200 OK 503
/_health/ready Readiness (can serve requests?) 200 OK 503
/_health/version Application version JSON with version -

Health Check Response Format

{
  "status": "Healthy",
  "timestamp": "2026-01-23T14:30:00Z",
  "components": {
    "storage": {
      "status": "Healthy",
      "description": "File system storage operational"
    },
    "authentication": {
      "status": "Healthy",
      "description": "Authentication service operational"
    },
    "search": {
      "status": "Healthy",
      "description": "Lucene search engine operational"
    }
  },
  "version": "2.0.0",
  "uptime": "2.05:30:15"
}

Kubernetes Probes Configuration

livenessProbe:
  httpGet:
    path: /_health/live
    port: 5000
  initialDelaySeconds: 10
  periodSeconds: 30
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /_health/ready
    port: 5000
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

startupProbe:
  httpGet:
    path: /_health
    port: 5000
  initialDelaySeconds: 0
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 30

Summary

Key Recommendations

  1. Always use HTTPS in production
  2. Configure authentication - never allow anonymous access
  3. Enable monitoring - use Prometheus/Grafana for metrics
  4. Implement backups - daily automated backups with verification
  5. Use batch sizes of 50-100 items for optimal performance
  6. Deploy behind a reverse proxy (nginx/HAProxy)
  7. Set resource limits - prevent runaway memory/CPU usage
  8. Monitor health endpoints - integrate with alerting systems

Performance Targets

Metric Target Alert Threshold
Response Time (p99) < 500ms > 2s
Error Rate < 0.1% > 1%
Uptime 99.9% < 99%
CPU Usage < 70% > 85%
Memory Usage < 80% > 90%

Document Version: 1.1 Last Updated: 2026-03-28 Author: Cesivi Server Team