Enterprise Setup
Deploy FAOS in your own infrastructure with complete control over data, security, and compliance. This guide covers self-hosted enterprise deployments.
Deployment Options
Cloud Deployment
Deploy FAOS on your cloud infrastructure:
| Provider | Deployment Method | Best For |
|---|---|---|
| AWS | EKS + RDS + S3 | Most enterprise deployments |
| Azure | AKS + Azure SQL + Blob Storage | Microsoft-centric organizations |
| GCP | GKE + Cloud SQL + Cloud Storage | Google Cloud users |
| Private Cloud | OpenShift, Rancher | Regulated industries |
On-Premise Deployment
Deploy FAOS in your data center:
- Kubernetes: Recommended for production
- Docker Compose: Development/testing
- VM-based: Legacy infrastructure
System Requirements
Minimum Production Requirements
Control Plane (3 nodes):
CPU: 8 cores per node
Memory: 32 GB per node
Storage: 500 GB SSD per node
Network: 10 Gbps
Worker Nodes (3+ nodes):
CPU: 16 cores per node
Memory: 64 GB per node
Storage: 1 TB NVMe SSD per node
Network: 10 Gbps
GPU: Optional (for local LLM inference)
Database:
Type: PostgreSQL 14+
CPU: 16 cores
Memory: 64 GB
Storage: 2 TB SSD (RAID 10)
Replicas: 1 primary + 2 standby
Cache:
Type: Redis 7+
CPU: 8 cores
Memory: 32 GB
Replicas: 3 (high availability)
Object Storage:
Type: S3-compatible
Capacity: 10 TB+ (expandable)
IOPS: 10,000+
Load Balancer:
Type: Layer 7 (Application)
Throughput: 10 Gbps
SSL Termination: Yes
Software Dependencies
Operating System:
- Ubuntu 22.04 LTS
- RHEL 8/9
- Amazon Linux 2
Container Runtime:
- Docker 24+
- containerd 1.6+
Orchestration:
- Kubernetes 1.28+
- Helm 3.12+
Database:
- PostgreSQL 14+ (with pgvector extension)
- Redis 7+
Monitoring:
- Prometheus
- Grafana
- ELK Stack or similar
Backup:
- Velero (Kubernetes)
- pg_dump + WAL archiving (PostgreSQL)
Kubernetes Deployment
Step 1: Prepare Kubernetes Cluster
# Create namespace
kubectl create namespace faos-production
# Create secrets
kubectl create secret generic faos-secrets \
--from-literal=db-password='your-db-password' \
--from-literal=redis-password='your-redis-password' \
--from-literal=jwt-secret='your-jwt-secret' \
--from-literal=encryption-key='your-encryption-key' \
-n faos-production
# Create config map
kubectl create configmap faos-config \
--from-file=config.yaml \
-n faos-production
Step 2: Deploy Database
# postgres-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: faos-production
spec:
serviceName: postgres
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:14-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: db-password
- name: POSTGRES_DB
value: faos_production
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "16Gi"
cpu: "4"
limits:
memory: "32Gi"
cpu: "8"
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 500Gi
Step 3: Deploy Redis
# redis-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
namespace: faos-production
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
command: ["redis-server"]
args: ["--requirepass", "$(REDIS_PASSWORD)", "--appendonly", "yes"]
ports:
- containerPort: 6379
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: redis-password
volumeMounts:
- name: redis-data
mountPath: /data
resources:
requests:
memory: "8Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "4"
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
Step 4: Deploy FAOS Application
# faos-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: faos-api
namespace: faos-production
spec:
replicas: 5
selector:
matchLabels:
app: faos-api
template:
metadata:
labels:
app: faos-api
spec:
containers:
- name: faos-api
image: faos/api:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "postgresql://postgres:$(DB_PASSWORD)@postgres:5432/faos_production"
- name: REDIS_URL
value: "redis://:$(REDIS_PASSWORD)@redis:6379"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: db-password
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: redis-password
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: faos-secrets
key: jwt-secret
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
---
apiVersion: v1
kind: Service
metadata:
name: faos-api
namespace: faos-production
spec:
selector:
app: faos-api
ports:
- port: 80
targetPort: 8080
type: ClusterIP
Step 5: Deploy Ingress
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: faos-ingress
namespace: faos-production
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
ingressClassName: nginx
tls:
- hosts:
- faos.yourcompany.com
secretName: faos-tls
rules:
- host: faos.yourcompany.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: faos-api
port:
number: 80
Step 6: Deploy with Helm
# Add FAOS Helm repository
helm repo add faos https://charts.faos.ai
helm repo update
# Create values file
cat > values-production.yaml <<EOF
replicaCount: 5
image:
repository: faos/api
tag: "1.0.0"
pullPolicy: IfNotPresent
resources:
limits:
cpu: 4
memory: 8Gi
requests:
cpu: 2
memory: 4Gi
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 20
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
ingress:
enabled: true
className: nginx
hosts:
- host: faos.yourcompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: faos-tls
hosts:
- faos.yourcompany.com
postgresql:
enabled: true
auth:
username: faos
database: faos_production
primary:
persistence:
size: 500Gi
resources:
limits:
cpu: 8
memory: 32Gi
redis:
enabled: true
auth:
enabled: true
master:
persistence:
size: 100Gi
replica:
replicaCount: 2
monitoring:
enabled: true
prometheus:
enabled: true
grafana:
enabled: true
EOF
# Install FAOS
helm install faos faos/faos \
--namespace faos-production \
--values values-production.yaml \
--wait
Configuration
Environment Variables
# Core Configuration
FAOS_ENV=production
FAOS_LOG_LEVEL=info
FAOS_PORT=8080
# Database
DATABASE_URL=postgresql://user:pass@host:5432/faos_production
DATABASE_POOL_SIZE=100
DATABASE_MAX_OVERFLOW=50
# Redis
REDIS_URL=redis://:password@host:6379
REDIS_POOL_SIZE=50
# Security
JWT_SECRET=your-secret-key-here
ENCRYPTION_KEY=your-encryption-key-here
SESSION_TIMEOUT=3600
# LLM Configuration
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
LLM_TIMEOUT=30000
LLM_MAX_RETRIES=3
# Object Storage
S3_BUCKET=faos-production-storage
S3_REGION=us-east-1
S3_ACCESS_KEY_ID=...
S3_SECRET_ACCESS_KEY=...
# Monitoring
PROMETHEUS_ENABLED=true
METRICS_PORT=9090
TRACING_ENABLED=true
JAEGER_ENDPOINT=http://jaeger:14268/api/traces
# Features
MULTI_TENANCY=true
RATE_LIMITING=true
AUDIT_LOGGING=true
Database Configuration
# config/database.yaml
production:
host: postgres.faos-production.svc.cluster.local
port: 5432
database: faos_production
username: faos_user
password: ${DB_PASSWORD}
pool:
min: 10
max: 100
idle_timeout: 10000
connection_timeout: 5000
ssl:
enabled: true
ca_cert: /etc/ssl/certs/postgres-ca.crt
backup:
enabled: true
schedule: "0 2 * * *" # 2 AM daily
retention_days: 30
storage: s3://backups/postgresql
High Availability
Multi-Region Setup
graph TB
A[Global Load Balancer] --> B[Region: US-East]
A --> C[Region: EU-West]
A --> D[Region: APAC]
B --> E[FAOS Cluster 1]
C --> F[FAOS Cluster 2]
D --> G[FAOS Cluster 3]
E --> H[(Primary DB)]
F --> I[(Replica DB)]
G --> J[(Replica DB)]
H -.Replication.-> I
H -.Replication.-> J
Database Replication
# Primary Database (US-East)
postgresql:
replication:
mode: streaming
synchronous_standby_names: "standby1,standby2"
max_wal_senders: 10
wal_keep_size: 1024
# Standby Databases
postgresql_standby:
primary_conninfo: "host=primary port=5432 user=replicator"
restore_command: "aws s3 cp s3://wal-archive/%f %p"
archive_cleanup_command: "pg_archivecleanup /archive %r"
Monitoring & Observability
Prometheus Metrics
# prometheus-config.yaml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'faos-api'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- faos-production
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: faos-api
Grafana Dashboards
Import pre-built FAOS dashboards:
# Download dashboard definitions
curl -O https://grafana.faos.ai/dashboards/faos-overview.json
curl -O https://grafana.faos.ai/dashboards/faos-performance.json
curl -O https://grafana.faos.ai/dashboards/faos-agents.json
# Import to Grafana
kubectl create configmap grafana-dashboards \
--from-file=faos-overview.json \
--from-file=faos-performance.json \
--from-file=faos-agents.json \
-n faos-production
Backup & Disaster Recovery
Automated Backups
#!/bin/bash
# backup-faos.sh
# Database backup
pg_dump -h postgres -U faos faos_production | \
gzip > backup-$(date +%Y%m%d-%H%M%S).sql.gz
# Upload to S3
aws s3 cp backup-*.sql.gz s3://faos-backups/postgresql/
# Kubernetes resources
kubectl get all -n faos-production -o yaml > k8s-backup.yaml
aws s3 cp k8s-backup.yaml s3://faos-backups/kubernetes/
# Redis snapshot
redis-cli --rdb /tmp/dump.rdb
aws s3 cp /tmp/dump.rdb s3://faos-backups/redis/
echo "Backup completed successfully"
Disaster Recovery Plan
Recovery Time Objective (RTO): 1 hour
Recovery Point Objective (RPO): 15 minutes
Backup Schedule:
Database: Every 15 minutes (WAL archiving)
Full Backup: Daily at 2 AM
Redis: Hourly snapshots
Configuration: On every change
Testing Schedule:
DR Drill: Monthly
Backup Restoration Test: Weekly
Failover Test: Quarterly
Scaling
Horizontal Pod Autoscaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: faos-api-hpa
namespace: faos-production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: faos-api
minReplicas: 5
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
Troubleshooting
Common Issues
Pod not starting:
# Check pod status
kubectl describe pod -n faos-production
# Check logs
kubectl logs -n faos-production <pod-name>
# Check events
kubectl get events -n faos-production --sort-by='.lastTimestamp'
Database connection issues:
# Test database connectivity
kubectl run -it --rm debug --image=postgres:14 --restart=Never -- \
psql -h postgres -U faos -d faos_production
# Check connection pool
kubectl exec -it <api-pod> -- curl localhost:8080/metrics | grep db_pool
Performance issues:
# Check resource usage
kubectl top pods -n faos-production
# Check database performance
kubectl exec -it postgres-0 -- psql -U faos -c "
SELECT * FROM pg_stat_activity WHERE state = 'active';
"
Next Steps
- Enterprise Security - Harden your deployment
- Enterprise Integrations - Connect to enterprise systems
- API Reference - API documentation
Enterprise-grade deployment with complete control over your AI agent infrastructure.