Chuyển tới nội dung chính

Enterprise Setup

Deploy FAOS in your own infrastructure with complete control over data, security, and compliance. This guide covers self-hosted enterprise deployments.

Deployment Options

Cloud Deployment

Deploy FAOS on your cloud infrastructure:

ProviderDeployment MethodBest For
AWSEKS + RDS + S3Most enterprise deployments
AzureAKS + Azure SQL + Blob StorageMicrosoft-centric organizations
GCPGKE + Cloud SQL + Cloud StorageGoogle Cloud users
Private CloudOpenShift, RancherRegulated industries

On-Premise Deployment

Deploy FAOS in your data center:

  • Kubernetes: Recommended for production
  • Docker Compose: Development/testing
  • VM-based: Legacy infrastructure

System Requirements

Minimum Production Requirements

Control Plane (3 nodes):
CPU: 8 cores per node
Memory: 32 GB per node
Storage: 500 GB SSD per node
Network: 10 Gbps

Worker Nodes (3+ nodes):
CPU: 16 cores per node
Memory: 64 GB per node
Storage: 1 TB NVMe SSD per node
Network: 10 Gbps
GPU: Optional (for local LLM inference)

Database:
Type: PostgreSQL 14+
CPU: 16 cores
Memory: 64 GB
Storage: 2 TB SSD (RAID 10)
Replicas: 1 primary + 2 standby

Cache:
Type: Redis 7+
CPU: 8 cores
Memory: 32 GB
Replicas: 3 (high availability)

Object Storage:
Type: S3-compatible
Capacity: 10 TB+ (expandable)
IOPS: 10,000+

Load Balancer:
Type: Layer 7 (Application)
Throughput: 10 Gbps
SSL Termination: Yes

Software Dependencies

Operating System:
- Ubuntu 22.04 LTS
- RHEL 8/9
- Amazon Linux 2

Container Runtime:
- Docker 24+
- containerd 1.6+

Orchestration:
- Kubernetes 1.28+
- Helm 3.12+

Database:
- PostgreSQL 14+ (with pgvector extension)
- Redis 7+

Monitoring:
- Prometheus
- Grafana
- ELK Stack or similar

Backup:
- Velero (Kubernetes)
- pg_dump + WAL archiving (PostgreSQL)

Kubernetes Deployment

Step 1: Prepare Kubernetes Cluster

# Create namespace
kubectl create namespace faos-production

# Create secrets
kubectl create secret generic faos-secrets \
--from-literal=db-password='your-db-password' \
--from-literal=redis-password='your-redis-password' \
--from-literal=jwt-secret='your-jwt-secret' \
--from-literal=encryption-key='your-encryption-key' \
-n faos-production

# Create config map
kubectl create configmap faos-config \
--from-file=config.yaml \
-n faos-production

Step 2: Deploy Database

# postgres-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: faos-production
spec:
serviceName: postgres
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:14-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: db-password
- name: POSTGRES_DB
value: faos_production
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "16Gi"
cpu: "4"
limits:
memory: "32Gi"
cpu: "8"
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 500Gi

Step 3: Deploy Redis

# redis-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
namespace: faos-production
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
command: ["redis-server"]
args: ["--requirepass", "$(REDIS_PASSWORD)", "--appendonly", "yes"]
ports:
- containerPort: 6379
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: redis-password
volumeMounts:
- name: redis-data
mountPath: /data
resources:
requests:
memory: "8Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "4"
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi

Step 4: Deploy FAOS Application

# faos-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: faos-api
namespace: faos-production
spec:
replicas: 5
selector:
matchLabels:
app: faos-api
template:
metadata:
labels:
app: faos-api
spec:
containers:
- name: faos-api
image: faos/api:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "postgresql://postgres:$(DB_PASSWORD)@postgres:5432/faos_production"
- name: REDIS_URL
value: "redis://:$(REDIS_PASSWORD)@redis:6379"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: db-password
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: faos-secrets
key: redis-password
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: faos-secrets
key: jwt-secret
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
---
apiVersion: v1
kind: Service
metadata:
name: faos-api
namespace: faos-production
spec:
selector:
app: faos-api
ports:
- port: 80
targetPort: 8080
type: ClusterIP

Step 5: Deploy Ingress

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: faos-ingress
namespace: faos-production
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
ingressClassName: nginx
tls:
- hosts:
- faos.yourcompany.com
secretName: faos-tls
rules:
- host: faos.yourcompany.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: faos-api
port:
number: 80

Step 6: Deploy with Helm

# Add FAOS Helm repository
helm repo add faos https://charts.faos.ai
helm repo update

# Create values file
cat > values-production.yaml <<EOF
replicaCount: 5

image:
repository: faos/api
tag: "1.0.0"
pullPolicy: IfNotPresent

resources:
limits:
cpu: 4
memory: 8Gi
requests:
cpu: 2
memory: 4Gi

autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 20
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80

ingress:
enabled: true
className: nginx
hosts:
- host: faos.yourcompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: faos-tls
hosts:
- faos.yourcompany.com

postgresql:
enabled: true
auth:
username: faos
database: faos_production
primary:
persistence:
size: 500Gi
resources:
limits:
cpu: 8
memory: 32Gi

redis:
enabled: true
auth:
enabled: true
master:
persistence:
size: 100Gi
replica:
replicaCount: 2

monitoring:
enabled: true
prometheus:
enabled: true
grafana:
enabled: true
EOF

# Install FAOS
helm install faos faos/faos \
--namespace faos-production \
--values values-production.yaml \
--wait

Configuration

Environment Variables

# Core Configuration
FAOS_ENV=production
FAOS_LOG_LEVEL=info
FAOS_PORT=8080

# Database
DATABASE_URL=postgresql://user:pass@host:5432/faos_production
DATABASE_POOL_SIZE=100
DATABASE_MAX_OVERFLOW=50

# Redis
REDIS_URL=redis://:password@host:6379
REDIS_POOL_SIZE=50

# Security
JWT_SECRET=your-secret-key-here
ENCRYPTION_KEY=your-encryption-key-here
SESSION_TIMEOUT=3600

# LLM Configuration
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
LLM_TIMEOUT=30000
LLM_MAX_RETRIES=3

# Object Storage
S3_BUCKET=faos-production-storage
S3_REGION=us-east-1
S3_ACCESS_KEY_ID=...
S3_SECRET_ACCESS_KEY=...

# Monitoring
PROMETHEUS_ENABLED=true
METRICS_PORT=9090
TRACING_ENABLED=true
JAEGER_ENDPOINT=http://jaeger:14268/api/traces

# Features
MULTI_TENANCY=true
RATE_LIMITING=true
AUDIT_LOGGING=true

Database Configuration

# config/database.yaml
production:
host: postgres.faos-production.svc.cluster.local
port: 5432
database: faos_production
username: faos_user
password: ${DB_PASSWORD}

pool:
min: 10
max: 100
idle_timeout: 10000
connection_timeout: 5000

ssl:
enabled: true
ca_cert: /etc/ssl/certs/postgres-ca.crt

backup:
enabled: true
schedule: "0 2 * * *" # 2 AM daily
retention_days: 30
storage: s3://backups/postgresql

High Availability

Multi-Region Setup

graph TB
A[Global Load Balancer] --> B[Region: US-East]
A --> C[Region: EU-West]
A --> D[Region: APAC]

B --> E[FAOS Cluster 1]
C --> F[FAOS Cluster 2]
D --> G[FAOS Cluster 3]

E --> H[(Primary DB)]
F --> I[(Replica DB)]
G --> J[(Replica DB)]

H -.Replication.-> I
H -.Replication.-> J

Database Replication

# Primary Database (US-East)
postgresql:
replication:
mode: streaming
synchronous_standby_names: "standby1,standby2"
max_wal_senders: 10
wal_keep_size: 1024

# Standby Databases
postgresql_standby:
primary_conninfo: "host=primary port=5432 user=replicator"
restore_command: "aws s3 cp s3://wal-archive/%f %p"
archive_cleanup_command: "pg_archivecleanup /archive %r"

Monitoring & Observability

Prometheus Metrics

# prometheus-config.yaml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'faos-api'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- faos-production
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: faos-api

Grafana Dashboards

Import pre-built FAOS dashboards:

# Download dashboard definitions
curl -O https://grafana.faos.ai/dashboards/faos-overview.json
curl -O https://grafana.faos.ai/dashboards/faos-performance.json
curl -O https://grafana.faos.ai/dashboards/faos-agents.json

# Import to Grafana
kubectl create configmap grafana-dashboards \
--from-file=faos-overview.json \
--from-file=faos-performance.json \
--from-file=faos-agents.json \
-n faos-production

Backup & Disaster Recovery

Automated Backups

#!/bin/bash
# backup-faos.sh

# Database backup
pg_dump -h postgres -U faos faos_production | \
gzip > backup-$(date +%Y%m%d-%H%M%S).sql.gz

# Upload to S3
aws s3 cp backup-*.sql.gz s3://faos-backups/postgresql/

# Kubernetes resources
kubectl get all -n faos-production -o yaml > k8s-backup.yaml
aws s3 cp k8s-backup.yaml s3://faos-backups/kubernetes/

# Redis snapshot
redis-cli --rdb /tmp/dump.rdb
aws s3 cp /tmp/dump.rdb s3://faos-backups/redis/

echo "Backup completed successfully"

Disaster Recovery Plan

Recovery Time Objective (RTO): 1 hour
Recovery Point Objective (RPO): 15 minutes

Backup Schedule:
Database: Every 15 minutes (WAL archiving)
Full Backup: Daily at 2 AM
Redis: Hourly snapshots
Configuration: On every change

Testing Schedule:
DR Drill: Monthly
Backup Restoration Test: Weekly
Failover Test: Quarterly

Scaling

Horizontal Pod Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: faos-api-hpa
namespace: faos-production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: faos-api
minReplicas: 5
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30

Troubleshooting

Common Issues

Pod not starting:

# Check pod status
kubectl describe pod -n faos-production

# Check logs
kubectl logs -n faos-production <pod-name>

# Check events
kubectl get events -n faos-production --sort-by='.lastTimestamp'

Database connection issues:

# Test database connectivity
kubectl run -it --rm debug --image=postgres:14 --restart=Never -- \
psql -h postgres -U faos -d faos_production

# Check connection pool
kubectl exec -it <api-pod> -- curl localhost:8080/metrics | grep db_pool

Performance issues:

# Check resource usage
kubectl top pods -n faos-production

# Check database performance
kubectl exec -it postgres-0 -- psql -U faos -c "
SELECT * FROM pg_stat_activity WHERE state = 'active';
"

Next Steps


Enterprise-grade deployment with complete control over your AI agent infrastructure.