Scaling & Performance

All Eloquent backend services are stateless and can be horizontally scaled. The Helm chart supports configurable resource limits and HPA autoscaling.

Default Resource Profiles

Frontend Applications

resources:
  requests:
    cpu: 100m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi
replicas: 2

Backend Services

Profile	CPU Request	Memory Request	CPU Limit	Memory Limit	Used By
Low	50m	128Mi	250m	256Mi	Lightweight processing services
Medium	100m	256Mi	500m	512Mi	Core platform services (AI, chat, workflows, entities)
High	250m	512Mi	1000m	1Gi	Data-intensive services (knowledge graph, search)

Infrastructure

Component	CPU Request	Memory Request	CPU Limit	Memory Limit
PostgreSQL	100m	256Mi	500m	512Mi
ClickHouse	200m	512Mi	1000m	2Gi
Redis	100m	128Mi	500m	256Mi
Message Queue	100m	256Mi	500m	512Mi

Autoscaling (HPA)

Horizontal Pod Autoscaling is disabled by default. Enable per service:

backend:
  agentsService:
    autoscaling:
      enabled: true
      minReplicas: 2
      maxReplicas: 5
      targetCPUUtilizationPercentage: 75

frontend:
  eloquentApp:
    autoscaling:
      enabled: true
      minReplicas: 2
      maxReplicas: 5
      targetCPUUtilizationPercentage: 75

HPA uses the autoscaling/v2 API and scales based on CPU utilization by default.

Scaling Strategies

Horizontal Scaling

All backend services are stateless. Increasing replicas distributes load across more pods without configuration changes.

The API Gateway is the primary scaling target — it handles all inbound traffic and proxies to backend services.

Vertical Scaling

For services under memory pressure (especially data-intensive services handling large embedding operations), increase resource limits in the Helm values for the specific service.

Database Scaling

PostgreSQL — increase PVC size and resource limits; consider connection pooling for high-concurrency workloads
ClickHouse — increase PVC size for knowledge graph growth; columnar storage is efficient but grows with document ingestion
Redis — increase memory limits if cache eviction rates are high

Database Optimization

For high-traffic deployments, consider:

Increasing max database connections per service
Using an external connection pooler (e.g., PgBouncer) for PostgreSQL
Monitoring connection pool saturation via service logs
Contact the Eloquent team for performance profiling of specific services