Playing with Systems Designs

I have been playing with some systems designs to prepare for a project I am working on. It has been interesting to see how my design principles have actually aligned with some existing recommendations and documentation. For example, this great write-up by Donne Martin, tech lead at Facebook reaffirmed what my experiences at OzEmail/UUNet had taught me in a production environment, and what I have observed during my support experiences.

Primarily for my own documentation/record keeping, the following design is what I will be using for my Skunkworks project:

Minimum Viable Product (MVP)

Systems

Apache HTTP + mod_fpm +> MySQL
+ redis
+ meilisearch

Laravel 10.x, Lets Encrypt (certbot), Cloudflare/Bunny, Sydney-based VPS, docker/Kubernetes

Product

Growth Path

Systems

  • implement S3 object storage, migrate all content to S3
  • implement HAProxy for load balancing, rate limiting, DDOS protection
  • implement varnish, for HEAD, GET, PURGE
  • implement +1 queue worker, migrate all async and scheduled tasks
  • implement +1 web frontend, still on apache + mod_fpm
  • implement ProxySQL, listening on a writer port and reader port, query investigation, query caching
  • implement +1 API web frontend, modify HAProxy to use API server, and web frontend(s) as failback
  • implement MySQL replication, with Master for RW and Slave for RO

Product

Ingress Layer

Cloudflare / Bunny.net provides Application Delivery Network functionality. Initially, pull-based caching and SSL re-termination but scaling to Web Application Firewall (WAF) as demand increases.

All HTTP-based services will be hosted behind a HAProxy instance and scale out to a HAProxy cluster as demand increases. HAproxy will be configured to:

  • accept connections on HTTP (80) and HTTPS (443)
  • terminate SSL connections, utilising LetsEncrypt certificates (certbot)
  • health-check of front-end layer services and redirect to off-site “maintenance” page on service down
  • prioritize access to and backend route certain URL prefixes for admin – i.e. /admin
  • prioritize access to and backend route certain URL prefixes for CSAT – i.e. /register
  • prioritize access to and backend route certain URLs for premium (i.e. paying) users based on cookie value
  • perform rate-limiting on certain URLs for DDOS protection – i.e. /register, /login, /api
  • redirect OpenTelemetry collection to jaegar collector(s)

Frontend Layer

The Frontend Layer will only accept connections from the ingress and management layers. Queue worker nodes will only accept connect connections from the management layer.

The frontend layer will initially be Apache+mod_fpm, scaling to Apache HTTP + nginx FPM. Initially, web, API & queue workers will run on a single instance and scale out as needed.

Additionally, jaegar collector(s) will exist in the frontend layer.

Backend Layer

The backend layer will only accept connections from the frontend and management layers.

  • Queue worker(s)

Caching Layer

The caching layer will only accept connections from the frontend, backend, and management layers.

  • Redis

Database Layer

The database layer will only accept connections from the frontend, backend, and management layers.

  • ProxySQL
    • configured for RW port -> MySQL master
    • configured for RO port -> MySQL slave(s)
  • MySQL Master & Slave(s)
  • MongoDB (nosql)

Management Layer

  • busybox
  • prometheus
  • grafana
  • jaegar – OpenTelemetry