I have been playing with some systems designs to prepare for a project I am working on. It has been interesting to see how my design principles have actually aligned with some existing recommendations and documentation. For example, this great write-up by Donne Martin, tech lead at Facebook reaffirmed what my experiences at OzEmail/UUNet had taught me in a production environment, and what I have observed during my support experiences.
Primarily for my own documentation/record keeping, the following design is what I will be using for my Skunkworks project:
Minimum Viable Product (MVP)
Apache HTTP + mod_fpm +> MySQL
Laravel 10.x, Lets Encrypt (certbot), Cloudflare/Bunny, Sydney-based VPS, docker/Kubernetes
- implement S3 object storage, migrate all content to S3
- implement HAProxy for load balancing, rate limiting, DDOS protection
- implement varnish, for HEAD, GET, PURGE
- implement +1 queue worker, migrate all async and scheduled tasks
- implement +1 web frontend, still on apache + mod_fpm
- implement ProxySQL, listening on a writer port and reader port, query investigation, query caching
- implement +1 API web frontend, modify HAProxy to use API server, and web frontend(s) as failback
- implement MySQL replication, with Master for RW and Slave for RO
Cloudflare / Bunny.net provides Application Delivery Network functionality. Initially, pull-based caching and SSL re-termination but scaling to Web Application Firewall (WAF) as demand increases.
All HTTP-based services will be hosted behind a HAProxy instance and scale out to a HAProxy cluster as demand increases. HAproxy will be configured to:
- accept connections on HTTP (80) and HTTPS (443)
- terminate SSL connections, utilising LetsEncrypt certificates (certbot)
- health-check of front-end layer services and redirect to off-site “maintenance” page on service down
- prioritize access to and backend route certain URL prefixes for admin – i.e. /admin
- prioritize access to and backend route certain URL prefixes for CSAT – i.e. /register
- prioritize access to and backend route certain URLs for premium (i.e. paying) users based on cookie value
- perform rate-limiting on certain URLs for DDOS protection – i.e. /register, /login, /api
- redirect OpenTelemetry collection to jaegar collector(s)
The Frontend Layer will only accept connections from the ingress and management layers. Queue worker nodes will only accept connect connections from the management layer.
The frontend layer will initially be Apache+mod_fpm, scaling to Apache HTTP + nginx FPM. Initially, web, API & queue workers will run on a single instance and scale out as needed.
Additionally, jaegar collector(s) will exist in the frontend layer.
The backend layer will only accept connections from the frontend and management layers.
- Queue worker(s)
The caching layer will only accept connections from the frontend, backend, and management layers.
The database layer will only accept connections from the frontend, backend, and management layers.
- configured for RW port -> MySQL master
- configured for RO port -> MySQL slave(s)
- MySQL Master & Slave(s)
- MongoDB (nosql)
- jaegar – OpenTelemetry