Job Description
Johannesburg – Gauteng – South Africa
- Own uptime, performance, and monitoring for all production applications.
- Manage Heroku pipelines, CI/CD, review apps, and production environments.
- Operate Celery workers and queues, monitor health, and handle missed task check-ins.
- Define and track service level objectives (SLOs) (availability, latency, task success rate).
- Maintain runbooks, a centralised wiki for incident response, and lead post-mortems.
- Run periodic disaster recovery drills and coordinate Penetration Tests.
- Keep environments current (Heroku stacks, Postgres/Redis versions, DO/AWS base images).
- Manage daily backups, ensure restore tests and disaster recovery runbooks are in place.
- Standardise infrastructure (Terraform or scripts for DO/AWS; app.json for Heroku).
- Manage Cloudflare for DNS, edge security, and performance optimisation.
- Tune performance (DB indices, query optimisation, cache usage, Celery queue design).
- Optimise infrastructure costs across Heroku, DigitalOcean, and AWS.
- Maintain CI pipelines with type checking, linting, and security scanning.
- Enforce test coverage and automate deploy checks (smoke tests, migration health, error budgets).
- Support Developers with tooling for local/staging environments and build self-service dashboards (e.g., Celery queue status).
- Collaborate with Developers to streamline workflows and educate on secure coding practices.
- Own vulnerability management and dependency patching cadence.
- Manage access reviews, secrets, MFA/SSO, and enforce least-privilege IAM policies.
- Implement encryption for data at rest and in transit (e.g., S3 server-side encryption).
- Contribute evidence and responses for security questionnaires and SOC 2 audits.
- Maintain a security pack with architecture, sub-processors, and DR/backup processes.
- Configure Sentry ownership rules, Cron Monitors, and release health.
- Centralise metrics/logs (Heroku metrics, Papertrail, Sentry, APM, Prometheus/New Relic).
- Set up alerts on golden signals (latency, errors, traffic, saturation) and avoid alert fatigue.
- Conduct capacity planning and track resource usage trends.
- Evaluate and manage vendor relationships (e.g., Mailgun, Twilio) to ensure service level agreements (SLAs) and performance.
- Assess new tools/services to enhance platform capabilities (e.g., observability, security).
- Track costs, security posture, and integration quality for all third-party services.
- Cloud Infrastructure Management: 3+ years operating production apps on Heroku, AWS, DigitalOcean, or similar.
- CI/CD pipelines: Hands-on experience with GitHub Actions, Heroku CI, or equivalent; solid Git fundamentals.
- Monitoring & incident response: Experience with Sentry, Papertrail (or similar), logs, and uptime/performance dashboards.
- Security Fundamentals: Understanding of IAM, encryption in transit/at rest, MFA/SSO, and secure configuration practices.
- Disaster recovery & backups: Experience implementing and operating automated backups, restore testing, and writing/maintaining incident runbooks.
- Communication & collaboration: Ability to document processes clearly and work closely with Developers in a small team.
- Infrastructure as Code & automation: Experience with Terraform, Docker, or equivalent tooling.
- Asynchronous workloads: Familiarity with Celery, Redis, or other task queues and message brokers.
- Scaling & cost optimisation: Capacity planning, performance tuning, and managing infra spend.
- Compliance frameworks: Exposure to SOC 2, GDPR, or supporting client security questionnaires.
- Incident management: Participation in on-call rotations, leading post-mortems, or serving as incident commander.
- Certifications (AWS Certified DevOps Engineer, CKS, or equivalent).
- Proficiency in Python; familiarity with Django/Flask.
- Experience with DNS/CDN/edge security (e.g., Cloudflare).
- Observability platforms (Prometheus, Grafana, New Relic).
- Static analysis and code quality tools (mypy, Bandit, SonarQube).
- Prior exposure to multi-tenant SaaS environments.
GO APPLY NOW
Safe & secure application process
Explore More Opportunities
Get Similar Job Alerts
Job Seeker Tip
Join professional associations in your field to expand your network and knowledge.
How to Apply
Click “GO APPLY NOW” to visit the company’s application page.
Follow their instructions carefully.
JVR Jobs connects you with employers – we don’t process applications directly.
Latest Job Opportunities
Gauteng: HUMAN RESOURCES CONSULTANT posted by Brentwood Employment Relations Specialists
This is a full time on-site Human Resouces Consultant role based in Johannesburg at Brentwood. The HR Consultant will be…
View JobStellenbosch: Senior Consultant (Industrial Engineering) posted by Sixty60 Recruitment
Our client located in Stellenbosch is looking for a Candidate Senior Consultant to join their fast growing team!About them: They…
View JobGauteng: Business Developer – Sales Hunter posted by Fouche & Co Recruitment
We’re not looking for an order-taker — we want a hunter. Someone who thrives in a high-paced environment, is energised…
View JobDurban: Business Development Manager posted by Switch Recruit
Our client is seeking a dynamic Business Development Manager to drive growth, forge strategic partnerships, and unlock new market opportunities…
View JobHillcrest: Premium Automotive Sales Executive posted by Fouche & Co Recruitment
Our client is a well-established business within the automotive industry, known for representing top-tier brands and delivering exceptional customer...
View JobKempton Park: Branch Manager Kempton Park posted by Status Staffing
A recent professional profile photo is to accompany your application.EMPLOYMENT TYPE : PermanentSECTOR : ManagementBASIC SALARY : Market relatedSTART DATE…
View Job
Browse Employers
Job Alerts