Site Reliability Engineer - Sydney Office - Permanent Full Time
A great opportunity for a Site Reliability Engineer to join our Platform Engineering team, which is a business unit inside of Domain that combines Cloud based DevOps, SRE, and Quality/Testing Engineering practices. Our aim is to enable product teams to focus on differentiating features for their unique customers by offering scalable solutions for common undifferentiated heavy lifting. We run a dual operating model - both creating and iterating on platforms for general, scalable use and running as an enabling team - embedding ourselves into teams to uplift their capabilities.
The SRE team contributes to our central Observability platform and works closely with our product teams; assisting them to upskill on Incident Management and general Observability principals. This can mean auditing alerts, assisting them to write adequate runbooks, building tailored dashboarding in our Platform to assist them monitoring SLOs/SLIs for their unique customer bases and more. The SRE team, as part of the broader Platform Engineering team, also ensures that product teams align to our best practices through advocating for adoption of our platforms more generally. We attempt to balance in place uplifts with the more significant benefit that comes when applications rely on our custom Compute, Observability, DevSecOps, Test platforms etc.
Who are you?
The ideal candidate will…
Gathering project requirements from stakeholders
Designing high-level schematics of the infrastructure, tools, and processes
Performing an in-depth analysis of the possible risks and countermeasures for them
Calculating the potential cost of outages and planning for contingency
Incident response and preparation of playbooks
Monitoring the systems and analysing their performance
Create and set up dashboarding
Implement and manage monitoring and alerting
Managing Application Performance
Write code that scales, maintains, and monitors infrastructure
Preparing input for infrastructure/tooling/workflow updates across the organisation.
Attributes
Experience working in an SRE/DevOps team
Experience with logging and monitoring tools
Strong understanding of monitoring and alerting concepts
Comfortable with a range of languages (or keen to learn)
Experience with CDK or something similar
Commercial experience with AWS services
Familiarity with the following; PagerDuty, ELK, Atlassian, Nagios, Open Telemetry
Why join us?
We’re the right size business for you to make a real impact, with a workplace culture where you can be you. Perks of the role include:
Flexibility tailored to you - so if you’ve recently made a sea change, work adjusted hours or like the idea of hybrid working, it’ll be perfect;
First-rate parental leave and wellbeing policies;
Access to Perkbox, giving you discounts across healthcare, entertainment, food, utilities and more
Continuous opportunities to leap, learn and grow.
We don’t just talk, we do. Every day we solve property problems for Australians and beyond. We encourage our people to see the possibilities, and turn them into realities. That’s why we want you.
Who are we?
We shine a light on all things property. Our business aims to simplify the property journey for all involved; motivated by expertise and our exclusive data.
Changing the way people engage with property requires a team of diverse thinkers.
What’s next?
One of our talent partners will give your application a good look and give you a call if it’s a good match, so apply now!
Don’t meet every single requirement? We’re committed to building an inclusive, diverse and supportive workplace, so if you’re excited about this role but your past experience doesn’t align perfectly, we encourage you to send in your application. You may just be the perfect candidate for this opportunity or another within the Domain Group.
