Site Reliability Engineer
Job type: Full Time · Department: Technology · Work type: On-Site
Sydney, New South Wales, Australia
Cover Genius is a Series E Insurtech that protects the global customers of the world’s largest digital companies including Booking Holdings, owner of Priceline, Kayak and Booking.com, Intuit, Hopper, Skyscanner, Ryanair, Turkish Airlines, Descartes ShipRush, Zip and SeatGeek. We’re also available at Amazon, Flipkart, eBay, Wayfair and SE Asia’s largest company, Shopee.
Our partners integrate with XCover, our award-winning insurance distribution platform, to embed protection for millions of customers worldwide each year. Our team and products have been recognized sed with dozens of awards including by the Financial Times who ranked Cover Genius as the #1 fastest growing company in APAC in 2020. Our diverse team across 20+ countries and many language groups commits itself to diverse cultural programs, in particular “CG Gives” which makes social entrepreneurs out of us all and funds development initiatives in global communities.
Our People are Bold, Authentic, Purposeful and Inspired
Our People are not Perfect, Traditional, Complacent or Cautious
About the Role
As a Site Reliability Engineer on our Technology Team, you will own the reliable operation and continuous improvement of our production systems. Your primary purpose will be to ensure the seamless and secure functioning of our platforms and operations.
To drive success in this role, you will have a strong background in systems engineering and automation, with experience in release processes, observability, security, core network and infrastructure, and datastores and disaster recovery. You should possess excellent problem-solving skills, a keen attention to detail, and a proactive approach to identifying and mitigating potential issues.
As the Site Reliability Engineer, you will be responsible for:
Monitoring system health and ensuring operational stability and security
Automating and optimizing platform operations
Sharing ownership of production workloads with software engineering teams
Writing and maintaining technical documentation, including tutorials, guides, and blameless post-mortems
Designing and creating information dashboards based on logging and monitoring data
Collaborating with software engineers to drive automation, scalability, and efficiency across technology products and platforms
Regular collaboration with software engineering teams, security teams, and other relevant stakeholders will be key in ensuring the reliability and efficiency of our production systems are achieved.
Key Responsibilities
Analyze, test and modify systems to improve reliability and optimize performance particularly at an architectural/infrastructure level
Apply AWS and GCP knowledge and skills to create & maintain cloud infrastructure for software projects
Develop and maintain observability tooling and dashboards
Implement automation tools and frameworks, CI/CD pipelines, Reduce toil
Troubleshoot production issues and coordinate with the development team to streamline code deployments
Design, develop and implement software integrations
Collaborate with Software Engineers and other team members with the goal of improving engineering tools, systems, procedures and data security
Develop and maintain design and troubleshooting documentation and runbooks
Optimize and control costs of the company’s computing infrastructure
Skills & Experience:
What you will bring
Understanding of SRE Principles and best practices
Experience using & configuring modern observability tools such as Datadog, Elasticsearch, Prometheus, Grafana
Experienced with container technology such as Docker and Ideally experienced with using and managing Kubernetes clusters
Experience working with infrastructure & configuration as code tools such as Terraform, Cloudformation, Chef, Puppet etc.
Comfortable scripting & developing internal tooling with Bash and at least one programming language (e.g. python, go)
Experience working with Linux
Solid understanding of networking and system architecture
Solid understanding of how to deploy, scale and monitor web applications and databases
Good knowledge of AWS and/or GCP platforms and associated best practices
Bachelor's degree in Computer Science/Engineering, A postgraduate degree and/or record of academic achievement is also desirable
What you will have
Strong communication and documentation skills
Curious and self motivated learner
Professional approach
Good team member
Organisational and time management skills
Excellent attention to detail
Positive approach to change
Why Cover Genius?
Cover Genius not only cares about being the best in our industry, we care about our team. We’re a business that understands life can be fluid and so we flex to ensure we provide the environment to suit that. What does that mean?
• Flexible Work Environment - our teams are hybrid. We work from home on Wednesdays and Thursday and attend the office on Monday, Tuesday and Friday with flexibility around start/finish times. We also have the added bonus of a Wellness day a month.
• Employee Stock Options - we want our people to share in our success, we reward them with ownership for their contribution in creating a world-class company.
• Work with like-minded people who are passionate about both the work we're doing and giving back. Our CG Gives programs enables us to all become philanthropists through our peer recognition and rewards system.
• Social Initiatives - pictures speak a thousand words!
Sound interesting? If you think you have the best composition of the above, send us your resume and let's chat!
* Cover Genius promotes diversity and inclusivity. We don't tolerate discrimination, demeaning treatment of anyone, or harassment due to race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or any other legally protected status.
By submitting your application, you acknowledge that we may collect, store and process your personal data for recruitment purposes. To ensure a fair evaluation, we may use AI to assist in sorting applications, but all final decisions are made by our hiring team and no candidate dispositions are automated. We will keep your information on file for three years from the date of your application. For detailed information about how we handle your data and our use of AI, please review our full Privacy Policy.
Autofill application
Save time by importing your resume in one of the following formats: .pdf or .docx.