Return to jobs list

Site Reliability Engineer

Job type: Full Time · Department: Technology · Work type: On-Site

Sydney, New South Wales, Australia

About the Company

Cover Genius is a Series E Insurtech that protects the global customers of the world’s largest digital companies including Booking Holdings, owner of Priceline, Kayak and Booking.com, Intuit, Hopper, Skyscanner, Ryanair, Turkish Airlines, Descartes ShipRush, Zip and SeatGeek. We’re also available at Amazon, Flipkart, eBay, Wayfair and SE Asia’s largest company, Shopee.

Our partners integrate with XCover, our award-winning insurance distribution platform, to embed protection for millions of customers worldwide each year. Our team and products have been recognized sed with dozens of awards including by the Financial Times who ranked Cover Genius as the #1 fastest growing company in APAC in 2020. Our diverse team across 20+ countries and many language groups commits itself to diverse cultural programs, in particular “CG Gives” which makes social entrepreneurs out of us all and funds development initiatives in global communities.

Our People are Bold, Authentic, Purposeful and Inspired  

Our People are not Perfect, Traditional, Complacent or Cautious 

About the Role

As a Site Reliability Engineer on our Technology Team, you will own the reliable operation and continuous improvement of our production systems. Your primary purpose will be to ensure the seamless and secure functioning of our platforms and operations.

To drive success in this role, you will have a strong background in systems engineering and automation, with experience in release processes, observability, security, core network and infrastructure, and datastores and disaster recovery. You should possess excellent problem-solving skills, a keen attention to detail, and a proactive approach to identifying and mitigating potential issues.

As the Site Reliability Engineer, you will be responsible for:

  • Monitoring system health and ensuring operational stability and security

  • Automating and optimizing platform operations

  • Sharing ownership of production workloads with software engineering teams

  • Writing and maintaining technical documentation, including tutorials, guides, and blameless post-mortems

  • Designing and creating information dashboards based on logging and monitoring data

  • Collaborating with software engineers to drive automation, scalability, and efficiency across technology products and platforms

  • Regular collaboration with software engineering teams, security teams, and other relevant stakeholders will be key in ensuring the reliability and efficiency of our production systems are achieved.

Key Responsibilities

  • Analyze, test and modify systems to improve reliability and optimize performance particularly at an architectural/infrastructure level

  • Apply AWS and GCP knowledge and skills to create & maintain cloud infrastructure for software projects

  • Develop and maintain observability tooling and dashboards

  • Implement automation tools and frameworks, CI/CD pipelines, Reduce toil

  • Troubleshoot production issues and coordinate with the development team to streamline code deployments

  • Design, develop and implement software integrations

  • Collaborate with Software Engineers and other team members with the goal of improving engineering tools, systems, procedures and data security

  • Develop and maintain design and troubleshooting documentation and runbooks

  • Optimize and control costs of the company’s computing infrastructure

Skills & Experience:

What you will bring

  • Understanding of SRE Principles and best practices

  • Experience using & configuring modern observability tools such as Datadog, Elasticsearch, Prometheus, Grafana

  • Experienced with container technology such as Docker and Ideally experienced with using and managing Kubernetes clusters

  • Experience working with infrastructure & configuration as code tools such as Terraform, Cloudformation, Chef, Puppet etc.

  • Comfortable scripting & developing internal tooling with Bash and at least one programming language (e.g. python, go)

  • Experience working with Linux

  • Solid understanding of networking and system architecture

  • Solid understanding of how to deploy, scale and monitor web applications and databases

  • Good knowledge of AWS and/or GCP platforms and associated best practices

  • Bachelor's degree in Computer Science/Engineering, A postgraduate degree and/or record of academic achievement is also desirable

What you will have

  • Strong communication and documentation skills

  • Curious and self motivated learner

  • Professional approach

  • Good team member

  • Organisational and time management skills

  • Excellent attention to detail

  • Positive approach to change

Why Cover Genius?  

Cover Genius not only cares about being the best in our industry, we care about our team. We’re a business that understands life can be fluid and so we flex to ensure we provide the environment to suit that. What does that mean?  

• Flexible Work Environment - our teams are hybrid. We work from home on Wednesdays and Thursday and attend the office on Monday, Tuesday and Friday with flexibility around start/finish times. We also have the added bonus of a Wellness day a month.

• Employee Stock Options - we want our people to share in our success, we reward them with ownership for their contribution in creating a world-class company.

• Work with like-minded people who are passionate about both the work we're doing and giving back. Our CG Gives programs enables us to all become philanthropists through our peer recognition and rewards system.

• Social Initiatives - pictures speak a thousand words!

Sound interesting? If you think you have the best composition of the above, send us your resume and let's chat!

* Cover Genius promotes diversity and inclusivity. We don't tolerate discrimination, demeaning treatment of anyone, or harassment due to race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or any other legally protected status.

By submitting your application, you acknowledge that we may collect, store and process your personal data for recruitment purposes. To ensure a fair evaluation, we may use AI to assist in sorting applications, but all final decisions are made by our hiring team and no candidate dispositions are automated. We will keep your information on file for three years from the date of your application.  For detailed information about how we handle your data and our use of AI, please review our full Privacy Policy.

Made with