SRE Engineer

  • Drive Site Reliability Engineering agenda to improve availability, reliability, and performance of services

  • Drive observability for our applications

  • Drive, optimise and operate initiative, example, reduction of operation toil

  • Work with application teams in setting up SLI, SLO and Error budget for their applications

  • Work with enterprise team in deploying SRE enablers/initiatives

  • Experience in one or more of the following: Java Script, Java and Python

  • Experience with APM system as ELK, Grafana, Prometheus, Dynatrace and AppDynamics, etc

  • Understands key SRE concepts such as Toil, SLI, SLO, Error Budgets, MTTD, MTTR, etc

  • Possess strong interpersonal and communication skills to be able to deal with and form good relationships with other technology teams through day-to-day support and project work