Observability Engineer/SRE Engineer
Responsibilities:
- Assess the current state of monitoring and observability across applications.
- Provide proactive alerting & visualization by creating actionable dashboards for applications and alerting strategies.
- Establish and promote monitoring best practices.
- Act as an escalation point for incidents and provide strategic guidance and recommendations to engineering and operations teams.
- Use Infrastructure as Code (IaC) tools like Terraform and Scripting (Python, Bash, PowerShell) to automate monitoring setups.
Key Skills:
- Solid experience with APM, monitoring, observability and event management tools including Datadog, Dynatrace/AppDynamics, Grafana.
- Experience with JSON and Scripting languages such as Python, Bash, PowerShell or JavaScript for automation of tasks.
- Exposure to CI/CD pipelines and IaC (Infrastructure as Code).
- Strong in analytical and problem-solving skills for diagnosing complex issues
- Effective in communication, individual leadership, and cross-functional team collaboration.
- Ability to think outside the box, sensitivity towards business impacts, and self-awareness to refine processes.
- Proficiency in broader aspects of monitoring and observability (APM, System Monitoring, Logs, Tracing, Visualization, Reporting and Integration)
- Certified professional in Dynatrace/AppDynamics or ITIL is desirable.