Zabbix Administrator
Zabbix Administrator & Site Reliability Engineer
Provide administration, support, and operational management of the Zabbix monitoring platform, ensuring reliable monitoring, alerting, and observability across enterprise infrastructure and services.
- Provide Tier 1 support including user access management, alert triage, and incident response.
- Configure and maintain Zabbix Servers, proxies, templates, hosts, triggers, dashboards, discovery rules, and integrations.
- Implement and support monitoring for Servers, networks, applications, SNMP devices, syslog events, and service health metrics.
- Support 24x7 monitoring operations, platform availability, patching, upgrades, and deployments.
- Apply SRE practices to improve reliability, reduce alert noise, enhance monitoring quality, and support operational readiness.
- Perform capacity planning, performance analysis, and monitoring platform optimization.
- Maintain security controls including role-based access, credential management, audit compliance, and governance standards.
- Support production readiness activities including failover testing, change management, documentation, and disaster recovery planning.