and deployment pipelines Familiarity with regulated workflows: ISO27001, SOC2, GDPR aren't just abbreviations, and don't fill you with dread Observability skills: Well familiar with Open Telemetry, Prometheus, Loki and Grafana CI/CD pipeline skills: You know what it takes to build templates and guardrails to allow the most junior developers to confidently push code, safely knowing More ❯
best practices across cloud and network environments. Troubleshoot deployment and performance issues across multiple environments. Set up and maintain observability tools for logging, monitoring, and alerting (e.g., Prometheus, Grafana, Loki). Contribute to internal tooling to streamline development, testing, and operations workflows. Stay current with DevOps trends and recommend improvements to tools and processes. Required Qualifications: Bachelor's degree … to multi-cloud or hybrid cloud architectures. Tech Stack: Cloud: AWS, OCI ZTN: Cloudflare Application: Kong (API Gateway), Java Spring Boot, Python, Go, TypeScript Monitoring: Prometheus Stack (Prometheus, Grafana, Loki) Compute: ECS, EC2, Lambda Frontend: S3, CloudFront Data: Glue, S3, PostgreSQL CI/CD: GitHub Actions IaC: Terraform, AWS SAM Why Join Us? At Intelmatix, you'll work on More ❯
EC2, RDS/Aurora, S3). Develop and maintain Infrastructure as Code using Terraform and configuration management with Ansible. Enhance monitoring, logging, and alerting using the Grafana stack (Prometheus, Loki, Tempo). Participate in incident management, root cause analysis, andpost-incident reviews. Implement automation to reduce manual operational tasks and improve recovery time. Contribute to the definition and tracking … relevant to production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong GitOps mindset for managing platform and configuration More ❯
EC2, RDS/Aurora, S3). Develop and maintain Infrastructure as Code using Terraform and configuration management with Ansible. Enhance monitoring, logging, and alerting using the Grafana stack (Prometheus, Loki, Tempo). Participate in incident management, root cause analysis, andpost-incident reviews. Implement automation to reduce manual operational tasks and improve recovery time. Contribute to the definition and tracking … relevant to production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong GitOps mindset for managing platform and configuration More ❯