Senior Application Support Engineer

This post is for our client.

Roles & Responsibilities

Production Monitoring & Incident Leadership

  • Oversee real-time monitoring of production systems using AWS CloudWatch, Datadog, and database logs.
  • Lead incident response calls, directing triage activities and coordinating cross-team communication for production application defects.
  • Quickly assess severity and business impact, ensuring timely escalation and clear stakeholder updates.
  • Drive or collaborate on post-incident reviews, including root-cause analysis, documentation, and implementation of corrective actions.

Advanced Troubleshooting & Platform Expertise

  • Perform in-depth investigations using database queries (PostgreSQL), log analysis, dashboards, and tracing tools.
  • Identify systemic issues and collaborate with backend, frontend, and data engineering teams to design permanent fixes.
  • Build tooling, scripts, and automated checks to reduce operational overhead and defects.
  • Develop and maintain internal knowledge bases and troubleshooting playbooks.

Operational Excellence & Reliability Improvements

  • Own production runbooks, monitoring strategies, alert thresholds, and operational best practices.
  • Identify inefficiencies in support workflows and propose improvements to tooling and processes.
  • Mentor junior engineers on triage processes, debugging methods, and client platform architecture.
  • Ensure alignment with compliance requirements (GDPR, HIPAA equivalents for EU operations, security best practices).

AWS & Cloud Infrastructure

  • Work with DevOps and engineering teams to maintain AWS infrastructure supporting client's applications.
  • Contribute to performance tuning, cost optimization, and cloud resource hygiene.
  • Investigate incidents involving AWS components such as EC2, RDS, S3, ECS, Lambda, CloudWatch, and networking configurations.

Deployment & Release Management

  • Own or support simple deployment processes and validate deployments, perform smoke testing, and ensure rollback readiness.
  • Provide feedback to improve deployment automation and contribute to increasing deployment reliability and speed.
  • Provide guidance and review for changes introduced by development teams to ensure production readiness.

Qualifications

Must - Have

  • 5+ years in Application Support, Production Support, SRE, DevOps Support, or a similar role.
  • Strong SQL skills and experience querying relational databases at scale.
  • Expertise in logs, distributed systems debugging, and incident management practices.
  • Hands-on experience with AWS (CloudWatch, ECS, RDS, S3, Lambda, networking) from a monitoring purview.
  • Experience with CI/CD pipelines and deployment processes.
  • Ability to lead technical discussions and make decisive calls during critical incidents.

Nice-to-Have

  • Experience working in healthcare, clinical research, or other regulated industries.
  • Familiarity with GDPR-compliant data handling and operational security practices.
  • Prior mentorship or team leadership experience.
  • Exposure to data pipelines and ETL operations.

Additional notes:

  • Hybrid (in office 2 days per week)
  • Ability to remain in a stationary position for extended periods of time. 
  • Ability to communicate information and ideas efficiently and accurately. 
  • Ability to operate and stare at a computer for extended periods of time.
Company
ChiSquare Labs (UK)
Location
City of London, Greater London, UK
Posted
Company
ChiSquare Labs (UK)
Location
City of London, Greater London, UK
Posted