digital trends, challenges, solutions, market dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are essential for guiding teams to More ❯
digital trends, challenges, solutions, market dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are essential for guiding teams to More ❯
digital trends, challenges, solutions, market dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are essential for guiding teams to More ❯
results that matter. By taking advantage of all structured and unstructured data - securing and protecting private information more effectively - Elastic's complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role The Search Inference team is responsible for bringing performant, ergonomic, and cost effective machine learning (ML) model inference More ❯
ramp up in highly technical, ambiguous domains. Strong knowledge of REST APIs , distributed system design, and performance optimization. Experience with both SQL and NoSQL data stores , caching layers, and observability tooling (e.g., Prometheus, Datadog). Nice to have: Experience deploying or integrating LLMs or NLP models in production systems. Comfortable balancing short-term execution with long-term architectural thinking . More ❯
Top Skills' Details 1) Top Secret Clearance+ (Preference for SCI & CI Poly regardless of Agency) 2) Apache NiFi 3) Containerization Tools Job Description Job Title: DevSecOps Engineer Location: Reston, VA or Charleston, SC Clearance Required: TS/SCI Employment Type More ❯
Description Job Title: DevSecOps Engineer Location: Reston, VA or Charleston, SC Clearance Required: TS/SCI Employment Type: Full-Time Company Overview Echelon Services LLC is a Native Hawaiian-Owned 8(a) small business that delivers mission-critical IT, cybersecurity More ❯
AWS Cloud Engineer We’re seeking a Cloud Engineer to own and scale our AWS-based infrastructure, powering a platform used by millions of cybersecurity individuals. You’ll ensure performance, security, scalability, and cost-efficiency, while enabling fast, reliable deployments More ❯
s about preparing." That's exactly our mission: to guide our clients towards anticipation, preparation, and success in a world that often settles for mere predictions. Through systematic applied observability, we aim to hand them a significant competitive advantage. By leveraging ultimate data insights, we aspire to refine decision-making and boost business operations, all unfolding in real time. And … insights gained from conferences, projects, literature, and market trends. Afterwards, you'll often find us enjoying some good food and drinks together! WHAT DOES YOUR JOB AS AN APPLIED OBSERVABILITY SPECIALIST AT ARCHERS LOOK LIKE As an Applied Observability Specialist, you collaborate closely with our community of integration experts to implement the Archers Connectivity Program for our industry-leading customers … while deep-diving into their unique requirements and objectives. You'll identify and analyze opportunities across various layers of an organization to enhance observability capabilities . You'll capture traces , logs , and metrics from integrated systems to develop appropriate alerting and visualization mechanisms to meet client needs. By designing and implementing effective monitoring and troubleshooting solutions, you ensure seamless integration More ❯
Administer GitLab infrastructure for CI/CD processes. Operate and maintain Kafka clusters for real-time data pipelines. Diagnose and resolve issues across systems, networks, containers, and applications. Use observability tools (Grafana, Prometheus, Kibana, Elasticsearch) to monitor system health. Automate system management tasks using Ansible. Participate in an on-call rotation to support global operations. Required Skills & Experience: Strong hands … system optimization. Production-level experience managing Kubernetes clusters. Proficiency with GitLab for version control and CI/CD workflows. Solid understanding of Kafka in high-throughput environments. Experience with observability tools such as Grafana, Prometheus, Kibana, and Elasticsearch. Expertise in Ansible for automation and configuration management. Strong problem-solving skills across infrastructure layers (compute, network, OS, containers). More ❯
Administer GitLab infrastructure for CI/CD processes. Operate and maintain Kafka clusters for real-time data pipelines. Diagnose and resolve issues across systems, networks, containers, and applications. Use observability tools (Grafana, Prometheus, Kibana, Elasticsearch) to monitor system health. Automate system management tasks using Ansible. Participate in an on-call rotation to support global operations. Required Skills & Experience: Strong hands … system optimization. Production-level experience managing Kubernetes clusters. Proficiency with GitLab for version control and CI/CD workflows. Solid understanding of Kafka in high-throughput environments. Experience with observability tools such as Grafana, Prometheus, Kibana, and Elasticsearch. Expertise in Ansible for automation and configuration management. Strong problem-solving skills across infrastructure layers (compute, network, OS, containers). More ❯
infrastructure components including AKS , managed identities, network controls, and secure storage. Manage infrastructure state with Terraform , integrating into our GitOps workflows. API and Automation Development (Java) Security & Compliance Engineering Observability and Incident Management Technical Environment Cloud: Microsoft Azure (AKS, Azure DevOps, storage, networking) Languages: Java (API/tooling), Bash, YAML, Go (optional) CI/CD: GitHub Actions, Argo CD, Terraform … Concourse (legacy) Observability: Datadog, custom metrics ingestion Source Control: GitHub Enterprise Orchestration: Kubernetes (Helm-based), Argo CD, Terraform What You’ll Bring: Deep experience designing and maintaining Azure-based infrastructure at scale. Solid engineering skills in Java , particularly for backend systems and automation tools. Proven ability to re-architect CI/CD systems, with hands-on experience in GitHub Actions … Background in financial services, trading infrastructure, or regulated environments. Experience with GitHub Enterprise, Argo CD patterns, or Kubernetes policy enforcement. Contributions to open-source tooling in CI/CD, observability, or platform engineering. This is a hands-on, high-leverage role for engineers who want to build resilient systems and own the tooling that powers real-time trading infrastructure. Apply More ❯
infrastructure components including AKS , managed identities, network controls, and secure storage. Manage infrastructure state with Terraform , integrating into our GitOps workflows. API and Automation Development (Java) Security & Compliance Engineering Observability and Incident Management Technical Environment Cloud: Microsoft Azure (AKS, Azure DevOps, storage, networking) Languages: Java (API/tooling), Bash, YAML, Go (optional) CI/CD: GitHub Actions, Argo CD, Terraform … Concourse (legacy) Observability: Datadog, custom metrics ingestion Source Control: GitHub Enterprise Orchestration: Kubernetes (Helm-based), Argo CD, Terraform What You’ll Bring: Deep experience designing and maintaining Azure-based infrastructure at scale. Solid engineering skills in Java , particularly for backend systems and automation tools. Proven ability to re-architect CI/CD systems, with hands-on experience in GitHub Actions … Background in financial services, trading infrastructure, or regulated environments. Experience with GitHub Enterprise, Argo CD patterns, or Kubernetes policy enforcement. Contributions to open-source tooling in CI/CD, observability, or platform engineering. This is a hands-on, high-leverage role for engineers who want to build resilient systems and own the tooling that powers real-time trading infrastructure. Apply More ❯
spirit. Responsibilities: Define and enforce SLOs, SLIs, and error budgets across critical services Develop and implement cloud infrastructure and tooling strategies Enhance SRE practices across the organization Implement robust observability metrics, logs, and traces using our observability tools Guide the team in building automated, self-healing systems Own and evolve incident response processes, including on-call practices and post-mortem … with AWS core services (EC2, EKS, RDS, S3, ALB/NLB, IAM, CloudWatch, etc.) Proficiency in Infrastructure as Code using Terraform and knowledge of GitOps workflows Strong background in observability: metrics, visualization, logging, tracing Understanding of automation, CI/CD pipelines, deployment automation, and release strategies Experience with incident management, disaster recovery, root cause analysis, and post-incident reviews Additional More ❯
and hybrid retrieval mechanisms Implement evaluation frameworks (BLEU, ROUGE, hallucination checks) to monitor answer quality Deploy production systems on GCP (Cloud Run, Vertex AI, BigQuery, Pub/Sub) Own observability, IaC (Terraform), and CI/CD (GitHub Actions) pipelines Collaborate with product, mobile, and clinical experts to ship weekly improvements Ensure compliance with data privacy standards (GDPR, NHS DSPT) Who … or recommender systems at scale Deep knowledge of embeddings, LLM-based retrieval, and vector similarity search Hands-on with GCP (or AWS/Azure), Terraform, CI/CD, and observability Strong communicator, product-minded, and thrives in fast-paced startup environments UK-based and available to work 2–3 days per week in-office (London) Bonus Points Experience in healthcare More ❯
and hybrid retrieval mechanisms Implement evaluation frameworks (BLEU, ROUGE, hallucination checks) to monitor answer quality Deploy production systems on GCP (Cloud Run, Vertex AI, BigQuery, Pub/Sub) Own observability, IaC (Terraform), and CI/CD (GitHub Actions) pipelines Collaborate with product, mobile, and clinical experts to ship weekly improvements Ensure compliance with data privacy standards (GDPR, NHS DSPT) Who … or recommender systems at scale Deep knowledge of embeddings, LLM-based retrieval, and vector similarity search Hands-on with GCP (or AWS/Azure), Terraform, CI/CD, and observability Strong communicator, product-minded, and thrives in fast-paced startup environments UK-based and available to work 2–3 days per week in-office (London) Bonus Points Experience in healthcare More ❯
platforms. Be part of a mission led company delivering smart, sustainable solutions for thousands of users across the UK. Work in a forward thinking engineering culture that embraces automation, observability and modern dev practices. ️ Tech Stack You'll Work With Core: Java (essential), JavaScript or TypeScript (bonus) Performance Testing: Custom frameworks, traffic analysis, monitoring tools Testing Tools: Playwright, Cypress or … backend testing , APIs, CICD, and data driven testing A collaborative mindset: someone who enjoys mentoring, problem solving and working closely with devs and stakeholders Nice to Have Familiarity with observability tools, logging, and analysing system behaviour in production Experience with cloud environments (AWS preferred) and containerised apps (Docker or Kubernetes) Exposure to JavaScriptTypeScript and frontend automation tools Working Model Based More ❯
platforms. Be part of a mission led company delivering smart, sustainable solutions for thousands of users across the UK. Work in a forward thinking engineering culture that embraces automation, observability and modern dev practices. Tech Stack You'll Work With Core: Java (essential), JavaScript or TypeScript (bonus) Performance Testing: Custom frameworks, traffic analysis, monitoring tools Testing Tools: Playwright, Cypress or … backend testing , APIs, CICD, and data driven testing A collaborative mindset: someone who enjoys mentoring, problem solving and working closely with devs and stakeholders Nice to Have Familiarity with observability tools, logging, and analysing system behaviour in production Experience with cloud environments (AWS preferred) and containerised apps (Docker or Kubernetes) Exposure to JavaScriptTypeScript and frontend automation tools Working Model Based More ❯
platforms. Be part of a mission-led company delivering smart, sustainable solutions for thousands of users across the UK. Work in a forward-thinking engineering culture that embraces automation, observability and modern dev practices. Tech Stack You'll Work With Core: Java (essential), JavaScript or TypeScript (bonus) Performance Testing: Custom frameworks, traffic analysis, monitoring tools Testing Tools: Playwright, Cypress or … backend testing, APIs, CICD, and data driven testing A collaborative mindset: someone who enjoys mentoring, problem-solving and working closely with devs and stakeholders Nice to Have Familiarity with observability tools, logging, and analysing system behaviour in production Experience with cloud environments (AWS preferred) and containerised apps (Docker/Kubernetes) Exposure to JavaScriptTypeScript and frontend automation tools Working Model Based More ❯
tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry, and model-driven infrastructure as code. Automate the full network lifecycle-including provisioning, configuration, observability, testing, troubleshooting, and capacity planning. Collaborate with architecture and design teams and the CTO office to implement new technologies that ensure scalability, efficiency, and operational resilience. Develop tools and platforms … that enhance the observability, reliability, and performance of the production network. Enhance existing monitoring and observability frameworks, integrating intelligent alerting and self-remediation capabilities to reduce manual intervention and improve incident response. Define and measure service-level objectives (SLOs) to track infrastructure performance and reliability. Write software utilizing orchestration systems to automate tasks and interact with other systems. Provide mentorship More ❯
ll do: Build production-grade AI/ML systems (NLP, OCR, LLM-based) Develop and manage robust MLOps pipelines and infrastructure Optimise models for performance, scalability, and cost Deploy observability tools (OpenTelemetry, Grafana) Collaborate with software engineers to embed AI into the product using APIs Create real-time, context-rich inference systems and retrieval-aware repositories What you'll accomplish … in your first 90 days: Deliver a rule-evaluation microservice using JSONLogic/CEL + MongoDB Integrate Pub/Sub sinks and land events in BigQuery Launch observability stack with dashboards + alerting We’re looking for someone smart, intense, focused, and ready to build things fast and well. Location: Remote or flexible (we work async and move quickly) Start More ❯
ll do: Build production-grade AI/ML systems (NLP, OCR, LLM-based) Develop and manage robust MLOps pipelines and infrastructure Optimise models for performance, scalability, and cost Deploy observability tools (OpenTelemetry, Grafana) Collaborate with software engineers to embed AI into the product using APIs Create real-time, context-rich inference systems and retrieval-aware repositories What you'll accomplish … in your first 90 days: Deliver a rule-evaluation microservice using JSONLogic/CEL + MongoDB Integrate Pub/Sub sinks and land events in BigQuery Launch observability stack with dashboards + alerting We’re looking for someone smart, intense, focused, and ready to build things fast and well. Location: Remote or flexible (we work async and move quickly) Start More ❯
Expertise with trading infrastructures and protocols (FIX, Market Data, Order Entry), and the demands of high-frequency and algorithmic trading environments. Proficiency in Linux systems, containers, and cloud-native observability stacks. Desirable Skills: Experience with Corvil, Pico tools, or similar for network telemetry ingestion. Exposure to observability platforms such as ITRS Geneos. Experience in data science integration, machine-learning models More ❯
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Experience leading/managing junior engineers Significant experience with Control Tower and deploying landing zones. More ❯
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Experience leading/managing junior engineers Significant experience with Control Tower and deploying landing zones. More ❯