processes Strong Linux skills Create and maintain custom automation scripts using Python, Bash, or similar tools. Monitoring and Optimization: Monitor system health, resource usage, and costs to ensure high performance and operational efficiency. Apply cost optimization strategies and identify opportunities to streamline AWS operations. Standards and Best Practices: Develop and enforce operational standards and best practices for cloud deployments. … CloudFormation, AWS CDK. Familiarity with the Atlassian suite agile methodologies Experience scripting in Python, Bash, or other languages for cloud automation. Proven ability to implement cloud cost optimization and performance tuning. Strong communication and documentation skills. U.S. Citizenship with the ability to obtain a Public Trust clearance. Preferred Qualifications: AWS certifications such as AWS Solutions Architect Professional and/ More ❯
expert across a diverse, modern, and complex technology landscape, ensuring seamless support and smooth operations. You'll take charge of incident triage and resolution, lead system upgrades, and keep performance optimized through proactive monitoring and alerting. Beyond day-to-day support, you'll drive continuous improvements in processes and tools, collaborating with stakeholders to elevate operational excellence and keep … Plan and execute minor upgrades and configuration changes with minimal impact to the production environment. Maintain and optimise system configurations across legacy and modern applications to ensure their continued performance and reliability. System Monitoring & Performance: Maintain and improve logging, monitoring, and alerting systems. Define service-level objectives and indicators for business applications. Continuously review performance metrics against … SLO/SLIs and proactively address performance bottlenecks or underperforming systems. Manage system health checks, review logs, and act on warnings to avoid potential disruptions. Provide regular updates to Technology leadership members and key business stakeholders. Work alongside Engineering teams to enhance system resilience, focusing on high-availability, scalability, and reliability. Support and contribute to the development of runbooks More ❯
Annapolis Junction, Maryland, United States Hybrid / WFH Options
GTSC Talent Solutions
Tools Management Specialist System Engineer to support our customer in the Annapolis Junction, MD area. You will be involved in service monitoring and Management, continuously monitoring the availability and performance of Splunk, SCCM, Micro Focus, and 1E. You will manage and execute patching activities across the enterprise. You will work with performance analytics to aid you in … making recommendations for performance improvements and security enhancements. You will play a key role in issue remediation, and security compliance. You will also document your work and prepare and present reports to management. Mission: As a System Engineer on this government program, you will be supporting important systems to support efforts to keep our country safe. Requirements: U.S. Citizenship … cloud security principles and best practices Excellent communication skills, both verbal and written, with the ability to explain technical concepts to non-technical stakeholders. Experience with monitoring tools and performance tuning. Familiarity with security compliance standards and best practices. Patch Management experience. Good Problem Solving and Critical Thinking Skills Desired Requirements Experience with scripting and automation (e.g., PowerShell) Mission More ❯
deep SQL expertise with advanced Power BI capabilities to support mission-critical business logic, reporting systems, and strategic insights. You will work with complex hierarchical data models, optimize database performance, and develop impactful dashboards that empower stakeholders across the organisation. Key Responsibilities SQL Development & Database Administration Design and implement efficient SQL queries and database structures to support complex business … processes and hierarchical data relationships. Develop recursive queries and CTEs to traverse and manipulate hierarchical data. Optimise stored procedures and troubleshoot performance issues using execution plans and query tuning techniques. Collaborate with developers and analysts to translate business requirements into efficient database logic. Enforce data governance, integrity, and consistency across database systems. Power BI Analytics Design and maintain … Gather and translate complex business requirements into technical specifications. Promote best practices in data governance and maintain documentation for report logic and dashboard functionality. Optimise Power BI solutions for performance and scalability. Strategic & Cross-Functional Collaboration Support re-platforming initiatives with data-driven insights. Mentor junior analysts and contribute to a culture of continuous learning. Identify opportunities for new More ❯
Central London, London, United Kingdom Hybrid / WFH Options
Singular Recruitment
collaborate with some of the most well-known footballers in the industry. This position offers significant opportunities for professional growth within sports analytics and the potential to impact sports performance through advanced technology, making it an ideal setting for those passionate about leveraging cutting-edge technology to make meaningful contributions in the world of sports analytics. Key responsibilities for … years industry experience in a Data Engineer role and a strong academic background Python & SQL: Advanced-level Python for data applications and high proficiency SQL for complex querying and performance tuning. ETL/ELT Pipelines: Proven experience designing, building, and maintaining production-grade data pipelines using Google Cloud Dataflow (Apache Beam) or similar technologies. GCP Stack: Hands-on expertise … experience building real-time data pipelines. DevOps & MLOps: Knowledge of Infrastructure as Code (Terraform), MLOps principles, and containerization (Docker, Kubernetes). What They Offer Work that impacts elite football performance and club-wide success Access to real-world sports data and performance analytics Flexible working options (hybrid/remote depending on role) Opportunity to grow with a digital More ❯
Washington, Washington DC, United States Hybrid / WFH Options
Halvik
database, web, and GUI components. Perform logical and physical database designs, ensuring efficiency, scalability, and operational reliability. Review and evaluate existing database structures, providing recommendations for enhancements to improve performance and long-term sustainability. Develop and maintain internet applications in a Microsoft environment, leveraging SQL Server (including stored procedures), Visual Studio, and IIS. Conduct data analysis and transform business … least 5 years of experience in systems development, testing, and implementation of integrated solutions. Demonstrated expertise in SQL Server development, including stored procedures, logical/physical database design, and performance tuning. Experience with Visual Studio and IIS in a Microsoft web environment. Strong knowledge of data analysis, database development, and operations and maintenance activities. Solid understanding of data processing More ❯
Chantilly, Virginia, United States Hybrid / WFH Options
Gridiron IT Solutions
processes Strong Linux skills Create and maintain custom automation scripts using Python, Bash, or similar tools. Monitoring and Optimization: Monitor system health, resource usage, and costs to ensure high performance and operational efficiency. Standards and Best Practices: Develop and enforce operational standards and best practices for cloud deployments. Documentation and Support: Create and maintain comprehensive documentation for AWS infrastructure … as-code tools such as Terraform (preferred), CloudFormation, AWS CDK. Familiarity with Jira Experience scripting in Python, Bash, or other languages for cloud automation. Proven ability to implement cloud performance tuning. Strong communication and documentation skills. U.S. Citizenship with the ability to obtain a Public Trust clearance. Preferred Qualifications: AWS certifications such as AWS Solutions Architect Professional and/ More ❯
systems. They will work closely with clinical, operational, IT and BI colleagues to ensure data solutions meet the evolving needs of the organisation, including statutory reporting, service planning, and performance monitoring. This role requires a strong understanding of data architecture, ETL processes, and business intelligence tools, along with a commitment to data quality, security, and governance. The successful candidate … warehouse and its associated systems, ensuring they are robust, scalable, and aligned with the evolving needs of the organisation. Proactively identify, manage, and deliver development projectsto enhance the functionality, performance, and usability of the data warehouse and reporting infrastructure. Maintain and validate data systemsto ensure they remain accurate, reliable, and compliant with national and local information governance standards, including … of patient based information systems Experience of line management and providing support to other colleagues Knowledge Essential Strong proficiency in T-SQL for writing complex queries, stored procedures, and performance tuning. Familiarity with Azure Data Factory for ETL/ELT processes. Understanding of the importance of data quality, confidentiality and security and Information Governance principles Knowledge of Power BI More ❯
Essex Hybrid The Opportunity We are seeking a highly motivated and experienced Site Reliability Engineering (SRE) Manager to lead a team of SREs responsible for the reliability, scalability, and performance of our production systems. This role is pivotal in bridging the gap between development and operations, ensuring our systems are robust, automated, and continuously improving. You will be supporting … To Do: Team Leadership & Development Lead, mentor, and grow a high-performing team of SREs. Foster a culture of ownership, continuous learning, and operational excellence. Drive career development and performance management for team members. Reliability & Performance Own the availability, latency, performance, and capacity of critical services. Define and monitor SLAs, SLOs, and SLIs to ensure service reliability. … engineering, cloud platforms (AWS, Azure), and container orchestration (Kubernetes). Proficiency in monitoring, alerting, and incident management tools (Prometheus, Grafana, PagerDuty). Solid understanding of networking, distributed systems, and performance tuning. Excellent communication, leadership, and stakeholder management skills. Preferred Qualifications Experience in a high-scale, high-availability SaaS environment. Familiarity with security and compliance in cloud-native environments. Why More ❯
the roleTo apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. AccountabilitiesAvailability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from … recurring.Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomesIf the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment More ❯
the roleTo apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. AccountabilitiesAvailability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from … recurring.Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomesIf the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment More ❯
the roleTo apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. AccountabilitiesAvailability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from … recurring.Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomesIf the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures If managing a team, they define jobs and responsibilities, planning for the department's future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … improvements and escalate breaches of policies/procedures.. If managing a team, they define jobs and responsibilities, planning for the department’s future needs and operations, counselling employees on performance and contributing to employee pay decisions/changes. They may also lead a number of specialists to influence the operations of a department, in alignment with strategic as well More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an More ❯
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents … from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development … using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an More ❯
computer systems, including servers, storage area networks and network attached storage on five enclaves. • Perform SharePoint Server (2019) on-prem administration including installation, configuration, migration, and services management. • Provide performance monitoring and tuning, upgrades and patching, backup and recovery, as well as troubleshooting and problem resolution. • Develop and maintain schedules and plans for system updates and patching. • Audit More ❯