Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
The Channel Recruiter
requests, and ensuring customers receive an excellent level of service in line with SLAs. This is a varied role that covers End User Support, Infrastructure, Networking, Cloud Services, and proactive monitoring. While you don’t need to be an expert in every area, you should be eager to learn, adaptable, and passionate about delivering first-class technical support. Established … bring that positive impact home. What You’ll Do: L2 Service Desk Analyst Investigate and resolve incidents and service requests in line with ITSM processes. Respond to alerts from monitoring tools (e.g., Logic Monitor) and address underlying issues. Implement technical changes, preparing and presenting to CAB when required. Support patch management, event management, and proactivemonitoring activities. More ❯
Strong skills in scripting languages (e.g., Python, Bash) to automate repetitive tasks and knowledge of configuration management tools (e.g., Ansible, Puppet, Chef). Expertise in setting up and maintaining monitoring systems (e.g., Prometheus, Grafana). Some other highly valued skills may include: Experience with cloud platforms (e.g., AWS, Azure, Google Cloud). Knowledge of containerization and orchestration tools (e.g. … practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
Strong skills in scripting languages (e.g., Python, Bash) to automate repetitive tasks and knowledge of configuration management tools (e.g., Ansible, Puppet, Chef). Expertise in setting up and maintaining monitoring systems (e.g., Prometheus, Grafana). Some other highly valued skills may include: Experience with cloud platforms (e.g., AWS, Azure, Google Cloud). Knowledge of containerization and orchestration tools (e.g. … practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
Redis, GridGain, Apache Ignite Programming languages: Java, Python, Go Lang Container orchestration/Cloud platform: RedHat Openshift/AWS/Azure DevOps tools - Ansible, Chef, Kubernetes, GitLab SRE logging & Monitoring Tools - ELK stack, Grafana, Prometheus, Open Telemetry Other highly valued skills include: Strong understanding of Agile application development methodology. Strong knowledge of API development/principles Collaborating with the … practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
CI or Jenkinsto enable fast, secure, and reliable software delivery. o Champion Kubernetes-based platformsusingAmazon EKSandIstio Service Meshto build scalable, service-oriented architectures. o Drive observability and reliability engineeringthrough proactivemonitoring, alerting, and incident response strategies. o Mentor and guide DevOps engineers, fostering a culture of continuous improvement, automation, and operational excellence. o Collaborate cross-functionallywith development, security … experience. We're looking for someone with deep expertise in: oInfrastructure as Code: Terraform, CloudFormation o Security best practices: IAM, KMS, encryption in transit/at rest, DevSecOps o Monitoring & observability: Datadog, Prometheus, Grafana, ELK, or similar What You Bring o 6+ years in DevOps or platform engineering, with experience in a technical lead role. o Proven experience designing More ❯
practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactivemonitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts … to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure More ❯
Chesterfield, Derbyshire, East Midlands, United Kingdom
Baltic Apprenticeships
team as an IT Apprentice. Founded in 2009 and headquartered in Chesterfieldwith additional offices in London, Manchester, and SheffieldAAG supports businesses across the UK with tailored IT solutions, including proactive network monitoring, cyber security, cloud services, and 24/7 managed IT support. In this hands-on role, the apprentice will assist with password resets, hardware setups, software More ❯
all IT enquiries, both via phone and the internal helpdesk system Accurately logging all support calls into the helpdesk system, capturing relevant information and assigning appropriate priority levels Proactively monitoring the helpdesk system for new tickets, ensuring timely responses and follow-ups for support or additional details Troubleshooting and resolving issues related to hardware, networking, and software Keeping users … the ability to quickly learn and adapt to new technologies (full training provided as needed) Highly organised and methodical , with the ability to remain focused and efficient under pressure Proactive and adaptable , able to multi-task effectively and thrive in a collaborative team environment Flexible and reliable , with a willingness to support extended hours and on-call duties when More ❯