Site Leader
Weekday AI • Poland
Posted: April 1, 2026
Job Description
This role is for one of the Weekday's clients
Min Experience: 10 years
Location: Poland, Remote (poland)
JobType: full-time
We are seeking a highly experienced and driven Site Leader with a strong background in Site Reliability Engineering (SRE) and Infrastructure to lead and scale our engineering operations. This role is ideal for a seasoned Engineering Manager who thrives at the intersection of leadership, system reliability, and large-scale infrastructure management. As a Site Leader, you will be responsible for building resilient systems, managing high-performing teams, and ensuring the availability, scalability, and performance of mission-critical platforms.
Key Responsibilities
- Lead and manage SRE and Infrastructure teams, driving operational excellence and fostering a culture of reliability and accountability.
- Define and execute the overall infrastructure and reliability strategy aligned with business goals.
- Oversee the design, deployment, and maintenance of scalable, highly available, and secure systems.
- Establish and monitor SLAs, SLOs, and SLIs, ensuring consistent service performance and uptime.
- Drive incident management processes, including root cause analysis, postmortems, and continuous improvement initiatives.
- Collaborate with product and engineering teams to embed reliability and scalability into the development lifecycle.
- Champion automation, observability, and proactive monitoring to minimize downtime and improve system health.
- Manage infrastructure costs, capacity planning, and resource optimization.
- Mentor and develop engineering managers and senior engineers, building a strong leadership pipeline.
- Ensure adherence to best practices in cloud infrastructure, DevOps, and security compliance.
Required Skills & Qualifications
- 10–15 years of experience in software engineering, infrastructure, or SRE, with at least 3–5 years in an Engineering Manager or leadership role.
- Proven expertise in Site Reliability Engineering (SRE) principles, including reliability, scalability, and fault tolerance.
- Strong experience with cloud platforms (such as AWS, GCP, or Azure) and modern infrastructure architectures.
- Deep understanding of infrastructure as code (Terraform, CloudFormation), CI/CD pipelines, and containerization technologies (Docker, Kubernetes).
- Demonstrated ability to lead and scale distributed engineering teams.
- Strong problem-solving skills with a focus on system-level thinking and root cause analysis.
- Experience with monitoring and observability tools such as Prometheus, Grafana, ELK stack, or similar.
- Excellent stakeholder management and communication skills, with the ability to influence cross-functional teams.
Preferred Qualifications
- Experience managing large-scale, high-traffic production systems.
- Background in DevOps transformation and cloud-native architecture.
- Familiarity with security best practices and compliance frameworks.
Additional Content
This role is for one of the Weekday's clients
Min Experience: 10 years
Location: Poland, Remote (poland)
JobType: full-time
We are seeking a highly experienced and driven Site Leader with a strong background in Site Reliability Engineering (SRE) and Infrastructure to lead and scale our engineering operations. This role is ideal for a seasoned Engineering Manager who thrives at the intersection of leadership, system reliability, and large-scale infrastructure management. As a Site Leader, you will be responsible for building resilient systems, managing high-performing teams, and ensuring the availability, scalability, and performance of mission-critical platforms.
Key Responsibilities
- Lead and manage SRE and Infrastructure teams, driving operational excellence and fostering a culture of reliability and accountability.
- Define and execute the overall infrastructure and reliability strategy aligned with business goals.
- Oversee the design, deployment, and maintenance of scalable, highly available, and secure systems.
- Establish and monitor SLAs, SLOs, and SLIs, ensuring consistent service performance and uptime.
- Drive incident management processes, including root cause analysis, postmortems, and continuous improvement initiatives.
- Collaborate with product and engineering teams to embed reliability and scalability into the development lifecycle.
- Champion automation, observability, and proactive monitoring to minimize downtime and improve system health.
- Manage infrastructure costs, capacity planning, and resource optimization.
- Mentor and develop engineering managers and senior engineers, building a strong leadership pipeline.
- Ensure adherence to best practices in cloud infrastructure, DevOps, and security compliance.
Required Skills & Qualifications
- 10–15 years of experience in software engineering, infrastructure, or SRE, with at least 3–5 years in an Engineering Manager or leadership role.
- Proven expertise in Site Reliability Engineering (SRE) principles, including reliability, scalability, and fault tolerance.
- Strong experience with cloud platforms (such as AWS, GCP, or Azure) and modern infrastructure architectures.
- Deep understanding of infrastructure as code (Terraform, CloudFormation), CI/CD pipelines, and containerization technologies (Docker, Kubernetes).
- Demonstrated ability to lead and scale distributed engineering teams.
- Strong problem-solving skills with a focus on system-level thinking and root cause analysis.
- Experience with monitoring and observability tools such as Prometheus, Grafana, ELK stack, or similar.
- Excellent stakeholder management and communication skills, with the ability to influence cross-functional teams.
Preferred Qualifications
- Experience managing large-scale, high-traffic production systems.
- Background in DevOps transformation and cloud-native architecture.
- Familiarity with security best practices and compliance frameworks.