Senior Manager, Software Deployment and Site Reliability
Time type: full time
Posted on: January 23, 2024
Job requisition id: R3038
What we need
We are looking for a Senior Manager, Software Deployment and Reliability. Your job will be to lead a Software Deployment team, execute and maintain best practices, and drive improvements towards scale across the company based on feedback from customers and our employees. This person is responsible for ensuring exemplary customer service, managing direct reports, using customer feedback to make data-driven recommendations to improve the customer experience, and overall ensuring a well-run software deployment team.
What we do
The Software Deployment and Site Reliability team is part of the Core Systems Software Engineering organization and is responsible for software/systems deployments, prompt critical issues resolution, and scaling to many customers and sites. We are solving some of the toughest operational challenges in some of the most sensitive and mission critical automated warehouse solutions.
What you’ll do
Manage and coach direct reports, including reviewing metrics and performance development.
Adhere to our core KPIs and implementation of daily, weekly, and monthly performance tracking for agents.
Work with other managers to forecast and schedule staffing and shifts based on forecasts and ticket volumes.
Recruit, interview, train, and on-board new employees
Answer questions, assist with difficult technical customer situations, and handle escalated situations.
Understand the end-to-end process of Symbotic software and assist in triaging and solving the issue and project tasks.
Advocate as the customer in technical cross-functional initiatives and communicate the impact on customers/team.
Understand and communicate with all departments to ensure alignment on messaging and customer and provider expectations.
Engage with the client project team and contribute to SOP and workshops to define business requirements.
Responsible for standardization and adoption of monitoring tools for the infrastructure department including Platform, Database, Reliability, and Operations teams.
Lead efforts to significantly reduce the SW deployment time, perform multiple deployments in parallel for scale, cut the time required to shut down or restart full structure systems by 50% or more.
Occasional travel of up to 15% may be required.
What you’ll need
Bachelor’s degree in Computer Science, Information Systems, or similar discipline.
10+ years’ experience in software engineering focusing on software applications, rapid response, customer-enabling engineering, deployment, support, and automation with at least 5 years of management experience.
Comfortable with contact center software, building interfaces, trend reports, and presenting findings.
Strong ability to problem-solve technical issues and handle multiple high-priority tasks.
Data-driven and comfortable translating customer responses into actionable items.
Excellent verbal and written communication skills – both understanding the customer and working cross-functionally internally.
Self-starter, quick learner, and calm under pressure.
Experience in Observability/monitoring and SRE is required
Experience managing quality assurance for Object Oriented Microservices-based development environments on Kubernetes/Linux systems is required
Experience with Rabbit MQ, Kafka, and Redis is a plus
Experience with log collection and storage (Splunk, Datadog, Sumologic, Loki, ELK) is a plus
Available to work and support ad hoc events.