Data Center Capacity Manager
Meta is seeking a leader and people manager who will drive excellence in engineering, analytics, planning and operations to ensure successful delivery and performance of our production infrastructure capacity. Strategic thinking, technical depth, and proven business acumen are essential in this role. Depth and breadth of knowledge managing large operations with a strong emphasis on supporting people and culture are core competencies of this individual.The Capacity Team is responsible for managing the growth and life-cycle of computing resources and data center capacity at Meta, as part of a global production footprint. Collectively and globally, the Data Center Capacity team plan and build one of the largest Internet services in the world with tens of billions of user requests, tens of exabytes of data, thousands of giga bps of network flow while maintaining operational and resource allocation efficiency. The Data Center Capacity Manager leads the team that drives planning, ownership and delivery of capacity within our data center locations. Strong project management experience and infrastructure engineering knowledge is required as this team manages complex infrastructure projects within the data center, often in parallel; including capacity planning, installations, retrofits, and service migrations. This person should also have strong people management, leadership and people development experience. This role will interact closely with many cross-functional partners including capacity and performance engineers, data scientists, optimization and process engineers, capacity planners, supply chain, logistics, finance, data center construction, facility operations, security, network engineering, network operations, hardware engineering, NPI, software engineering and systems & tooling engineering.
Data Center Capacity Manager Responsibilities:
- Lead the Data Center Capacity Team that drives both strategic and tactical data center capacity planning, scaling, and management in a production Meta data center, as part of one of the largest infrastructure fleets in the world.
- Deliver complex data center infrastructure capacity, taking into account the interdependencies of production resiliency, power, cooling, network, server and application layers, throughout the data center life-cycle.
- Leverage deep technical expertise, and develop new data center capacity strategies that focus on scaling capacity, driving improvements in reliability, and achieving cost and operational efficiency, include reductions in deployment times, improvements in data center power utilization, service placement optimizations, and lifecycle total cost of ownership savings.
- Own data center capacity deployment and engineering strategies at your data center site, including direction for lifecycle management, optimal hardware spread, buffer management, ensuring software platform resiliency and service delivery goals are achieved.
- Hire, build and lead a diverse, world-class data center operations and engineering team, developing both the technical capabilities and leadership qualities of engineers through mentorship, guidance and career development.
- Encourage your team to develop the next generation data-driven systems to automate capacity life-cycle management at scale, across a global fleet. Define the project roadmap to automate and improve our capacity management at scale solutions.
- Partner with Data Science, Capacity Planning, Performance Engineering, Supply Chain, Logistics, Data Center Construction, Facility Operations, Network Engineering, network Operations, Security, Hardware Engineering, Software Engineering to optimally scale Meta infrastructure.
- Work and partner with Meta hardware and software engineering teams and vendors to help resolve complex technical issues that affect Meta's computing infrastructure.
- Ensure excellence in operational delivery through partnership with both remote and local peer organizations to support ongoing business growth.
- Understand team and Data Center performance through data trending, analysis and interpretation of systemic issues that impact fleet uptime and utilization. Ensure delivery of root cause analysis of complex technical and engineering issues and drive resolution.
- Build cross-functional relationships and have the ability to influence policies and procedures to improve regional/global data center operations.
- Create and drive a culture of safety, ownership, innovation, collaboration and accountability.
- Be forward thinking by understanding data center growth, identifying scaling issues before they occur.
- BS, BA, or BEng or commensurate experience.
- Substantial experience in managing technical teams and leading engineers and technicians.
- Project management and delivery experience.
- 4+ years experience managing technical teams, leading, training and mentoring engineers.
- 5+ years of experience in a combination of capacity planning, demand and supply management , production planning, operations planning or infrastructure management.
- Proven track record setting goals, tracking progress, and growing individuals.
- Knowledge of enterprise level networking, server and storage installs.
- Experience in the application of data-driven continuous improvement through lean 6 Sigma data and process analysis, visualization and modeling using Excel, Tableau and Minitab.
- Comprehensive cross-discipline engineering and technical knowledge of data center infrastructure systems and applications, or a directly compatible industry such as pharmaceuticals, nuclear, or large-scale manufacturing.
- Experience in communicating with and managing cross-functional relationships across stakeholders and presenting to senior executives.
- English as a working language. Effective communication skills.
- Agile, Prince2 or PMP certifications.