We are looking for an experienced Site Reliability Engineer Architect to join our Infrastructure Engineering group. We strive to build the most reliable and performant systems for our customers and engineers.
We are looking for an Architect who has experience and a passion for designing and running complex large scale services with any or multiple public cloud platforms. This role requires collaboration with the Okta Software Engineering and Site Reliability Engineering teams to ensure we are providing solutions to improve their productivity to build, manage and run their team’s services on the Okta infrastructure with high availability, reliability and performance. The ideal candidate is someone that welcomes the challenge and enjoys seeing their designs run at scale with automation, testing, and tuning. If you exemplify the ethics of, "If you have to do something more than once, automate it," we want to hear from you!
What You'll Do:
- Lead initiatives to build Okta's production infrastructure with a focus on automation and scale for multiple public clouds
- Promote and apply best practices for building scalable and reliable services across engineering
- Be a subject matter expert with public cloud infrastructure and how Okta services can run on them efficiently and at scale
- Design, build, run and monitor Okta's production infrastructure
- Drive initiatives to evolve our current platform to increase efficiency and keep it in line with current standards and best practices
- Respond to production incidents and determining how we can prevent them in the future
- Identify and automate manual processes
- Be a Product Owner for the infrastructure roadmap and prioritized backlog
- Bring clarity to design and solution discovery processes and guide teams on how to solve complex problems in as simple a way as possible
- Lead development and deliver solutions that serve as a model for others with regard to execution, quality, scalability, operability, maintainability, etc
- Define the technical vision and architecture, and effectively drives towards this vision
- Communicate and collaborate across levels, functions and organizational boundaries
- Mentor and coach junior engineers to leverage their full potential
Qualifications for the role:
- Track record of leading successful large scale Infrastructure projects
- 8+ years of experience with designing and running large scale solutions on public cloud
- 2+ years of experience with Docker, Kubernetes or cloud managed Kubernetes, Service Mesh
- Possess knowledge in network and edge technologies
- Demonstrate strong Linux fundamentals
- 3+ years of experience with automating systems and infrastructure via Terraform
- Experience automating and running large scale production services in public cloud providers
- Can code to a good standard with a programming language using standard software development practices like unit testing and iterative development
- Experience working with Agile methodologies
- Champion excellent documentation and communication skills, with the ability to influence others
Education and Training:
- B.S. Computer Science (plus) or relevant experience
Okta is rethinking the traditional work environment, providing our employees with the flexibility to be their most creative and successful versions of themselves, no matter where the employees are located. We enable a flexible approach to work, meaning you can work from the office or home, regardless of where you live. Okta invests in the best technologies, and provides flexible benefits and collaborative work environments/experiences, empowering employees to work productively in a setting that best and uniquely suits their needs. Find your place at Okta https://www.okta.com/company/careers/.
Okta is an equal opportunity employer.