Site Reliability Engineer, Infrastructure (Remote)
Location: Remote - US only
Job Type: Full time
In Meraki SRE we build the highly scalable cloud infrastructure that supports millions of Meraki devices worldwide. Meraki’s customer base has grown by a factor of 2-3 every year, serving more than 8 billion HTTP requests per day across ten data centers! Our customers depend on our products to run their critical infrastructure of network switches (now including Cisco Catalyst in addition to the Meraki switches), security appliances, wireless APs, security cameras and sensors. In addition, Meraki offers a range of SaaS solutions on its Dashboard that greatly improves the insight and experience of IT departments around the globe.
The Infrastructure SRE team is responsible for the compute, storage and security underpinning Meraki's cloud in 10 data centers worldwide. Meraki's high growth rate means our processes must be automatic and efficient, never driven manually. Automation, monitoring and a keen eye for technical debt are key.
As a member of the team, you will craft and develop the global infrastructure which supports our cloud; this might mean deploying new Infrastructure management technologies at scale, writing code and using workflow orchestrators to improve our provisioning and decommissioning processes, or building models to predict business demand. You will also work closely with our vendors and internal Datacenter Operations team. We follow the *nix way (build large systems out of small components that each does one job and does it well. We run Debian and Ubuntu), automate tedious tasks and work almost entirely with infrastructure-as-code.
This role is to support a specific customer, and there is a 24/7 on-call requirement as part of a rotation. You will work with your team to deliver technical projects to support the wider business while spending a portion of your time working cross-team to support this critical customer.
- Deploying and running IaaS solutions that let teams run seamlessly between our private cloud and AWS.
- Using and running a workflow orchestrator (i.e. Luigi, Apache Airflow, Argo, etc.) to continuously reduce any manual work required to run our Infrastructure.
- Security hardening for services, packages and processes, including OS images.
- Building an automatic service lifecycle platform to coordinate the full lifecycle of all infrastructure (server, storage, network and site).
- Deploying comprehensive monitoring tools to provide insight into the performance and reliability of our infrastructure.
- Automating testing infrastructure to accelerate the velocity at which we can deploy changes.
You are an ideal candidate if:
- Enjoy and have experience leading large technical projects - particularly working with cloud systems, networking, distributed systems, or data processing frameworks (ETL pipelines).
- Have at least 2 years of experience
- Have experience running and/or developing highly scalable IaaS solutions.
- Have experience with automation tools such as Ansible and Terraform as well as container technologies (Docker or similar).
- Script or code with 1-2 languages like Ruby, Scala, and Python. Enjoy digging into other people’s source code (even if you don't know the language) in search of the root cause of a problem. Instinctively write code to deploy and automate infrastructure.
Bonus points for:
- Exciting personal projects or contributions to open-source.
- Experience working with ETL pipelines.
- Experience deploying HA services on K8s
- Experience working with Cellery
Cisco Covid-19 Vaccination Policy
The health and safety of Cisco's employees, customers, and partners is a top priority. Our goal is to protect and mitigate the spread of COVID-19 infection for strong business resiliency during the pandemic. Therefore, Cisco may require new hires to be fully vaccinated against COVID-19 if the role requires business-related travel, meeting with customers/partners (including visiting third-party sites on behalf of Cisco), attending trade events, and Cisco office entry, unless otherwise prohibited by applicable law, and in countries where COVID-19 vaccination is legally required. The company will consider legally required accommodations/exceptions for medical, religious, and other reasons as per the requirements of the role and in accordance with applicable law. Additional information will be provided to candidates about the requirements and accommodation process at the offer time based on region.
Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
At Cisco Meraki, we’re challenging the status quo with the power of diversity, inclusion, and collaboration. When we connect different perspectives, we can imagine new possibilities, inspire innovation, and release the full potential of our people. We’re building an employee experience that includes appreciation, belonging, growth, and purpose for everyone.