Senior site reliability engineer (UK)

Summary

The Meraki cloud serves millions of customer devices from 8 datacentres around the world.

Technology Interest: Cloud and Data Center, Networking, Software Development, Testing

Area of interest: Engineer - Software

Job type: Professional

The Meraki cloud serves millions of customer devices from 8 datacentres around the world. As a Senior Site Reliability Engineer on the Observability team you will be responsible for designing useful, scalable and secure monitoring systems that make sure we stay online. You’re passionate about data, and about using automation to raise the bar.

In this role you will join a small engineering team that is based out of our office in London, UK. You will lead the design, development and operational aspects of the monitoring, log/event collection, and metric processing systems which support our private cloud. We believe in automating manual tasks with the right tools.

As SREs at Meraki we are responsible for building and scaling the cloud that supports millions of Meraki devices across the world. Meraki’s customer base has grown by a factor of 2-3 every year, serving more than 4 billion HTTP requests per day across six datacentres. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras. We embrace the *nix way, automate away tedious tasks and build infrastructure as code.

Example projects of a Senior Site Reliability Engineer (Observability):

  • Lead the discussion around our Graphite architecture to handle the next five years of metric growth.

  • Design and build ElasticSearch clusters holding 10-1000TB of data, for a variety of use cases.

  • Gather requirements, design and build an alerting system that allows developers to construct alerts - from multiple data sources and alerting workflows.

  • Develop comprehensive meta-monitoring tools that provide new insights into our complex event and metric pipelines.

  • Write libraries and APIs that provide a simple, unified interface to other developers when they use our monitoring, logging and event processing systems.

  • Automate cluster scaling so monitoring resources can be requested and automatically deployed.

You are an ideal candidate if you:

  • Have 6+ years experience designing, deploying and operating mid to large scale enterprise or cloud environments.

  • Have 3+ years experience scripting or coding with languages like Ruby, Scala, Python, or Bash.

  • Fearlessly dive into other people's source code to solve a problem.

  • Know your way around *nix systems. We run Debian.

  • Consult with other teams on how they can better monitor their service. Evangelize best practice.

  • You automate all the things.

  • You care about and empathise with the customer experience. You have experience supporting an externally-facing production environment, ideally in a team that follows the sun.

  • Bonus points for experience with: ElasticSearch, Logstash, Kibana, Graphite, Grafana, statsd, collectd, Snowflake, Ansible, Ruby.


Keywords: Observability, Monitoring, SRE, Site Reliability Engineering, DevOps, ElasticSearch, Logstash, Kibana, ELK, Grafana, Graphite, statsd, collectd, Snowflake, Ansible, Ruby.

Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.

Join the CW jobs mailing list

This site uses cookies.

We use cookies to help us to improve our site and they enable us to deliver the best possible service and customer experience. By clicking accept or continuing to use this site you are agreeing to our cookies policy. Learn more

Start typing and press enter or the magnifying glass to search

Sign up to our newsletter
Stay in touch with CW

Choosing to join an existing organisation means that you'll need to be approved before your registration is complete. You'll be notified by email when your request has been accepted.

i
Your password must be at least 8 characters long and contain at least 1 uppercase character, 1 lowercase character and at least 1 number.

I would like to subscribe to

Select at least one option*