Site Reliability Engineer Jobs - VP of Site Reliability Engineering, 16037

at ExecuNet
Location Redwood City, CA
Date Posted September 2, 2019
Category Default
Job Type Full-time


The Site Reliability Engineering (SRE) team is responsible for running the application and data processing platforms on the Company global private cloud.  While raising the bar on service cloud objectives, the SRE team strives to continually optimize and automate our 24 X 7 service delivery which supports Financial Wellness, Wealth, Credit and Developer API capabilities for millions of end users daily. Our financial cloud is constantly growing and evolving in all dimensions from product capabilities, customers and users, countries supported and delivery automation.  
The SRE team is also heavily involved in R&D activities, and plays a key role in new product delivery in partnership with Product, Engineering, and Professional Services teams.

Company is looking for an energetic, technically skilled, and highly customer focused VP to manage SRE across all product portfolios. The ideal candidate is a skilled, results oriented leader that will work closely with cross-functional teams to architect, design, implement and launch new products and services to drive new revenue opportunities at scale. Responsibilities in this position span from driving continual improvement in application monitoring and resiliency capabilities, to running projects to increase efficiency in the core platform, while perpetuating a devOPS culture and mindset as a leader in an ever-growing organization.

You will be part of one of the top technical teams within Company who are strongly focused on new product and revenue delivery to production, problem management, supporting a technical shift to CI/CD deployments, and solving complex technical challenges through innovative solutions.  This is a great opportunity to further your career in the rapidly growing business unit in Envestnet, while furthering your skills on the latest application technologies and cloud infrastructure.

You will have-     

  • Bachelor’s degree in Computer Science, Engineering or Business IT
  • 20+ years of overall technology industry experience
  • 15 years of leadership experience managing technical delivery, DevOPS or SRE Teams
  • High energy and enthusiasm for learning and problem solving complex systems
  • Experience running 24 X 7 production support at a high scale
  • Experience managing and developing large teams in multiple geographic locations
  • Experience running a modern SRE organization coupled with an experience running ITIL processes
  • A DevOPS mind set and focus on delivery automation
  • Strong technical architecture skills
  • Exceptional communication, presentation and customer interaction skills

You will definitely possess these technical skills- 

  • Proficiency with developing applications on Linux
  • Extensive experience with Java/J2EE applications
  • Strong SQL knowledge and high scaled RDBMS applications (preferred Oracle)
  • MongoDB or equivalent NoSQL
  • Running monitoring architectures including New Relic, Splunk and Sensu
  • Messaging architectures (IBM MQ and Kafka)
  • Strong general Network skills
  • Big data technologies (Hadoop, Spark) a plus

Nice to have-  

  • Performance Engineering experience
  • Data Analytics technology delivery experience

What will you bring – 

You are a professional who enjoys people, possesses a great attitude, and is a self-starter with the desire to learn new technologies and systems while continually improving teams.  You possess excellent verbal, written, presentation and customer communication skills.  You have exceptional leadership, analytical and problem-solving capabilities coupled with a strong business acumen and a results oriented approach to decision making.  You are an industry expert in Site Reliability Engineering who fosters a DevOPS culture and can demonstrate a proven track record of building and running high performing teams.

Performance objectives –

  1. Exceed monthly Service Delivery SLOs including uptime, restoration times and monitoring detection times
  2. Manage service problems and continually drive down incident volumes
  3. Partner with business unit Product, Engineering and Business teams to design, build and deliver new revenue generating capabilities with no production disruption
  4. Support migration of application components to Docker containers and CI/CD delivery
  5. Drive service optimization and cost savings projects
  6. Continually grow the technical and professional capabilities of the SRE team

How will your lofty goals be translated into specific actions / short term goals -

  1. Within first 30 days, you will learn the high level Company technical system and start managing the day-to-day activities of the SRE team
  2. By the end of the first quarter, you will have participated in the full release of a new product feature from architecture to design, engineering, and production
  3. By the end of first six months, you will have acquired the Company business and technical knowledge necessary to drive the forward looking vision for the SRE department
Only registered members can apply for jobs.