|Date Posted||June 6, 2019|
Site Reliability Engineer
Location: Philadelphia, PA 19103
- This is an opportunity to work in a department that provides back-end services to teams that create applications that are used by tens of millions of people.
- As a member of the SRE team you will partner with development teams to increase speed of deployment, reliability, and availability of applications consumed internally by other development and operations teams. The right candidate will be passionate about automation, reduction of toil, and the measurement of all things (logging, alerting, health-checks).
- Our team has a 'Cloud First' approach so the candidate must understand the inner workings of the cloud from a standpoint of reliability, distributed communication, and security.
- A background in development is preferred since the right candidate should have a mindset of automating things using a programmatic approach. We do not fix things once, we fix things for good.
- Our team is responsible for on-call support of the applications we manage.
- Contribute to a team that is at the forefront of the SRE practice at Comcast
- Ideate, design, engineer, and implement systems and solutions at a scale spanning regions, providers, and business verticals
- Improve service reliability through effective use of monitoring, alerting, break-fix, blameless post-mortems, and engineering of long-term fixes
- Uncover sources of toil and promote the automation of these tasks by convincing your teammates to implement effective solutions that you suggest
- Provide documentation to the team that would allow a reasonably skilled, but inexperienced, individual to make it through an on-call rotation without help from a team mate