|Date Posted||October 9, 2019|
Ref ID: 04310-9502228386
Classification: Site Reliability Engineer
Apply directly or send resume to [Click Here to Email Your Resumé]
The position requires expertise across different knowledge domains including networking, coding, persistent storage, and continuous delivery and deployment.
• Directly influence the architectural decisions for structuring, managing, monitoring and securing the Companies worldwide cloud platforms by leveraging your first-hand experience as well as knowledge of AWS Cloud best practices
• Design, build and maintain production systems with high efficiency, availability and fault tolerance using the combination of software and systems engineering practices
• Engage in and improve the whole life cycle of services from inception and design, through deployment, operation and refinement (continuous improvement)
• Quantify performance of services by defining application and supporting infrastructure KPI, measure and monitor application usage, availability, latency and overall system health
• Scale systems sustainable through automation and evolve systems by pushing for changes that improve reliability and velocity
• Create and maintain complete and accurate documentation for the purpose of operational audits including security and compliance.
• Participate in sustainable incident response and blameless postmortems
• Address production issues and provide Tier 3 on-call support