Site Reliability Engineer Jobs - Lead SRE, 15721

at Hunter Technical Resources
Location Atlanta, GA
Date Posted July 9, 2019
Category Default
Job Type Full-time

Description

Job Description:

  • Responsible for managing teams of IT resources in dynamic, evolving, complex environments across multiple platforms with numerous dependencies. 
  • Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems 
  • Have a customer centric focus while working with internal and external stake holders 
  • Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement 
  • Automate build, packaging, testing and deployment processes using tools and/or scripts 
  • Should implement best practices and standards into the CI/CD pipeline automations 
  • Review deployment procedures and execute deployments that ensure environment reproducibility with integration of gating standards to ensure quality production releases 
  • Emphasizing automation, implement processes, procedures and best practice guidelines for code management 
  • Responsible to ensure success of frequent Releases to the SDLC environments 
  • Strong communication and team building skills 
  • Ability to work in a team environment or independently with little guidance based on assignment 
  • Guide and mentor lead and junior engineers/developers in automating infrastructure and application build/deployments. 

Qualifications 

  • Installing/maintaining/Administering software on Linux, Windows servers 
  • Experience in Designing and Deploying multi-data center Large Scale Web Applications 
  • Work closely with dev, and ops teams to build highly available, cost effective systems 
  • Create new tools and scripts designed for auto-remediation of incidents 
  • Design/Implement containers/applications in scalable HA/DR multi-tier cloud environments, including new system design, documentation, implementation, and deployment 
  • Experience in providing L3 technical support for production 24x7X365 
  • Experience in Monitoring infrastructure/applications, maximizing system uptime and availability, ensuring functional and performance SLAs 
  • Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), Nexus, CI(Jenkins), CD Automation Tools 
  • Deploying and automating applications into public (AWS, GCP, Azure) and private cloud environments 
  • Scripting experience using tools such as Python, Ruby, Bash, Korn Shell 
  • Experience with multiple database technologies Oracle, PostgreSQL, MongoDB, Cassandra, etc 
  • Knowledge of cloud infrastructure as a service ( IaaS ) / platform as a service ( PaaS ), microservices 

Competencies 

  • Strong analytical and problem solving skills 
  • Strong leadership, unquestioned integrity, accountability, relationship management and ability to influence 
  • Ability to communicate effectively at all levels of the organization 
  • High sense of ownership and accountability 
  • Ability to interpret data and to communicate effectively in both technical and user-friendly language 
  • Ability to effectively prioritize and execute tasks in a high-pressure environment 

Professional Experience 

  • Bachelor’s degree or equivalent and 5+ years of related experience 
  • Strong knowledge on common Unix and LINUX operating systems 
  • Knowledge of best practices and principles.