Site Reliability Engineer Jobs - Software Engineer Lead- L2 Engineering, 15321

at AIC (part of ACS Group)
Location Lewisville, TX
Date Posted May 8, 2019
Category Default
Job Type Full-time

Description

Fortune 100 company based in Plano, TX is looking for SRE with Big Data, Devops experience for a contract to hire role. If you are interested in learning more, please send resume to:

We are looking to develop a core set of set of data management capabilities to drive consistency across each line of business. This data platform will be deployed on premise and longer term in the public cloud. The initial focus is on sourcing, storing, enriching and making available information to support internal management reporting, external regulatory reporting, as well as machine learning and other data analysis applications.
We are seeking an experienced software engineering lead in our global Site Reliability Engineering (SRE) team supporting our Big Data platform. This individual will be expected to lead a team of software engineers who will grow into subject matter experts, work with functional application development teams, partner with infrastructure engineers and production support analysts to determine requirements for designing and developing automation, SDLC and development environment testing & integration tools.
The SRE team runs, maintains and improves the Big Data Platform against established Service Level Objectives by applying software engineering practices. It is responsible for the availability, performance, change management, monitoring, and capacity management of their services, with special emphasis being placed on the automation of the processes/workload in support of the above. The SRE team is also responsible for the operational support of the Big Data infrastructure, with emphasis being placed on the ability to submit outage/issue/incident data into a design and SDLC feedback loop to ensure maximum automation and outage avoidance.
Key responsibilities this role would include:

  • Engage with development teams throughout the life cycle of incident, ensure lessons learned are translated into automated or process adjust responses to help develop software for reliability and scale, ensuring minimal refactoring or changes
  • Code, test and deliver software to automate manual operational work
  • Troubleshoot incidents, participate in blameless post-incident evaluations and ensure permanent closure of incidents
  • Identify application patterns and analytics in support of better service level objectives
  • Analyze self-healing and resiliency patterns and contribute to software which can use these outcomes
  • Implement best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting

Key Qualifications include:

  • Bachelor's Degree in Computer Science, Engineering or Business
  • Strong knowledge and experience in DevOps and Agile teams
  • Strong knowledge and experience across multiple platforms, including Cloud architecture
  • Knowledge/experience in Hadoop environment administration, release deployments to HBase, supervising Hadoop jobs, performing cluster coordination services will be preferable
  • Knowledge of Unix/Linux administration, Unix scripts and platform level orchestration scripting. Should be knowledgeable about automating the build and deployment process.
  • Knowledge in Python,
  • Knowledge of DB technologies (Oracle, MS SQL DB, Sybase, etc)
  • Familiarity with Control M and AutoSys job scheduler
  • Knowledge and experience in Web based applications / architecture (Certificates, IIS, Web Services)
  • Knowledge of GIT, BitBucket, Jenkins, SONAR, SPLUNK, Maven, AIM and Continuous Delivery tools.
  • Knowledge of Load balancing, IP, DNS
  • Knowledge of Cloud (private cloud, public cloud etc) working experience of cloud environments like AWS is a plus.
  • Basic knowledge in Java, Javascript and RESTful API.
  • Ability to work directly with AD, Business and Operators
  • Excellent communication skills, both written and oral appropriately scaled for technical or business audience
  • Excellent interpersonal skills, team player
  • Strong analysis, research, investigation and evaluation skills, with a structured approach to problem solving
  • Ability to work and effectively prioritize in a highly dynamic work environment that includes a global focus

U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor at this time.
At present no Corp to Corp relationships will be considered for this opportunity.

AIC is an Equal Opportunity Employer

Only registered members can apply for jobs.