Senior Site Reliability Engineer

Job Locations US-UT-Salt Lake City | US-WI-Madison
Req No.
Regular Full-Time

Summary of Major Responsibilities

An Exact Sciences SRE is responsible for architecting, automating, and maintaining the cloud infrastructure. SRE’s are not operators; they are experienced software engineers with a focus on operations. The SRE will work closely with other Software Engineers, Network Engineers, Database Engineers, and Product Managers to analyze system and network loads to address stability and performance challenges, and collaborate with others to operate various systems. The SRE performs ongoing application support by diagnosing and resolving issues, maintaining applications, and evaluating and recommending options for improving performance, security, networking, monitoring, scaling, high-availability, disaster recovery, build pipelines, and the provisioning and configuration of cloud infrastructure. That also includes streamlining processes to increase system scalability and reliability, improve efficiency, and minimize errors.

Essential Duties and Responsibilities

  • Ability to work with and use Amazon Web Services or other Cloud technology platforms
  • Write automation code for provisioning and operating infrastructure at scale
  • Experience with configuration automation tools such as Chef, Salt, Ansible, Puppet, or similar
  • Strong verbal and written communication, time management, and organizational skills
  • Ability to automate tasks with Python, Ruby, Bash, Java, JavaScript or similar
  • Experience with both Linux and Windows operating systems
  • Familiarity with Datadog or similar cloud monitoring solutions
  • Strong understanding of PostgreSQL or other SQL database technology
  • Understanding of security and encryption best practices
  • Responsible for designing, building, maintaining, and scaling production services and server farms across multiple data centers for complex and data-intensive cloud services
  • Design and enhance software architecture to improve scalability, service reliability, capacity, and performance
  • Work with development and IT teams to make sure the applications fit nicely within the infrastructure and scalability/reliability is designed and implemented from the grounds up. You will work with QA on building pipelines and automation for delivering and deploying applications to production
  • on-call rotation supporting the infrastructure
  • Roll up the sleeves to troubleshoot incidents, formulate theories and test your hypothesis, and narrow down possibilities to find the root cause
  • Write postmortem reviews and remediation recommendation
  • Identify trends before they become problems; respond to automated system alerts, effectively troubleshoot system errors and work incidents to return systems to normal operating conditions
  • Author and update high-quality documentation of all relevant specifications, systems and procedures
  • Other duties as assigned


  • College diploma in CS/Engineering/Sciences or equivalent experience
  • 5-10+ years of experience with design capabilities using modern technologies
  • Track record in successfully addressing performance, scalability and latency challenges
  • Experience in developing  systems architecture


Exact Sciences is an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to age, color, creed, disability, gender identity, national origin, protected veteran status, race, religion, sex, sexual orientation, and any other status protected by applicable local, state or federal law. The Company’s affirmative action program is available to any applicant or employee for inspection upon request.



Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed