Data Operation Engineer at GSK, Stevenage/Remote, 6 Months, £500-£540 per day

£500 - £549 per day
  • Contract Spy
  • Stevenage, UK
  • May 09, 2022
6 Months or more

Contract Description

Position: Data Operation Engineer

Location: Stevenage / Hybrid

Pay Rate: £549.12 Per Day / Via Umbrella

Duration: 6 Months

Number of Position: 3/4

Start: ASAP:

 

The mission of the Data Science and Data Engineering (DSDE) organization within GSK Pharmaceuticals R&D is to get the right data, to the right people, at the right time. The Data Framework and Ops organization ensures we can do this efficiently, reliably, transparently, and at scale through the creation of a leading-edge, cloud-native data services framework. We focus heavily on developer experience, on strong, semantic abstractions for the data ecosystem, on professional operations and aggressive automation, and on transparency of operations and cost.

We are looking for a skilled Data Ops Engineer II to join our growing team. The Data Ops team accelerates biomedical and scientific data product development and ensures consistent, professional-grade operations for the Data Science and Engineering organization by building templated projects (code repository plus DevOps pipelines) for various Data Science/Data Engineering architecture patterns in the challenging biomedical data space.  A Data Ops Engineer II knows the metrics desired for their tools and services and iterates to deliver and improve on those metrics in an agile fashion.

Achieving delivery of the right data to the right people at the right time needs design and implementation of data flows and data products which leverage internal and external data assets and tools to drive discovery  and development is a key objective for the Data Science and Data Engineering (DSDE) team within GSK's Pharmaceutical R&D organisation.  There are five key drivers for this approach, which are closely aligned with GSK's corporate priorities of Innovation, Performance and Trust: 

 

Automation of end-to-end data flows: Faster and reliable ingestion of high throughput data in genetics, genomics and multi-omics, to extract value of investments in new technology (instrument to analysis-ready data in <12h) 

Enabling governance by design of external and internal data:  with engineered practical solutions for controlled use and monitoring 

Innovative disease-specific and domain-expert specific data products: to enable computational scientists and their research unit collaborators to get faster to key insights leading to faster biopharmaceutical development cycles.​  

Supporting e2e  code traceability and data provenance: Increasing assurance of data integrity through automation, integration 

Improving engineering efficiency: Extensible, reusable, scalable, updateable, maintainable, virtualized traceable data and code​ would be driven by data engineering innovation and better resource utilization.  

A Data Ops Engineer II is a highly technical individual contributor, building modern, cloud-native systems for standardizing and templatizing data engineering, such as: 

 

 

  • Deliver declarative components for common data ingestion, transformation and publishing techniques  
  • Implement data governance aligned to modern standards 
  • Partner with DSDE data engineering teams to advise on implementation and best practices 
  • Cloud Infrastructure-as-Code  
  • Service and Flow orchestration  

 A Data Ops Engineer II develops robust, modularised data components to enable delivery of  a suite of high-performing, high-impact biomedical and scientific data ops products and services. Demonstrating software engineering and quality coding practices towards modern data problems. 

 

Responsibilities include:   

  • Develop and support delivery of high-performing, high-impact data ops products and services, from a loosely defined data engineering problem or requirement 
  • Partner with Tech where modifications to underlying tools (e.g. infrastructure as code, cloud ops, devops, logging / alerting, ...) are needed to serve new use cases, and to ensure operations are planned in advance  
  • Write fantastic code along with proper unit, functional, and integration tests for code and services to ensure quality.  
  • Stay up-to-date with developments in the open source community around DevOps, data engineering, data science, and similar tooling  ​

 

The DSDE team is built on the principles of ownership, accountability, continuous development, and collaboration. We hire for the long term, and we're motivated to make this a great place to work. Our leaders will be committed to your career and development from day one. 

 

Basic Qualifications:

We are looking for professionals with these required skills to achieve our goals:

  • Masters in Computer Science with a focus in Data Engineering, DataOps, DevOps, MLOps, Software Engineering, etc, plus 2 years job experience (or PhD) 
  • Experience with DevOps tools and concepts (e.g. Jira, GitLabs / Jenkins / CircleCI / Azure DevOps / …)  
  • Experience with common distributed data tools in a production setting (Spark, Kafka, etc) 
  • Experience with search / indexing systems (e.g. Elasticsearch) 
  • Experience in Python, Scala, Go, and/or C++ 
  • Metrics-first mindset 

Preferred Qualifications:

If you have the following characteristics, it would be a plus:

  • Experience with agile software development 
  • Experience building and designing a DevOps-first way of working 
  • Experience building reusable components on top of the CNCF ecosystem including Kubernetes (or similar ecosystem)