Data Engineer at UKRI, Remote/Swindon, 3 Months initial, £Contract Rate

Contract Description

UK Research and Innovation (UKRI) is the national funding agency investing in science and research in the UK.

 

UKRI invests £8 billion of taxpayers’ money each year into research and innovation and the people who make it happen. They work across a huge range of fields – from biodiversity conservation to quantum computing, and from space telescopes to innovative health care. They give everyone the opportunity to contribute and to benefit, bringing together people and organisations nationally and globally to create, develop and deploy new ideas and technologies.

 

The Data Engineer will be responsible for the requirements analysis and design of solutions for data platforms, ETL, integration and analysis solutions. Our toolset includes AWS hosted databases (postgres, S3 based data lake, MySQL, Athena), integration services (AWS API gateway, lambda functions), ETL services (AWS Glue, Step functions, AWS Batch), infrastructure management tools such as Terraform and extensive use of SQL and Python.

 

As a Data Engineer your main responsibilities will be to:

 

Infrastructure Management:


    1. Set up and manage data infrastructure, including clusters, servers, and cloud-based resources using Terraform.
    2. Monitor and optimize system performance, troubleshoot issues, and ensure system availability.

 

Data Architecture and Design:


    1. Design and implement scalable and efficient data pipelines, databases, and data warehouses.
    2. Collaborate with data visualization and analysts’ teams to understand data requirements and translate them into technical specifications.

 

Data Processing:


    1. Develop and maintain ETL (Extract, Transform, Load) processes for ingesting data from various sources into the data infrastructure.
    2. Optimize data processing and storage for performance and cost-effectiveness.

 

AI:


    1. Build scalable inference pipelines and AI APIs.
    2. Implement MLOps workflows for model versioning, monitoring, and retraining.
    3. Collaborate with data scientists to productionize machine learning solutions.
    4. Evaluate and optimize AI model performance and operational efficiency.

 

Database Management:


    1. Manage and maintain databases, ensuring data integrity, security, and availability.
    2. Implement database schema changes and optimizations as needed.

 

Collaboration:


    1. Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to meet data requirements.
    2. Communicate effectively with stakeholders to gather requirements and provide updates on data engineering projects.

 

 

Essential:

 

  • Comfortable working in an Agile rapidly changing environment.
  • Data Engineering experience, with strong Python skills.
  • Experience with infrastructure deployment tools such as Terraform, CDK or cloud formation.
  • Experience developing API based data integration.
  • Excellent analytic skills associated with working on structured and unstructured datasets.
  • Excellent SQL experience on various platforms (SQL, PostgreSQL, PL/SQL etc).
  • Experience of several of MySQL, Oracle, SQL, Postgres, RDS, Aurora, Athena or other similar large scale database technologies.
  • Exposure, knowledge and / or experience of working with AI technologies.

 

Desirable:

 

  • Experience working with AI frameworks such as LangChain / LlamaIndex.
  • Experience with Vector databases (Pinecone, Weaviate, FAISS).