Site Name: UK - Hertfordshire - Stevenage, UK - London - Brentford, USA - Pennsylvania - Philadelphia, USA - Pennsylvania - Upper Providence Posted Date: Aug 17 2021 The mission of the Data Science and Data Engineering (DSDE) organization within GSK Pharmaceuticals R&D is to get the right data, to the right people, at the right time. The Data Framework and Ops organization ensures we can do this efficiently, reliably, transparently, and at scale through the creation of a leading-edge, cloud-native data services framework. We focus heavily on developer experience, on strong, semantic abstractions for the data ecosystem, on professional operations and aggressive automation, and on transparency of operations and cost. We are looking for a skilled Data Ops Engineer II to join our growing team. The Data Ops team accelerates biomedical and scientific data product development and ensures consistent, professional-grade operations for the Data Science and Engineering organization by building templated projects (code repository plus DevOps pipelines) for various Data Science/Data Engineering architecture patterns in the challenging biomedical data space. A Data Ops Engineer II knows the metrics desired for their tools and services and iterates to deliver and improve on those metrics in an agile fashion. A Data Ops Engineer II is a highly technical individual contributor, building modern, cloud-native systems for standardizing and templatizing data engineering, such as: Standardized physical storage and search / indexing systems Schema management (data + metadata + versioning + provenance + governance) API semantics and ontology management Standard API architectures Kafka + standard streaming semantics Standard components for publishing data to file-based, relational, and other sorts of data stores Metadata systems Tooling for QA / evaluation Audit as a Service Additional responsibilities also include: Given a well-specified data framework problem, implement end-to-end solutions using appropriate programming languages (e.g. Python, Scala, or Go), open-source tools (e.g. Spark, Elasticsearch, ...), and cloud vendor-provided tools (e.g. Amazon S3) Leverage tools provided by Tech (e.g. infrastructure as code, Cloud Ops, DevOps, logging / alerting, ...) in delivery of solutions Write proper documentation in code as well as in wikis/other documentation systems Write fantastic code along with the proper unit, functional, and integration tests for code and services to ensure quality Stay up to date with developments in the open-source community around data engineering, data science, and similar tooling The DSDE team is built on the principles of ownership, accountability, continuous development, and collaboration. We hire for the long term, and we're motivated to make this a great place to work. Our leaders will be committed to your career and development from day one. Why you? Basic Qualifications: We are looking for professionals with these required skills to achieve our goals: Master's in Computer Science with a focus in Data Engineering, DataOps, DevOps, MLOps, Software Engineering and 2+ years of experience OR PhD in Computer Science Demonstrated experience with software engineering (testing, documentation, software development lifecycle, source control, … Experience with DevOps tools and concepts (e.g. Jira, GitLabs / Jenkins / CircleCI / Azure DevOps / …) Experience with common distributed data tools in a production setting (Spark, Kafka, etc) Experience with basics of search engines/indexing (e.g. Elasticsearch, Lucene) Demonstrated experience in writing Python, Scala, Go, and/or C++ Preferred Qualifications: If you have the following characteristics, it would be a plus: Comfort with specialized data architecture (e.g. optimizing physical layout for access patterns, including bloom filters, optimizing against self-describing formats such as ORC or Parquet, etc) Experience with the CNCF ecosystem / Kubernetes Comfort with search/indexing systems (e.g. Elasticsearch) Experience with schema tools/schema management (Avro, Protobuf) Why GSK? Our values and expectations are at the heart of everything we do and form an important part of our culture. These include Patient focus, Transparency, Respect, Integrity along with Courage, Accountability, Development, and Teamwork. As GSK focuses on our values and expectations and a culture of innovation, performance, and trust, the successful candidate will demonstrate the following capabilities: Operating at pace and agile decision making - using evidence and applying judgement to balance pace, rigour and risk. Committed to delivering high-quality results, overcoming challenges, focusing on what matters, execution. Continuously looking for opportunities to learn, build skills and share learning. Sustaining energy and wellbeing Building strong relationships and collaboration, honest and open conversations. 