Taro Logo

Lead Software Engineer- Data Engineer

Global leader in trusted and transformative intelligence, providing enriched data, insights, analytics and workflow solutions across knowledge, research and innovation.
Data
Staff Software Engineer
Hybrid
5+ years of experience
Healthcare · Enterprise SaaS

Description For Lead Software Engineer- Data Engineer

Clarivate is seeking a Lead Data Engineer to join their Content Tech Big Data Engineering Team in India. This role offers an exciting opportunity to work with Real World Data using cutting-edge big data technologies. The position involves building and scaling high-value medical data capabilities as part of a 20+ member engineering team reporting to the Director of technology.

The ideal candidate will play a crucial role in developing big data platforms and implementing data platform strategies on Cloud. They will work with various stakeholders including Analytics teams, Application Teams, Enterprise Solutions Teams, and External Partners. The role combines technical expertise in data engineering with healthcare domain knowledge.

The position offers a hybrid working model with full-time hours (40 hrs/week). The successful candidate will be responsible for building data pipelines, implementing ETL processes, and working with various technologies including Python, Spark, AWS, and healthcare data systems. This is an excellent opportunity for someone passionate about big data and healthcare analytics to make a meaningful impact on customer outcomes.

Key technologies include Python, PySpark, AWS, AWS Glue, EMR, Delta Lake, and various database systems. The role requires strong technical skills combined with business acumen and excellent communication abilities. Clarivate offers a high-energy, innovative, fast-paced Agile culture where you can contribute to transformative healthcare solutions.

Last updated a few seconds ago

Responsibilities For Lead Software Engineer- Data Engineer

  • Build data platforms, data pipelines, and data transformation capabilities
  • Implement data platform strategy on Cloud
  • Drive rapid prototyping and development with Product and Technical teams
  • Extract, transform, and load data using Apache suite, SQL, Python, ETL, and AWS
  • Create and support batch and real-time data pipelines
  • Conduct functional and non-functional testing
  • Evaluate existing applications to update and add new features

Requirements For Lead Software Engineer- Data Engineer

Python
PostgreSQL
  • Bachelor's Degree or equivalent in computer science, software engineering, or related field
  • At least 5 years of relevant experience
  • Experience with Python, PySpark, AWS, AWS Glue, EMR and Delta Lake
  • Knowledge of ETL and database systems (Postgres, Oracle, Snowflake/Databricks)
  • Experience in handling large volume of data and building data pipelines
  • Knowledge of Agile/SDLC methodologies
  • Experience in Data warehouse / BI projects in Healthcare Domain
  • Strong oral and written communication skills

Interested in this job?

Jobs Related To Clarivate Lead Software Engineer- Data Engineer

Lead Software Engineer (Lead ETL PySpark Developer)

Lead ETL Developer position at Clarivate, focusing on building and maintaining data pipelines using PySpark and modern big data technologies for the Clarivate Customer Cloud platform.

Lead Software Engineer (Lead ETL PySpark Developer)

Lead ETL Developer position at Clarivate, focusing on building and maintaining data pipelines using PySpark and modern big data technologies for the Clarivate Customer Cloud platform.

Market Risk Data Scientist / Python Developer - VP

VP-level Market Risk Data Scientist/Python Developer role at Citi London, focusing on risk monitoring and analysis tools development for the Global Rates business.

Software Engineer, Metrics Tooling and Automation

Senior Software Engineering role at Zoox focusing on metrics tooling and automation for autonomous vehicle simulation and safety analysis.

Staff Data Engineer

Staff Data Engineer position at Unico, Brazil's leading IDtech company, focusing on data infrastructure, ETL pipelines, and cloud platforms with remote work options.