Taro Logo

Software Engineer II (Python and Data pipelines)

A platform creating a world of stories and knowledge through Everand, Scribd, and Slideshare products.
$103,500 - $196,000
Data
Mid-Level Software Engineer
Remote
1,000 - 5,000 Employees
4+ years of experience
Enterprise SaaS · Education

Description For Software Engineer II (Python and Data pipelines)

Scribd, a leading platform in digital content sharing, is seeking a Software Engineer II specializing in Python and Data pipelines to join their ML Data Engineering team. This role is central to managing and processing hundreds of millions of documents and billions of images across their platforms including Everand, Scribd, and Slideshare. The position offers a unique opportunity to work with diverse datasets including UGC documents, ebooks, and audiobooks at an unprecedented scale. The tech stack includes Python, Scala, Ruby on Rails, Airflow, Databricks, Spark, and various AWS services. The role combines technical expertise in data engineering with the challenge of building robust systems for content discovery and metadata management. The company offers a flexible work environment through their Scribd Flex program, competitive compensation including equity, and comprehensive benefits. They foster a culture of curiosity, boldness, and customer-first thinking, making it an ideal place for engineers passionate about working with data at scale.

Last updated 15 days ago

Responsibilities For Software Engineer II (Python and Data pipelines)

  • Design and develop data pipelines to extract, enrich, and process metadata from millions of documents, images, and other content types
  • Collaborate with cross-functional teams, including ML engineers and product managers, to deliver scalable, efficient, and reliable metadata solutions
  • Build and maintain systems that operate at a massive scale, handling hundreds of millions of documents and billions of images
  • Optimize and refactor existing systems for performance, scalability, and reliability
  • Ensure data accuracy, integrity, and quality through automated validation and monitoring
  • Participate in code reviews, ensuring best practices are followed and maintaining high-quality standards in the codebase
  • Manage and maintain data pipelines, security and infrastructure

Requirements For Software Engineer II (Python and Data pipelines)

Python
Kafka
  • 4+ years of experience in backend software engineering, with hands-on work in developing data pipelines and building and deploying your own infrastructure
  • Proficient in one or more programming languages, such as Python, Ruby or similar
  • Experience working with a public cloud provider (AWS, Azure, or Google Cloud)
  • Hands-on experience with building, deploying, and optimizing solutions using ECS, EKS or AWS Lambdas
  • Experience with queueing and streaming technologies like SQS, Sidekiq, Kafka or Kinesis
  • Experience working with systems at scale such as External APIs, and data transformations
  • Proven ability to test and optimize systems for performance and scalability
  • Bachelor's in CS or equivalent professional experience

Benefits For Software Engineer II (Python and Data pipelines)

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Mental Health Assistance
Education Budget
  • Healthcare Insurance Coverage (Medical/Dental/Vision): 100% paid for employees
  • 12 weeks paid parental leave
  • Short-term/long-term disability plans
  • 401k/RSP matching
  • Onboarding stipend for home office peripherals + accessories
  • Tuition Reimbursement
  • Learning & Development programs
  • Quarterly stipend for Wellness, Connectivity & Comfort
  • Mental Health support & resources
  • Free subscription to Scribd + gift memberships for friends & family
  • Referral Bonuses
  • Book Benefit
  • Sabbaticals
  • Company wide events
  • Team engagement budgets
  • Vacation & Personal Days
  • Paid Holidays (+ winter break)
  • Flexible Sick Time
  • Volunteer Day

Interested in this job?

Jobs Related To Scribd Software Engineer II (Python and Data pipelines)

Business Intelligence Engineer, Supply Chain Innovation

Business Intelligence Engineer role at Amazon focusing on supply chain innovation, combining data analysis and visualization to improve operational efficiency and drive strategic decisions.

Business Intelligence Engineer, SCOT Long Term Forecasting

Business Intelligence Engineer role at Amazon's SCOT team, focusing on long-term forecasting and inventory optimization using data analysis and AWS technologies.

Business Intel Engineer II

Business Intelligence Engineer II position at Amazon Advertising focusing on statistical modeling, forecasting, and automated reporting for advertising optimization.

Business Intelligence Engineer II, AWS Analytics and Data Solutions (ADS)

Business Intelligence Engineer II position at AWS Analytics and Data Solutions team, focusing on data analysis and insights generation to support AWS Infrastructure Services operations.

Data Engineer II, AWS Sustainability Technology

AWS Data Engineer II position focusing on sustainability technology, building data models and ETL pipelines for renewable energy and water management initiatives.