Taro Logo

Software Development Engineer, Open Data Analytics Engines - Spark Engine Performance

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
$151,300 - $261,500
Backend
Senior Software Engineer
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Software Development Engineer, Open Data Analytics Engines - Spark Engine Performance

The analytics team is looking for an experienced engineer to join the core engines team. Athena and EMR are services that our customers use to run large scale analytics, leveraging open source engines like Trino and Spark. The analytic engine organization makes significant modifications to these engines to run in serverless environments and with superior performance and scalability than what is available in Open Source. In the last 3 years, we have improved our engines by a factor of 5x by making changes to the optimizer, query runtime and storage connectors. We have also made significant changes to the compiler to enable enterprise features like fine grain access control.

As an Engineer on the engines team, you will help with design and implementation of key features and performance optimizations for query engines. You will be working on improving the Spark engine further, making deeper changes all around the query engine codebase.

Key responsibilities:

  • Hands-on development for core components of the query engine
  • Design and develop solutions and algorithms to improve the performance of Spark
  • Design and develop improvements to the AWS Runtime for Spark's integration with other AWS services
  • Manage complex deliverables project and research projects with deadlines
  • Ensure data consistency and durability with breakthrough performance and scalability
  • Interact and partner with the open source community
  • Be a point of contact for challenging customer issues around query engine problems

The team is composed of engineers that are either passionate about the engine internal space or have several years of experience (including PhD). Our managers are also extremely technical, having worked many years as developers. We manage our backlog with a research approach, where ideas are prototyped, validated with real data and then implemented. We value work-life balance and see our work as a marathon, not a sprint.

Last updated 9 months ago