Taro Logo
2

How to leverage side projects as a Data Engineer?

Profile picture
Mid-Level Data Engineer at Taro Communitya year ago

I'm not too happy with my current job, I feel like career growth prospects are dim and the work is not challenging enough. Given this, my best career move is to find a job at a different company.

Given that the work at my job is not good, I want to use side projects (which Taro tells me has a ton of benefits) to be able to get "equivalent" job experience so that when I interview at other companies I'm able to leverage the experience with awesome side projects to do well.

More specifically most data engineering jobs want experience with airflow and pyspark. My current job only has me doing SQL stuff, so I want a side project that leverages airflow/spark

Problem: If I focus on just building apps/web apps to get users then that wont help my data engineering career because most apps dont need someone with a full pyspark pipeline with airflow and whatnot

Conversely, if I focus on building an awsome data engineering pipeline, I'm likely not solving any real world issues and have 0 users, but the skills will help.

Problem 2: If I focus on consumer apps I'll have to learn react (which I 100% dont use at work) and spend time doing backend stuff which isnt helping me grow personally as a DE because I read on Taro that you should be going deep, not wide.

68
8

Discussion

(8 comments)
  • 1
    Profile picture
    Tech Lead/Manager at Meta, Pinterest, Kosei
    a year ago

    My current job only has me doing SQL stuff, so I want a side project that leverages airflow/spark

    This doesn't answer your question about doing a side project, but can you shoehorn your current job to be more amenable to the stuff you want to work on?

    If your company is big enough, it should have a lot of diversity of technologies and people. Can you find the areas that are interesting and work on those?

    The benefit of "double dipping" your day job with side projects is that you'll learn about how to use the tech in a production environment, with people who have more experience than you. This will be much faster for you to ramp up and learn transferable skills for the next job.

    I'd pursue this route even if it's not the perfect tech alignment, e.g. Hadoop instead of Spark.

    The other option is to make open-source contributions, if you have some approachable tool or application which uses these data analytics pipelines.

    • 0
      Profile picture
      Tech Lead @ Robinhood, Meta, Course Hero
      a year ago

      +1: Just getting better scope at your current job would be way better than trying to figure out a meaningful side project

      You don't need your manager for this either. Go around your organization to data engineers you trust and ask them if they have anything meaningful on their backlog. If they're senior/staff level, they almost certainly do. From there, just pick up those tasks and do them. This situation is the best as there's mutual benefit:

      1. You get meaningful work
      2. They get a worker bee to delegate to and help them scale (they can claim some portion of your impact as they were the one who gave you the task)
    • 0
      Profile picture
      Mid-Level Data Engineer
      Taro Community
      a year ago

      Unfortunately I cant. I work for a pure classic finance company (not hedge fund) where I do mostly analytics/engineering. Most of the work is doing sql (+ managing DB) -> extracting insights -> making powerpoint slides/dashboards

      We don't have the volume of data to be doing spark and we also don't have any production models so we dont need airflow so it's not something I can push within the org. It's also not a tech company and I'm part of an analytics division

      The ceiling for this analytics/engineer job is pretty low which is why I want to upskill to get the higher paying data engineering roles

    • 1
      Profile picture
      Tech Lead @ Robinhood, Meta, Course Hero
      a year ago

      I see, the answer here is pretty obvious then hehe: [Course] Ace Your Tech Interview And Get A Job As A Software Engineer

      I wouldn't gatekeep your job search on lacking skills - This will lead to you studying and building side projects forever. I recommend just applying a lot and seeing where that takes you. If you have spare time, you can study/build in your remaining free time.

      The important part about applying (especially if you apply a lot) is that you'll get data. If you're noticing common themes across rejections, you can target your outside studying to fill that specific gap. Many engineers don't realize that interviewing is actually a data collection exercise in disguise 🕵️

    • 1
      Profile picture
      Mid-Level Data Engineer
      Taro Community
      a year ago

      Thanks Alex, it helps. I was curious if you have any general advice for people who are looking to get more interviews where users as a measure of impact is not super applicable? aka roles in infra/backend/data?

      Also I get that getting users means the project is impressive which makes it a better project which would lead to more interviews. But why are projects with users better? Isn't a project with users just giving me signal about your marketing/sales/product skills and not engineering skills? Especially if you're building a utility app/one time use app e.g. background remover

  • 0
    Profile picture
    Tech Lead @ Robinhood, Meta, Course Hero
    a year ago

    Side projects are admittedly very tricky as a Data Engineer as the role is inherently not-user-facing. It is hard to make something that is easily shareable and gets tons of users.

    I think a better path might be do open-source contributions and build up industry-leading data engineering libraries/components. Here's a good list: https://github.com/gunnarmorling/awesome-opensource-data-engineering

    From there, follow the advice here: [Course] Become An Open Source Master

    • 1
      Profile picture
      Mid-Level Data Engineer [OP]
      Taro Community
      a year ago

      Thanks alex. The issue with open source is that (from my limited experience) most projects are at the infra layer while most jobs expect you to be a pro at the application layer. That is building pyspark pipelines and using airflow and stuff. I feel like the overlap between infra layer and application layer is not that much?

      I've briefly worked on open source for ml and a lot of it needed me to know stuff at the compiler layer or the pytorch low level stuff but when I try to train an ml model that level of depth and the problems faced doesn't translate much?

      In other words working on ml/data infra is very different application of ml/data infra and I'm not sure if working on open source is the best path to gain better data engineering skills

    • 1
      Profile picture
      Tech Lead @ Robinhood, Meta, Course Hero
      a year ago

      Hmm, then maybe you can build a side project just for the learning as you mentioned? It won't help you get interviews as it likely won't get users, but learning is still valuable (and it can help you pass interviews once you get them).

      The other option is to find additional scope at your current role as I talked about in my reply to Rahul on this thread.