How to remove yourself from being a bottleneck?

Anonymous User at Taro Community2 years ago

Editor's Choice

Mentorship

Due to unforeseen circumstances from past 6 - 8 months, I've been the Senior most engineer in my team, (I have a total of just ~2.7 YOE). My team consists of ~12 SDE 1s (New Hires) and 2 SDE2s (The other SDE2 being promoted very recently). My manager does a great job filling the role of Senior Engineer which reduces bit of pressure off of me.

However, due to necessity in the team I've ended up being SME in all the services owned by our team. This leads to everyone reaching out to me to help them with their queries, I try to document some of these and add in the Wikis so that it can be easily accessible for others next time. However, when it comes to certain tickets and issues, I end up having to pick that task up myself (Manager does not ask me to, but at same time i know that for someone else the ramp up time required to fix the issue would be too high).

I recently tried to reduce this (2~ months ago), this led to our overall ticket health getting worse and I had to again start looking into them myself and guiding each on-call cycle with right action items for the tickets etc.

This involves me helping them to do the following :-

Prioritize correct tickets to look into for the on-call cycle.
A potential fix for the ticket so that they know where to look into.

Due to which it ends up taking 6+ hours weekly to keep this running. I don't really mind doing this; however, I don't feel like this is a scalable solution and would eventually want to slowly scale down from doing this and have my team being able to be self-sufficient.

What's the best way to go about this without affecting my team's ticket health?

2.7K2.7K Views

44 Comments

Discussion

(4 comments)

33
Steve Huynh
•Principal Software Engineer at Amazon
2 years ago
I'm interpreting what your saying as you are effectively acting as a senior engineer by being the SME on your team's services by doing things like fielding queries from other teams, and this is leading to you having to choose between acting in this capacity or addressing your team's ticket queue.

The simple answer is that a senior engineer is expected to do both at a high level. There are three paths forward

Work extra time. This isn't sustainable in the long-run, will lead to burnout, and potentially will introduce performance problems for you down the line.

Transition away from being the SME. I'm going to assume you want to be a senior so we'll rule this out.

Get more efficient, deep dive on root causes, and drive long-term fixes. It seems that you've started doing this on the SME side by creating documentation and wikis. Realize that this type of work is long-term and pays dividends in the future. You should apply this thinking to your team's ticket queue. If your team gets the same class of ticket, spend your oncall time and spare cycles working on fixing the root causes. Make sure your ticket queue doesn't boil over, but prioritize fixing bad alarming, getting finer grained metrics, and code-level problems rather than only focusing on ticket-level resolution. This may require some extra time and effort in the short term but it won't be permanent. Eventually, like the documentation work you are doing, things should get better over time and there will be less on your plate.

Realize that the way out of this is by taking a longer-term perspective and by making sure your day-to-day is chipping away at making yourself and your team's processes more efficient.

Hope that helps.

-Steve
16
Kuan Peng
•Senior Software Engineer [L5] at Google
2 years ago
Building off what Steve wrote (drive long term fixes), perhaps you don't have to solve this problem entirely on your own either, but take a directive role instead.

You can work with your manager with come up a list of consistently recurring issues. You manager should really want to solve this issue as well, because they can't afford to have single-point-of-failure in the team. You may ask questions to yourself and your manager to figure out what is the main contributor and address it systematically.

For example:

Deployments of new code frequently cause outages => Are they catchable by tests? if so, maybe help focus the team on adding tests to their code changes. Is it more that the systems underneath are not decomposed well enough and thus are dependent on internal workings of each other?

Lots of false alarms => Are you alerting on the correct metrics? How can you tune your monitoring solution

Oncall engineers don't know how triage issues => have everyone been sufficiently trained? Is there an oncall playbook? Can you have someone shadow you every time?

Once you have a list of potential deeper issues and possible solutions, prioritize them. You can then either work on the priority solutions yourself, or delegate to a fellow engineer (maybe one of the recently promoted folks?) and you can guide their work.
13
Alex Chiou
•Tech Lead @ Robinhood, Meta, Course Hero
2 years ago
All the tactical advice here is fantastic, but I have one that's more high-level and people-oriented: Take on a right-hand (a "second-in-command" among the SDEs after you).

In other words, take another engineer on the team (probably the other SDE 2), and start deeply mentoring them. Your goal is to effectively "upload" all the awesome stuff in your brain into theirs. You want to make it so that if you're ever gone for a prolonged period of time, they can hold down the fort. Tactically, here's what this mentor <> mentee relationship should look like:

A weekly 30-minute 1 on 1 - It may need to be more in the beginning, either twice a week or a 45-minute to 1 hour long meeting.

Give them clear tracks of ownership - You want to be able to tell this person something like, "My goal is to have you own the entire system health of our automated testing suite." Framing the mentee's contribution in this way will empower them to take on more responsibility and offload more stuff from your plate (which is more impact for them and better peace of mind for you, it's a win-win!). As SDE 2s, both of you are pushing to lead entire areas to get to SDE 3, so doing this is a beautiful example of making incentives align. Talk to your manager about what makes the most sense to hand off to this mentee, so that both of you still have sufficient scope after the partition.

Start pushing them to uplift SDE 1s - This is when the benefits of mentorship start getting crazy powerful: Your mentee becomes a mentor, and you are now mentoring engineers indirectly through them. Make this mentee observe how you are teaching other engineers on the team, and encourage them to start doing the same after 2-3 months. Your goal is to get to a place where if someone asks you for help, you can start referring your mentee in instead for a large portion of cases.

As a rapidly growing SDE 2, you have hit a wall I have seen so many mid-level engineers run into: You simply can't do it all. Small to medium-sized answers like updating wikis and holding office hours can help here and there, but at the end of the day, you need to start seriously raising up your peers in order to truly scale yourself.

This was the case for most engineers at Meta going from E4 to E5 (mid-level -> senior) and especially for E5 to E6 (senior -> staff). It was effectively a requirement for engineers targeting these promotions to become a major mentoring force on their teams.

For excellent resources on how to be an effective mentor, check these out:

[Case Study] Mentoring Junior SWEs [E3] to Senior [E5] In Just 2.5 Years At Meta

"How can I help juniors?"

[Case Study] Becoming A Tech Lead Again In Just 1 Month After Joining Robinhood From Meta
7
Brad Messer
•Senior Software Engineer at IBM
2 years ago
I've had this myself too to a point I haven't really gotten to do much coding in the last few months. My job though is to make the team better and to take on the grunt work when necessary to keep everyone helping at a high rate. In my case, people around me are working ~70hr weeks and I'm working ~55 max with a lot of my days finishing up midday really because there is no work to do. I scale through others by asking them to do the hard work I don't know how to do and then I jump in to ease up the pressure on my juniors by spending quality time in the trenches. Once one trench gets cleaned up, then we as a team can be re-deployed to help with other trenches. As stated before, document a lot and keep a long term focus. Focus on teaching them as much as possible and encouraging them to help each other grow, otherwise the situation becomes untenable very quickly and you start getting turnover.