LinkedIn Site Reliability Engineer Interview Experience

The interview process for these types of companies is fairly standard. It typically involves an initial screening, followed by two rounds of phone interviews, and finally an on-site interview. The on-site interview was swapped for a virtual interview due to COVID-19.

PROS:

The coding portions of the interview were relevant to the job and not just toy examples.
The turnaround time after each interview was extremely quick.
Most people were easy to talk to, which made each interview feel relatively quick.
You set your own pace for the interview rounds, with some rounds occurring over a month apart.
Recruiters requested your feedback after each round.
The interviews were easier than expected, except for the last round where less guidance on study material was provided compared to the first two rounds, leading to time spent on irrelevant information.

CONS:

The interview questions covered a wide range of topics, from low-level details like shared library loading to high-level system design for video streaming services, rather than focusing on specific areas.
You are handed off to a different recruiter before the final round.
Based on browsing this site, the company appears to use the same questions for each candidate, creating a feeling of being a "cog in the wheel" without real input. This lack of tailored questions suggests a disregard for individual strengths and weaknesses outside their own perceptions.
Interviewers do not review your LinkedIn profile or resume until the interview itself. Consequently, questions regarding your background are either stock or ad-libbed, with little indication that interviewers are interested in your history or interests.
The company relies too heavily on its name to justify salary and quality of living compromises, especially for those relocating from more affordable areas. The benefits offered are not substantially different from those of the current employer.
There is a disconnect between recruiters and interviewers. For example, despite repeatedly stating this would be a domain and role change, and expressing uncertainty about fit despite willingness to learn and extensive experience in other tech domains, the assurance of extensive training was provided. However, the final feedback indicated experience was a reason for not getting the job, favoring someone who could "hit the ground running," even though the position was held open indefinitely with a training period planned.
Feedback after the last interview was almost non-existent. Unlike previous rounds where direct recruiter contact was made, feedback for the final round was delivered via voicemail, and your feedback was not solicited.

TIP:

Go through the second Google SRE book, specifically the workbook.

How do you make a variable in a shell script available after the script exits (assuming the shell script was sourced)?

How do you change the priority of a running process?

Coding test: Parse a (syslog) file to get various fields from the logs and message counts. Associate counts with the processes that logged them.

Describe how SSH works.

Describe how curl works. What happens when you call the command? Describe the process of loading libraries, parsing arguments, DNS resolution, etc.

You have gigabytes of data that needs to be periodically synced from a producer to a large number of consumers. How do you approach it? Hint: the data set isn't necessarily entirely new each time it needs to be synced, so only sync the data that has changed.

You take over a new service and discover it has no monitoring. What monitoring would you put in place within the first week to ensure the service is working? Within the first month? How do you monitor failures which are local to a region?

You will be asked to role-play a scenario where the number of registrations for a service has dropped to 0 for the past 6 or so hours, setting off an alert. You will have to go through an incident response and elevation. You will be asked to write simple reports that are suitable for giving high-level status to a manager.

You will be shown several architecture diagrams and asked various questions, like "what happens when database X goes down?", or "How to speed up requests from service Y?". Caching plays a big role in almost all responses.

You will be asked to do live troubleshooting of an Apache (httpd) web service. You will not be given many details by the recruiter, so it's easy to study the wrong thing here. It ended up that you need to be familiar with the httpd config file and Aliases. You need to be familiar with how to change Linux filesystem permissions, but you can ignore that you are running on RedHat and you won't need to touch SELinux permissions. Be careful of one problem where they will have two nearly-identical file names, except one has a hyphen and the other a Unicode dash character. They look very similar in many fonts. Make sure you know how to do a simple GDB backtrace. You will be asked to debug a segfault and work around it (via simple file rename).

You will have to perform a code review of several pieces of code. Focus on logic errors, not stylistic issues. I don't remember all the code samples, but one was about doing file backups, where they manually implemented extension parsing and copied over ".1" files to ".2", etc. without ensuring the order of the copy.

LinkedIn

Site Reliability Engineer Interview Experience - Sunnyvale, California

Process

Questions

Was this helpful?

Interview Statistics

Success Rate

Experience Rating

LinkedIn Work Experiences

Good Wlb, Unstable Reorgs

Culture is being lost

Flexible, Growing, Professional, Hectic

Good Wlb, Unstable Reorgs

Culture is being lost

Flexible, Growing, Professional, Hectic

LinkedIn Interview Questions

Binary Tree Upside Down

Insert Delete GetRandom O(1)

Nested List Weight Sum II