How can I safely plan a difficult project for which I have little context?

Question

Context:-

My team is going to work on a new project which involves upgrading a service and migrating/enabling all of the dependents to use the new service.
This service provides a business critical functionality for our teams and the new version attempts to solve a lot of high impact pain points with the previous version.
We have just inherited this service and none of us have worked with it or any of its dependents before. We have some support from the previous team that worked on this project but only in a consultation capacity.
This is a project that has been attempted multiple times by various teams over the years - unsuccessfully or with little progress. My perception is that it's going to be a difficult project with low-moderate chances of success.
There is a lot documentation but most of it is somewhat outdated. There are a lot of PRs as well but these are for the unsuccessful attempts so I'm not sure how impactful it would be to go through them.
The plans for the previous attempts only had internal milestones for the team and a single big-bang completion milestone for stakeholders.

Questions:-

How can I identify smaller, independent high-level milestones that are relevant for external stakeholders?
How can I come up with broad estimates and capacity requirements for the external and internal milestones if I'm not clear on what these milestones would require for completion?
How can I think about de-risking this attempt of the project and improving the probability of success?

Siddharth Srivastava · Accepted Answer

First of all, thanks for the detailed question. Its a scenario I can completely relate to and am facing something of this sort right now. I would attempt to answer your questions, but changing the order in the responses ( from the original order you provided)

(3) How can I think about de-risking this attempt of the project and improving the probability of success?

If the upgrade has failed a few times in the past, I would start by asking the past owners on the blockers they faced and documenting the same. Have the conditions changed from the past that we are attempting this project again? In either case however, trying to understand the nature of the blockers and what are the possible paths forward will help. Mentioning a couple of examples I can think of,

a common scenario I have seen in upgrade projects is that the dependencies are not upgraded or won't work with another upgraded component i.e. we depend on x-1.0 and need to upgrade y-1.0 to y-2.0 but x-1.0 won't work with y-2.0. In such a case, we would need to contact the owners of x-1.0 to upgrade or pull it in the scope (perhaps fork it and do the changes ourself). If we are lucky, the dependency could be something minor and its easier to just write our own version of it than trying to upgrade the whole thing.
sometimes the upgrade is not backwards compatible and the consumers of the service would break. There are two paths here, can we make the service backwards compatible? or can the consumers make changes to their side ? If there are external consumers then the latter might not be an option, unless the change on their side is to just upgrade to a new client.

The answers to the above questions ( and the ones you come up with respect to your project) can help estimating the risk and take a more informed guess at the success rate. The idea is "why did the previous attempts fail and what has changed?"

(2) How can I come up with broad estimates and capacity requirements for the external and internal milestones if I'm not clear on what these milestones would require for completion?

The only way to remove ambiguity from a complex task is to break the complex task/milestone into smaller tasks. Again, going back to the previous owners and looking up the documentation would help. For milestones that are so ambiguous that we are unable to break them into simpler tasks, a strategy I would use is to try ( or seek someon's help ) and correlate the previous owners' PRs to the milestones, and figure out how the previous owner sought to break the problem. A rule of thumb that my team has followed in the past is if the estimate of a task is more than 2 days, it probably needs to be broken down further ( of course, exceptions can be there). This exercise sucks but it generally sucks more without these finer estimates.

(1) How can I identify smaller, independent high-level milestones that are relevant for external stakeholders?

For this part, I would approach them ( perhaps cut a low/med sev ticket if too many stakeholders) and inquire from them about two things:

how much time do they need for testing in pre-prod before they have confidence of releasing the changes to prod
if output from your system changes in any way ( e.g. your service creates csv files in external stakeholder's storage and after the upgrade, the csv file sizes have doubled) will their systems break and how much time would they need to confirm it.

The answer to these questions would help in coming up with timelines and might pave way for more questions ( which lead to finer milestones )

How can I safely plan a difficult project for which I have little context?

Discussion

Other Great Discussions