I am currently doing some ETL (extract transform load)
But the Transform part of the operation takes a minute or two. I am having issues with the "load" part of my script where I need to update a DB with the transformed data
Pain point: It takes me a minute or two to get the transform outputs and it really slows down my coding velocity + feedback loop bc most of my time is spent waiting for the code to run
What kind of mocking infrastructure do you have? Because if you know that the "Load" part is broken, just isolate that. Instead of waiting for "Extract" and "Transform" every time, try to plug-in some hard-coded data into the "Load" part and see how it breaks.
If you can't mock it, then try to shrink the input. If the "Load" part is really buggy, it will probably break both when you have 1 million rows of data and when you have just 1,000 rows of data. Slash your data set down by 90% or 99% and see what happens.
At a high-level, the process for debugging is like this:
Our debugging masterclass covers this far more in-depth. Check it out: [Masterclass] How To Become A Debugging Master And Fix Issues Faster
I agree with breaking down the process into granular pieces and focusing on fixing the relevant, breaking portion.
Here are some things you can try: