I'm a Data Scientist looking to switch company and move to a role closer to ML/LLMs. My plan is to build a side project fine-tuning LLMs to familiarize myself with this field and leverage that experience on my resume. I was wondering if anyone here has experience building similar projects or went through a similar learning process - it would be very helpful to get some insights on skill acquisition and finding a job in this area. Here're some examples of what advice I'm looking for, but please feel free to share other aspects as well - anything will be greatly appreciated:
Thanks in advance and please feel free to share your thoughts!
I would treat this like any other ML project. I would build an end to end app where you
run some prefect or orchestration job to fetch data from some source (reddit/twitter/...)
put it in s3
read from s3 and write a processing job to fine tune an LLM in sagemaker
deploy on sagemaker inference endpoint
wrap with a flask app
Here are my general thoughts on ML roles for LLM since I see a lot of people interested in this.
In my opinion from an ML/Data Science perspective an LLM is just like any other model. It's similar to how you would use BERT for something except that its like BERT on steroids.
I would also suggest to reevaluate your motivation for pursuing an LLM driven field. From this masterclass on choosing a good company/team it is advised that things like product, technical, level, and comp are generally overrated. At the end of the day an LLM is just another model from a company perspective.
Again in my opinion, the reason there is a lot of hype rn is because the barrier to entry for LLM is pretty low. There's tons of hype and most (>75%) of applications which tend to be low hanging fruit are being grabbed. These can be done via chatGPT API, or out of the box tools to fine tune and deploy LLMs or another set of tools for RAG. All of this is really software heavy and less of ML Expertise heavy. AWS also has a lot of built in tools to simplify the process for deploying LLMs as they want to streamline and productize it as easily as possible)
The only place where you truly need someone with data science skills is in companies that are building novel LLMs/SOTA LLMs (e.g. phind.ai, magic.ai) and those need PhDs usually
And also 90% of the hard part of fine-tuning an LLM is in the "Getting the data to the the LLM part" i.e. the data engineering part
There is a ton of competition as well in this field and it is incredibly saturated. There really arent a ton of jobs where they want someone to fine tune LLMs. (source: https://youtu.be/5p248yoa3oE?si=FmANIzPN-\_zMtXvB&t=758). Think about how many companies need AI/ML in the first place -- its not as much as things like SWE. Then the number of companies that need LLMs is a fraction of that. It is important to keep in mind that most data in companies is tabular and supervised learning still accounts for the vast majority of the ML expertise needed.
Finally the AI/LLM field is moving really fast. A lot of the LLM specific expertise needed today is going to be obsolete in 6-12 months.
All this is to say is focus on the fundamentals of understanding how to frame, design, and solve problems and less about trying to find the right space/tools/expertise.
I know this might have sounded a bit rough/not exactly answer your question! If LLM is something you're deeply passionate about then by all means go for it! But I just wanted to paint a more complete/realistic picture on the reality of LLM landscape since I've seen a lot of people trying to chase the hype/sell the hype
I don't have much to add to Sai's great answer, but one thing I recommend is to find a partner to work with. The likelihood of completing the project is way higher if you have someone to learn from and keep you accountable.
Thanks Sai, that was extremely helpful! I definitely agree how to frame, design, and solve problems is more important, and that's what I'm aiming for. What would be a good way to showcase to companies that I'm good at solving problems in general vs. just being good at one tool? Would writing and publishing something documenting my thought process help? Or are there better ways to achieve this? Thanks again!