6

Looking for advice on fine-tuning LLMs as a side project

Profile picture
Entry-Level Data Scientist at Flatiron Health8 months ago

I'm a Data Scientist looking to switch company and move to a role closer to ML/LLMs. My plan is to build a side project fine-tuning LLMs to familiarize myself with this field and leverage that experience on my resume. I was wondering if anyone here has experience building similar projects or went through a similar learning process - it would be very helpful to get some insights on skill acquisition and finding a job in this area. Here're some examples of what advice I'm looking for, but please feel free to share other aspects as well - anything will be greatly appreciated:

  1. What are some good resources to learn about building LLMs? (currently mostly learning from HF, reddit, and googling)
  2. What's the best tech stack to build personal fine-tuned LLM projects? (I'm planning to use Runpod or similar services like Vast for training and inference, but was wondering if there's other better options)
  3. I'm looking to get into an early stage company in this field. What kind of project should I build to maximize my chance at getting into such companies? My plan rn is to fine tune a model using literature works (novels, poems, proses, etc.) since training data is relatively abundant and it's aligned with my interests. Are there more impactful use cases (for job hunting) out there?
  4. What are some things I should keep in mind when producing deliverables to better showcase my technical and learning abilities? I'm planning to make a series of blog/social media posts documenting my experience building this project. Is there anything in specific that would draw companies' attention?

Thanks in advance and please feel free to share your thoughts!

119
3

Discussion

(3 comments)
  • 7
    Profile picture
    ML Engineer
    8 months ago

    I would treat this like any other ML project. I would build an end to end app where you

    1. run some prefect or orchestration job to fetch data from some source (reddit/twitter/...)

    2. put it in s3

    3. read from s3 and write a processing job to fine tune an LLM in sagemaker

    4. deploy on sagemaker inference endpoint

    5. wrap with a flask app

    In my opinion from an ML/Data Science perspective an LLM is just like any other model. It's similar to how you would use BERT for something except that its like BERT on steroids.

  • 3
    Profile picture
    Tech Lead/Manager at Meta, Pinterest, Kosei
    8 months ago

    I don't have much to add to Sai's great answer, but one thing I recommend is to find a partner to work with. The likelihood of completing the project is way higher if you have someone to learn from and keep you accountable.

  • 0
    Profile picture
    Entry-Level Data Scientist [OP]
    Flatiron Health
    8 months ago

    Thanks Sai, that was extremely helpful! I definitely agree how to frame, design, and solve problems is more important, and that's what I'm aiming for. What would be a good way to showcase to companies that I'm good at solving problems in general vs. just being good at one tool? Would writing and publishing something documenting my thought process help? Or are there better ways to achieve this? Thanks again!

Flatiron Health’s mission is to improve and extend lives by learning from the experience of every person with cancer.
Flatiron Health2 questions