Hello everyone, for Meal Forger, I'm looking to implement a feature that will allow u to take pictures of food and simply add them to your "pantry" to improve your search. Also, feel free to check it out if I'm currently suing using for calorie calculation and validation. I'm specifically looking for something cost-effective or even free, or could simply train a model myself, which could be an amazing learning opportunity.
Meal Forger
Ah. Adding a custom model will skyrocket the complexity of this project. Custom model means you're taking care of:
... all of which are extremely time-consuming to get right.
I'd just go with GPT 4o mini, the token costs should be negligible if you're careful.
I would use Gemini 2.5 flash. Free 1500 requests/day from google
Food image recognition is a problem that has been worked on for years, well before the age of LLMs. I would be surprised if there wasn't a good solution out there. The tricky part is finding one that doesn't melt your wallet.
My advice is to go through all the big consumer LLM chatbots (ChatGPT, Gemini, Claude, Meta AI, etc), upload pictures of food (fridge, pantry, etc), and ask it to identify which ingredients are in the photo. Maybe have a test suite of 5-10 images to get a proper sample size. For each chatbot, see how accurate it was and compile that info into a spreadsheet. From there, analyze API costs and break it down against the quality of the LLM. Your goal is to have a good balance of low cost and decent accuracy.
You definitely do not want to be training a model yourself. That would be a ton of work, and you would be reinventing the wheel (and probably much worse too).