stefanwebb's comments

stefanwebb · 2025-10-23T17:54:50 1761242090

First couple of paragraphs:

"There are many things one needs to live a rich and fulfilled life (according to AI researchers). A good initialization [Mishkin and Matas, 2015], attention-based neural networks [Vaswani et al., 2017], and a good title for your research paper [Myself, just now], to name a few.

In this post, we discuss another piece of eternal wisdom from AI researchers: “less is more.” Specifically, how foundation models can be fine-tuned for new capabilities with small data, in many cases less than one-thousand samples, and often outperform the same model fine-tuned on larger datasets. Meditate on that for a moment (suggested pose in figure above)."

stefanwebb · 2025-10-19T20:34:50 1760906090

Here's a blog post I wrote last week on the same topic: https://blog.oumi.ai/p/small-fine-tuned-models-are-all-you

I discuss a large-scale empirical study of fine-tuning 7B models to outperform GPT-4 called "LoRA Land", and give some arguments in the discussion section making the case for the return of fine-tuning, i.e. what has changed in the past 6 months

willybraun · 2025-10-19T20:51:25 1760907085

insightful, thanks

stefanwebb · 2025-10-19T20:29:43 1760905783

Seems topical given some recent front-page HN articles on fine-tuning. I discuss a large-scale empirical study from 2014 of fine-tuning 7B models to outperform GPT-4 and GPT-3.5-Turbo, as well as arguments why fine-tuning is coming back into favor

stefanwebb · 2025-10-09T23:58:54 1760054334

Hello Fellow Hackers, I wanted to share what my team is building. We released our open-source library for foundation model development in February and we're about to release our first Enterprise offering.

In brief, we've developed an easy-to-use platform for fine-tuning custom models. We automate data synthesis for judging and training, as well as automating the judge prompt itself. The end result is that model development times and costs are drastically cut!

Check out our Substack article above if you're interested in learning more or signing up for early access :)

stefanwebb · 2025-09-30T02:18:42 1759198722

This is a really powerful technique in general because it lets us have some controllability over traditional PCG techniques! All you need is the right prompt and an evaluation metric - could definitely apply to Voronoi maps

stefanwebb · 2025-09-30T01:57:46 1759197466

On a related note, I've started a blog on procedural content generation and GenAI content synthesis: https://gamedev.blog/. Would love any feedback / suggestions! I intend to cover Voronoi diagrams in the near future + a Python implementation and turning it into a 3D map with Unity

nrjames · 2025-09-30T13:04:05 1759237445

Have a look here for a great reference

https://available-anaconda-10d.notion.site/That-Creative-Cod...

stefanwebb · 2025-09-19T03:45:03 1758253503

There’s a similar library that also includes data synth and LLM-as-a-Judge: https://github.com/oumi-ai/oumi

BoorishBears · 2025-09-19T05:50:10 1758261010

Yet another framework lying about Deepseek support.

I've been trying to actually finetune Deepseek (not distills) and there are few options

3abiton · 2025-09-19T11:47:09 1758282429

Which version were you trying? Doesn't unsloth already support finetuning?

BoorishBears · 2025-09-19T16:50:47 1758300647

Previous V3 base

Unsloth doesn't have an official multi-GPU story: there's hacked together solutions but they're finicky as it is for smaller models

In general Deepseek has very few resources on finetuning, that get even further muddied by people referring to the distills when they claim to be finetuning it.

stefanwebb · 2025-03-28T02:37:10 1743129430

Totally relate to that! Article looks interesting :)

stefanwebb · on Feb 25, 2025

There's quite a few differences between HuggingFace's Open Deep-Research and Zilliz's DeepSearcher.

I think the biggest one is the goal: HF is to replicate the performance of Deep Research on the GAIA benchmark whereas ours is to teach agentic concepts and show how to build research agents with open-source.

Also, we go into the design in a lot more detail than HF's blog post. On the design side, HF uses code writing and execution as a tool, whereas we use prompt writing and calling as a tool. We do an explicit break down of the query into sub-queries, and sub-sub-queries, etc. whereas HF uses a chain of reasoning to decide what to do next.

I think ours is a better approach for producing a detailed report on an open-ended question, whereas HFs is better for answering a specific, challenging question in short form.

stefanwebb · on Feb 25, 2025

There's two blog posts that go with this, check it out:

https://milvus.io/blog/i-built-a-deep-research-with-open-sou...

https://milvus.io/blog/introduce-deepsearcher-a-local-open-s...