j-wang's comments

j-wang · 2025-06-11T21:05:26 1749675926

I was about to say exactly this—it's not really that different from managing a bunch of junior programmers. You outline, they implement, and then you need to review certain things carefully to make sure they didn't do crazy things.

But yes, these juniors take minutes versus days or weeks to turn stuff around.

j-wang · 2025-06-10T23:44:05 1749599045

"But this logic breaks down for advanced models, and badly so. At high performance, fine-tuning isn’t merely adding new data — it’s overwriting existing knowledge. Every neuron updated risks losing information that’s already intricately woven into the network. In short: neurons are valuable, finite resources. Updating them isn’t a costless act; it’s a dangerous trade-off that threatens the delicate ecosystem of an advanced model."

Mainly including this article to spark discussion—I agree with some of this and not with all of it. But it is an interesting take.

j-wang · on Feb 15, 2025

I miss the entire Maker Movement moment. It isn’t actually super complex to make a simple circuit board that does some basic things for you.

It’s cheaper to just outsource making the board, but I even made my own simple boards (etched using hobbyist CNC machines). I even used a bunch of surface mount parts with them.

j-wang · on Jan 7, 2025

How AI agents of the current era (with LLMs) differs from the history of reinforcement learning-based agents

j-wang · on March 24, 2024

ML researchers in statistics departments write stuff in R, which makes everyone scream. ML researchers absolutely do.

My point in the article was basically the class was "indoctrinating" (too strong, but you get the point) the future ML researchers in the superiority of using CUDA and spending NVIDIA company resources to continuously do so in these classes, year after year.

fancyfredbot · on March 25, 2024

This hits the nail on the head. Nvidia got all the programmers excited about using their GPUs first and now they have all the software targeting their hardware.

If you could compile CUDA for Intel and AMD it's not going to perform well. When you program a GPU you aren't just writing task specific code, you are also writing hardware specific code. So having developer mindshare matters much more than having a nice programming language.

In ML many people write pytorch and not CUDA. But even in ML the choice of precision is driven by the data types Nvidia can deal with efficiently - this is a moat which is nothing to do with CUDA.

j-wang · on March 24, 2024

> I dont understand this - arent almost all ML NN models built in pytorch, and arent these compiled / jit'd into a lower level format - and can we not have various backends/drivers for that, such as CUDA / ROCM / vnni ?

PyTorch already does. But if you're saying "NN" and "pytorch" that already means you're outside of the audience for CUDA I'm talking about in the article. My own stuff was usually Bayesian Hierarchical Models, which at least at the time made pytorch completely useless (that was nearly a decade ago though—maybe that specific use case improved).

If you've tried to write actually new (or different enough) NNs or entirely different models, pytorch is too high-level, and sometimes even TF is too. Even aside from that, if you're a maintainer of BLAS or some specific library for sparse MM with very specific distributions that are optimized for it...

Anyway, those are the key cases, but even aside from that, if you've ever tried even with some higher-level libraries to do non-vanilla stuff, nothing works as well as it should. You get random, inscrutable errors that certainly do exist on NVIDIA GPUs/stuff-based-on-CUDA-under-the-hood, but way way fewer. For newer, custom stuff, getting things like numerical overflows or other completely breaking problems on alternative backends, but don't happen / work just fine on CPU or CUDA backend is not really that uncommon. Or the CUDA backend is just ridiculously faster. If you're doing something annoying, new, and complicated enough, there's no point in taking the aggravation.

The people who write the stuff that is used in PyTorch or other libraries definitely write CUDA code (in C++ etc). And then the people who use PyTorch just build on top of that.

I deliberately tried to keep it accessible and have non-technical (or just non-software) audiences also be able to get an intuition for why CUDA has such strong lock-in. Otherwise, the pushback I've often gotten "just re-write it" or "it's just software" which if it were so simple, people wouldn't need to be yelling so much at AMD across so many comments. Basically, people who can't fathom why software technical debt can ever be a thing. Or, if it is, China has infinite money and time anyway.

A high-level analysis should say that Huawei, AMD, and Intel all should easily invest enough to make this all work and compete with CUDA to push their hardware platforms. The reality is decentralized decision-making from users also makes it more of an expensive, uncertain bet that people will adopt. A bunch of the lower-level, underlying libraries that things are built on AND the researchers who do bleeding-edge research still have a huge amount of experience in and stuff built on CUDA.

j-wang · on Nov 28, 2023

Search engines have always been a bit weird. Do they have network effects? Why is it actually winner-take-all? I've had spirited conversations with a lot of different people and academics/microeconomists on the topic and I don't think anyone truly has a good conclusion. It doesn't naturally seem like it should be the case.

Anyway, I think your point here is interesting and was kind of the idea behind a lot of the "gather lots of data" startups. A lot of those failed in part because the frontier of AI is moving pretty quickly. You need a lot less data to do interesting thing today than you did not that long ago. Because we've thrown more and more data at more and more compute, I think people don't appreciate how much we've truly progressed algorithmically. You need an order of magnitude less data to do the same thing for each "generation" of AI.

That frontier cuts against the ability to build a moat on user-generated data, so long as it's readily available or somewhat replicable. Your competitor is naturally going to have a cheaper time getting into market than you if they wait longer to do so.

However, this definitely does stand if your area truly is obscure (e.g. specific industry), annoying to gather data in (e.g. certain healthcare applications), or actually proprietary (e.g. your own device data with a different modality).

Not putting words into your mouth that you aren't saying the latter here—just making a distinction since it's easy to imagine any data being a moat, which is a common mistake I see.

fardo · on Nov 28, 2023

> Why is it actually winner-take-all?

The power of defaults, mostly.

The average user experience of picking up an internet-connected device has been very intentionally cultivated by Google. Whether you're in your browser or on your phone, Google's spent a lot of money building up Chrome as a browser ecosystem, Android on mobile, and paying off Apple on iPhones and competing browser vendors like Firefox, to guarantee that, whenever possible, Google is always the default search engine. The only non-Google default will typically be on Edge, which only has about 5-6% penetration. Since Google historically has always been the best search engine in the space, does not explicitly charge its users money, and (at least for average users) is really good at surfacing what they're looking for, most users feel no need to look elsewhere for a search engine because the default just works, switching would demand an effort, and Google is what they'd want anyways. The moat isn't big, but Google has put a ton of work into ensuing that any competing search engine requires an intentional and active choice of users to seek you out while they're worse.

At least until the recent AI play by Bing, this tiny moat was always sufficient, because if you start from scratch on search, you're essentially guaranteed to be worse, and all other 'serious' offerings under the hood were weak alternatives: essentially one of "Bing search API wrappers" (worse results), "nation-state-actor search engines" (for most users, worse results), or "Google, but with some cursory privacy measures, a subscription fee, or filtration features" (which wasn't something most users care about).

Recent chat AI represents a competing alternative to doing a search in the first place, which jeopardizes the "we have essentially all defaults and users can't be assed to switch to a worse search" barrier to entry that Google historically relies on, which is ringing alarm bells for them.

tqi · on Nov 28, 2023

Agreed that it's not intuitive. My guess is that it's not a network effect in the traditional sense (where having more users makes the product more valuable), but rather that there is something about a product category being free that lends itself to being winner take all. Users are less motivated to comparison shop, and if any one company gets enough marketshare to be a default choice then maybe it just snowballs. Like if tissues were free, would Kleenex have a monopoly bc everyone just reaches for it by default?

j-wang · on Nov 28, 2023

Potentially—I suppose free can lead to more justifiable laziness in finding different resources for different things.

The argument I generally hear from certain microeconomists is that they still expect there to be value in niches, given Google's highly general nature. If you're looking for super specific topics, it often doesn't perform extremely well. You'd find it valuable to go to a resource tailored for your area.

Anecdotal, but I've personally found it to be true—for specific hobbies, or for more "real" reviews, I search reddit. Except I use Google to search reddit, since reddit's search sucks, but still. Amazon or Etsy or whatever can be considered "search engines" for highly specific topics (purchases, and purchases of a specific type of product) and they do have success there too, but Google is still often the front-page to get people to those sites.

Maybe it's just that Google is just a default "front-page" and enough tech-non-savvy people just use it to get to where they want to go (e.g. the classic "type Facebook into Google to get to Facebook") that it sticks. That's maybe the most compelling reason I've heard, but it is also somewhat unsatisfying as well (as well as precarious if the defaults ever change—but maybe that's true!).

tqi · on Nov 28, 2023

> The argument I generally hear from certain microeconomists is that they still expect there to be value in niches, given Google's highly general nature. If you're looking for super specific topics, it often doesn't perform extremely well. You'd find it valuable to go to a resource tailored for your area.

I think that is true, but only within a very narrow band of topics that are broad enough to require a search functionality (lexusnexus, webmd, arxiv, etc). I think most topic niches that I would be interested in are more often served by niche publications (ie I wouldn't need/want a search engine geared toward photography, I would mostly go to specific publications and sites that I trust).

JumpCrisscross · on Nov 28, 2023

> Why is it actually winner-take-all?

Is search winner takes all? Or is it advertising?

j-wang · on Nov 28, 2023

That's an interesting question since advertising certainly follows search dominance, but it doesn't necessarily follow the other way around. Google figured out how to monetize its dominance with advertising before it had the behemoth ad platforms they have today. It's pretty much the same with the popular social networks.

The answer (more logically) should kind of be neither. Advertising obviously has a lot of channels, and even though Google has both Adwords and its display advertising network, it doesn't follow those really need to be the same provider... at least outside of more data to do more targeted ads. But, again, advertising dollars will follow platforms that price for ROI.

Better targeting mainly adds to the amount that Google and Facebook can charge for their ads and still have companies pay for them. It doesn't really add to their dominance directly (I say directly since, obviously, more money can buy more R&D/employees/regulatory capture/acquiring competitors/just-paying-for-dominance like with Google paying Apple. But that's all indirect).

esafak · on Nov 29, 2023

Nothing prevents users from switching search engines. I switched mine! Advertisers on the other hand want as many eyeballs that fit their target profile as possible.

paul7986 · on Nov 28, 2023

Once Google became a verb well then ..that's when they won?

m_0x · on Nov 28, 2023

Because the phrase is "Just google it" not "Just search-engine it"

esafak · on Nov 29, 2023

Just search it.

j-wang · on Nov 28, 2023

I think various of those aspects you call out here, I do as well. The specificity of the application is fairly key, whether it comes through proprietary data or application-specific stuff or simply business-lock-in.

Interesting hill analogy—I do broadly agree with the areas.

j-wang · on Nov 28, 2023

Good point—that is the point. When there's a hype cycle, people often check their normal business sense at the door in terms of customers, value generation, and defensibility. Are there any of these? No, but it's crypto. Or now AI.

I do go somewhat beyond that in pointing out exactly why most of these startups don't have defensibility.

Perhaps for some people it doesn't need to be said, but back when I wrote this... and now... the market seems to suggest that it isn't that obvious.