Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't get the "plagiarism/miscrediting" accusations. This was in the original PR (https://github.com/ggerganov/llama.cpp/pull/613):

> This PR was written in collaboration with @slaren. This PR is also rebased on PR #586 so please do not squash merge! Use either merge or rebase.

jart made sure to that the other user got credit, in addition to making sure that their name was properly attributed in the commit log. Given all this, it feels like the drama--shouldn't exist? Like, if there's an issue with attribution, it's not because of bad-faith, and I feel like a good-faith conversation could have just resolved this, instead of bringing in trolls.



That's not the original PR. jart was working on a malloc() approach that didn't work and slaren wrote all the code actually doing mmap, which jart then rebased in a random new PR, changed to support an unnecessary version change, magic numbers, a conversion tool, and WIN32 support when that was already working in the draft PR. https://archive.ph/Uva8c

This is the original PR: https://github.com/ggerganov/llama.cpp/pull/586.

Jart's archived comments:

"my changes"

"Here's how folks in the community have been reacting to my work."

"I just wrote a change that's going to let your LLaMA models load instantly..."

https://archive.ph/PyPFZ

"I'm the author"

https://archive.ph/qFrcY

"Author here..."

"Tragedy of the commons...We're talking to a group of people who live inside scientific papers and jupyer notebooks."

"My change helps inference go faster."

"The point of my change..."

"I stated my change offered a 2x improvement in memory usage."

https://archive.ph/k34V2

"I can only take credit for a 2x recrease in RAM usage."

https://archive.ph/MBPN0

"I just wrote a change that's going to let your LLaMA models load instantly, thanks to custom malloc() and the power of mmap()"

https://archive.ph/yrMwh

slaren replied to jart on HN asking her why she was doing and saying those things, and she didn't bother to reply to him, despite replying to others in that subthread within minutes. https://archive.ph/zCfiJ


Hmm, based on what you've quoted here and knowing nothing else but a few messages on AI Twitter I would invest in jart.

This is BillG-style product skill -- there is a ton of work that goes into representing a piece of software as something important and valuable that people should buy into.


Jart is a pretty exceptional engineer, even if she wrote this patch single-handedly it would hardly be a footnote in her list of professional accomplishments. This is the author of Cosmopolitan libc, redbean and APE we're talking about, after all.

That being said, it's important to attribute work properly. It can be easy to mix things up (eg. "my patch" is excusable) but repeatedly insisting authorship when you're not the author of the change just seems disingenuous. I'm sure it was in good faith, but since they didn't address the issue or clear anything up, it's come to this.

Dramatic, and hardly the conclusion people wanted to the story of a free performance improvement. It's not entirely contrived though, and I think the maintainer handled this exceptionally well given the circumstances.


> This is the author of Cosmopolitan libc, redbean and APE we're talking about, after all.

Is this? If she so easily misrepresented slarens work as hers in this case, what other work isn't actually attributable to jart?


I'm all for detracting from suspicious authors, but it's unlikely Justine just steals their code wholecloth. She's been an active community member for a while, and wrote a lot of impressive software before LLMs and script kiddies democratized the whole process.

In this specific instance, jart had a communication error that she failed to clarify, and so things compounded from there. The part that she didn't author is clearly defined in Git, and the most-plausible explanation is an honest mistake. Assuming ill-intent requires you to ignore the original context of the disagreement and focus on the outrage, which pretty much says it all.

That being said, I'd love to hear what evidence you have to the contrary. Maybe you've got a link to an FTP server from 2001 with the Blinkenlights source code on it, I can't say for sure. A fraud probably doesn't write in-depth patch breakdowns on their personal blog for fun, though.


> > This PR was written in collaboration with @slaren. This PR is also rebased on PR #586 so please do not squash merge! Use either merge or rebase.

I read that PR (didn't click any links) and here on HN posted a "Great work" to jart. The reason I did that is precisely because those final lines in the PR came across as an upright acknowledgement that some people helped out. I also got the impression that jart was a co-owner of the project with all the "we"s that were thrown around.

If I was writing that PR, it would be something like "this PR consolidates slaren's mmap approach with additional work done for ... by myself". After hearing about the drama, actually reading slaren's PR, and reviewing jart's comments in issues and the PR and the hn show and tell, I am now convinced this is someone who wants to steal other people's thunder. Heck, even this front page article is yet another PR stunt. I suspect "faster fork of llama.cpp" posts will follow.

Giorgi Gerganov remains for me the hacker hero here as far as LLMs are concerned -- mmap is kiddie stuff to be frank, but anyone who gets whisper and llama to work on my laptop with a handful of files (many thanks to you sir) has my technical respect. And I think he has made the right call regarding the project.


I think that Georgi regrets making the project so openly to PR, he was probably happier with running it on his own.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: