Hacker Newsnew | past | comments | ask | show | jobs | submit | Treesrule14's commentslogin

https://www.bloomberg.com/news/articles/2024-07-01/dan-davie...

Dan davies did a great interview on odd lots about this he called it accountability sinks


I think about a third of the reason I get lead positions is because I'm willing to be an 'accountability sink', or the much more colorful description: a sin-eater. You just gotta be careful about what decisions you're willing to own. There's a long list of decisions I won't be held responsible for and that sometimes creates... problems.

Some of that is on me, but a lot is being taken for granted. I'm not a scapegoat I'm a facilitator, and being able to say, "I believe in this idea enough that if it blows up you can tell people to come yell at me instead of at you." unblocks a lot of design and triage meetings.


what if people enjoy being in good shape and having engaging workouts in the home?


There are a lot of webcrawlers where the chief feature is turning the website into markdown, I don't quite understand what they are doing for me thats useful since I can just do something like `markdownify(my_html)` or whatever, all this to say is that I wouldn't find this useful, but also clearly people think this is a useful feature as part of an LLM pipeline.


You don't want the footer or navigation in the output. Ideally you want the main content of the page, if it exists. How do you assign header level if they're only differentiated by CSS left-margin in a variety of units? How do you interpret documents that render properly but are hardly correct HTML?


Thanks, I guess, none of that stuff seemed super useful to cut systematically, but I'm gonna run some tests.


Fractional AI | Founding Engineer | San Francisco | Onsite | Full-time

Who: We're Fractional AI -- the dev shop for difficult enterprise applications of genAI.

Looking for: You! Full-stack founding engineers (5+ years of experience, 150k-210k cash, meaningful equity, 99% healthcare premium coverage )

How: apply https://jobs.lever.co/fractional or reach out to me (ben@ fractional.ai)

Why we may be a great fit:

-High impact AI projects - we work on companies' hardest problems to get genAI into production (in the same quarter you might build an AI phone agent for one customer and automate a complex workflow for another customer)

-Culture of a startup but substantive problems to solve that impact millions of users

- Curious, humble, banter-y in-person team comprised of multi-time founders

Why we may not be the best fit:

-Excellence isn't really really important to you (this is less of a 'move fast and break things' role, though we respect that ethos!)

-Predictability is important to you - we work across customers, tech stacks, industries etc

-Interacting with customers isn't your thing


Has anyone else found a good way to swap out models between companies, Langchain has made it very easy for us to swap between openai/anthropic etc


The point is that you don’t need a framework for that; the APIs are already similar enough that it should be obvious how to abstract over them using whatever approach is natural in your programming language of choice.


I have a consumer app that swaps between the 5 bigs and wholeheartedly agree, except, God help you if you're doing Gemini. I somewhat regret hacking it into the same concepts as everyone else.

I should have built stronger separation boundaries with more general abstractions. It works fine, I haven't had any critical bugs / mistakes, but it's really nasty once you get to the actual JSON you'll send.

Google's was 100% designed by a committee of people who had never seen anyone else's API, and if they had, they would have dismissed it via NIH. (disclaimer: ex-Googler, no direct knowledge)


luckily Google now support's using the OpenAI lib https://cloud.google.com/vertex-ai/generative-ai/docs/multim...


> Google's was 100% designed by a committee of people who had never seen anyone else's API

Google made their API before the others had one, since they were the first with making these kind of language models. Its just that it has been an internal API before.


No.

That'd be a good explanation, but it's theoretical.

In practice:

A) there was no meaningful internal LLM API pre-ChatGPT. All this AI stuff was under lock and key until Nov 2022, then it was an emergency.

B) the bits we're discussing are OpenAI-specific concepts that could only have occurred after OpenAI's.

The API includes chat messages organized with roles, an OpenAI concept, and "tools", an OpenAI concept, both of which came well after the GPT API.

Initial API announcement here: https://developers.googleblog.com/en/palm-api-makersuite-an-...


Google started including LLM features in internal products 2019 at least, I knew since I worked there then. I can't remember exactly when they started having LLM generated snippets and suggestions everywhere but it was there at least since 2019. So they have had internal APIs for this for quite some time.

> All this AI stuff was under lock and key until Nov 2022

That is all wrong... Did you work there? What do you base this on? Google has been experimenting with LLMs internally ever since the original paper, I worked in search then and I remember my senior manager said this was the biggest revolution in natural language processing since ever.

So even if Google added a few concepts from OpenAI, or renamed them, they still have had plenty of experience working with LLM APIs internally and that would make them want different things in their public API as well.


> LLM generated snippets and suggestions everywhere but it was there at least since 2019

Absolutely not. Note that ex. Google's AI answers are not from an LLM and they're very proud of that.

> So they have had internal APIs for this for quite some time.

We did not have internal or external APIs for "chat completions" with chat messages, roles, and JSON schemas until after OpenAI.

> Did you work there?

Yes

> What do you base this on?

The fact it was under lock and key. You had to jump through several layers of approvals to even get access to a standard text-completion GUI, never mind API.

> has been experimenting with LLMs internally ever since the original paper,

What's "the original paper"? Are you calling BERT an LLM? Do you think transformers implied "chat completions"?

> that would make them want different things in their public API as well.

It's a nice theoretical argument.

If you're still convinced Google had a conversational LLM API before OpenAI, or that we need to quibble everything because I might be implying Google didn't invent transformers, there's a much more damning thing:

The API is Gemini-specific and released with Gemini, ~December 2023. There's no reason for it to be so different other than NIH and proto-based thinking. It's not great. That's why ex. we see the other comment where Cloud built out a whole other API and framework that can be used with OpenAI's Python library.


>All this AI stuff was under lock and key until Nov 2022, then it was an emergency.

This is absolutely false, as the other person said. As one example: We had already built and were using AI based code completion in production by then.

Here's a public blog post from July, 2022: https://research.google/blog/ml-enhanced-code-completion-imp...

This is just one easy publicly verifiable example, there are others. (We actually were doing it before copilot, etc)


Pretending that was an LLM as it is understood today, and that whatever internal API was available for internal use cases is actually the same as the public API for Gemini today, and that it was the same as an API for adding a "chat completion" to a "conversation" with messages, roles, and JSON schemas is silly.


I'm glad you know exactly what happened, what it was capable of, etc, despite not working on it, and not asking a single question of those who did!

This follows right in line with the rest of your approach.

If you want to know things, it works better to ask questions than make assertions about what other people did or didn't do.

Nobody really cares about the opinions of those who can't be bothered to learn.


My understanding is that the original Gmail team actually invented modern LLMs in passing back in 2004, and it’s taken outsiders two decades to catch up because doing so requires setting up the Closure Compiler correctly.


Lol, sounds like you have more experience with other ex/Googlers doing this than I do. I'm honestly surprised, I didn't know there was a whole shell game to be played with "what's an LLM anyway" to justify "whats NIH? our API was designed by experienced experts"


Would recommend just picking up a gateway that you can deploy and act as an OpenAI compatible endpoint.

We built something like this for ourselves here -> https://www.npmjs.com/package/@kluai/gateway?activeTab=readm....

Documentation is a bit sparse but TL;DR - deploy it in a cloudflare worker and now you can access about 15 providers (the one that matter - OpenAI, Cohere, Azure, Bedrock, Gemini, etc) all with the same API without any issues.


Wow; this is really nice work, I wish you deep success.


Coming back to write something more full-throated: Klu.ai is a rare thing in the LLM space, well-thought out, has the ancillary tools you need, is beautiful, and isn't a giveaway from a BigCo that is a privacy nightmare: ex. Cloudflare has some sort of halfway similar nonsense that, in all seriousness, logs all inputs/outputs.

I haven't tried it out in code, it's too late for me and I'm doing native apps, but I can tell you this is a significant step up in the space.

Even if you don't use multiple LLMs yet, and your integration is working swell right now, you will someday. These will be commodities, valuable commodities, but commodities. It's better to get ahead of it now.

Ex. If you were using GPT-4 2 months ago, you'd be disappointed by GPT-4o, and it'd be an obvious financial and quality decision to at least _try_ Claude 3.5 Sonnet.

It's a weird one. Benchmarks great. Not bad. Pretty damn good. But ex. It's now the only provider I have to worry about for RAG. Prompt says "don't add footnotes, pause at the end silently, and I will provide citations", and GPT-4o does nonsense like saying "I am now pausing silently for citations: markdown formatted divider"


Using Llama Index for this via the `llama_index.core.base.llms.base.BaseLLM` interface. Using config files to describe the args to different models makes swapping models literally as easy as:

  chat_model:
    cls: llama_index.llms.openai.OpenAI
    kwargs:
      model: gpt-4

  chat_model:
    cls: llama_index.llms.gemini.Gemini
    kwargs:
      model_name: models/gemini-pro


Vercel AI SDK[1] shines in this aspect in JS ecosystem.

They have the concept of providers [2] and switching between them is easy as changing parameters of a function[3]

[1]:https://sdk.vercel.ai/docs/introduction

[2]: https://sdk.vercel.ai/docs/foundations/providers-and-models

[3]: https://sdk.vercel.ai/docs/ai-sdk-core/overview#ai-sdk-core



LiteLLM seemed to be the best approach for what I needed - simple integration with different models (mainly OpenAI and the various Bedrock models) and the ability to track costs / limit spending. It's working really well so far.


Didn't know about LiteLLM. That seems to be the right kind of middleware most people would need, instead of Langchain.


Use a consistent argument structure and make a simple class or function for each provider that translates that to the specific API calls. They are very similar APIs. Maybe select the function call based on the model name.


Use openrouter. One OpenAI like api but lots of models


The strategy design pattern would be suitable for this.


Openrouter maybe?


Just to echo what you are saying, I've read the first chapter and I thought the thesis is interesting and the writing is good but I failed to be convinced becuase it makes a lot of classic mistakes you make in science. Even though logical arguments are being made there is no attempt not to overfit the data.

The author brings out a lot of stats "smart high-schooler", "effective compute", "OOM", "Test Scores", "inference efficiency" but doesn't do a good job of explaining how the author predicted these things before hand (preregistering) and how they actual will result in new technologies or how we can extrapolate past the trend line.

Also in the unhobbling section "Tools: Imagine if humans weren’t allowed to use calculators or computers. We’re only at the beginning here, but ChatGPT can now use a web browser, run some code, and so on. "

This is so non-specific (because no one has really commercialized anything with this yet) that I worry that we don't actually know if we can make the kinds of effective tools the author is talking about. Would love some feedback on these critisms


also one funny thing is that the author mentions power constraints, but then doesn't calculate how many terraflops for example the us grid can produce etc.


That's where the Trillion $ cluster comes in. It also includes building power plants, not just data centers


That’s around the time he says we should build 1200 shale wells in Pennsylvania.


At least in the US the fed has specifically had full employment as one of their goals at least for the last few years so I'm pretty skeptical of "corporations control the fed arguments"

Like I agree there's a lot of regulatory capture in the US but I just don't see any reason to think the fed is responding to big business rather than their own preferences.


Ye olde Motte and Bailey.

There are in fact people who are saying its only corporate inflation.

The real reason i'm skeptical of the greed argument is that its just a vibes based guess on whats happening and not based on any data or insight into the situation.


New York, NY Remote | Local, Full Time Stack: C, Java, RTOS, Python, Perl Resume: http://careers.stackoverflow.com/benjaminkadish Contact: [email protected]

I am looking to join an engineering team working on new technology (energy, cars, consumer electronics, ect.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: