Hacker Newsnew | past | comments | ask | show | jobs | submit | sabaimran's commentslogin

The death of privacy is definitely a more fundamental issue here, outside of politics. In an ideal world, private citizens, even if they have notorious views, have a right to privacy.


> In an ideal world, private citizens, even if they have notorious views, have a right to privacy.

Privacy is definitely important, in an ideal world we'd like to have as much of it as possible but we are now in world that's far from ideal.

> The death of privacy is definitely a more fundamental issue here,

Well, my prior comment was about personal security and why it's so easily, and on so many levels, jeopardized by the lack of privacy. In that light, privacy isn't the more fundamental issue for sure.

We know that "security through obscurity" isn't the best way to get there, it's rough analogy but it does apply here too. The fundamental issues are life, liberty and the pursuit of happiness, we'd be in a very rough spot if high levels of secrecy were the only way to secure these.

Even under the best of conditions, secrecy has a limited reach and lifespan, it simply doesn't play well with liberty and happiness. So, there are more fundamental issues than privacy, and while more privacy is definitely an enhancer, it's just a knife in gun fight.


I don't personally think there's that much value to this argument. Compare, for instance the consumption of 1 hour of tv vs. 1 hour of GPT usage:

A single AI chat message can consume 0.34 watt-hours of energy (1). So, let's say a hundred messages in an hour (quite an aggressive session) would be 34 watt-hours of energy.

An LCD TV running for an hour consumes about 100 watt-hours of energy, depending on size, LED, vs. OLED etc. (2).

I think AI does help people do better research faster, which is a significant uplift to humanity, while I do not see anyone specifically curbing their TV usage. We should probably focus our effrots on helping people use AI better and meanwhile build more nuclear energy plants, imo.

(1): https://epoch.ai/gradient-updates/how-much-energy-does-chatg...

(2): https://santannaenergyservices.com/how-many-watts-does-a-tv-...

---

And then consider the amount of energy traditionally required by one human to do the same research tasks. Also quite significant.

I think we should be focused on making the more efficient, for sure! But I don't buy that the arguments based on energy consumption are very strong.


You are ignoring the energy cost to train models. AI is not just the surface layer of end user messages.

AI is also used far more often than TVs - if every app and device starts using it, that is constant AI messaging going on. So TVs aren't even the correct comparison, especially if AI starts to be used more to create content - there is then an AI energy cost to just watching that content. Even putting that aside, what screen are you looking at when making these queries to AI? Maybe a phone... but if not, you are burning the energy from both the large screen and the AI.

Even putting aside the poor comparison that TVs are, with today's energy production, the environmental damage from AI is unquestionable. Rather than asking whether or not that is OK, there are really 2 questions to answer:

1) What are the benefits of AI, specifically? Yeah, vague things like "research faster" is a benefit, but you need to quantify it if you are going to make comparisons. And most AI usage is frivolous. Some AI usage is downright damaging, especially in creative industries. All of that needs to be balanced.

2) Can we change energy production to get off of fossil fuels? If we can do that, the damage of burning more energy decreases greatly.

My takeaways from this entire line of questioning is that we need to balance AI usage with renewable energy adoption, while keeping a strong eye on what we actually do with AI.


What a great way to put it in perspective, and shocking how low per query.

I expanded your comparison list (with Chat GPT). Pretty interesting.

Activity | Watts Used (1 hr) ---------------------------------------|------------------- Laptop | 50 Wh Desktop + large monitor | 200 Wh Large screen TV | 200–300 Wh Video game + large TV | 300–500 Wh Washing machine run (avg) | 500–1,200 Wh Dishwasher run (avg) | 1,200 Wh Dryer run (electric) | 2,000–5,000 Wh Tesla city driving (Model 3 est.) | 14,000 Wh Tesla highway driving (Model 3 est.) | 18,000 Wh


> I think AI does help people do better research faster, which is a significant uplift to humanity,

One must consider both sides. Better research also contributes (no question) to more consumerism and the furthering of technology, which also uses more fossil fuels. It's time we acknowledged that research isn't free, and all of the damage to the biosphere was mainly enabled by science.

And AI uses a huge ton of energy. For example, according to [1], "In Ireland, [...] electricity demand from data centres represented 17% of the country’s total electricity consumption for 2022". And we also have to consider the raw materials and mining used.

1. In Ireland, where the data centre market is developing rapidly, electricity demand from data centres represented 17% of the country’s total electricity consumption for 2022


Related, link to the GH repo: https://github.com/openai/harmony


Super excited to see these released!

Major points of interest for me:

- In the "Main capabilities evaluations" section, the 120b outperform o3-mini and approaches o4 on most evals. 20b model is also decent, passing o3-mini on one of the tasks.

- AIME 2025 is nearly saturated with large CoT

- CBRN threat levels kind of on par with other SOTA open source models. Plus, demonstrated good refusals even after adversarial fine tuning.

- Interesting to me how a lot of the safety benchmarking runs on trust, since methodology can't be published too openly due to counterparty risk.

Model cards with some of my annotations: https://openpaper.ai/paper/share/7137e6a8-b6ff-4293-a3ce-68b...


Do subagents run in parallel?


No way because they share filesystem


> The central finding is that a 15% increase in solar generation across the U.S. is associated with an annual reduction of 8.54 million metric tons (MMT) of CO2, a significant step toward national climate goals.

Whoa, that's really cool.

You can see the paper along with figures & regional breakdowns here: https://openpaper.ai/paper/share/1d0c6956-4820-4ee2-ac1e-12c...


US Produced about 4.8 Billion Metric Tons of CO2 in 2024 ( https://www.statista.com/statistics/183943/us-carbon-dioxide... )

The savings is minuscule. But important nonetheless. It just goes on to show how much more solar is required.


Given that solar power is 4% of electricity generation, a 15% increase is like 0.5% percentage points in total. Roughly 30% of co2 is from electricity generation, the numbers all seem to make sense.

If you replace 0.5% of things that emit carbon with non-carbon sources it reduces carbon emissions by 0.5%.


This falls for the ”fallacy” of primary energy.

We need vastly less total primary energy to run a 90-95% efficient BEV compared to a 20-30% efficient ICE.

That is excluding the entire very inefficient supply chain to refine and transport the fuel to the ICE.


> We need vastly less total primary energy to run a 90-95% efficient BEV compared to a 20-30% thermally ICE.

While true, that requires actually transitioning to BEVs, which in turn requires having enough batteries to transition to BEVs.

Doing that in the USA is (~290M vehicles, say 60kWh each, ~= 17.4TWh) more than enough to provide the entire USA with several days worth of backup storage, even if the place somehow got a continent-wide version of a Dunkelflaute that wasn't merely "20% normal output" but "actually no output".

I am hopeful this will happen, but last I checked, it was further away than the PV itself is, what with the batteries needing replacement every few thousand cycles but the PV mostly lasting 25-35 years no problem.


For the net zero scenarios by the IEA one of the areas that is ahead of the curve is batteries.

The global battery manufacturing capacity reached 3 TWh in 2024.

With say an average lifetime of 15 years getting a bit over 1 TWh of new batteries per year for the car fleet seems easily feasible.

Then please give us a source for regarding your continental dunkelflate doomsday scenario so we can make sure is a plausible scenario, and not made up scary numbers.

Of course ignoring that you assume that we need to charge every single car to 100% every day.


> The global battery manufacturing capacity reached 3 TWh in 2024.

OK, that's better than I thought, I was led to believe it was 1 TWh in 2024.

> With say an average lifetime of 15 years getting a bit over 1 TWh of new batteries per year for the car fleet seems easily feasible.

I think that's optimistic; none of my (non-car) batteries have maintained significant capacity for that long. I think grid use will look more like phone or laptop use than like car use, with daily full cycles?

> Then please give us a source for regarding your continental dunkelflate doomsday scenario so we can make sure is a plausible scenario, and not made up scary numbers.

I think you're misunderstanding me on this. I'm saying it's good enough even for a very weird and unusual condition far in excess of the normal talking points.

> Of course ignoring that you assume that we need to charge every single car to 100% every day.

No? I'm saying I expect an average car to get a 60kWh battery pack, and that there are 290 million vehicles in the USA, and that this makes a combined manufacturing requirement of sustaining a capacity of multiply-those-numbers-together storage. This says nothing about how often that storage will normally get charged, and instead I was saying how long this could power the USA for if discharged in a very weird condition.

Actual power consumption of those vehicles is tiny, something like 80% of mean consumption (in places where shaded parking isn't the norm) could be supplied by requiring their surfaces to be covered in PV.


These kinds of arguments don't really add up if you use some system thinking and extrapolate from current trends.

First, the US isn't northern Europe (where solar energy is very popular regardless). Especially the southern half is more comparable to southern Europe or even North Africa. Places like Berlin are at 52 degrees latitude. You have to go deep into Canada to find cities at a similar latitude. Most of the US is below 49 degrees and gets decent amounts of sun. It's more than fine most of the year. If you regularly need to wear your sun glasses in January, you live in a place that can have solar power.

But sure, the Sun doesn't always shine and it gets grey and cloudy sometimes. Even in San Diego. But there are also wind, and batteries. And people always forget that you can use cables to move energy around as well. And a lot of cables aren't at their maximum capacities all of the time. So, they can be used to move energy around when it isn't needed and be used to charge batteries close to where it is needed later. San Diego is basically at the same latitude as places in Northern Africa that might end up supplying power via HVDC cables to Europe. The US can mix off shore wind on both coasts, solar across the south and its deserts with hydro in mountainous regions and lots of batteries. At this point very doable already and long term only getting more obvious to do as cost and efficiencies continue to improve.

Finally, modern batteries already last quite long. LFP and sodium ion are basically getting lifespans of 5000 or more cycles at this point. That's basically decades with normal usage and over a decade even with full daily cycling (which would be intensive usage).

Sodium ion means lots of dirt cheap batteries for storage and (small) vehicles. Basically it uses no rare materials and lasts a long time. It has the potential to decimate the cost of batteries from close to 100$/kwh to more like 10$/kwh by some estimates. At 10$/kwh, most house holds would be able to afford to have a mwh battery - enough to power an inefficient US household for a month. And a more efficient household throughout even the longest imaginable type of dunkelflaute. You can't quite get those yet of course but at this point we have some reason to be optimistic about this being possible mid to long term at least.

Add nuclear, hydro and geothermal to the mix and you have a lot of clean ways to generate and move around clean energy. That kind of system takes time to build but there really aren't a whole lot excuses not to.

This transition period has a lot of people looking in the rear view mirror being blind to the huge stuff that is clearly visible ahead at this point. There are a few wild cards that are interesting but not that essential. Like small reactors, fusion, etc. Nice but not really that essential.

The dunkelflaute is an interesting technical and infrastructure challenge that requires some out of the box thinking. But it's very solvable and it doesn't require any major new technology breakthroughs. We just need to do more of what we're already doing and preferably a bit cheaper. All very doable and within reach. And we have time to do it as our old infrastructure isn't magically about to disappear. Most of this stuff will be cost and economics driven.

Lots of countries that are ahead of the curve might be importing progressively less oil in the decades ahead. That means their trade balances shift and they start having economic growth and a competitive advantage.

IMHO countries that are lagging here will first fall behind, suffer the economic consequences for a while, and then fix it by compensating with massive investments. The US seems to be doing all the wrong things to set itself up for exactly that right now. Which is why I'm quite optimistic it will figure it out eventually.


I think you're arguing against something I didn't say?

The tech is great. I'm usually the one defending it, even. But you do actually have to build the factories. Which we (humanity, I'm not American) are, as fast as we can, but that's the trend-line to look at, not what the tech can ultimately do.

I mean, to one of your points, I'm one of the few people here who keeps saying that if China wanted to make it a strategic goal, they have the manufacturing capacity to put in a genuinely global power grid with 1Ω electrical resistance for a fairly low material cost (few hundred billion), what a shame about the geopolitical realities getting in the way of this…


Gigatonnes for short, as used in climate sciences.


I mean it's five basis points, it ain't nothin.

Put another way, if I could grease the right palms to shave commensurate minuscule savings off of the budget of ICE, it'd pay off my mortgage. Twentyfold.

Back to greenhouse gases, I'm no climatologist, but isn't it plausible the difference could, for instance, make or break one catastrophic wildfire across the western seaboard of North America?

Beware of statistic thinking in a stochastic world.


Fifty basis points.


Seems odd to state a percentage increase in solar to obtain an absolute decrease in CO2.


[flagged]


Solar is cheaper than oil, and oil is essentially never used for electricity.


Of the world's power 35% is coal (solid oil), 20% Gas/Natural Gas (gaseous oil), 3% oil (Wikipedia def'n).. 58% of our power is oil https://en.wikipedia.org/wiki/List_of_countries_by_electrici...

Are you being particular about your definition of oil?

Light Oil (C4-C12 aka "Gasoline"(NA) "Petrol"(EU)) is used for personal generation and backup power systems.

Heavy oil (C9-C25 aka "Diesel") is regularly used for electricity, extensively used for backup power systems.


Redefining coal as "solid oil" and natural gas as "gaseous oil" is ludicrous. Coal, natural gas and oil are well-defined concepts that are not easily fungible in our energy infrastructure, so plopping them all together using your made-up language is silly.


GGP used oil to cover all fossil fuel based sources.. GP decided to focus on the word choice rather than the intent. 58% of the worlds power comes from fossil fuels.


I count three percent as negligible.


I don't think Jevon's Paradox is applicable here? This is about solar becoming more efficient.

In any case, if the argument is that oil is going to be pumped regardless of how much it's actually used, can we not just save it for a rainy day, so to speak?


No because we cannot store large amounts of oil. We can store a few weeks of oil, and that's it. That's why, for example, Putin burned it off: if he doesn't cut supply, he can't store it. But that isn't a Russian problem, that's a global problem. Losses through burnoff are typical in the industry, which is why equipment for large scale burnoff even exists: for various logistical problems. If oil can't be taken out of pumps or refineries, and it's not worth it to take production offline due to restart costs, they just burn it right there. For no useful work.


Why can't we store oil? Is it just a matter of we haven't built long-term storage yet due to not needing to, or is there something else?


We can store oil underground for millions of years.


... in the sense that we can disable some pumps, yes. If one party agrees to make less money and let everyone else have more, then we can store oil underground for millions of years. In other words: this is absolutely, utterly, completely and totally impossible. What always happens is oil becomes cheaper and all of it sells.


The reason this is a stupid argument is that solar power is significantly cheaper than fossil fuel power almost everywhere. And not in a "calculating all of the global impacts" way, in the very direct, greedy, "I want the cheapest electricity possible" way. "Whatabout"s with storage and time of day, etc. aren't necessary, battery tech is cheap and solar production is so cheap you can do inefficient things with it (panels at non-ideal angles to get more power at off peak hours) and still come out ahead.

I really doubt China is installing solar at insane rates to be nice to the world.


> Despite being sparse, NSA surpasses Full Attention baseline on average across general benchmarks, long-context tasks, and reasoning evaluation.

Isn't it very notable that the latency improvement didn't have a performance loss? I'm not super familiar with all the technical aspects, but that seems like it should be one of the main focuses of the paper.


The performance maintenance (or even improvement) isn't surprising - sparse attention can reduce noise by focusing only on relevant tokens. Traditional full attention dilutes focus by attending to everything equally, while NSA's pruning approach mimics how humans selectively process information.


Yes that’s what makes it so interesting and novel you nailed it


In things that I am comparatively good at (e.g., coding), I can see that it helps 'raise the ceiling' as a result of allowing me to complete more of the low level tasks more effectively. But it is true as well that it hasn't raised my personal bar in capability, as far as I can measure.

When it comes to things I am not good at at, it has given me the illusion of getting 'up to speed' faster. Perhaps that's a personal ceiling raise?

I think a lot of these upskilling utilities will come down to delivery format. If you use a chat that gives you answers, don't expect to get better at that topic. If you use a tool that forces you to come up with answers yourself and get personalized validation, you might find yourself leveling up.


> When it comes to things I am not good at at, it has given me the illusion of getting 'up to speed' faster. Perhaps that's a personal ceiling raise?

Disagree. It's only the illusion of a personal ceiling raise.

---

Example 1:

Alice has a simple basic text only blog. She wants to update the styles on his website, but wants to keep his previous posts.

She does research to learn how to update a page's styles to something more "modern". She updates the homepage, post page, about page. She doesn't know how to update the login page without breaking it because it uses different elements she hasn't seen before.

She does research to learn what the new form elements and on the way sees recommendations on how to build login systems. She builds some test pages to learn how to restyle forms and while she's at it, also learns how to build login systems.

She redesigns her login page.

Alice believes she has raised the ceiling what she can accomplish.

Alice is correct.

---

Example 2:

Bob has a simple basic text only blog. He wants to update the styles on his website, but wants to keep his previous posts.

He asks the LLM to help him update styles to something more "modern". He updates the homepage, post page, about page, and login page.

The login page doesn't work anymore.

Bob asks the LLM to fix it and after some back and forth it works again.

Bob believes she has raised the ceiling what he can accomplish.

Bob is incorrect. He has not increased his own knowledge or abilities.

A week later his posts are gone.

---

There are only a few differences between both examples:

1. Alice does not use LLMs, but Bob does. 2. Alice knows how to redesign pages, but Bob does not. 3. Alice knows how login systems work, but Bob does not.

Bob simply asked the LLM to redesign the login page, and it did.

When the page broke, he checked that he was definitely using the right username and password but it still wasn't working. He asked the LLM to change the login page to always work with his username and password. The LLM produced a login form that now always accepted a hard coded username and password. The hardcoded check was taking place on the client where the username and password were now publicly viewable.

Bob didn't ask the LLM to make the form secure, he didn't even know that he had to ask. He didn't know what any of the footguns to avoid were because he didn't even know there were any footguns to avoid in the first place.

Both Alice and Bob started from the same place. They both lacked knowledge on how login systems should be built. That knowledge was known because it is documented somewhere, but it was unknown to them. It is a "known unknown".

When Alice learned how to style form elements, she also read links on how forms work which lead her to links on how login systems work. That knowledge for her went from an unknown known to a "known known" (knowledge that is known, that she now also knows).

When Bob asked the LLM to redesign his login page, at no point in time does the knowledge of how login systems work become a "known known" for him. And a week later some bored kid finds the page, right clicks on the form, clicks inspect and sees a username and password to log in with.


This is interesting, but I wonder how reliable this type of monitoring is really going to be in the long run. There are fairly strong indications that CoT adherence can be trained out of models, and there's already research showing that they won't always reveal their thought process in certain topics.

See: https://arxiv.org/pdf/2305.04388

On a related note, if anyone here is also reading a lot of papers to keep up with AI safety, what tools have been helpful for you? I'm building https://openpaper.ai to help me read papers more effectively without losing accuracy, and looking for more feature tuning. It's also open source :)


We need to distinctly think about what tasks are actually suitable for LLMs. Used poorly, they'll gut our abilities to think thoughtfully. The push, IMO, should be for using them for verification and clarification, but not for replacements in understanding and creativity.

Example: Do the problem sets yourself. If you're getting questions wrong, dig deeper with an AI assistant to find gaps in your knowledge. Do NOT let the AI do the problem sets first.

I think it was similar to how we used calculators in school in the 2010s at least. We learned the principles behind the formulae and how to do them manually, before introducing the calculators to abstract the usage of the tools.

I've let that core principle shape some of how we're designing our paper-reading assistant, but still thinking through the UX patterns -- https://openpaper.ai/blog/manifesto.


Agreed, and the analogy with calculators is very apt.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: