I agree that after discussions with a LLM you may be led to novel insights.
However, such novel insights are not novel due to the LLM, but due to you.
The "novel" insights are either novel only to you, because they belong to something that you have not studied before, or they are novel ideas that were generated by yourself as a consequence of your attempts to explain what you want to the LLM.
It is very frequent for someone to be led to novel insights about something that he/she believed to already understand well, only after trying to explain it to another ignorant human, when one may discover that the previous supposed understanding was actually incorrect or incomplete.
The point is that the combined knowledge/process of the LLM and a user (which could be another LLM!) led to it walking the manifold in a way that produced a novel distribution for a given domain.
I talk with LLMs for hours out of the day, every single day. I'm deeply familiar with their strengths and shortcomings on both a technical and intuitive level. I push them to their limits and have definitely witnessed novel output. The question remains, just how novel can this output be? Synthesis is a valid way to produce novel data.
And beyond that, we are teaching these models general problem-solving skills through RL, and it's not absurd to consider the possibility that a good enough training regimen cannot impart deduction/induction skills into a model that are powerful enough to produce novel information even via means other than direct synthesis of existing information. Especially when given affordances such as the ability to take notes and browse the web.
> I push them to their limits and have definitely witnessed novel output.
I’m quite curious what these novel outputs are. I imagine the entire world would like to know of an LLM producing completely, never-before-created outputs which no human has ever thought before.
Here is where I get completely hung up. Take 2+2. An LLM has never had 2 groups of two items and reached the enlightenment of 2+2=4
It only knows that because it was told that. If enough people start putting 2+2=3 on the internet who knows what the LLM will spit out. There was that example a ways back where an LLM would happily suggest all humans should eat 1 rock a day. Amusingly, even _that_ wasn’t a novel idea for the LLM, it simply regurgitated what it scraped from a website about humans eating rocks. Which leads to the crux: how much patently false information have LLMs scraped that is completely incorrect?
This is not a correct approximation of what happens inside an LLM. They form probabilistic logical circuits which approximate the world they have learned through training. They are not simply recalling stored facts. They are exploiting organically-produced circuitry, walking a manifold, which leads to the ability to predict the next state in a staggering variety of contexts.
It's not hard to imagine that a sufficiently developed manifold could theoretically allow LLMs to interpolate or even extrapolate information that was missing from the training data, but is logically or experimentally valid.
So you do agree that an LLM cannot derive math from first principals, or no? If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4?
I argue absolutely not. That would be a fascinating experiment.
Hell, train it on every 2-number addition combination of m+n where m and n can be any number between 1-100 (or 0-100 would be better) BUT 2, and have it figure out what 2+2 is.
I would probably change my opinion about “circuits”, which by the way really stretches the idea of a circuit. The “circuit” is just the statistically most likely series of tokens that you’re drawing pretend lines between. Sure, technically connect-the-dots is a circuit, but not in the way you’re implying, or that paper.
> If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4?
What? Of course not? Could you? Do you understand just how much work has gone into proving that 1 + 1 = 2? Centuries upon centuries of work, reformulating all of mathematics several times in the process.
> Hell, train it on every 2-number addition combination of m+n where m and n can be any number between 1-100 (or 0-100 would be better) BUT 2, and have it figure out what 2+2 is.
If you read the paper I linked, it shows how a constrained modular addition is grokked by the model. Give it a read.
> The “circuit” is just the statistically most likely series of tokens that you’re drawing pretend lines between.
That is not what ML researchers mean when they say circuit, no. Circuits are features within the weights. It's understandable that you'd be confused if you do not have the right prior knowledge. Your inquiries are good, but they should stop as inquiries.
If you wish to push them to claims, you first need to understand the space better, understand what modern research does and doesn't show, and turn your hypotheses into testable experiments, collect and publish the results. Or wait for someone else to do it. But the scientific community doesn't accept unfounded conjecture, especially from someone who is not caught up with the literature.
That's wonderful, but you are ignoring that your kid comes built in with a massive range of biological priors, built by millions of years of evolution, which make counting natural and easy out of the box. Machine learning models have to learn all of these things from scratch.
And does your child's understanding of mathematics scale? I'm sure your 4-year-old would fail at harder arithmetic. Can they also tell me why 1+1=2? Like actually why we believe that? LLMs can do that. Modern LLMs are actually insanely good at not just basic algebra, but abstract, symbolic mathematics.
You're comparing apples and oranges, and seem to lack foundational knowledge in mathematics and computer science. It's no wonder this makes no sense to you. I was more patient about it before, but now this conversation is just getting tiresome. I'd rather spend my energy elsewhere. Take care, have a good day.
I hope you restore your energy, I had no idea this was so exhausting! Truly, I'll stop afflicting my projected lack of knowledge, sorry I tired you out!
Ah man, I was curious to read your response about priors.
> If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4?
Unless you locked your kid in a room since birth with just this information, it is not the same kind of set up is it?
No, you were being arrogant and presumptuous, providing flawed analogies and using them as evidence for unfounded and ill-formed claims about the capabilities of frontier models.
Lack of knowledge is one thing, arrogance is another.
You could find a pre-print on Arxiv to validate practically any belief. Why should we care about this particular piece of research? Is this established science, or are you cherry-picking low-quality papers?
I don't need to reach far to find preliminary evidence of circuits forming in machine learning models. Here's some research from OpenAI researchers exploring circuits in vision models: https://distill.pub/2020/circuits/ Are these enough to meet your arbitrary quality bar?
Circuits are the basis for features. There is still a ton of open research on this subject. I don't care what you care about, the research is still being done and it's not a new concept.
The difference is that when you do not know how a problem can be solved, but you know that this kind of problem has been solved countless times earlier by various programmers, you know that it is likely that if you ask an AI coding assistant to provide a solution, you will get an acceptable solution.
On the other hand, if the problem you have to solve has never been solved before at a quality satisfactory for your purpose, then it is futile to ask an AI coding assistant to provide a solution, because it is pretty certain that the proposed solution will be unacceptable (unless the AI succeeds to duplicate the performance of a monkey that would type a Shakespearean text by typing randomly).
While I agree with the article, the reducing of the number of technical writers due to the belief that their absence can be compensated by AI is just the most recent step of a continuous process of degradation of the technical documentation that has characterized the last 3 decades.
During the nineties of the last century I was still naive enough to believe that the great improvements in technology, i.e. the widespread availability of powerful word processors and the availability of the Internet for extremely cheap distribution will lead to an improvement in the quality of technical documentation and to easy access to it for everybody.
The reverse has happened, the quality of the technical documentation has become worse and worse, with very rare exceptions, and the access to much of what has remained has become very restricted, either by requiring NDAs or by requiring very high prices (e.g. big annual fees for membership to some industry standards organization).
A likely explanation for the worse and worse technical documentation is a reduction in the number of professional technical writers.
It is very obvious that the current management of most big companies does not understand at all the value of competent technical writers and of good product documentation; not only for their customers and potential customers, but also for their internal R&D teams or customer support teams.
I have worked for several decades at many companies, very big and very small, on several continents, but, unfortunately only at one of them the importance of technical documentation was well understood by the management, therefore the hardware and software developers had an adequate amount of time planned for writing documentation in their schedules for product development. Despite the fact that the project schedules at that company appeared to allocate much more time for "non-productive tasks" like documentation, than in other places, in reality it was there where the R&D projects were completed the fastest and with the least delays over the initially estimated completion time, one important factor being that every developer understood very well what must be done in the future and what has already been done and why.
There are better tools for software developers now than in e.g. 1996, so the pace of writing software has indeed increased, but certainly there has not been any 100x speed up.
At best there may have been a doubling of the speed, though something like +50% is much more likely.
Between e.g. 1980 and 1995 the speed of writing documentation has increased much faster than the speed of writing programs has ever increased, due to the generalization of the use of word processors on personal computers, instead of using typewriting machines.
Many software projects might be completed today much faster than in the past only when they do not start from zero, but they are able to reuse various libraries or program components from past projects, so the part that is actually written now is very small. Using an AI coding assistant does exactly the same thing, except that it automates the search through past programs and it also circumvents the copyright barriers that would prevent the reuse of programs in many cases.
I'm talking about the features/hr. It's trivial now to spin up a website with login, search, commenting, notifications, etc. These used to be multi week projects.
This is not writing something new for scratch, but just using an already existing framework, with minor customization for the new project.
Writing an essentially new program, which does something never accomplished before, proceeds barely faster today than what could be done in 1990, with a programming environment like those of Microsoft or Borland C/C++.
Avalanche transistors, like the tunnel diodes mentioned by another poster, had been widely used in the past for generating fast pulses.
However, nowadays it is difficult to find any bipolar transistors that are suitable to be operated in the avalanche mode or any tunnel diodes, because these were fabricated using older technologies that are not suitable for the semiconductor devices that are popular today, so most such fabrication lines have been closed, due to insufficient demand.
Only for extremely few bipolar transistors the characteristics of the avalanche mode operation were specified by their manufacturer, so for most devices using avalanche transistors the transistors for each built device had to be cherry picked by testing many transistors of a type known to include suitable transistors.
Indeed, in the now distant past the application notes from companies like Linear Technology, and many others, were a treasure of information from which one could learn more about electronics than from university textbooks.
Sadly, such great technical documentation exists no more. The companies that make such products are no longer your business partners, but they are adversarial entities, whose only goal is how to confuse and to fool their customers into paying as much as possible for products whose quality is as low as possible.
Educating your customers about how to better use your products is no longer a business goal. Another current thread on HN is about the fear that the huge decline in the quality of technical documentation during the last 3 decades will be accelerated by the replacement of professional technical writers with AI.
The market has changed significantly, there's much less need for this kind of education for a 3 cent microcontroller.
I've found ADI still has some great educational material, although that's partly because they've been better at maintaining their webpages from the 90's and 00's, not because they're putting out much new material.
All the productivity enhancement provided by LLMs for programming is caused by circumventing the copyright restrictions of the programs on which they have been trained.
You and anyone else could have avoided spending millions for programmer salaries, had you been allowed to reuse freely any of the many existing proprietary or open-source programs that solved the same or very similar problems.
I would have no problem with everyone being able to reuse any program, without restrictions, but with these AI programming tools the rich are now permitted to ignore copyrights, while the poor remain constrained by them, as before.
The copyright for programs has caused a huge multiplication of the programming effort for many decades, with everyone rewriting again and again similar programs, in order for their employing company to own the "IP". Now LLMs are exposing what would have happened in an alternative timeline.
The LLMs have the additional advantage of fast and easy searching through a huge database of programs, but this advantage would not have been enough for a significant productivity increase over a competent programmer that would have searched the same database by traditional means, to find reusable code.
> the rich are now permitted to ignore copyrights, while the poor remain constrained by them, as before.
Claude Code is $20 a month, and I get a lot of usage out of it. I don't see how cutting edge AI tools are only for the rich. The name OpenAI is often mocked, but they did succeed at bringing the cutting edge of AI to everyone, time and time again.
Intellectual property law is a net loss to humanity, so by my reckoning, anything which lets us all work around that overhead gets some extra points on the credit side of the ledger.
I agree in spirit, but in actual fact this subversion of intellectual property is disproportionately beneficial to those who can afford to steal from others and those who can afford to enforce their copyright, while disproportionately disadvantageous to those who can't afford to fend off a copyright lawsuit or can't afford to sue to enforce their copyright.
The GP can free-ride uncredited on the collective work of open source at their leisure, but I'm sure Disney would string me up by my earlobes if I released a copywashed version of Toy Story 6.
Then it really proves how much the economy would be booming if we abolished copyright, doesn't it? China ignores copyright too, and look at them surpassing us in all aspects of technology, while Western economies choose to sabotage themselves to keep money flowing upwards to old guys.
"Available for use" and "Automatically rewritten to work in your codebase fairly well" is very different, so copyright is probably not the blocker technically
22050 Hz is an ideal unreachable limit, like the speed of light for velocities.
You cannot make filters that would stop everything above 22050 Hz and pass everything below. You can barely make very expensive analog filters that pass everything below 20 kHz while stopping everything above 22 kHz.
Many early CD recordings used cheaper filters with a pass-band smaller than 20 kHz.
For 48 kHz it is much easier to make filters that pass 20 kHz and whose output falls gradually until 24 kHz, but it is still not easy.
Modern audio equipment circumvents this problem by sampling at much higher frequencies, e.g. at least 96 kHz or 192 kHz, which allows much cheaper analog filters that pass 20 kHz but which do not attenuate well enough the higher frequencies, then using digital filters to remove everything above 20 kHz that has passed through the analog filters, and then downsampling to 48 kHz.
The original CD sampling frequency of 44.1 kHz was very tight, despite the high cost of the required filters, because at that time, making 16-bit ADCs and DACs for a higher sampling frequency was even more difficult and expensive. Today, making a 24-bit ADC sampling at 192 kHz is much simpler and cheaper than making an audio anti-aliasing filter for 44.1 kHz.
The analog source is never perfectly limited to 20 kHz because very steep filters are expensive and they may also degrade the signal in other ways, because their transient response is not completely constrained by their amplitude-frequency characteristic.
This is especially true for older recordings, because for most newer recordings the analog filters are much less steep, but this is compensated by using a much higher sampling frequency than needed for the audio bandwidth, followed by digital filters, where it is much easier to obtain a steep characteristic without distorting the signal.
Therefore, normally it is much safer to upsample a 44.1 kHz signal to 48 kHz, than to downsample 48 kHz to 44.1 kHz, because in the latter case the source signal may have components above 22 kHz that have not been filtered enough before sampling (because the higher sampling frequency had allowed the use of cheaper filters) and which will become aliased to audible frequencies after downsampling.
Fortunately, you almost always want to upsample 44.1 kHz to 48 kHz, not the reverse, and this should always be safe, even when you do not know how the original analog signal had been processed.
yeah but you can record it in 96kHz, then resample it perfectly to 44.1 (hell, even just 40) in digital domain, then resample it back to 48kHz before sending to DAC
If you have such a source sampled at a frequency high enough above the audio range, then through a combination of digital filtering and resampling you can obtain pretty much any desired output sampling frequency.
the point is that when down sampling from 48 to 44.1 you can for "free" do the filtering since the down sampling is being done digitally with an fft anyway
When passing through a material that is at thermal equilibrium, light is attenuated, because a part of it is absorbed. On the other hand, if the material is not at thermal equilibrium and you prepare it to have a negative temperature (by "pumping", so that more molecules/atoms/ions are in states with higher energy than they are in states with lower energy), then when light passes through the material it can be amplified (by stimulated emission), instead of being attenuated, like through a material at positive temperature.
Of course, in any laser/maser material, only a very small fraction of its constituents have an energy distribution corresponding to a negative temperature, there is no known method that could force a whole piece of a material to have a negative temperature (because the energy would redistribute spontaneously inside the material towards what corresponds to a positive temperature, faster than energy could be provided from the outside; lasers use some special energy states that have a low probability of decaying spontaneously, so they persist enough time after pumping).
KDE 3.5 has been the best desktop environment for me (mainly due to its extreme customization facilities), far better than the contemporaneous Windows XP or Mac OS X, while the following KDE 4 was an unusable atrocious piece of garbage (despite having waited to make the transition to KDE 4 until it was claimed that all its initial bugs had been solved; when I tried it there were no bug problems, only bad design choices that could not be altered in any way).
For a few years I had kept the last KDE 3.5, but eventually I grew tired of solving compatibility problems with newer programs and I switched to XFCE.
I am still using it because I have never seen any reason to use anything else. There are a few KDE or Gnome applications that I use (for instance Okular or Kate), but I have not encountered yet any compatibility problem with them, so I have no need for one of the more bloated environment systems.
I have been using Linux on a variety of laptops and desktops, all with XFCE and without problems. XFCE does not do much, but I do not want it to do more, it allows my GUIs to be beautiful and to reach maximum speed and it has decent customization facilities, which is very important for me, as I have never encountered any desktop environment where I can be content with its default configuration.
Whenever I happen to temporarily use some Windows version for some work-related activity, I immediately feel constrained in a straitjacket by the rigidity of the desktop environment, which does not allow me to configure it in a way that would please me and would not interfere with my work.
On my main desktop, and also on my mobile workstation laptop, I have used only NVIDIA GPUs for the last 20 years and I have never encountered even the slightest problem with them, at least not with XFCE, so I am always surprised when other users mention such problems, like another poster near this message.
Perhaps my lack of problems with NVIDIA may be explained by the fact that I am using Gentoo, so I always have up-to-date NVIDIA drivers, while the users of other distributions mention having some problems with updating the drivers.
Only in my latest desktop, which was assembled this summer, I have installed an Intel Battlemage GPU, instead of an NVIDIA GPU, because the Intel GPU has increased its FP64 throughput, while the NVIDIA GPUs have decreased their FP64 throughput. Thus I hope that Intel will not abandon the GPU market, even if the intentions of their current CEO are extremely nebulous.
As an example of some very simple customizations, which are trivial on XFCE but surprisingly difficult on other desktop environments, I use a desktop with a completely blank, neutral grey background, without icons or any other visual clutter. I launch applications from a menu accessed with a right mouse click or with CTRL-ESC, and I have an auto-hiding taskbar for minimized applications and for a very small number of utilities, e.g. a clock/calendar and a clipboard manager. A few frequently used applications are bound to hot keys.
I wish that humans were not so easily duped into believing that things for which someone else uses the same name are really the same and things for which someone else uses different names are really different.
The Michelson–Morley experiment was indeed very important, but it has not proved in any way the non-existence of ether. It has just proved that the ether does not behave as it was previously supposed, i.e. like the materials with which humans are familiar.
It does not matter at all what names are used for it, one may choose to name it "ether", "vacuum", "electromagnetic field", "force field" or anything else, but all the modern physics, since James Clerk Maxwell and William Thomson, is built on the assumption that the space is not empty, but it is completely filled with something that mediates all the interactions between things.
Only before the middle of the 19th century, the dominant theories of physics assumed the existence of true vacuum. The existence of true vacuum is possible only in the theories based on action at a distance, like the Newtonian theory of gravity or the electromagnetic theory of Wilhelm Eduard Weber, but not in field-based theories, like the electromagnetic theory of Maxwell or the gravitational theory of Einstein.
It is rather shameful for physics that the main result of the Michelson-Morley experiment has been the replacement of the word "ether" by "vacuum", as if a change of name would change the thing to which the name is applied, instead of focusing on a better understanding of the properties of the thing for which the name is used.
"Ether" is a hypothetical substance with certain properties. The Michelson-Morley experiment proved that no substance with those properties existed. There's something else with different properties, so it makes perfect sense to use a different name.
In context of approximately everyone's education, the history goes like this: in the past people believe there's something in empty space, and used the name "ether" for that. You learn that, then you learn that MM showed there's no "something", no "ether", but that empty space is, in fact, empty, which is what "vacuum" means. And then if you pay attention or any interest to the topic, you learn that there in fact is no pure vacuum, there's always "something" in empty space.
The obvious question to ask at this point is, "so ether is back on the table?".
Turns out the mistake is, as GP said, thinking MM proved space is empty; it only disproved a particular class of substances with particular properties. But that's not how they tell you about it in school.
More specifically MM showed that earth is not moving relative to a hypothetical medium through which electromagnetic waves propagate. So either the universe is geocentric or there is no such medium.
Another interpretation is that the apparatus and not just light is made from ether — and so the signal is lost because the measuring apparatus is also subject to the local distortion.
That interpretation is also consistent with LIGO: we can detect those ether disturbances because the distortion of our motion on the apparatus doesn’t cancel the signal in the same way.
QM posits that fields which constitute matter and fields which constitute EM are both manifestations of an underlying phenomenon — that’s the whole idea behind unification. (And we’ve already successfully unified such fields.) My comment is just applying that theory to interpreting the Michelson-Morley experiment.
Please don’t reply with such trite anti-scientific comments, which conflate actual scientific claims with nonsense.
If you have an actual objection, then you should present it. But argument by mockery because you fail to understand modern physics lowers the quality of the discussion.
Field theories are aether theories; as is GR. (Wilczek says as much.) You’re fixated on a particular model of aether, rather than addressing the broader concept. But that’s as illogical as me insisting atoms aren’t real because the Bohr model of electron shells was wrong.
The current aether for light is called “EM field”; matter is made of other fields demonstrated to unify at high energies by the LHC and similar experiments, within the standard model. But you knew all that.
You’re just pretending ignorance to avoid addressing my central thesis: fields are aethers.
Well, yes, of course. This is a discussion of Michelson-Morley interferometers. In that context the word "aether" has a very specific and well-established meaning, and it is not at all the same as a quantum field.
> fields are aethers
You are free to employ the Humpty Dumpty theory of language and redefine the word "aether" if you like. But in the context of Michelson-Morley interferometers, no, quantum fields are not "aethers". The whole notion of making the word "aether" plural in that context is non-sensical. In the context of Michelson-Morley interferometers there is only one aether: the luminiferous aether, a hypothetical physical substance that exists in three-dimensional space. Quantum fields are not even remotely like that. They are not physical. They do not exist in 3-D space. They cannot be directly measured. Physical objects are emergent properties of fields, but they are not "made of" fields. The constituents of a piece of lab equipment are particles, not fields.
It's more than just the lack of material. It demonstrates that light propagates in a specific way that is different from any ordinary material. Light moving in a vacuum is different from a baseball moving in a vacuum. The speed of light is independent of your own motion, which is not true of anything with mass.
> The Michelson–Morley experiment was indeed very important, but it has not proved in any way the non-existence of ether. It has just proved that the ether does not behave as it was previously supposed, i.e. like the materials with which humans are familiar.
That's kind of like saying that our failure to observe invisible pink unicorns does not prove the non-existence of invisible pink unicorns, it just proves that invisible pink unicorns don't behave the way you expect them to.
Luminiferous ether was a specific hypothesis about how light works. It made a prediction, which turned out to be wrong, which falsified the theory. Whether you want to attach the description "proves the ether does not exist" or "proves the ether does not have the properties ascribed to it by the theory" is completely irrelevant.
> I thought the whole point was if it did exist the motion goes faster in one direction than the other.
No, the idea was that, in a space filled with the hypothetical ether, Earth's velocity through the ether should have been detectable by comparing light beams traveling in different directions.
The null result was very important -- it didn't prove the absence of an ether, it only showed that it wasn't a factor in light propagation.
However, such novel insights are not novel due to the LLM, but due to you.
The "novel" insights are either novel only to you, because they belong to something that you have not studied before, or they are novel ideas that were generated by yourself as a consequence of your attempts to explain what you want to the LLM.
It is very frequent for someone to be led to novel insights about something that he/she believed to already understand well, only after trying to explain it to another ignorant human, when one may discover that the previous supposed understanding was actually incorrect or incomplete.
reply