One of my favorite College courses was my digital logic class, where we went from logic gates to building a 4-bit microprocessor. It really gave me a deeper understanding of what computers are and how everything ultimately works together.
I took this class when I was already A dozen years into my software development career. It’s certainly not necessary to write software, but I’m really glad I know the underpinnings of computing.
I had a course like that in my CS programme, and even though we really, really didn't go deep, we wired our own super simple 8-bit processor, and wrote the microcode, and then wrote a program in assembly to do something.
Stepping through the fetch-execute cycle one cycle at a time, watching your CPU do its little thing, and then realizing that that is exactly what actual CPUs do, they're just doing it billions of times per second, that was amazingly eye-opening.
In 1986, after quitting my PhD in computation molecular biology and starting out on a career as a software developer, I tried to figure out what the goal for the next 5 years was. I settled on satisfying myself that I understood (enough about) every level of the computer from semiconductor physics up to GUI/UX work. Luckily I had the semiconductor stuff already covered from my undergrad days and a friend from that time who was an EE major.
The rest took a bit longer than 5 years (maybe 7), and of course the learning didn't stop there. But I do remember thinking in the early 1990s "OK, I understand the entire stack at this point, what comes next?" My understanding had reached a level I was satisfied with - it would not have compared with someone who spent their entire working life focused on that level, but it was far, far beyond any normal computer user's understanding.
In the 2 decades and more since then, not a whole lot has changed that has invalidated the understanding I think I gained at the beginning. Microcode was a bit of a shock. The biggest shock has been watching a whole new generation of programmers who barely grasp the lower 80-90% of the system. Maybe that's good, but it doesn't feel obviously right.
It feels no less wrong than having, at best, a surface-level understanding of how seratonin / dopamine / norepinephrine receptors work and still being comfortable enough to drive a car at highway speeds, knowing full well that an error along this obscure signal processing pipeline is a potentially fatal swerve into oncoming traffic.
A couple of years ago I was in this position. I wanted to know how computers work and stumbled upon this resource. I can highly recommend the exercises given on Github by Xorpd. It really requires no prior knowledge and a lot of the exercises have answers now (still a work in progress though).
Next to that it might be convenient to have something like asmdebugger.com next to it.
Another book that also really helped working with this was "Code: the Hidden Language of Computer Hard and Software" by Charles Petzold. This explains how computers work from light bulb/transistor level, all the way to the top to the OS, the monitor and the keyboard.
I second Charles Petzold's book Code! It was my first technical book and probably the reason I'm going to school for computer science right now. I recommend it to anyone interested in learning how computers work, even non-techies!
I think there already is a subculture of programmers who value the lower resource use (i.e. power as well as memory/speed) and availability of free/open hardware that knowing how computers work affords. Whether that remains a niche or becomes more mainstream is an interesting question.
This reminds me of nand2tetris course, which I think really helps to understand practical implementation of today's computers (https://www.nand2tetris.org)
The associated book ('The Elements of Computing Systems') was eye-opening for me. Not sure if 'today's computers' is quite right -- if I understand correctly, a lot of the simplicity has been optimised out of the hardware stack, to the point that even experts don't have a fully detailed understanding. Still really worthwhile to read and understand the basic principles, though.
I found that one of the most important insights when studying how computers work on a fundamental level, is that deep down data and code don't make much difference. This realization can open the gates of looking at higher programming language concepts in a new light; even if one never delves into bare metal engineering.
You can really take this two ways: one part being about von Neuman architectures, the other side culminating in Lisp. They’re fairly separate but both interesting views on computing. Of course, the opposite viewpoints also do exist, as Harvard architectures and your favorite “blub” language is still used today…
This became apparent to me once I learned a Turing machine can compute anything λ calculus can and vice versa. Is this also also a conclusion you can reach by thinking about computer hardware? Would love to hear more about the thought process.
I'd spent one of the most educational years of my life this last year reading, "Designing for Data-Intensive Applications", "CODE: The Hidden Language of Computer Hardware and Software", and "The Design of the UNIX Operating System".
While they haven't given me any expertise on how to actually use these technologies to solve intricate problems (only experience and true depth can do that), they have given me a near-complete picture of what ordinary computers do today. Nothing is magic. Everything is a logical operation.
I only regret I hadn't read these back when I was a teenager.
PS. Next thing I would like to learn is the general design of modern CPUs and branch prediction.
Designing for Data Intensive Applications is a must-read imo. The book is chock full of very straight forward explanations of what happens in data-related operations, from DB transactions to stream processing.
The most valuable thing(s) I got from it are clear explanations of Transactions, Isolation Levels, and examples of errors that occur due to asynchronous operations.
I do know how computers work, at the transistor level. But it boggles the mind to try to understand how a computer works from the transistor level to the point of where your mouse icon traverses across the screen. Trying to understand it at all levels will really give you a headache. Or at least, it does for me.
I tend to think that if more people understood how to program at this level, the weight of software and websites would be a lot less. I can't log in to my bank website without download 1 MB of crap. It's super easy to program using by importing * but if you can write in assembly you can really optimize. It's just not worth the effort these days. So the improvement of Moore's law (RIP) ends up going to programmers' salaries instead of to performance improvements. Kind of an interesting way to think about it.
Amusingly, in high school my physics teacher told the class a story about how things had become so complex that no single person could understand everything about the classroom computer.
That was in 1978 and the computer was an Apple II.
It was entirely possible in those days to understand everything from transistors, to digital logic gates, to flip-flops, to clocks, to adders, to registers, to 8 bit computer chips, to assembler, to the primitive BASIC on an Apple II.
QM and classical physics courses (2 years worth) are required of every Caltech student, as they are considered foundational. EE and CS are optional, of course, but not a problem to take.
You can get a hand-wavey understanding quite easily.
Digital circuits use a fairly simple model, and that's not difficult to pick up.
Analog circuits can use a number of more complex models, and they're more of a challenge, depending on what you're trying to do.
Actually designing transistors - defining dopant levels, topology, and process steps - is a very specialised skill that very few people have.
And modern PC motherboards are marvels of electromagnetic engineering as well as electronic engineering. You have GHz signals running reliably on circuit boards with multiple layers. Getting those to work is another very specialised skill.
So it would quite something to find someone who had a professional-level understanding of all the levels: from silicon chemistry and quantum physics at one extreme, to electromagnetic design, to all the different levels of electronic design, to binary machine language and processor architectures, to OS design and application software, to all the various data structures and encodings that make all of the above useful.
You really need a course in Quantum Mechanics to understand it. But a workable description is a couple pages in most electronic textbooks. MIT covers it in their first course in electronics on Youtube.
True, but transistors do not behave in a newtonian physics manner, whereas we do just fine understanding most physics using conventional newtonian mechanics.
A generic course in QM won't really help you understand electronics any better. Perhaps if you have a background in electronics or if the course makes attempts to link QM with electronics but otherwise it's just relatively abstract physics.
I don't know any program of study that would get sufficiently deep enough to get to semi-conductor quantum tunneling (year 2/3 class) without having some foundational E&M to motivate the QM to begin with (pure math here being a more exceptional but rare case). I think the beauty of QM at that point is that it looks and feels a lot like differential field work of EM and Lagrangian mechanics but maps to quantized numerical spaces, where jumps and little shoots like that happen all the time (spontaneous crystal domain alignment in magnets for example). If you were able to get sufficiently skilled in intro QM, it would absolutely help study in E&M.
If you E&M questions are just node solving, linear algebra and difeq is all you need for E&M to solve "most" work. You do a lot of specialized versions of that in QM and as mentioned above the people doing the actual semiconductor design are so highly specialized that they are often not transferable to other problem domains and require fundamentally different approaches. Not sure any non-PHD study will really help one advance the field in a meaningful way. Study of QM i would say sufficiently enriches all physics verticals that if you are capable, it's worthy of individual pursuit, but also makes total sense to study as part of a curriculum taking place after intro E&M.
Now that you mention it. After the QM course there was a solid state physics course where quantum effects show up and where it's applied to electronics/semiconductors.
> A generic course in QM won't really help you understand electronics any better.
Caltech had an elective course for freshman about how semiconductors worked up through flip-flops and registers. I took it. It involved QM and yes, tunnelling. So it's doable for mortals.
Very interesting discussion thread re how much knowledge you need or can aquire on complex technology platforms. The goal is key here, I presume.
I read Charles Perzhold's Code, worked through parts of SICP, developed a bit of high level code software, have a very good knowledge of chemistry, lithographic processes (and QM, half forgotten by now). And yet it is not enough to get a good understanding of the modern computer - EE, compilers, OS de, networks, ... wiring the while thing together, et cetera.
I see the point in historical approaches, but it is quite time consuming. At the end, you will most likely attack problems at one to three levels in the "stack".
What would you consider to be an above average conceptual tool and skill set for the modern hacker?
Yes, I can see how it would help there. I was more thinking of the situation where you take a QM course and don't apply it to an electronics problem. I agree that it's doable for mere mortals, but it might take a fair bit of time!
The handwavy explanations you'll find in electronics textbooks are sufficient to enable you to effectively design electronic circuits around them. But personally, I found those explanations inadequate.
Kind of blew my mind when I learned they are analog... I previously thought of them as one of the most basic digital components, because they're used in every digital circuit.
My job for the past 30 years has been designing ASICs, so all the analog stuff is not my concern. Prior to that I designed at the TTL level. I'd spend months working on some project, thinking in terms of logic an timing, and eventually we'd have boards made. We'd stuff a few prototypes and set them up on the lab bench to commence debugging.
Every time at this phase I'd suffer the existential horror the first time put an oscilloscope to the circuit -- the clock wasn't square! Signals would ring, and other signals which hadn't transitioned would sometimes bounce a bit. Setup and hold times could be pretty well estimated, but it was still an estimate. Then there were phantoms that might appear on the scope or real issues that would not show up on the scope because the probe lead was enough to make what was measured not the same as what was actually going on.
That reminds me of working on the stabilizer trim system on the 757. I got used to thinking of the parts as rubber, bending and twisting on the high loads an airplane has in flight. So much so that when the hardware was finally available, I was surprised at how rigid and immovable it felt under my hands with my puny human strength.
I am wondering about the reasons for these issues that you mentioned. Could you please mention them here ? It might serve as a nice pointer to do better hardware design debugging I believe(I have close to zero experience in this. Hence the interest).
Also, are there any formal systems that verify your TTL-level designs ?
Sure. There are many reasons why a real circuit falls short of the idealized abstraction.
A truly squarewave clock, or any rising/falling edge that happens in zero time would have infinite bandwidth, and no real circuit can have infinite bandwidth because of capacitance, inductance, and resistance. Even though the power rails of a TTL chip are 0V and 5V, when a fast transition happens, the signal can experience reflections because the impedance of the driver and the copper trace are different, and because the end of the copper trace isn't terminated with impedance which matches the trace. As a result, the edge takes finite time to transition from 0V to 5V, but the signal can actually exceed 5V because of a reflection. Likewise, falling signals can go below 0V.
Also, the signal at one end of a wire will not be the same as the signal at the other end of the wire. It isn't even just a simple time delayed version. Eventually they'll come to equilibrium if the signal is stable long enough.
OK, another defect. Imagine a TTL part with 8 buffers in it, say a 74ACT244. Say all the outputs are low, and at some moment seven of those inputs switch from low to high. The corresponding outputs must drive enough current to switch their loads from low to high as well. Each of those outputs draw current to charge their load from 0V to 5V, and all of the current flows through a single power pin and a single ground pin. Because of of non-zero resistance and inductance, the voltage seen by the transistors on the chip is shifted slightly from the voltage seen on the PCB. That can cause the output which isn't changing to sink/source current (depending on the direction of the change) causing its output to bounce a bit. Imagine having four people on a trampoline, with one standing still and three jumping at the same time. That "standing still" person will still bounce up and down because of the others causing their common ground reference to shift.
Another defect. A chip's spec sheet lists a bunch of requirements for proper operation and guarantees of the chip's behavior if the requirements are met. Say a flip flip says it has a 5ns setup time, that is, the data in to the flop must be held stable for 5ns before the rising edge of the clock arrives. If this requirement isn't met, the chip's behavior is unspecified (it might capture it, it might not, it might capture it but it might take an arbitrarily long time to appear on the output, or the output could even oscillate). During the design phase one adds up the worst case propagation times for all paths from point A to point B and makes sure there is at least 5ns to spare to guarantee the requirements. But in real life, a clock trace runs all over the board and has many loads, each presenting some amount of capacitance. Kind of like relativity, different chips will see the rising edge at different times. Looking at a given chip, the clock edge and data inputs are each transitioning from high to low or low to high over the course of a nanosecond (and often more). At what point on that slope is the signal really a "0" or a "1"? Sometimes those signals are clean transitions because of reflections, it might go from 0.7V before the clock edge, to -0.5V after the clock edge, to 2.2V (neither logical 0 nor 1), stay like that for a few ns, then move to 4.2V for a few ns, then 4.0V for a few ns, then 4.6V.
The most maddening is the fact you can't always trust what the oscilloscope tells you. The oscilloscope has finite bandwidth; 100MHz scopes were common in the 80s and early 90s when I was using them. The 200MHz scopes were a lot more expensive. You might think, hey my clock is only 25 MHz, 100 MHz is plenty ... but a fast signal edge contains harmonics that are a multiple of that 25 MHz base frequency, so the scope tends to act as a lowpass filter. The scope probe is typically more than a meter long, and there is a clip on the probe that must be attached to your circuit ground. That is a 2m loop of wire, and in the best case it would take a signal a few ns to travel the length of that wire, and any stray EM fields can induce signals on that probe that aren't in the circuit (though obviously the probe designers take pains to minimize such effects). When debugging and a signal looked unexpectedly weird, a common tactic to avoid sinking too much time chasing phantoms was to change the probe, change the channel, or simply waggle the lead wire around to test the sensitivity of the displayed waveform on stray capacitance and such.
Testbenches always had a few cans of refrigerant on hand. If you suspected a chip might be flaky, probe its output then spray it and see if the output is affected by the chip temperature.
Your explanation is quite amazing. In my undergrad, we had learned about Transistors, TTL, flip-flops etc., but as individual building blocks. Your reply paints a very good picture about what can go wrong they play together. Is there a book that you would recommend to learn more about this ?
Also, Honestly, I thought reflections would be negligible when we are dealing with mA and 0-5V ranges. But your reply makes me think that there is more to it
Finally, thank you for your reply. I gonna come back to this answer in the future when I have a lab setup that is more than a solder and some screwdrivers :)
Oh, I forgot to mention that when you probe a circuit, the presence of the probe can affect the operation of the circuit.
On RF designs it was not uncommon to find seemingly needless small valued components at certain probe points. The idea was these modeled the reactive load of a probe. When you needed to probe it you'd remove the components and attach your probe and the circuit see approximately the same load so you could have more faith that what you were seeing more closely matched what was happening when you weren't looking.
I have seen such small valued components, looked their values up and thought, "this is insane".
(I wish I could say I thought, "I must be dumb, someone must be much smarter than I am", but if I hadn't embodied a certain kind of hubris I would not work in software.)
I don't know that there is anything truly digital in the universe, except perhaps quantized components like spin.
Digital computers are an abstraction implemented using error control and a number of other things, but they ultimately fail like analog objects.
Nowadays you'd have to know how a modern lithography machine works. There's a reason why ASML is a highly valuable company. You can perhaps understand the basic ideas, but to implement them is another matter entirely.
That's a surprising example. From reading his autobiography and other sources, it's my impression that by virtue of having designed it from the transistors to the software, Steve Wozniak did in fact grok it fully.
Wozniak has not designed it on the transistors level, he used MOS Technology 6502 microprocessor. Though I agree he could probably understand it on the transistor level if he wanted/needed to.
>So the improvement of Moore's law (RIP) ends up going to programmers' salaries instead of to performance improvements.
This statement doesn't really make sense. If programmers spent the time to optimize everything instead of gluing a bunch of inefficient libraries together, software would cost even more than it does now. So that's even higher salaries because there is even more business competition for the same pool of available programmer hours in the workforce (and that doesn't even take into account the fact that you would wipe out 90% of the programming workforce if writing assembly is required).
Moore's law means that you can afford inefficient code, so it's taking money from programmers and keeping it in the businesses that would have originally had to pay for a hand-tuned piece of code.
There is no "Moore's law for performance". Moore's law is a statement about the number of transistors on a chip, nothing more (Moore?). And sure enough, the Threadripper machine I built this year has about double the number of transistors as the Threadripper machine I built last year, which has about double the number of transistors to the one before that... (although this is partly a function of increasing budget; Moore's law specifies a two year doubling).
But sure, if you want to talk performance - the fastest CPU this year, the Threadripper 3990X, is over 100x faster than the fastest CPU of 2005, the Athlon 64 FX-57. That's about 7-8 doublings, or pretty much spot on.
For a very long time between 1970 and 2005 doubling the transistors halved the time taken for any task. We're well past this point.
If you don't think so try and calculate a_{i+1} = rand_int() a_i mod 232 on a machine from today and one from 2005. Then do the same with a machine from 2005 and 1990.
My CS systems PhD advisor used to say, 'you waste the abundant resource' in systems design. Lots of CPU cycles available, lots of bandwidth available (except when not!) - so waste it and save programmer time.
At least, in theory.
It's not about knowing assembly or not. It's resources, time, cost, money, ROI.
Even after knowing assembly, there's probably significant optimizations that can be done if you design chips and ISAs that are specific to a certain problem. There's "waste" at every stage of abstraction.
I had to build a one-time export yesterday. Am I gonna spend a lot of time optimizing the speed of it? Or am I going to make it work and get the export to the client faster?
It would've wasted my time and the clients money if I optimized the queries. Sure, the export itself took longer. But overall, the task was done quicker.
If the client wanted this export to run on their server frequently, I would've spend the time making sure the export itself runs faster.
It's especially relevant when thinking not at the level of computer systems but business systems - since for many tasks you anyway have a combined computer+people process that works for some goal, and there's a tradeoff between computers and people for choosing the level of automation you want.
For exmaple, in your example of a one-time export it might also make sense to write code that does not meet all requirements and explicitly can not handle certain edge cases, if it's more effective to fix these cases manually than try to automate that.
> when technological progress or government policy increases the efficiency with which a resource is used (reducing the amount necessary for any one use), but the rate of consumption of that resource rises due to increasing demand
I think the problem is actually the opposite. There aren't enough abstractions that help programmers write good code. So when a programmer has to do something complex, they end up doing the easy thing which is just importing every library they need. If instead, there was a way to say "only import this when I'm doing this action" and there's no way for the programmer to explicitly say what to import, then that problem almost disappears.
1) In dynamic languages, it's not possible to detect whether a function is used or not in the general case. For example, consider string accesses on objects. If the compiler is not sophisticated enough to resolve the set of possible strings at compile time (or such analysis would unacceptably increase compile times), then you can't shake out unused methods on that object. [1]
2) For languages like C and C++, the compiler cannot tree shake because it only knows about a single file at a time (translation unit, to be precise). You would have to rely on link-time optimizations to effectively tree shake, but LTO is not well-supported by all toolchains.
Tree shaking also has a cost that I mentioned earlier -- it increases compile times. Both LTO and tree shaking in dynamic languages increase compile times superlinearly [citation needed] wrt. the size of your application. As other commenters have mentioned, it's better to avoid including unnecessary libraries in the first place.
[1] For the pedants: yes, I know resolving the possible set of values (stricter than "all possible values for that type") for a variable is undecidable in the general case.
You could perhaps better use a lazy-loading strategy instead of a static analysis. (But this would change the semantics in case of an existing language that allows side-effects while loading modules, unless you have a lazy strategy for them too ...; and then there are the errors you'd have to deal with)
To truly achieve the same thing as "tree shaking," the function call overhead would be abysmal. You'd have to check whether the module was loaded already (with synchronization if your program is multithreaded). For single threaded programs, you could avoid this by hot-patching your machine code, but there's no way around some synchronization barrier in multithreaded programs [1]. In JavaScript (or any language where you want to avoid sending a large bundle over the network), you'd incur the latency of a network request for the first call to any function.
You're right that people are already splitting apps into bundles, but that is usually done at the page level.
[1] You could probably avoid having to take a mutex after the first call to the function with self patching code, but that sounds incredibly ugly and could have other implications (self-modifying code could be detected as a virus; could be used as a gadget to exploit some other vulnerability).
I am under the impression that JIT compilers do not modify the compiler’s own bytecode. They write generated code into a separate data region and mark that region as executable. If the code needs to modified, then control transfers back to the compiler which can mark the region as writable again.
The same can be done for a binary and it’s own code, but I wonder if it’s used as a signal in antivirus protections if done too often.
See the Elm to JS compiler, it does deep unused code elimination at the function level so you only ship the code you actually use (or the functions u actually use, even inside libraries).
The closure js compiler is also pretty good if you prefer writing js.
If we only start using better languages in our applications we could improve performance quite a bit.
Dead code elimination has been available in C compilers since... so long ago I can't remember. It's not that compilers can't do it, but much like the halting problem, they really can't catch it all.
I can’t tell if this is sarcastic or not. Tree shaking isn’t a silver bullet and really only works under specific circumstances. Better to not include the library to begin with.
> I tend to think that if more people understood how to program at this level, the weight of software and websites would be a lot less.
I’ve done assembler work, I’m perfectly competent at it. I would never try to build a website in anything low level like that. It’s optimizing for completely the wrong set of problems. The world has lots of network bandwidth, cpu cycles and volatile memory. You won’t find many situations where having a better performing website is worth sacrificing the savings in human labor you get from high level abstractions.
I have a degree in electronic systems engineering. At college I learned about both digital and analog circuit design and fabrication. Had my fair share of FPGA and Assembly programming. Built some microcontrolled printed circuit boards. In summary, I believe I have a reasonable understanding of how computers work underneath, and I'm not interested in using anything other than a high level programming language for consumer application development as well ;)
The big reason that everything is so slow is that we just can't standardize on things. For example assembler: There are many to choose from! So we solve this by building slow abstraction layers.
But we can't standardize because technology is improving so quickly. Any standard would be quickly obsolete.
If we ever standardize on one CPU, one screen size, one GPU, etc etc. Then we can make big gains by hyperfocusing on this.
> So the improvement of Moore's law (RIP) ends up going to programmers' salaries
If businesses demanded that everything be of optimal performance, developer salaries would most likely increase because that would increase demand for developer labour in the market without increasing its supply. And I know that many developers would be more than happy to do this (performance optimisation being one of these classic 'nerd sniping' tasks that really appeals to many programmers) but are simply not granted the time to.
And there is a reason that hardly anyone's boss is telling them to eek out every bit and clock cycle: they can't afford it, or at least don't want to pay to do that. So the "benefit" of doing this actually lands squarely on the business, if you want anywhere to look for wasted resources.
As someone who used to work in banking software I can tell you where the 1MB of crap comes from - many banks create their web apps out of ready to use components, which are designed to be completely independent and this brings a huge weight penalty.
Oh spare me. If you went out to write a secure banking website in assembly code you'd come back in 20 years with something even more bloated than we have now. Why? Because the first thing you'd write is a higher-level programming language suited to actually building websites. Then you'd write a web framework on top of that web. Then you'd write a bunch of buggy, insecure crypto code. Et cetera et cetera until you've reinvented a much worse version of the stack we have now.
I'm not convinced. Doug Engelbart's lab did essentially that with a dozen or so engineers within 2-3 years before the famous 1968 demo. I'm not saying we should do that for a banking website, but IMO when it comes to Frontend development there is an opportunity to rethink and start over, maybe on top of wasm.
I have known from the transistor level through to the mouse moving on the screen, just not all at once. If I put the effort in then I know I could have a solid grasp of all stages at once but it would be a waste. If you're programming in assembly then you're dealing with instruction sets. You don't need to know how transistors make up a NAND gate to program with instructions, or even how gates fit together to make higher order components. You just need to need you have a processor and memory.
To write highest performance assembly code you need to understand more than the instruction set. You need to understand about cache hierarchy and how cache lines are filled, about microcode and how to fill the pipeline, including how to exploit hyperthreads, what the consequence of different workloads is on boost speeds, how the vectorized instructions map to hardware and when the trade-off is worth it, etc... Not to mention that you really also need to be a GPGPU expert so you can trade off work to the graphics hardware when necessary.
One can write assembly while knowing only a fraction of that stuff, but the compiler will probably outperform that kind of assembly, so there is little point aside from the fun of it.
What if? While I’d love to really know how computers work at this level of depth, I can’t help but feel like it’s a colossal waste of time these days if you already have other sharply tuned high level programming skills.
The ROI isn’t great, the time you spend learning this is time you could have spent going deeper into your chosen craft and becoming even more masterful.
By the time you do learn a dangerous level of assembly and reverse engineering knowledge, you’ll still have to fight for entry level gigs to get in the industry unless you’re really good at finding niche and high paying work... meanwhile people still want to pay you six figures+ for your decade or more of experience writing everyday software at scale.
Alas, where as it once seemed like anything was possible, as I get older I come to realize the best use of my time is to further perfect the things I already know how to do well, so that one day I can truly become a sagacious master, or at the very least not fall behind my peers and become “outdated”.
Learning opportunities like this are inspiring, but ultimately just a passing curiosity. I hope someone young finds it well.
Agree. Yes, it’s extremely interesting. I look back fondly on my youthful hours/days/weeks spent tinkering with assembly.
And yet. To a junior dev who asked me for advice on this stuff today, I’d just say read Code: The Hidden Language of Computer Hardware and Software, and if you like that then work through Nand to Tetris, and then go back to whatever you need for you day job.
What I don’t need to know, and don’t at all, is how web APIs really work, what a kubernetes is, there is something called Elixir I think and it may or may not be related to some Etherium not-money, and have you heard about elastic searches? Because I haven’t.
But I do need to know extremely tight programming for tiny microcontrollers, how many cycles per instruction this or that will take, when pipelines get flushed, device drivers, and datasheets and silicon erratas.
If you deal in higher level stuff, you likely use more ram for a text field than I may have on my entire chip.
The only reason I don't use less RAM for a text field is because it doesn't fucking matter, RAM is cheap and when you're building web applications your user is probably not going to run out of it, it's not even a thought. RAM sitting around doing nothing is a waste of money, might as well put it to use.
Yeah, although I think understanding how computers work at a basic level (the components of a CPU, what instructions are, caches, etc.) has a pretty good ROI because it makes it easy to understand some concepts at a higher level (concurrency for example), understand why some things work a certain way, think about some common performance optimisations, tradeoffs, etc.
I think that's true of most things—you don't need to (and likely cannot) know everything in real depth, but having some knowledge tells you where to look, what to look for, etc.
The CPU / RAM bottleneck isn't getting any faster. Latency is actually increasing with newer versions of DDR.
Most of your CPU time is spent waiting on memory. Compilers are hard-pressed to make you choose efficient data-structures. You just need to know the implementation details to get a responsive user experience.
I've tried to deep dive the modern PC a few times before and I always end up giving up because it quickly becomes a proprietary and undocumented minefield.
This is all totally correct, but why not transition into management and get even more money? Dealing with people is more future-proof than any computer programming
Looks pretty interesting. I took a computer architecture class in Jan-Feb this year, and it was one of my favorite things in a long time. I've written software for ~6 years but I didn't really know much about how CPUs work, what an instruction is, what registers are, etc. so I learned a lot.
I thought a lot of these details could be boring or hard to grasp, but it was the opposite—to me, things like how logic gates worked, how we remember values between computer cycles were all very fascinating, especially compared to the kind of work I've done for a long time.
I've learned a bit more about operating systems and databases since then, and I think knowing a bit of computer architecture helped me understand several topics in much better depth. So even if knowing these things doesn't have a very direct ROI in terms of getting a job, I think it's worth spending the time to learn at least the basics.
Some of you would be ashamed of the choices you make in typed languages...
Others would be horrified to find out what happens in the back end of not typed languages...
You would constantly find it amazing that single CPU core systems aren’t ever actually doing more than one thing at a time and the concurrency of moving the mouse on the screen and ANYTHING else the system is doing is just an illusion...
You wouldn’t shit on C language... at least less.
... IDK, I’m sure there are a ton more I’m not thinking about. I’ll leave it up to someone else to articulate better... something something FaceTime is usually a massive waste of technology.
These are bad musings from an embedded programmer who is dealing with a 2020 released chip and it’s 4KB of ram.
I don't think knowing how computers work stops 4MB of JS on websites, just that the kind of person inclined to learn about the foundations of computers is more likely to not want bloated websites in the first place, and is less likely to work as a web developer.
> You wouldn’t shit on C language... at least less.
Nope. If anything, while programming for microcontrollers I learned C is too remote from hardware and leaves too much implementation-defined. Struct padding? Implementation-defined. Bit field implementation? So implementation-defined you'll have to roll your own with shifts and masks. Accessing anything mapped to memory address? The pointers as defined in C standard aren't what you think, but fortunately the implementation often is. Even standard variable sizes are only 21 years old, which means that there is still legacy code that either doesn't use them or wraps them in layers of typedefs to work with toolchains or libraries that don't. And good luck if you have 24-bit registers.
Of course it's frustratingly low level at the same time. I don't know what could be a good choice, and C is what everyone already uses anyway.
> Struct padding? Implementation-defined. Bit field implementation? So implementation-defined you'll have to roll your own with shifts and masks.
You must be much more advanced than I am. Because in 20 years I’ve never had an issue with structure padding because I just assume it pads to n basetypes and pad them manually to 4byte intervals in 32bit.
As to portability, man, you must be moving from ARM to mips to x86 to 8051 to OLD stuff, I’ve never had an issues here either, I just don’t make assumptions of but order if it might matter. Jh mean, code gets tested right? LE to BE or padding or whatever isn’t going to slip by me typically.
The problem with shifts and masks is they don’t typically “automatically” take advantage of the BMF cpu operations like structures definitely will.
IDK, like I said, you must be more advanced than I am. I’ve had none of this issues. Ignorance must be bliss! :)
Yes, my most interesting experience is from writing code that was portable between to TI DSPs. One had 16-bit char and 24-bit pointer, the other 8 and 32 bits. Of course, the struct padding, bit field packing, char signedness, etc were also different. By the way, their compilers were very good at two things: optimizing shifts and masks into bit access operations and sticking to the standard where it is the least intuitive. C was originally intended to be a high level systems programming language and this stuff would have been written in assembly, and it shows.
>You wouldn’t shit on C language... at least less.
Anyone who still thinks C is the way computers work is in for a surprise when they realize all modern x86 machines are just emulating x86 in microcode for backward compatibility reasons.
> If you knew how computers worked.... Websites wouldn’t be 4MB on the front page...
Websites are 4MB because they want a certain experience and a certain functionality, and developers assemble various common components and libraries to achieve this with the least amount of custom development.
A developer knowing assembler will not change the priorities of web site owners.
Sure. I’m working on an IoT-like product. I have extremely limited resources. A serialized packet going to the server can be no more than 20 bytes in this case, it needs to fit in a default size single BLE packet.
My app developers came back with a message ID field that uses a UUID. A 256bit store, 16 of my possible 20 bytes.
The whole packet as they designed it was only 6x larger than could possibly fit.
They used int... not having any idea what a int64_t was or that because the value could only ever be 0-20 that a uint8_t was most appropriate. Oh but the language they’re using doesn’t have 8bit ints I guess so then it was a discussion of “what is but shifting” and who even uses this weird operator >> and << !?
I have a hundred examples of the above. Embedded is a different world, I consider that “how computers work” territory.
Back when I was in electronics during our Digital circuits section we built all the logic from gates and such (TTL, CMOS) and then at the end we learned 8085 assembly using an "Emac 8085 primer trainer" which we all had to build ourselves. It had no battery backup (optional) and we were not using the serial output at all...so we lost any programs we made on power off. It was very interesting to learn and I feel fortunate that I understand computers on that level.
That being said...I don't really do code at all :) Most of what I do is on the hardware side of things.
I also recommend the book "But How Do It Know? - The Basic Principles of Computers for Everyone." It attempted to explain circuit-based logic gates, how you could use binary trees for all the basic math operations, and then for other characters, building up to more and more complex operations...
I had to stop and sit and think several times while reading that book, but it was a real thrill when I started to understand it, feeling that small flash of near-omniscience...
This looks pretty interesting. I didn't look into the course materials, but I wonder if someone learning "how computers work" would benefit from a casual introduction to a simple microprocessor hardware architecture.
For me, it was the Z80, because there happened to be a book at Radio Shack, published by Howard Sams, that was extremely well written. But since this is about x86 assembly, the old 8086 isn't all that bad in terms of grasping how the different kinds of instructions actually play out in the registers, memory and i/o busses, and so forth. And it's not so bad to learn about the features of modern processors in light of how they compare to early chips. For instance in the 8086, a pointer is not an abstract concept, but a real physical thing!
Despite modern microcontrollers being much more sophisticated than the Z80, that foundation has helped me learn and understand things like how to read the documentation for special function registers and advanced peripheral chips.
I think 6502 (commodore 64/apple 2 CPU) is perhaps the best introduction to asm. The instruction set is designed for people and while it only takes a day to learn, it takes a lifetime to master.
I've developed software for 6502 (the first processor I learned assembly language for, but not the first CPU I programmed at the machine code level), Z80, 6809, 68000 series (x00, x20, x30, x40), a variety of MIPS (R3000, R4000, R10000), ARM and a number of variations on it, x86 and all its variations, T800 Transputers, various 4-bit and 8-bit microcontrollers, and well over a dozen more CPU types over the years.
For real world processors, I would say that the 68000 is probably the best introduction to assembly level programming. For a real-world CPU. Lots of registers, usually a flat and easily understood memory map, a clean instruction set, and instructions that have direct parallels to many of the basic operations you would find in higher level languages, e.g. multiply, divide, addition and subtraction of 32-bit numbers, easily understood loops, various conditionals and so forth. Certain early ARM variants would run a close second of "try this out, it's easy to understand." Though even then, I'd hesitate to introduce someone to ARM programming unless they had a real interest in learning assembly programming.
6502 is simple, I agree with that, but moving beyond the most basic of operations takes careful thought.
So for an introductory assembly language on a real world processor, I would have someone start with 68K. Once they grasp that, which, if they are a reasonably competent programmer, facile in their language of choice, they should be able to crank out 68K assembly that does useful work in a couple of days. It is possible to then consider the 6502 or Z80 or 6809 as options for a less capable, but certainly more difficult next step in assembly language.
I would strongly disagree that 6502 is designed for human readability or that it is the pinnacle of assembly language programming options. For real world processors, I would be hard pressed to rank it in the top 10 of "processor of choice to solve this task because of capability." Perhaps up to about 1982 it could be considered reasonably competent. But "pinnacle" is not a word I would ever use to describe that CPU no matter how fond my memories are of learning to program it or shipping my first video games on it. The 6502 CPU has a number of shortcomings that were clearly evident in the late 1970s when it was at peak popularity. The one thing it did have going for it was that it was cheap and that it was easy to get going on a board with some supporting chips.
Yeah, as someone who's taught a intro to computer architecture curriculum based on the 6502, I wouldn't recommend it. I ended up switching to MSP430 after the first run through.
Features of the 6502 ISA like register width being half of pointer width with no way to link a pair of registers together to a full address, relying on you to have to bounce back and forth through memory to construct a full pointer really obfuscated a lot of the concepts I wanted to teach. Yes, "zero page is the real registers", bla bla, but those extra hoops don't do any favors to someone who's just learning computer architecture.
68k looks clean, but I've never had to use it in my career so I shy away from it for probably that reason only.
If you know even a modicum of assembly language you can grok 68K in under 20 minutes. It works exactly how you expect it to work. I want to load a register, I want to loda a second register, I want to multiply them. The most remarkable caveat in 68K, that I can think of, off the top of my head, is "multiply and divide can be slow than you expect." Which is where the MMU or the x40 comes in.
The MSP430 is also a good learning processor, clean instruction set, not too much to take in at once, though there is mental overhead when you need to start being concerned with stack processing and also with some jump instruction limits.
> If you know even a modicum of assembly language you can grok 68K in under 20 minutes. It works exactly how you expect it to work. I want to load a register, I want to loda a second register, I want to multiply them. The most remarkable caveat in 68K, that I can think of, off the top of my head, is "multiply and divide can be slow than you expect." Which is where the MMU or the x40 comes in.
Going to be real, I never liked the split A and D integer register files of the 68k. It always seemed like it would obfuscate one of the base ideas I want students to keep over thier lives. That it's all memory, whether it's 'data', or pointers, or instructions, it's all just bits in memory. Having to load into one part of the processor to use it as an offset seemed antithetical to that goal.
> The MSP430 is also a good learning processor, clean instruction set, not too much to take in at once, though there is mental overhead when you need to start being concerned with stack processing
It's got clean push and pop instructions, and R1(SP) an otherwise be accessed just like a normal register in the file. What are you thinking about there?
> and also with some jump instruction limits.
Yeah, I just stuck RAM and ROM within the limit on the emulated system they use so that doesn't come up.
I will agree that the split Address and Data registers are probably not the best architectural decision the desginers made. But in a pool of "be aware of these things", only having to deal with a half-dozen new concepts is a lot easier than the dozens of caveats and new concepts of many other processors. It is easier to teach "multiply is slow but this is how you do it" vs "here's a function you have to call, or need to write from scratch, to do a division."
Regarding the stack processing, I mispoke; I was recalling a different CPU, whereby, given an address in the R(0), it is possible to execute an instruction at a different memory address than the one the PC is currently pointing at, effectively giving you a second PC, which was frequently used to execute a function without jumping to the function. The easiest way to describe that concept is a mix-in.
I totally agree that it isn't a big deal in practice. It's just antithetical to one of the major theses of the course, that there is no difference between pointers and other forms of data, but instead those are abstractions that mainly exist at a higher level.
And lol, yeah, the execute indirect processors are super goofy and I wouldn't subject students to that right off the bat.
Cannot decide if sarcasm or serious. The PDP-11 computer was an a good architecture for the time, but "most elegant modern architecture" is so dismissive of everything that has been developed in the past 50 years as to border on the absurd. The instruction set of the PDP-11 (MACRO-11 the programming language) , and the CPU architecture that went with it, was kinda shit to be honest. It was groundbreaking at the time as many systems were, and firmed up many of the standards that we have today, but I wouldn't ever describe the CPU as elegant. Influential, perhaps.
The site implies assembly is how computers really work. IMO, assembly is just an incidental implementation detail of little note. High level language chips are possible, after all.
I took this class when I was already A dozen years into my software development career. It’s certainly not necessary to write software, but I’m really glad I know the underpinnings of computing.