Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Michael Stonebraker wins Turing Award (newsoffice.mit.edu)
528 points by assface on March 25, 2015 | hide | past | favorite | 74 comments


His creativity is amazing: founder of a number of database companies, including Ingres, Illustra, Cohera, StreamBase Systems, Vertica, VoltDB, and Paradigm4...at MIT, where he has been involved in the development of the Aurora, C-Store, H-Store, Morpheus, and SciDB systems [1]

And his students:

-Daniel Abadi (co-founder and Chief Scientist of Hadapt)

-Michael J. Carey (faculty at UC Irvine, formerly at U. Wisconsin Madison, NAE Member and ACM Fellow)

-Robert Epstein (founder and former VP of Engineering of Sybase)

-Diane Greene (co-founder and former CEO of VMWare)

-Paula Hawthorn (founder of Britton-Lee, formerly VP of Engineering of Informix)

-Marti Hearst (Professor at UC Berkeley)

-Gerald Held (former VP of Engineering of Oracle)

-Joseph M. Hellerstein (faculty at UC Berkeley)

-Anant Jhingran (VP and CTO for IBM's Information Management Division)

-Curt Kolovson (Sr. Staff Research Scientist at VMware)

-Clifford A. Lynch (executive director of the Coalition for Networked Information)

-Mike Olson (former CEO of Sleepycat Software and founding CEO of Cloudera)

-Margo Seltzer (Professor of Computer Science at Harvard, founder and former CTO of Sleepycat Software)

-Dale Skeen (founder of Tibco, founder and CEO of Vitria)

[1] http://en.wikipedia.org/wiki/Michael_Stonebraker


Don't forget his newest company - Tamr! http://www.tamr.com/


Here are the links to the top info pages for each of the projects you named:

https://news.ycombinator.com/item?id=9267844


For those who are unfamiliar with his work, he worked on ingres, and postgres, which are now the bases for postgresql.

He worked on Aurora (http://homes.cs.washington.edu/~magda/aurora-medusa.pdf), which I don't think ever got a lot of commercial success, but was one of the earlier stream databases.

He also has some amazing work on non-traditional databases, for example he worked on H-store (http://hstore.cs.brown.edu/) which is now voltdb, which is an in-memory distributed database.

And then C-Store was the basis for Vertica: http://en.wikipedia.org/wiki/C-Store

I'm pretty sure Margo Seltzer (of BerkeleyDB fame) was his student at some point too.

Truly amazing researcher - and I think an example to those who focus on a teeny tiny minor niche their entire career and never explore anything else.


Also his YouTube videos are highly worth watching:

https://www.youtube.com/results?search_query=michael+stonebr...


"On what hardware platform does Oracle runs the best?"

...

"Extaclty right, on a 35mm projector..."

hahaha


Fun Fact: Stonebraker used to party with Michael Jackson in the early 1980s when Thriller came out:

https://youtu.be/LCXWF-OEIkc?t=6m17s


Whoever it is, it's not Michael Jackson [1].

http://mcs.open.ac.uk/mj665/

Whoosh.


Fun fact, it's Pete Carter as Michael Jackson https://youtu.be/LCXWF-OEIkc?t=639


He got me there for a moment. Pete Carter also says that Stonebraker wrote lyrics for "Beat It".


I can highly recommend his video "Everything you learned in your DBMS class is wrong" http://slideshot.epfl.ch/play/suri_stonebraker

It opened my eyes to why traditional block-based RDBMS's were definitely going to go away and replaced by in-memory engines.


Aurora was commercialized as Streambase Systems (http://en.wikipedia.org/wiki/StreamBase_Systems), and was a reasonable success as well.


Also doing some really interesting work @ Tamr. Worth checking out.


and scidb.


[flagged]


I don't know about the episodes you're referring to, but it appears she wrote the log-structured filesystem for BSD (which builds on UFS, that was created by McKusick, a great filesystem for its day, but ultimate needed improvements). Also looks like she founded Sleepycat, and knew a bunch about Berkeley DB, which hammered the filesystem.

"""She is the author of several widely-used software packages including database and transaction libraries and the 4.4BSD log-structured file system. Dr. Seltzer was a founder and CTO of Sleepycat Software, the makers of Berkeley DB and is now an Architect for Oracle Corporation."""

So, frankly, I think she probably is qualified to tell McVoy (not one of my heros) and McKusick (one of my heros) how to up their game.


I remember Berkeley DB for occasionally corrupting its databases.


Come on, /that/ downvote was unfair.

I can grudgingly accept that people downvote me for saying negative things about someone without linking to evidence (that really does exist, I swear!).

But Berkeley DB not being robust back in the day shouldn't be controversial at all. It's a simple fact.

And look: here's someone who agrees:

https://news.ycombinator.com/item?id=3614071

(this time it was a lot easier to google)


I downvoted you not because your statement was negative but because it was irrelevant. You jumped in to trash somebody tangentially mentioned apparently because you don't like their politics. When somebody gave you a substantive response, you ignored the meat and came back with personal trivia. Comments like these waste everybody's time.


I didn't downvote you myself, but I will say I have ... um... "some" experience with Berkeley DB, and it's pretty well designed for reliability. Obviously that doesn't mean nobody has ever experienced corruption, but it's been used heavily in production environments as the storage layer for many systems.


I didn't think you did and I apologize for not phrasing it better.

Berkeley DB is an old piece of software that has changed a lot over time, as have the environments it has to run in. I think the big change in robustness happened from 3.x to 4.x.

Edit: grammar.


Can you provide some evidence of this episode? Right now I'm reading her PhD thesis at Berkeley on filesystems and several of her first-author papers with McKusick. Her works cites McVoy's and says it's OK.


Maybe later. I took a quick google around before I posted but they are not as easy to google as they were the last time I looked at it, some years back.

I think I originally found them via McVoy's page back in 2000, and then followed links and googled a bit. There may also have been some discussion available through Deja News.

McVoy had written some interesting stuff even if his BitKeeper thingy looked horrible and very TacKy :)


If your assertions can't be backed up with data, I suggest you delete them. Otherwise you're just spreading FUD.


I became fascinated with what Stonebraker was saying back when I got into the biotech industry. We were dealing with huge amounts of data and the more I read from him the more it made sense to me, particularly when he talks about the lack of ACID in NoSQL stuff being a bad thing. I have a couple of projects involving VoltDB and SciDB on the backburner, and any future projects I plan on using VoltDB in if possible and applicable, and so far I am pretty convinced that they are much more useful than people understand.

If you haven't read up on either VoltDB or SciDB or Stonebraker himself, I highly suggest you do, as it might make you think twice about some of your current setups. Here's a few quotes for the fun of it:

"I think the biggest NoSQL proponent of non-ACID has been historically a guy named Jeff Dean at Google, who’s responsible for, essentially, most to all of their database offerings. And he recently … wrote a system called Spanner,” Stonebraker explained. “Spanner is a pure ACID system. So Google is moving to ACID and I think the NoSQL market will move away from eventual consistency and toward ACID.”

“My prediction is that NoSQL will come to mean not yet SQL,”

"You saw that they went for Cassandra for inbox search and HBase for messaging. The reason they're not doing that on MySQL is that sharding MySQL is a lot of effort and you have to apply that effort to each new project."

That should be enough to get your curiosity piqued.


A lot of that seems like anecdotal at a best. Now, I'm not one to argue with someone who has as much experience as Stonebreaker, but it seems like he's looking at a few specific use cases, and formulating a broad opinion on NoSQL from them.

There are plenty of use cases where ACID compliance truly isn't needed.

Also, just because Google has one new database that features ACID compliance, does not mean that "Google is moving to ACID", it simply means that Google has identified a need for a portion of their data to be stored in an ACID compliant way.


I don't disagree, but I think what he is trying to convey is that there are a lot of place where ACID is needed but isn't being put into place. He's not arguing against non-ACID, he's saying people are using non-ACID systems where they shouldn't. It's a small but important distinction.


Very well deserved. I've looked over dozens of papers for relational db's and every single one of them cites down to his foundational work. Congratulations Professor Stonebraker!


Seconded. He was on episode 199 of Software Engineering Radio two years ago (http://www.se-radio.net/2013/12/episode-199-michael-stonebra...), which I thoroughly enjoyed. It was so informative that I took a page of notes while listening! Really great stuff.


Agreed. I remember being very inspired by some of his papers when I started studying databases in grad school. His work along with all of those who developed the B-Tree and its variants[1] is really foundational to data storage and retrieval. He also helped to kick off the "NoSQL" and "NewSQL" movements with his C-Store paper.[2]

[1] Proposed by Bayer and McCreight, and independently developed by Chiat and Schwartz, and also by Cole, Radcliffe and Kaufman, improved by many including D. Knuth.

[2] C-Store: A Column Oriented DBMS. Mike Stonebraker, Daniel Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran and Stan Zdonik. VLDB, pages 553-564, 2005.


And yet:

> An adjunct professor of computer science and engineering at MIT

Adjunct? Does this mean something different at MIT? Or is it some form of convenience for Stonebraker?


This is what adjuncting is supposed to be for - folks with industry experience moonlight as professors. The benefit is not primarily salary but networking, knowledge sharing, and helping the next generation of industry professionals.

Now it's more of a way to have 2/3 or more of the department work for unliveable wages which allows for an ever-growing administrative overhead in colleges while tuitions double every decade.


As far as I know, not true at MIT, teaching is done by tenured or tenture track professors, with exceptions like this, or SF author Joe Haldeman, that prove the rule.

But, yeah, I hear from lots of sources that adjuncts making peanuts have become the rule rather than the exception in general US academia, and there's no disputing how administrators are taking over higher education, now even desiring to wrest little the faculty still control from them.


Management in all organizations should be automated by AI, with copious amounts of override buttons sprinkled throughout.

Education administration is a like a thorn stuck in my mind, they make _more_ than everyone else and for the most part only act as a gas to support their own structure.


This is how every bureaucracy works, whether government, commercial or academic. Once the institution has enough income/cash flow for momentum, then it attract people who are expert at operating the machine itself, rather than expert in what the machine is supposed to be accomplishing. After the first one lands, they continue to accrete.

It's hard to recognize when it starts, but you'll know it's happened once you see a lot of people who are not connected with the apparent goal of the machine, and there are posters all over the place touting whatever programs the administrators have created to justify their existence, as well as packaged training programs from motivational/educational consultants (think Franklin Covey).


Ha, A fork bomb of Agent Smith crossed with Nancy in Program Outreach.

The accretion or calcification model of bureaucratic formation is compelling, something like how a coral reef grows. The randomized surface provides eddies and pockets of protection for other life to flourish, RFPs and SBIRs can nestle in a protected arena with low local competition.

I just realized that large, messy codebases also follow the reef model of bureaucracy. Hadoop is like that coral reef, providing nooks and crannies for optimizations and integrations to take hold. I used to imagine Hadoop as Whale fall [0], but it is more of a mandlebulb. Had Hadoop not provided such a rich environment the secondary ecosystem wouldn't be as vibrant. Fail to Win?

I find management structures fascinating. Whenever I interact with one I probe it to see how much autonomy each individual in it has, what rules they can bend or not follow. Once the agents participating in the bureaucracy cannot bend the rules I think it will tend towards dystopia. Maybe 1984 isn't a warning against fascism, but the natural tendency of all bureaucracies to only support them selves.

[0] http://en.wikipedia.org/wiki/Whale_fall

note: I might sound like the stereo type of a hackernews-bitcoin-libertarian, but I assure you my politics are much more nuanced than that. I don't think that bureaucracy as a structure is bad, but it needs to be managed with something akin to the voting logic in a triple redundant control circuit [1] [2]. Most bureaucracies exist within a positive feedback loop, which rewards them for growth instead of efficiency. It is like getting paid by LOC instead of 1/LOC or 1/runtime.

[1] ftp://ftp.unicauca.edu.co/Facultades/FIET/DEIC/Materias/Instrumentacion%20Industrial/Instrument_Engineers__Handbook_-_Process_Measurement_and_Analysis/Instrument%20Engineers'%20Handbook%20-%20Process%20Measurement%20and%20Analysis/1083ch1_10.pdf

[2] http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/1985002...


He was a professor at Berkeley for 29 years.

He's cashed out a number of successful businesses, so it's not likely that he needs money.

He's in his 70s. I would guess he still likes to teach and do research, but maybe not as a full-time professor.

Adjuncts don't have to go to committee meetings. :-)


Yup. In the case of people like Stonebraker -- or Butler Lampson, who's also officially adjunct at MIT -- Think of it as: You get an office, a community, resources with which to do research (and a framework within which to ask for grant/funding if you want), opportunities to advise students, the ability to teach classes when you want. In exchange, you don't get much money -- but also almost zero responsibilities unless you choose to assume them. (source: I'm a CS professor and I got my Ph.D. at MIT while both Stonebraker and Lampson were there.)


Adjunct definitely means something different at MIT. For instance, departments are only allowed to hire adjuncts up to 5% of the total normal faculty members of their department[1]. MIT EECS only has six [2]. In effect, this means that adjunct positions are only for really, really qualified people -- they're not there to fill out the teaching staff but to augment the experience of the rest of the faculty.

Adjunct professors can also supervise research, which I believe is uncommon at other institutions.

[1] http://web.mit.edu/policies/2/2.3.html#sub2

[2] Four listed as "adjuncts" and four "professors of the practice", which are equivalent per [1]: https://www.eecs.mit.edu/people/faculty-advisors


Most likely the latter. He was a full professor at Berkeley before that.


It means he doesn't have the commitments that come with being a full-time faculty member but that he can teach classes.


Absolutely. The man is outstandingly influential.


To know more about Professor Stonebraker I cant recommend this excellent interview[0] by se-radio enough. Its easily one of the best interviews that I've heard on a podcast. Do check it out.

[0] - http://www.se-radio.net/2013/12/episode-199-michael-stonebra...



His newest venture Tamr, is also really interesting. Unifying data with Machine Learning and Human input. I recommend keeping an eye on this one.

www.tamr.com


http://webcache.googleusercontent.com/search?q=cache:SijAcDv...

Google cache in case you have trouble reaching the website (apparently under some understandably heavy load).


Stonebraker was tech advisor starting in 2001 for Addamark/Sensage, which developed a column-oriented/columnar DB for log aggregation/analysis for security/operations. Stonebraker's own C-Store and Vertica came later and were more fully featured. While Sensage's product was integrated into some HP offerings, HP unfortunately (my perspective) chose to buy rival Arcsight and then Vertica.

See discussion and comment from Adam Sah at http://radar.oreilly.com/2008/11/the-commoditization-of-mass... for more context for column-storage and log analysis.

edit: link to article.


Thx for the mention.

Mike was my thesis advisor at Cal, and had enormous influence on all sorts of things beyond databases, including (I believe) the founding of the CS department and the negotiation of how Ingres technology spin off from Cal (which owns the IP), which became the prototype for how others would create companies like Inktomi and many more.


newsoffice.mit.edu is not enjoying all this traffic.

Article text: http://pastebin.com/MTdagueN



That's the first Turing winner I've met. Thanks to philg for bringing Professor Stonebraker in for his 3-day database course.

http://philip.greenspun.com/teaching/rdbms-iap-2015


His book "Readings in Database Systems" 4th edition http://www.amazon.com/Readings-Database-Systems-Joseph-Helle...


Keep in mind this book is really just a large collection of core papers. If you are looking for something more structured as a how-to of developing DBs this is useful as a reference but not the best introduction.


Could you recommend something more appropriate for someone who would like to explore the world of database implementations?


For an overview of basic concepts in DB implementation, I quite like Database Systems: The Complete Book (By Widom I think...). You could do a lot worse than An Introduction to Database Systems (CJ Date), although many dislike his opinionated style :-).

If you're talking about actually implementing a full transactional database system, strong foundational books are:

Transaction Processing: Concepts and Techniques (Gray and Reuter)

Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery (Vossen and Weikum)

Neither are exactly easy reading, but the concepts therein are really important.


+1 on all of those books. If you are interested in multi-dimensional indices (R-Trees, M-Trees, and many more exotic ones). You should checkout "Foundations of Multidimensional and Metric Data Structures" by Hanan Samet. Very comprehensive! "Database Systems: The Complete Book" is fantastic (pick up the previous version it is cheaper!) but it only touches on multidimensional indexing.

Also if you are interested in B-Trees start with "The Ubiquitous B-Tree" by Comer. [ http://dx.doi.org/10.1145/356770.356776 ].


"Architecture of Database Systems" by Hellerstein, Stonebraker, and Hamilton is a pretty good starting place: http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf


I like Silberchatz et al. Pretty rigorous but practical:

http://codex.cs.yale.edu/avi/db-book/db6/slide-dir/


This course reading contains a lot of information too: http://www.cs286.net/home/reading-list


The link can not be opened. Any mistake?



He also contributed 2 modules to the recent "Tackling the Challenges of Big Data" online course from MITx. Among other things, he did a very lucid roundup of legacy vs modern db systems.


Though most articles says his open source contributions, Wikipedia page says "PostgreSQL evolved from the Ingres project at the University of California, Berkeley. In 1982 the leader of the Ingres team, Michael Stonebraker, left Berkeley to make a proprietary version of Ingres"


I think I watched a talk by Michael a few years back about the basics of columnar databases but now I can't find it. I recognized the names of the companies he founded from the article. At least I think it was him that gave the talk! Does anyone know what I'm talking about?



He is very very good. I had his undergrad database class long time ago. His class was one of my favorites. What he taught I can still use today. I've just done a merge-join thing recently based on what I remembered from his class.


Question about the Turing Award, something I wonder about whenever this comes up: how did Claude Shannon never win it?


Perhaps information theory was viewed as more EE than CS?


If he's only an adjunct, I wonder what you need to do to get full professorship at MIT.


The desire to put in that sort of work to the exclusion of building things and companies, it would appear in his case. He was a full professor at Berkeley. He's also getting old, past the normal age for a MIT professor to become Emeritus.


It's his choice to be an adjunct. Butler Lampson is one too.


well deserved. congratulations to Mr. Stonebraker

(among other things I was part of one of the first early adopter teams that used his Streambase product, while at Orbitz.)


He's the creator of postgres. Enough said!


The link is broken, change https to http.


Thanks. Fixed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: