There's something important here in that a public good like Metabrainz would be fine with the AI bots picking up their content -- they're just doing it in a frustratingly inefficient way.
It's a co-ordination problem: Metabrainz assumes good intent from bots, and has to lock down when they violate that trust. The bots have a different model -- they assume that the website is adversarially "hiding" its content. They won't believe a random site when it says "Look, stop hitting our API, you can pick all of this data in one go, over in this gzipped tar file."
Or better still, this torrent file, where the bots would briefly end up improving the shareability of the data.
A lot of the complaints here about physics have to do with focusing so heavily for decades on string theory (or M-theory) which hasn't produced much in the way of practical results. At some point we have to quit throwing good money after bad and redirect funding towards other lines of inquiry.
I worked for Cycorp for a few years recently. AMA, I guess? I obviously won't give away any secrets (e.g. business partners, finer grained details of how the inference engine works), but I can talk about the company culture, some high level technical things and the interpretation of the project that different people at the company have that makes it seem more viable than you might guess from the outside.
There were some big positives. Everyone there is very smart and depending on your tastes, it can be pretty fun to be in meetings where you try to explain Davidsonian ontology to perplexed business people. I suspect a decent fraction of the technical staff are reading this comment thread. There are also some genuine technical advances (which I wish were more publicly shared) in inference engine architecture or generally stemming from treating symbolic reasoning as a practical engineering project and giving up on things like completeness in favor of being able to get an answer most of the time.
There were also some big negatives, mostly structural ones. Within Cycorp different people have very different pictures of what the ultimate goals of the project are, what true AI is, and how (and whether) Cyc is going to make strides along the path to true AI. The company has been around for a long time and these disagreements never really resolve - they just sort of hang around and affect how different segments of the company work. There's also a very flat organizational structure which makes for a very anarchic and shifting map of who is responsible or accountable for what. And there's a huge disconnect between what the higher ups understand the company and technology to be doing, the projects they actually work on, and the low-level day-to-day work done by programmers and ontologists there.
I was initially pretty skeptical of the continued feasibility of symbolic AI when I went in to interview, but Doug Lenat gave me a pitch that essentially assured me that the project had found a way around many of the concerns I had. In particular, they were doing deep reasoning from common sense principles using heuristics and not just doing the thing Prolog often devolved into where you end up basically writing a logical system to emulate a procedural algorithm to solve problems.
It turns out there's a kind of reality distortion field around the management there, despite their best intentions - partially maintained by the management's own steadfast belief in the idea that what Cyc does is what it ought to be doing, but partially maintained by a layer of people that actively isolate the management from understanding the dirty work that goes into actually making projects work or appear to. So while a certain amount of "common sense" knowledge factors into the reasoning processes, a great amount of Cyc's output at the project level really comes from hand-crafted algorithms implemented either in the inference engine or the ontology.
Also the codebase is the biggest mess I have ever seen by an order of magnitude. I spent some entire days just scrolling through different versions of entire systems that duplicate massive chunks of functionality, written 20 years apart, with no indication of which (if any) still worked or were the preferred way to do things.
It depends on what you're doing... Just for reference, here is a small showcase of the capabilities that I've trained on a 13 billion parameter llama2 fine tune (done with qlora).
One of my all time favorites. Can’t remember where I first read it (Quora?), but it’s currently my top Google hit for “balloon programmer project manager joke”. [0]
============
A man is flying in a hot air balloon and realizes he is lost. He reduces height and spots a man down below. He lowers the balloon further and shouts:
"Excuse me, can you help me? I promised my friend. I would meet him half an hour ago, but I don't know where I am."
The man below says, "Yes, you are in a hot air balloon, hovering approximately 30 feet above this field. You are between 40 and 42 degrees North latitude, and between 58 and 60 degrees West longitude."
"You must be a programmer," says the balloonist.
"I am," replies the man. "How did you know?"
"Well," says the balloonist, "everything you have told me is technically correct, but I have no idea what to make of your information, and the fact is I am still lost."
The man below says, "You must be a project manager"
"I am," replies the balloonist, "but how did you know?"
"Well," says the man, "you don't know where you are or where you are going. You have made a promise which you have no idea how to keep, and you expect me to solve your problem. The fact is you are in the exact same position you were in before we met, but now it is somehow my fault."
"As you know people, as you learn about things, you realize that these generalizations we have are, virtually to a generalization, false. Well, except for this one, as it turns out. What you think of Oracle, is even truer than you think it is. There has been no entity in human history with less complexity or nuance to it than Oracle. And I gotta say, as someone who has seen that complexity for my entire life, it's very hard to get used to that idea. It's like, 'surely this is more complicated!' but it's like: Wow, this is really simple! This company is very straightforward, in its defense. This company is about one man, his alter-ego, and what he wants to inflict upon humanity -- that's it! ...Ship mediocrity, inflict misery, lie our asses off, screw our customers, and make a whole shitload of money. Yeah... you talk to Oracle, it's like, 'no, we don't fucking make dreams happen -- we make money!' ...You need to think of Larry Ellison the way you think of a lawnmower. You don't anthropomorphize your lawnmower, the lawnmower just mows the lawn, you stick your hand in there and it'll chop it off, the end. You don't think 'oh, the lawnmower hates me' -- lawnmower doesn't give a shit about you, lawnmower can't hate you. Don't anthropomorphize the lawnmower. Don't fall into that trap about Oracle."
-- Bryan Cantrill
https://www.youtube.com/watch?v=-zRN7XLCRhc
I contracted for a person who built a small but very profitable SaaS app using contractors.
There are some caveats though. The person has a smattering of technical knowledge which prevented him from being completely taken advantage of by contractors. And he was an effective manager (a rarity to be sure). And he knew exactly what he wanted and stayed on top of details of how it was implemented. Although not a necessarily a "programmer" he also knew enough about using Git to review/merge changes. And perhaps most importantly he had the financial backing to complete the project.
I've also worked for other people who didn't have these qualities but figured they'd just hire some people to build some stuff without keeping on top of it or in many cases having a clear vision of what they wanted. Predictably these projects didn't turn out well for them.
V8 is a world class compiler. Compared to ruby, Python, Erlang, php and that ilk it is lightning quick. (Pypy is a good match for it but it's not main street and you can't just use any module.) in terms of performance and popularity, v8/node stand pretty much alone as far as dynamic non-compiled environments.
JavaScript has a lighter weight feel than Java and the jvm stack. Single threaded with a top notch event reactor. You can write just a few lines in node and have a web server serving some json.
There are a ton of JavaScript coders. There is something to be said for having a common technology throughout your stack and you ui is almost certainly in js.
It's easy to get something simple going quickly, there are tons of libraries and you're going to need to write go, Java or .net to out perform it.
The big downsides, imho, v8 is designed for client side stuff. Both it and dart vm are limited to relatively small memory footprints (think 1GB) which just isn't good for some workloads and problems. Likewise a lot of server tasks can make good use of real threads, you're out of luck with node, you can have child processes and send messages but if they have to share big amounts of data the marshaling becomes a bit of a bottleneck. If you work is database crud, or light weight, or horizontally scalable, or able to be modeled as streams then node isn't that bad. Js feels kind of limited in how certain things are modeled, there isn't much data hiding or abstraction; the extreme simplicity almost creates more complexity for some things. There are lots of different opinions on basic code behavior, some node modules do work on import, some need explicit initialization, some alter prototypes and it's a concept that seems to be lost on many js devs; I've burnt hours debugging something because someone reordered the imports and it screwed init logic. And you'll completely screw yourself if you don't stay up to date with your depends, things change fast and sometimes they change a lot. Both node and dart vm seem really well equipped for tooling type things that are usually done in bash, Python or perl but that doesn't seem to be as popular.
It's definitely not for everything but it's worth a look. It is really popular.
This looks interesting but if you like it I encourage you to check out https://serverless.com/
It has a very active community, is extremely powerful (because for AWS it is built on AWS CloudFormation), and very robust. I wrote by own framework and eventually moved to Serverless.
Not to mention in supports other languages (although the Python documentation is lacking... Java has good documentation though) and other providers (Azure & Google).
Edit:
For a quick comparison:
Dawson - 4 contributors on Github, and almost 650 commits
Serverless - Over 200, and almost 6000 commits
I don't want to be the type of person that discourages innovation. Sometimes you need to innovate because the current state of affairs is broken. But in this case Serverless Framework is not broken... it is very very good.
Footnote: I'm not a contributor but I am a long time user. I don't have any personal stake in the Serverless project.
"Since weak use of Diffie-Hellman is widespread in standards and implementations, it will be many years before the problems go away, even given existing security recommendations and our new findings. In the meantime, other large governments potentially can implement similar attacks, if they haven’t already.
"
can someone explain to me why this cant be fixed over night. im no crypto expert, but
" If a client and server are speaking Diffie-Hellman, they first need to agree on a large prime number with a particular form. "
why can't you just switch the large prime number and then continue on sending encrypted data?
There's something important here in that a public good like Metabrainz would be fine with the AI bots picking up their content -- they're just doing it in a frustratingly inefficient way.
It's a co-ordination problem: Metabrainz assumes good intent from bots, and has to lock down when they violate that trust. The bots have a different model -- they assume that the website is adversarially "hiding" its content. They won't believe a random site when it says "Look, stop hitting our API, you can pick all of this data in one go, over in this gzipped tar file."
Or better still, this torrent file, where the bots would briefly end up improving the shareability of the data.