What you actually want in this context is some code that generates random deviates of probability distributions chosen randomly and a "guesser agent" that tries to guess which distribution was chosen. Then you can ask questions like,
> given some condition on a distribution of distributions, when do we feel that a guesser is taking too long to make a choice?
This is like a person who is taking to long to identify a color or a baby making a decision about what kind of food it wants and waiting for it to do so. For a certain interval, it makes sense, but after a point it becomes pathological.
So for example if we have two distributions,
> uniform distribution on the unit interval [0,1]; uniform distribution on the interval [1,2]
then we get impatient with a guesser who takes longer than a single guess, since we know (with probability 1) that a single guess will do.
Now, if we have two distributions that overlap, say the uniform distribution on [1,3] and [0,2], then we can quantify how long it will take before we know the choice with probability 1, but we can't say for sure how many observations will be required before any agent capable of processing positive feedback in a neural network can say for certain which one it is. As soon as an observation leaves the interval (1,2) the guesser can state the answer.
Now, things can get more interesting when the distributions are arranged in a hierarchy, say the uniform distribution on finite disjoint unions of disjoint intervals (a,b) where a < b are two dyadic rationals with the same denominator when written in lowest terms.
If a guesser is forced to guess early, before becoming certain of the result, then we can compare ways to guess by computing how often they get the right answer.
Observations now give two types of information: certain distributions can be eliminated with complete confidence (because there exists a positive epsilon such that the probability of obtaining an observation in the epsilon ball is zero) while for the others, Bayes theorem can be used to update a distribution of distributions or several distributions of distributions that are used to drive a guessing algorithm. A guess is a statement of the form "all observations are taken from the uniform distribution on subset ___ of the unit interval".
Example: take the distributions on the unit interval given by the probability density functions 2x and 2-2x. Given a sequence of observations, we can ask: what is the probability that the first distribution was chosen?
The answers to these questions can be found in a book like Probability : Theory and Examples.
How do people feel about a Zuckerberg presidency? As president, is Zuckerberg going to "fight the hate" in America? Is Bloomberg asking Zuckerberg to outline such an agenda?
> During a daylong brainstorming session, the group came up with a meme that subtly mocks people who blame minorities for the mundane frustrations of daily life, such as packed subway cars.
It seems that Bloomberg approves of hate speech in the form of mockery when it is directed at "people who blame minorities". It sounds like Bloomberg wants Zuckerberg to channel hate speech, not stamp it out.
What do you think? Is Zuckerberg going to take action against Baldauf for launching a hate campaign?
I doubt it.
Mockery is clearly a form of hatred, and Bloomberg wants Zuckerberg to permit this form of hatred while claiming to advocate a policy of retaliation against hate speech.
This is very misleading.
> Their mission was to come up with a social media campaign that might make Germans less susceptible to the wave of fake news and right-wing propaganda scapegoating Europe’s growing population of immigrants and refugees.
It appears that this is little more than an attempt to subvert German nationalism and German ethnic identity.
> you can hate people as long as you hate people I hate
Is it any wonder that two - Bloomberg and Zuckerberg - find common cause in directing hatred toward ethnic Germans?
Can you think of any reason why this might be?
Well I can: the holocaust. I think these two want to create a culture of hatred directed at ethnic Germans, all while claiming to be against hatred in principle.
What do you think? Is this kind of misrepresentation acceptable? Or do you find it as repugnant as I do?
I wish HN would ban comments like the parent. It's such an old tactic for spreading hate, I don't need to describe it. It adds nothing to HN - the pseudo-rational is just old propaganda - and is in fact a great detriment. I can speak only for myself, but please take your hate elsewhere.
Yeah I get this feeling too, though I'm not sure it's motivated by the Holocaust (has Zuckerman made a big deal of being Jewish?). At least not directly. So many of my left leaning friends share these same ideas and rhetoric, even if they're against Israel.
Look at all the backlash AfD got for proposing benefits to families having children ... if they'd been citizens for at least 5 years. Things are very one sided, and even questioning that gets you labeled all sorts of things.
A FB presidency sounds extremely dystopian. Fortunately Zuck seems really unlikeable, and hopefully will not overcome that. OTOH FB is so powerful, that might not matter as much as it used to.
Not sure what can be done. Perhaps if Trump is ineffective enough Americans will get a real right wing leader?
> reproducibility is difficult in science generally but can be insanely difficult for machine learning
it is a computer algorithm, so by definition it is trivial to reproduce results, you just run the program again.
> I ran into this recently by accident when writing a simple RL example. With two weight matrices to learn, the first weight matrix was given correct gradients, the last weight matrix was only supplied with partial information. Surprise, still works, and I only discovered the bug _after_ submitting to OpenAI's Gym with quite reasonable results.
so you want to say that you coded a bug, but you don't have a method of testing whether you have a bug. So you didn't code a bug. And if you didn't code a bug, you can't reproduce a bug.
So yes, reproducing a bug is difficult when you have no means of determining whether or not you have a bug...
...maybe you should look into choosing a means of determining whether or not you have a bug.
> it is a computer algorithm, so by definition it is trivial to reproduce results, you just run the program again.
Not really. Deep learning is still quite a lot of dark voodoo where random initialization and data shuffling can matter significantly. People also adapt hyperparameters manually during training, stop early with no clear metric, and don't share their code for preprocessing the data or even the exact architecture of the network.
It's certainly better than in other fields, but it's not trivial.
This actually isn't true; you can invent organizations of the imagination and mirror those. This "imagination driven programming" is actually quite dangerous and tends to devolve into state secrets and so on since you are solving the problems created by people who "don't exist" but may in fact have lives that are SURPRISINGLY SIMILAR to the lives of people who are doing jobs that are better kept secret.
So you can get around the law...but only by risking the integration of something that should be kept secret into the organizational structure...which makes its way into your system. Most people aren't willing to cross that threshold.
So Conway's law is true for most people, just not all.
If you want to think about it from the adversarial point of view, you can say that all programs are designed to transform or destroy organizations; programs mirror organizational structures because people want to determine the "resonant frequency" of an organization and understand its "social vulnerabilities" in much the same way a physical structure has structural vulnerabilities.
One person or AI can do such "imagination driven programming" to mirror any organisation. so i guess organisation of n people can also mirror any organization they want, if that organization is more complex than theirs. Organization can also use some kind of obfuscator to change program structure to not mirror their organization and also mirror less complex organizations.
I don't understand how this is dangerous. can you please give an example? I guess if you mimic more complex organization it may be dangerous.
You have to say which set of programs you will run, and then you have to pick how you will describe a set of programs (since the name of a set of programs is just data).
Wikipedia lets malicious editors enter lies...these lies can be very difficult to detect so you need smart people to see them. The hope is that because it is text, people know how to read and can use critical thinking...computers on the other hand do not do this with code; they apply a set of rules that governs what can and cannot be executed; in other words, the input is either in the set of programs it will run (or not).
Now, you could allow each article to name a way of translating between...and all of a sudden we are kicking a hornet's nest of code-breakers :D :D :D
So, the short answer is...YOU COULD DO IT...but why would you want to attract the attention of people whose JOB it is to do figure out breakthroughs in code-breaking?
This gets into ZFC set theory and Tower of Babel and intelligence collection stuff. Essentially the problem is that if we want to define the boundary of what we will execute, we wind up getting marginally closer to state secrets.
And then, before you know it, somebody posted the code to crack the data-link to a drone in (redacted foreign country) and this caused the deaths of ___ soldiers.
So, you have to spend some time defining the territory before you let people wander aimlessly into danger.
And the activity looks very similar to a neuron firing! This is, in a sense, THE BASIC LEARNING UNIT a.k.a. neuron. If you think about the way water flows through a river (the water KEEPS ERODING THE RIVER! How can this possibly work ?!?!!?) or the way a neuron counts the "votes" from other neurons like an automatic electronic computer, you can see similarities in scale!
identity-fixing monotone (in the prefix relation, a partial preorder) maps of cancellative monoids. It is possible to use them as a model of abstract sequential computation, so they can be used for applications such as mathematical models of compilers, parsers, ...
I would like to propose wikipedia edit history and comment deletion milestones for the hn comment system, and in addition, a comment redaction facility that works like redaction of classified documents.
> given some condition on a distribution of distributions, when do we feel that a guesser is taking too long to make a choice?
This is like a person who is taking to long to identify a color or a baby making a decision about what kind of food it wants and waiting for it to do so. For a certain interval, it makes sense, but after a point it becomes pathological.
So for example if we have two distributions,
> uniform distribution on the unit interval [0,1]; uniform distribution on the interval [1,2]
then we get impatient with a guesser who takes longer than a single guess, since we know (with probability 1) that a single guess will do.
Now, if we have two distributions that overlap, say the uniform distribution on [1,3] and [0,2], then we can quantify how long it will take before we know the choice with probability 1, but we can't say for sure how many observations will be required before any agent capable of processing positive feedback in a neural network can say for certain which one it is. As soon as an observation leaves the interval (1,2) the guesser can state the answer.
Now, things can get more interesting when the distributions are arranged in a hierarchy, say the uniform distribution on finite disjoint unions of disjoint intervals (a,b) where a < b are two dyadic rationals with the same denominator when written in lowest terms.
If a guesser is forced to guess early, before becoming certain of the result, then we can compare ways to guess by computing how often they get the right answer.
Observations now give two types of information: certain distributions can be eliminated with complete confidence (because there exists a positive epsilon such that the probability of obtaining an observation in the epsilon ball is zero) while for the others, Bayes theorem can be used to update a distribution of distributions or several distributions of distributions that are used to drive a guessing algorithm. A guess is a statement of the form "all observations are taken from the uniform distribution on subset ___ of the unit interval".
Example: take the distributions on the unit interval given by the probability density functions 2x and 2-2x. Given a sequence of observations, we can ask: what is the probability that the first distribution was chosen?
The answers to these questions can be found in a book like Probability : Theory and Examples.