Hacker Newsnew | past | comments | ask | show | jobs | submit | topaz0's commentslogin

For some people the relevant properties of "thing" include not needing overpowered hardware to run it comfortably. So "thing" does not just "exist", at least not in the form of electron.

I dunno, I think the boosters show their hurt feelings here more often than the detractors do.

Lmfao. The front page is littered with whining about the craft from people who can’t argue coherently why I should go back to getting yelled at by a linter.

It’s all “I can’t think anymore” or “software bad now” followed by a critique of the industry circa 2015.

Most of the people making cool stuff with LLMs are making it, not writing blog posts hoping to be a thought leader.


If telling something else to make something for you is a craft, I'm an artisan for hiring a webdev to build my site.

Reading is fundamental

I think your intuition comes from the assumption that the experimental subjects are already coming to you in a random order. If that's the case, then you might as well assign the first half to control and the second half to treatment. To see the problem with poor randomization, you have to think about situations where there is (often unknown) bias or correlations in the order of the list that you're drawing from to randomize. Say you have an ordered list of 10 numbers, assigned 5 and 5 to control and (null) treatment groups. There are 252 assignments, which in theory should be equally likely. Assuming they all give different values of your statistic, you'll have 12 assignments with p <= .0476. If, say, you do the assignment from ~~a 256~~ an 8 bit random number such that 4 of the possible assignments are twice as likely as the others under your randomization procedure, the probability of getting one of those 12 assignments something between .0469 and .0625, depending whether the more-likely assignments happen to be among the 12 most extreme statistics, which is a difference of about 1/3 and could easily be the difference between "p>.05" and "p<.05". Again, if you start with your numbers in a random order, then this doesn't matter -- the biased assignment procedure will still give you a random assignment, because each initial number will be equally likely to be among the over-sampled or under-sampled ones.

Also worth noting that the situations where this matters are usually where your effect size is fairly small compared to the unexplained variation, so a few percent error in your p-value can make a difference.


> If, say, you do the assignment from a 256 bit random number such that 4 of the possible assignments are twice as likely as the others under your randomization procedure

Your numbers don't make sense. Your number of assignments is way fewer than 2^256, so the problem the author is (mistakenly) concerned about doesn't arise--no sane method would result in any measurable deviation from equiprobable, certainly not "twice as likely".

With a larger number of turkeys and thus assignments, the author is correct that some assignments must be impossible by a counting argument. They are incorrect that it matters--as long as the process of winnowing our set to 2^256 candidates isn't measurably biased (i.e., correlated with turkey weight ex television effects), it changes nothing. There is no difference between discarding a possible assignment because the CSPRNG algorithm choice excludes it (as we do for all but 2^256) and discarding it because the seed excludes it (as we do for all but one), as long as both processes are unbiased.


typo -- meant to say 8 bit random number i.e. having 256 possibilities, convenient just because the number of assignments was close to a power of 2. If instead you use a 248-sided die and have equal probabilities for all but 4 of the assignments, the result is similar but in the other direction. Of course there are many other more subtle ways that your distribution over assignments could go wrong, I was just picking one that was easy to analyze.

Ah, then I see where you got 4 assignments and 2x probability. Then I think that is the problem the author was worried about and that it would be a real concern with those numbers, but that the much smaller number of possibilities in your example causes incorrect intuition for the 2^256-possibility case.

I think the intuition that everything will be fine in the 256 bit vs 300 bit case depends on the intuition that the assignments that you're missing will be (~close to) randomly distributed, but it's far from clear to me that you can depend on that to be true in general without carefully analyzing your procedure and how it interacts with the PRNG.

If you can find a case where this matters, then you've found a practical way to distinguish a CSPRNG seeded with true randomness from a stream of all true randomness. The cryptographers would consider that a weakness in the CSPRNG algorithm, which for the usual choices would be headline news. I don't think it's possible to prove that no such structure exists, but the world's top (unclassified) cryptographers have tried and failed to find it.

And worth noting that the "even when properly seeded with 256 bits of entropy" example in the article was intended as an extreme case, i.e. that many researchers in fact use seeds that are much less random than that.

We're talking about randomizing experiment participants into treatment and control group... What you're saying is equivalent to "I can use the same order of assignments for every experiment I do"...

I thought we were speaking more generally. In the specific case of assigning experimental subjects to groups at the time the experiment is performed you'd want a "no tricks up my sleeve" number for the key such as a hash of the date string in ISO standard format (ie yyyy-mm-dd). Your RNG is then AES_ECB( key=sha3(date_string), plaintext=serial_number ) using the serial number of the participant.

If you need to use a rejection method to achieve a uniform distribution you can do so via plaintext=( ( serial_number << 32 ) | sample_counter ).

By adhering to such a scheme it becomes extremely difficult for anyone to reasonably accuse you of underhanded tricks via RNG manipulation.


> if object X models object Y, then I’m going to say that X is Y

If you haven't read to the end of the post, you might be interested in the philosophical discussion it builds to. The idea there, which I ascribe to, is not quite the same as what you are saying, but related in a way, namely, that in the case that X models Y, the mathematician is only concerned with the structure that is isomorphic between them. But on the other hand, I think following "therefore X is Y" to its logical conclusion will lead you to commit to things you don't really believe.


> But on the other hand, I think following "therefore X is Y" to its logical conclusion will lead you to commit to things you don't really believe.

I would love to hear an example… but before you do, I’m going to clarify that my statement was expressing a notion of what “is” sometimes means to a mathematician, and caution that

1. This notion is contextual, that sometimes we use the word “is” differently, and

2. It requires an understanding of “forgetfulness”.

So if I say that “Cauchy sequences in Q is R” and “Dedekind cuts is R”, you have to forget the structure not implied by R. In a set-theoretic sense, the two constructions are unequal, because you use constructed different sets.

I think this weird notion of “is” is the only sane way to talk about math. YMMV.


I think the problem with insisting on using "is" that way is that you then can't distinguish between two things you might reasonably want to express, i.e. "is isomorphic to"/"has the same structure as" and "refers to the same object". I totally agree that math is all about forgetting about the features of your objects that are not relevant to your problem (and in particular as the post argues things like R and C do not refer to any concrete construction but rather to their common structure), but if you want to describe that position you have to be able to distinguish between equality and isomorphism.

(Of course using "is" that way in informal discussion among mathematicians is fine -- in that case everyone is on the same page about what you mean by it usually)


> I think the problem with insisting on using "is" that way is that you then can't distinguish between two things you might reasonably want to express, i.e. "is isomorphic to"/"has the same structure as" and "refers to the same object".

It’s reasonable to want to express that difference in specific circumstances, but it would be completely unreasonable to make this the default.

For example, I can say that Z is a subset of Q, and Q is a subset of R. I can do this, but maybe you cannot—you’ve expressed a preference for a more rigid and inflexible terminology, and I don’t think you’re prepared to deal with the consequences.


Most commenters are talking about the first part of the post, which lays out how you might construct the complex numbers if you're interested in different properties of them. I think the last bit is the real interesting substance, which is about how to think about things like this in general (namely through structuralism), and why the observations of the first half should not be taken as an argument against structuralism. Very interesting and well written.

It is very re-assuring to know, on a post where I can essentially not even speak the language (despite a masters in engineering) HN is still just discussing the first paragraph of the post.

Maybe the bottom ~1/3, starting at "The complex field as a problem for singular terms", would be helpful to you. It gives a philosophical view of what we mean when we talk about things like the complex numbers, grounded in mathematical practice.

Notably, the real numbers are not symmetrical in this way: there are two square roots of 1, but one of them is equal to it and the other is not. (positive) 1 is special because it's the multiplicative identity, whereas i (and -i) have no distinguishing features: it doesn't matter which one you call i and which one you call -i: if you define j = -i, you'll find that anything you can say about i can also be shown to be true about j. That doesn't mean they're equal, just that they don't have any mathematical properties that let you say which one is which.

Every father is a son to somebody...

Unfortunately tech journalists' judgement of source credibility don't have a very good track record


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: