Hacker Newsnew | past | comments | ask | show | jobs | submit | clausecker's commentslogin

You can do it like this, assuming A is the mask of newlines and B is the mask of non-spaces.

1. Compute M1 = ~A & ~B, which is the mask of all spaces that are not newlines 2. Compute M2 = M1 + (A << 1) + 1, which is the first non-space or newline after each newline and then additional bits behind each such newline. 3. Compute M3 = M2 & ~M1, which removes the junk bits, leaving only the first match in each section

Here is what it looks like:

    10010000 = A
    01100110 = B
    00001001 = M1 = ~A & ~B
    00101010 = M2 = M1 + (A << 1) + 1
    00100010 = M3 = M2 & ~M1
Note that this code treats newlines as non-spaces, meaning if a line comprises only spaces, the terminating NL character is returned. You can have it treat newlines as spaces (meaning a line of all spaces is not a match) by computing M4 = M3 & ~A.


Thanks for the suggestion. Indeed, that would be the first approach. That is how I started. This is however not considering the state (am I inside a 'statement' or not)

statement meaning string from first non-space till next EOL or EOF.

Problem starts when you need to cover the "corner cases". Without the corner cases the algo is not algo.


Do you mind trying out your solution? The code is in https://github.com/zokrezyl/yaal-cpp-poc Thanks a lot!

Obviously if your solutions gets closed to the memory bandwith limit, we will proudly mention it!


So I've thought about it and I don't really feel like spending more time to convince you that this works. If you have questions I am happy to answer them, but please write your own code.


It's fine and thank you! I am playing arround with the idea, in theory all is good.. Only thing is that things like "first non ..." often involve branching that corrupts the prediction ability of the CPU. Therefore I kindly invited you to show it in code.


You can find the first set bit in an integer with a machine instruction, it's completely branch free. gcc has __builtin_ctz() for this. You'll either need to iterate over all set bits (so one branch per set bit) or use a compression instruction (requiring AVX-512) to turn the bit set into a set of integers.

That said, as you seem to actually want to do something with the results, you'll take a branch per match anyway, so I don't see the problem.


This does track the state. If you want to track it across multple vectors of input, you'll need to carry it over manually.


We do this with FreeBSD ports, but users don't have to clone the ports tree unless they want to modify ports or compile them with custom options.


Yeah, using it as an actual source repository sounds fine to me. I think the issue comes when users are expected to clone it just for installation.


And the same reason NVRAM was dead on arrival. No affordable dev systems meant that only enterprise software supported it.


That's more "load store architecture" than RISC. And by that measure, S/360 could be considered a RISC.



FreeBSD


ARM already has most stuff required for this on board. Two proprietary extensions are used by Rosetta: one emulates the parity (rarely used) and half-carry (obsolete) flags, which can also be emulated conventionally. The other implementa TSO memory ordering, which can either be ignored or implemented with explicit barriers; some other chips apparently have a similar setting.

The other stuff is all present in ARMv8.5 I think.


It's very likely that there is some serious autovectorisation going on behind the scenes.


Best one was when gedit had the option to syntax highlight for a language named “Los.”


Not a bad name, to be honest!


Had the same feeling browsing through the Haskell package collection. Felt like and almagamation of PhD theses, none of which were maintained after the author got his degree. Every single one a work of art, but most engeneered so badly that you would only use them begrudgingly.


My impression of Rust crates is that most are developed because a standardized solution to the problem didn't exist or didn't meet the author's needs, so they built their own. Many are well designed, but were never used by enough people to become truly usable or robust before they were abandoned.

It seems like outside certain problem domains, there isn't any effort to pool resources to keep projects alive. The few I did find were forks of forks where each subsequent maintainer stopped responding to proposed changes.


haskell


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: