brihat's comments

brihat · on Feb 1, 2014

Freenode #emacs, ##linux, #nimrod, #julia and #d.

Nimrod's gang (including Araq) are very friendly and welcoming.

#julia and #d are very quiet though (except for the bots).

And #emacs -- well, that one channel which is lenient towards off-topic chats!

bulte-rs · on Feb 2, 2014

No... #emacs is not lenient... It's just that everything is applicable to Emacs and vice-verse. ;-)

brihat · on Nov 30, 2013

Isn't this actually called "implementation-defined" behaviour in the std rather than "undefined" behaviour? I generally get a segfault for undefined behaviors

twoodfin · on Dec 1, 2013

It's undefined. That means the compiler and runtime can choose to segfault, do something that looks right, silently corrupt your program, launch the nuclear missiles, whatever. And there's no requirement that this behavior be repeatable, so the code can work with debug flags turned on but fail in a release build.

You can never rely on your compiler's implementation of undefined behavior.

ygra · on Dec 1, 2013

> You can never rely on your compiler's implementation of undefined behavior.

I wish I could somehow get my coworker to understand this. He's mostly "Yeah, well, I know it's undefined, but there's really no way this could be anything else than <foo>. I know what the CPU does there."

scott_w · on Nov 30, 2013

I believe this is undefined because the standard doesn't say what a "sequence point" is and not "a sequence point is implementation defined". I'm going from second-hand knowledge and it's been years since I looked at C (and never past a high school level of understanding).

brihat · on Nov 30, 2013

A naive implementation in Nimrod:

     proc main(max_a0: int): int =
       var a, longest, len, max_len : int
     
       for a0 in countup(1, max_a0):
         a = a0
         len = 0
     
         while a != 1:
           len += 1
           if (a mod 2 != 0): a = (3*a + 1)
           a = a div 2
     
         if len > max_len:
           max_len = len
           longest = a0
     
       return longest
     
     # Main program starts here
     echo(main(1000000))

Takes 0.76s (the C program in TFA takes 0.58s)

brihat · on Nov 27, 2013

The author intentionally chooses decomposed form. Indeed all of them work with Python 3. Here:

    Python 3.3.2+ (default, Oct  9 2013, 14:50:09) 
    [GCC 4.8.1] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> noel="noël"
    >>> noel[::-1]       # reverse
    'lëon'
    >>> noel[0:3]        # first three characters 
    'noë'
    >>> len(noel)        # length
    4

The point is, defining what is a character based on how it is displayed is flawed. Just precompose the string ifg you want and carry on. Like I said in my other comment, making automatic conversion of decomposed -> precomposed wrecks havoc with Indian languages.

jfim · on Nov 27, 2013

Works as expected too in Scala, although it might be because the terminal does normalization.

scala> val noel = "Noël" noel: String = Noël

scala> noel.reverse res0: String = lëoN

scala> noel.take(3) res1: String = Noë

scala> noel.length res2: Int = 4

scala> import java.text.Normalizer

val nfdNoel = Normalizer.normalize(noel, Normalizer.Form.NFD) import java.text.Normalizer

scala> nfdNoel: String = Noël

scala> nfdNoel.length res3: Int = 5

scala> nfdNoel.reverse res4: String = l̈eoN

scala> nfdNoel.take(3) res5: String = Noe

The problem with an array of characters, as he mentions, is that it doesn't work properly in many use cases. If your array of characters stores 16 bit codepoints, it breaks with the 32 bit codepoints (Java got bit hard by that, where a char used to be a character prior to the introduction of surrogate pairs in Unicode); if it stores 32 bit codepoints, then it's pretty wasteful in most cases, which is exactly why you'd want a string type that handles storage of series of characters in an optimal fashion.

brihat · on Nov 27, 2013

This article is mostly written from a European language perspective. For Indian scripts, storing combining characters as a separate code points is the right thing to do.

For example, कि (ki) is composed of क and ि When I'm writing this in an editor, say, I typed ku (कु) instead of ki (कि) and I press backspace, I indeed want to see क rather than deleting the whole "कि".

frenchy · on Nov 27, 2013

Only some times I figure, because if you want to make the first letter green, you'd want that to apply to the whole कि.

brihat · on Nov 26, 2013

I am very impressed with Anaconda's Free edition. Works well in Linux and has more packages [2] than Enthought's free edition. Anaconda is the commercial offering from Travis Oliphant, the creator of Numpy.

[1]: https://store.continuum.io/cshop/anaconda/

[2]: Packages: http://docs.continuum.io/anaconda/pkgs.html