Isn't this actually called "implementation-defined" behaviour in the std rather than "undefined" behaviour? I generally get a segfault for undefined behaviors
It's undefined. That means the compiler and runtime can choose to segfault, do something that looks right, silently corrupt your program, launch the nuclear missiles, whatever. And there's no requirement that this behavior be repeatable, so the code can work with debug flags turned on but fail in a release build.
You can never rely on your compiler's implementation of undefined behavior.
> You can never rely on your compiler's implementation of undefined behavior.
I wish I could somehow get my coworker to understand this. He's mostly "Yeah, well, I know it's undefined, but there's really no way this could be anything else than <foo>. I know what the CPU does there."
I believe this is undefined because the standard doesn't say what a "sequence point" is and not "a sequence point is implementation defined". I'm going from second-hand knowledge and it's been years since I looked at C (and never past a high school level of understanding).
proc main(max_a0: int): int =
var a, longest, len, max_len : int
for a0 in countup(1, max_a0):
a = a0
len = 0
while a != 1:
len += 1
if (a mod 2 != 0): a = (3*a + 1)
a = a div 2
if len > max_len:
max_len = len
longest = a0
return longest
# Main program starts here
echo(main(1000000))
The author intentionally chooses decomposed form. Indeed all of them work with Python 3. Here:
Python 3.3.2+ (default, Oct 9 2013, 14:50:09)
[GCC 4.8.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> noel="noël"
>>> noel[::-1] # reverse
'lëon'
>>> noel[0:3] # first three characters
'noë'
>>> len(noel) # length
4
The point is, defining what is a character based on how it is displayed is flawed. Just precompose the string ifg you want and carry on. Like I said in my other comment, making automatic conversion of decomposed -> precomposed wrecks havoc with Indian languages.
Works as expected too in Scala, although it might be because the terminal does normalization.
scala> val noel = "Noël"
noel: String = Noël
scala> noel.reverse
res0: String = lëoN
scala> noel.take(3)
res1: String = Noë
scala> noel.length
res2: Int = 4
scala> import java.text.Normalizer
val nfdNoel = Normalizer.normalize(noel, Normalizer.Form.NFD)
import java.text.Normalizer
scala> nfdNoel: String = Noël
scala> nfdNoel.length
res3: Int = 5
scala> nfdNoel.reverse
res4: String = l̈eoN
scala> nfdNoel.take(3)
res5: String = Noe
The problem with an array of characters, as he mentions, is that it doesn't work properly in many use cases. If your array of characters stores 16 bit codepoints, it breaks with the 32 bit codepoints (Java got bit hard by that, where a char used to be a character prior to the introduction of surrogate pairs in Unicode); if it stores 32 bit codepoints, then it's pretty wasteful in most cases, which is exactly why you'd want a string type that handles storage of series of characters in an optimal fashion.
This article is mostly written from a European language perspective. For Indian scripts, storing combining characters as a separate code points is the right thing to do.
For example, कि (ki) is composed of क and ि When I'm writing this in an editor, say, I typed ku (कु) instead of ki (कि) and I press backspace, I indeed want to see क rather than deleting the whole "कि".
I am very impressed with Anaconda's Free edition. Works well in Linux and has more packages [2] than Enthought's free edition. Anaconda is the commercial offering from Travis Oliphant, the creator of Numpy.
Nimrod's gang (including Araq) are very friendly and welcoming.
#julia and #d are very quiet though (except for the bots).
And #emacs -- well, that one channel which is lenient towards off-topic chats!