> #pip install python-levenshtein > from Levenshtein import distance Tha...

llimllib · on June 14, 2011

Why? It's not hard to write a quick and dirty levenshtein function, but when google gets it for me in 1 minute, why do it?

edit: just for fun, as a one-liner:

    def levenshtein(a,b): return sum(x!=y for x,y in zip(a, b))

edit 2: and just to clarify, I know this fails if the lengths of the input strings differ, they were guaranteed to be equal in my code.

ithayer · on June 14, 2011

Can you do it in linear time in the length of the input?

kilburn · on June 17, 2011

Find an actual (amortized) linear time implementation for this question in the following gist:

https://gist.github.com/1026638

drdo · on June 14, 2011

It's not hard to prove that that's impossible

amalcon · on June 14, 2011

It's possible if RAM isn't a constraint, using a bitset of all possible five-letter strings. The bitset would take up 4G of RAM (128G for extended ASCII, more for unicode).

In practice, that would actually be slower on this input, because of the cost of initializing the bitset. But that is not dependent on the input, so computational complexity is unaffected.

edit: Removed description. Yes, you're right, you can't even look at the whole input in constant time.

drdo · on June 14, 2011

This is a misunderstanding, you must have answered to my reply after ithayer edited his comment, the initial comment said "constant time".

drdo · on June 14, 2011

Didn't understand what you meant, care to elaborate?

ithayer · on June 14, 2011

:) I meant linear, but I'd review a proof :)