Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are features of some regular expressions for which the only known solution is backtracking. If you want those features then you "require backtracking".


Out of interest, what are some of these? I have a hard time believing that the implementors of the Perl regex engine chose to write it that way for no reason while the Thompson NFA figures are thrown about. I knew there must havevbeen something this 'implementation detail' was good for.


An easy example is matching palindromes. You simply can't match a palindrome by moving forward only; you have to go back and see if every letter matches. So, if you want to search for the longest palindrome in a string, you'll necessarily be doing a lot of backtracking.

There's no RE2-compatible regular expression for matching palindromes, but additional features as found in PCRE and similar "regex" engines can do it with backreferences or with look-around assertions. See http://stackoverflow.com/q/3746487 and http://stackoverflow.com/q/3664881 for two ways to write such a regex.


I believe backreferences require backtracking.


No, but variable size lookahead/behind do. This is because the engine has to go back if the remaining part of a regex fails. For some examples, see http://www.regular-expressions.info/recursebacktrack.html)

EDIT: you are correct, backreferences do require backtracking, my bad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: