Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

URL parsing is a nightmare. There are lots of RFCs, but the real world is full of quirks. This will lead to edge case differences between the implementations.

Also, which URI standard is part of the spec, what if the URI standard evolves? – It is a complex standard with multiple RFCs and a long history.

What of possible, different length restrictions in language native URI types?

I have no stake in this though, feel free to ignore. I tend to see negatives first.



Yes, it's true that URLs are a nightmare, but we don't have anything better (yet). Once we do, I'll happily release v2 of the spec. For now, it follows the RFC.


XML tried to do that with the anyURI data type

In XML/XSD 1.1 they gave up on it, and consider any string as valid anyURI

I tried to implement the old XML types. I built a huge regex from the RFC, but it did seem to cover all cases


TBH at this point it doesn't really matter what the URI specifications say. Somehow we're able to stuff our URLs into our browsers, web pages, package managers, REST APIs, mail clients etc and manage to get it working. It's useful enough that everyone uses it, so I'd be a fool to get stuck on the "correctness" of their specs. CTE uses the double-quote as a delimiter, so as long as the contents are percent-escaped for double-quote and your language's URL parser says "cool", it's acceptable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: