Talk:Find URI in text

Unicode Chars

My hunch is just to leave Unicode characters alone. This can be regarded as a matter of conversion before the URL is used. It depends on the purpose of extracting URL's from text. (Are they headed for a processing stage which deals with those characters fine?)24.85.131.247 19:01, 3 January 2012 (UTC)

that's the intention exactly. non-ascii characters are mentioned because they should be included. a parser that only accepts legal characters would not do that.--eMBee 02:14, 4 January 2012 (UTC)

So, since spaces can be entered in a browser, they can be accepted as part of a URI, here? --Rdm 18:17, 5 January 2012 (UTC)