Jump to content

Talk:URL decoding: Difference between revisions

maybe think of this as code points (which must be literal) vs characters (which in utf-8 can be a sequence of code points
No edit summary
(maybe think of this as code points (which must be literal) vs characters (which in utf-8 can be a sequence of code points)
Line 37:
:::::::::::::: I think you need to ignore LC_ALL/LANG (other than requiring that they have sensible values for you environment and OS), since their purpose is to be able to turn unicode support on or off outside the context of the program. --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 18:28, 28 May 2015 (UTC)
:::::::::::::::Yes I would like to ignore locale settings entirely, believe me! Yet, it is the key to making the URL decode function work correctly as evidenced from our discussion. The alternative is to use [https://github.com/kevin-albert/awkserver/blob/master/src/core.awk this urldecode() function] which has an array of every potential character needing to be decoded. -- [[User:3havj7t3nps8z8wij3g9|3havj7t3nps8z8wij3g9]] ([[User talk:3havj7t3nps8z8wij3g9|talk]]) 01:40, 29 May 2015 (UTC)
 
:::::::::::::::: https://github.com/kevin-albert/awkserver/blob/master/src/core.awk should have the same issue with utf-8. If you tell awk that it should represent its characters using utf-8 then what you'll get from the urldecode will not be the right code points.
 
:::::::::::::::: Maybe think of this as utf-8 code points (which are what the awk code emits, in either version) vs characters (which in utf-8 can each be a sequence of one or more code points)? ---[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 09:13, 29 May 2015 (UTC)
6,962

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.