Terminal control/Unicode output: Difference between revisions

Content added Content deleted

Inline

Revision as of 02:41, 12 September 2011

The task is to check that the terminal supports Unicode output, before outputting a Unicode character. If the terminal supports Unicode, then the terminal should output a Unicode delta (U+25b3). If the terminal does not support Unicode, then an appropriate error should be raised.

AWK

<lang awk>#!/usr/bin/awk -f BEGIN {

 unicodeterm=1   # Assume Unicode support
 if (ENVIRON["LC_ALL"] !~ "UTF") {
   if (ENVIRON["LC_ALL"] != ""
     unicodeterm=0    # LC_ALL is the boss, and it says nay
   else {
     # Check other locale settings if LC_ALL override not set
     if (ENVIRON["LC_CTYPE"] !~ "UTF") {
       if (ENVIRON["LANG"] !~ "UTF")
         unicodeterm=0    # This terminal does not support Unicode
     }        
   }    
 }

 if (unicodeterm) {
     # This terminal supports Unicode
     # We need a Unicode compatible printf, so we source this externally
     # printf might not know \u or \x, so use octal.
     # U+25B3 => UTF-8 342 226 263
     "/usr/bin/printf \\342\\226\\263\\n"
 } else {
     print "HW65001 This program requires a Unicode compatible terminal"|"cat 1>&2"
   exit 252    # Incompatible hardware
 }</lang>

UNIX Shell

This script only checks if the name of the locale contains "UTF-8". This often works because many UTF-8 locales have names like "en_US.UTF-8". This script will fail to recognize a Unicode terminal if:

The locale is a UTF-8 locale, but does not have "UTF-8" in its name.
The locale uses some other Unicode Transformation Format, such as GB18030.

Works with: Bourne Shell

<lang bash>unicode_tty() {

 # LC_ALL supersedes LC_CTYPE, which supersedes LANG.
 # Set $1 to environment value.
 case y in
 ${LC_ALL:+y})		set -- "$LC_ALL";;
 ${LC_CTYPE:+y})	set -- "$LC_CTYPE";;
 ${LANG:+y})		set -- "$LANG";;
 y)			return 1;;  # Assume "C" locale not UTF-8.
 esac
 # We use 'case' to perform pattern matching against a string.
 case "$1" in
 *UTF-8*)		return 0;;
 *)			return 1;;
 esac

}

if unicode_tty; then

 # printf might not know \u or \x, so use octal.
 # U+25B3 => UTF-8 342 226 263
 printf "\342\226\263\n"

else

 echo "HW65001 This program requires a Unicode compatible terminal" >&2
 exit 252    # Incompatible hardware

fi</lang>

The terminal might support UTF-8, but its fonts might not have every Unicode character. Unless they have U+25B3, the output will not look correct. Greek letters like U+25B3 tend to be common, but some fonts might not have Chinese characters (for example), and almost no fonts have dead scripts such as Cuneiform.

ZX Spectrum Basic

<lang zxbasic>10 REM There is no Unicode delta in ROM 20 REM So we first define a custom character 30 FOR l=0 TO 7 40 READ n 50 POKE USR "d"+l,n 60 NEXT l 70 REM our custom character is a user defined d 80 PRINT CHR$(147): REM this outputs our delta 9500 REM data for our custom delta 9510 DATA 0,0,8,20,34,65,127,0 </lang>