Variable size/Set

Task

Demonstrate how to specify the minimum size of a variable or a data type.

11l

In 11l, for base types, the size of a variable is determined by the type. There is some flexibility, as it is possible to choose the type according to the required size. For instance, for signed integers, we can choose among Int8, Int16, Int32 and Int64. The size of the variable of type “Int” is 32 or 64 bits according to the platform. The same exists for unsigned integers (Byte, UInt16, UInt32, UInt64) and for floating point numbers (Float32, Float64/Float).

360 Assembly

The 360 architecture data specifications are: <lang 360 Assembly>

Binary interger (H,F)

I2 DS H half word 2 bytes I4 DS F full word 4 bytes

Real (floating point) (E,D,L)

X4 DS E short 4 bytes X8 DS D double 8 bytes X16 DS L extended 16 bytes

Packed decimal (P)

P3 DS PL3 2 bytes P7 DS PL7 4 bytes P15 DS PL15 8 bytes

Zoned decimal (Z)

Z8 DS ZL8 8 bytes Z16 DS ZL16 16 bytes

Character (C)

C1 DS C 1 byte C16 DS CL16 16 bytes C256 DS CL256 256 bytes

Bit value (B)

B1 DC B'10101010' 1 byte

Hexadecimal value (X)

X1 DC X'AA' 1 byte

Address value (A)

A4 DC A(176) 4 bytes but only 3 bytes used

(24 bits => 16 MB of storage)

</lang>

6502 Assembly

Syntax will vary depending on the assembler and whether your program will run in RAM or ROM. For programs that execute from RAM such as those that are loaded from a disk, you can use the same syntax that you would use to define constants. When defining bytes as variables or constants, you do not prefix them with a # sign. Only in actual CPU instructions do you need to use a # to prevent the assembler from treating the numeric value as a memory location.

Since these are variables, the value given to them (in this case, 0) is the initial value, and can be changed later at runtime. If you don't care what the initial value is, some assemblers allow you to use a "?" where the 0s are.

<lang 6502asm>MyByte:

  byte 0       ;most assemblers will also accept DB or DFB

MyWord:

  word 0       ;most assemblers will also accept DW or DFW

MyDouble:

  dd 0</lang>

For programs that are executed solely from ROM, such as video game console cartridges, you won't be able to use the above method. The assembler can often use an enum or rsset directive to sequentially assign labels to a series of consecutive memory locations in the system's RAM.

<lang 6502asm>.rsset $00 ;starting at $0400, the following labels represent sequential memory locations of length ".rs n" VBlankFlag .rs 1 ;$00 soft_PPUCTRL .rs 1 ;$01 soft_PPUSTATUS .rs 1 ;$02 soft_SCROLL_X .rs 1 ;$03 soft_SCROLL_Y .rs 1 ;$04 temp_16 .rs 2 ;$05,$06 tempStack .rs 1 ;$07</lang>

Assemblers that don't have an enum or rsset directive can use the equ directive instead. This method lets you immediately see what each memory location actually is, but it makes it harder to insert a new one without having to redo all the numbering. Certain 6502 instructions rely on two memory addresses being consecutive.

<lang 6502asm>VBlankFlag equ $00 soft_PPUCTRL equ $01 soft_PPUSTATUS equ $02 soft_SCROLL_X equ $03 soft_SCROLL_Y equ $04 temp_16 equ $05 ;you have to keep track of spacing yourself in this method tempStack equ $07</lang>

While setting a variable's size is easy, getting it isn't possible without knowing it in advance. The CPU does not (and cannot) know the intended size of a variable. There's no enforcement of types whatsoever on the 6502; anything is fair game.

68000 Assembly

Translation of: 6502 Assembly

Syntax will vary depending on the assembler and whether your program will run in RAM or ROM. For programs that execute from RAM such as those that are loaded from a disk, you can use the same syntax that you would use to define constants. When defining bytes as variables or constants, you do not prefix them with a # sign. Only in actual CPU instructions do you need to use a # to prevent the assembler from treating the numeric value as a memory location.

Since these are variables, the value given to them (in this case, 0) is the initial value, and can be changed later at runtime. If you don't care what the initial value is, some assemblers allow you to use a "?" where the 0s are.

<lang 68000devpac>MyByte:

  DC.B 0    
  EVEN         ;you need this to prevent alignment problems if you define an odd number of bytes.

MyWord:

  DC.W 0       ;this takes up 2 bytes even though only one 0 was written

MyLong:

  DC.L 0       ;this takes up 4 bytes even though only one 0 was written</lang>

For programs that are executed solely from ROM, such as video game console cartridges, you won't be able to use the above method. The assembler can often use an enum or rsset directive to sequentially assign labels to a series of consecutive memory locations in the system's RAM. Assemblers that don't have an enum or rsset directive can use the equ directive instead. This method lets you immediately see what each memory location actually is, but it makes it harder to insert a new one without having to redo all the numbering. These variables are located in the heap and thus there is no need to use EVEN directives to align this data.

<lang 68000devpac> Cursor_X equ $100000 ;byte - only comments can tell you the intended variable size. Cursor_Y equ $100001 ;byte tempWord equ $100002 ;word - also occupies $100003 tempLong equ $100004 ;long - also occupies $100005,6,7</lang>

While setting a variable's size is easy, getting it isn't possible without knowing it in advance. The CPU does not (and cannot) know the intended size of a variable. There's no enforcement of types whatsoever on the 6502; anything is fair game.

8086 Assembly

Syntax will vary depending on the assembler you're using, but the following should apply to most assemblers.

<lang asm> .data ;data segment

TestValue_00 byte 0 ;an 8-bit variable TestValue_01 word 0 ;a 16-bit variable TestValue_02 dword 0 ;a 32-bit variable

.code

start:

mov dh, byte ptr [ds:TestValue_00] ;load the value stored at the address "TestValue_00" mov ax, word ptr [ds:TestValue_01] ;load the value stored at the address "TestValue_01"</lang>

For programs that are executed from ROM, such as those on video game cartridges, the above syntax can only be used to define constants. Defining variables will need to be done using equ directives, and you'll need to read the system's documentation to know where the heap is located. For example, the Bandai Wonderswan has its heap located starting at memory address 100h.

There is some enforcement of variable sizes, but it's not as strict as most other languages. The two commands above would not have worked had the wrong register sizes been used. However, using the wrong labels is perfectly legal in the eyes of most assemblers.

<lang asm>mov al, byte ptr [ds:TestValue_02]

even though this was listed as a dword in the data segment, the assembler lets us do this!</lang>

In fact, data sizes don't matter to the CPU all that much, as the following definitions are all equivalent: <lang asm>foo byte 0,0,0,0 bar word 0,0 baz dword 0</lang>

The data types are more for the programmer's convenience, so that it's clear how the data should be interpreted.

Data can be organized in a table for better readability; however it has no effect on how the data is structured apart from visually. (The data will be organized in a linear fashion regardless, so it makes no difference to the CPU). The following two structures are the same: <lang asm>Array1 byte 00,01,02,03

Array2 byte 00,01

      byte 02,03</lang>

If you have a variable of a pre-determined size that isn't 1, 2, or 4 bytes you can use the following to define it: <lang asm>BigNumber byte 256 dup (0) ;reserve 256 bytes, each equals zero</lang>

While it's easy to set a variable's size, getting it is impossible without knowing it in advance. Variables are nothing more than just a section of RAM; the CPU does not (and cannot) know how many bytes your variable is supposed to be.

Ada

<lang ada>type Response is (Yes, No); -- Definition of an enumeration type with two values for Response'Size use 1; -- Setting the size of Response to 1 bit, rather than the default single byte size</lang>

AutoHotkey

The documentation explains how the built-in function VarSetCapacity() may be used to do so.

BASIC

Numerical values and arrays generally are a fixed size. Strings are dynamically resized according to the data that they hold.

Variable sizes are in chunks relating to the type of data that they contain. There may also be additional bytes of storage in the variable table that do not show in the dimensions. Typically, strings are allocated in single characters (bytes), so C$(12) in the following example is stored as 12 bytes + additional bytes used for the header in the variable table. In some implementations of basic (such as those that support the storage of variable length strings in arrays), additional terminator characters (such as a trailing Ascii NUL) may also be included. In traditional basic, integers are typically 2 bytes each, so A%(10) contains 10 lots of 2 bytes (20 bytes in total) + additional bytes used for header data in the variable table. Floating point values are typically 8 bytes each, so B(10) holds 10 lots of 8 bytes (80 bytes in total) + additional bytes for header in the variable table:

<lang basic>10 DIM A%(10): REM the array size is 10 integers 20 DIM B(10): REM the array will hold 10 floating point values 30 DIM C$(12): REM a character array of 12 bytes</lang>

BBC BASIC

Works with: BBC BASIC for Windows

The only way to 'set' the size of a scalar numeric variable is to declare it with the appropriate type suffix: <lang bbcbasic> var& = 1 : REM Variable occupies 8 bits

     var% = 1 : REM Variable occupies 32 bits
     var  = 1 : REM Variable occupies 40 bits
     var# = 1 : REM Variable occupies 64 bits</lang>

If the task is talking about setting the size of a variable at run time that is only possible with strings, arrays and structures.

IS-BASIC

<lang IS-BASIC>100 DIM A(10) 110 NUMERIC ARRAY(1 TO 100) 120 NUMERIC MATRIX(1 TO 10,1 TO 10) 130 NUMERIC NR,X,Y 140 STRING NAME$(1 TO 100)*24 150 STRING LINE$*254 160 STRING CHAR$*1</lang>

C

Works with: C99

<lang c>#include <stdint.h>

int_least32_t foo;</lang>

Here foo is a signed integer with at least 32 bits. stdint.h also defines minimum-width types for at least 8, 16, 32, and 64 bits, as well as unsigned integer types.

<lang c>union u {

 int i;
 long l;
 double d;
 /* ... */

};</lang>

Here the use of union results in a datatype which is at least as large as the largest type. Unions are sometimes exploited to just meet a minimum size:

<lang c>union must_be_at_least_512_bytes {

 int interesting_datum;
 char padding[512];

};</lang>

Here, the application will never access padding nor store anything; the padding is there to make the type large enough to meet some requirement. For instance, so that some third party API function which fills in the object, doesn't write past the end of the memory, when the program is only interested in interesting_datum.

C++

Works with: C++11

or

Works with: Boost

<lang Cpp>#include <boost/cstdint.hpp>

boost::int_least32_t foo;</lang>

D

In D, any variables of static array of zero length has a size of zero. But such data is useless, as no base type element can be accessed. <lang d>typedef long[0] zeroLength ; writefln(zeroLength.sizeof) ; // print 0</lang> NOTE: a dynamic array variable's size is always 8 bytes, 4(32-bit) for length and 4 for a reference pointer of the actual storage somewhere in runtime memory.
The proper candidates of minimum size variable are empty structure, 1-byte size data type variable (include byte, ubyte, char and bool), and void, they all occupy 1 byte. <lang d>byte b ; ubyte ub ; char c ; bool t ;</lang> bool is logically 1-bit size, but it actually occupy 1 byte.
void can't be declared alone, but void.sizeof gives 1.
An empty structure is logically zero size, but still occupy 1 byte. <lang d>struct Empty { } writefln(Empty.sizeof) ; // print 1</lang>

Erlang

Variables and data type sizes are outside of the programmers control, with one exception: binary data. Here you can say exactly how many bits you want. Default is 8 bits so below the 0 is 8 bits and the 1 is 3 bits.

15> <<1:11>>.
<<0,1:3>>

ERRE

Numerical values and arrays generally are a fixed size. Strings are dynamically resized according to the data that they hold.

Variable sizes are in chunks relating to the type of data that they contain. There may also be additional bytes of storage in the variable table that do not show in the dimensions. Typically, in ERRE strings are allocated in single characters (bytes), so C$[12] in the following example is stored as 12 bytes + additional bytes used for the header in the variable table. Integers are typically 2 bytes each, so A%[10] contains 10 numbers of 2 bytes (20 bytes in total) + additional bytes used for header data in the variable table. Floating point values are typically 4 bytes each, so B[10] holds 10 numbers of 4 bytes (40 bytes in total) + additional bytes for header in the variable table:

<lang erre>DIM A%[10] ! the array size is 10 integers

DIM B[10] ! the array will hold 10 floating point values

DIM C$[12] ! a character array of 12 bytes</lang>

There is also "double" floating point values (8 bytes). Variables of this type use the suffix #.

Fortran

Since Fortran 90 each intrinsic data type (INTEGER, REAL, COMPLEX, LOGICAL and CHARACTER) has a KIND parameter associated with it that can be used to set the required level of precision. The actual values which these KIND parameters can take are not specified in the standard and are implementation-dependent. In order to select an appropriate KIND value that is portable over different platforms we can use the intrinsic functions SELECTED_REAL_KIND and SELECTED_INT_KIND.

The syntax of these functions are as follows:-

selected_real_kind(P, R), where P is the required number of significant decimal digits and R is the required decimal exponent range. At least one argument must be present. The return value is the kind type parameter for real values with the given precision and/or range. A value of -1 is returned if P is out of range, a value of -2 is returned if R is out of range and a value of -3 is returned if both P and R are out of range.

selected_int_kind(R), where R is the required decimal exponent range. The return value is the kind type parameter for integer values n such that -10^R < n < 10^R. A value of -1 is returned if R is out of range.

<lang fortran>program setsize implicit none

 integer, parameter :: p1 = 6
 integer, parameter :: p2 = 12
 integer, parameter :: r1 = 30
 integer, parameter :: r2 = 1000
 integer, parameter :: r3 = 2
 integer, parameter :: r4 = 4
 integer, parameter :: r5 = 8
 integer, parameter :: r6 = 16
 integer, parameter :: rprec1 = selected_real_kind(p1, r1) 
 integer, parameter :: rprec2 = selected_real_kind(p2, r1) 
 integer, parameter :: rprec3 = selected_real_kind(p2, r2) 
 integer, parameter :: iprec1 = selected_int_kind(r3) 
 integer, parameter :: iprec2 = selected_int_kind(r4)
 integer, parameter :: iprec3 = selected_int_kind(r5)
 integer, parameter :: iprec4 = selected_int_kind(r6) 
 
 real(rprec1)    :: n1
 real(rprec2)    :: n2
 real(rprec3)    :: n3
 integer(iprec1) :: n4
 integer(iprec2) :: n5
 integer(iprec3) :: n6
 integer(iprec4) :: n7
 character(30) :: form
 
 form = "(a7, i11, i10, i6, i9, i8)"
 write(*, "(a)") "KIND NAME   KIND NUMBER   PRECISION        RANGE "
 write(*, "(a)") "                          min   set     min     set"
 write(*, "(a)") "______________________________________________________"
 write(*, form) "rprec1", kind(n1), p1, precision(n1), r1, range(n1)
 write(*, form) "rprec2", kind(n2), p2, precision(n2), r1, range(n2)
 write(*, form) "rprec3", kind(n3), p2, precision(n3), r2, range(n3)
 write(*,*)
 form = "(a7, i11, i25, i8)"
 write(*, form) "iprec1", kind(n4), r3, range(n4) 
 write(*, form) "iprec2", kind(n5), r4, range(n5) 
 write(*, form) "iprec3", kind(n6), r5, range(n6)
 write(*, form) "iprec4", kind(n7), r6, range(n7)

end program</lang> Output

KIND NAME   KIND NUMBER   PRECISION        RANGE
                          min   set     min     set
______________________________________________________
 rprec1          1         6     6       30      37
 rprec2          2        12    15       30     307
 rprec3          3        12    18     1000    4931
 
 iprec1          1                        2       2
 iprec2          2                        4       4
 iprec3          3                        8       9
 iprec4          4                       16      18

Free Pascal

See also: Pascal

Only enumeration type definitions can have a minimum size:<lang pascal>type {$packEnum 4} enum = (x, y, z);</lang> Only a {$packEnum} of 1, 2, or 4 Bytes can be specified.

FreeBASIC

FreeBASIC variables have a fixed size (depending on their type) with four exceptions:

1. The size of the Integer and UInteger types depends on the underlying platform - 4 bytes for 32-bit and 8 bytes for 64-bit platforms.

2. The size of individual characters of the WString type depends on the operating system - 2 bytes for Windows and 4 bytes for Linux.

3. The length of variable length strings is determined at run time and can be changed.

4. The bounds of dynamic arrays are determined at run time and can be changed (using ReDim). However, the number of dimensions (which can be up to 8) must be specified at compile time and cannot be changed.

Variables of types 3. and 4. don't hold their data directly but instead contain a fixed length descriptor - 24 bytes for a string and between 64 and 232 bytes for an array depending on the number of dimensions. The descriptor contains, amongst other things, a pointer to where the actual data is stored.

Go

Translation of: Ada

For task interpretation this follows the spirit of the Ada example included by the task author. In it, an enumeration type is defined from enumeration values, then a storage size--smaller than the default--is specified for the type. A similar situation exists within Go. Defining types from values is called duck-typing, and the situation where a type smaller than the default can be specified exists when a variable is duck-typed from a numeric literal. <lang go>package main

import (

   "fmt"
   "unsafe"

)

func main() {

   i := 5   // default type is int
   r := '5' // default type is rune (which is int32)
   f := 5.  // default type is float64
   c := 5i  // default type is complex128
   fmt.Println("i:", unsafe.Sizeof(i), "bytes")
   fmt.Println("r:", unsafe.Sizeof(r), "bytes")
   fmt.Println("f:", unsafe.Sizeof(f), "bytes")
   fmt.Println("c:", unsafe.Sizeof(c), "bytes")
   iMin := int8(5)
   rMin := byte('5')
   fMin := float32(5.)
   cMin := complex64(5i)
   fmt.Println("iMin:", unsafe.Sizeof(iMin), "bytes")
   fmt.Println("rMin:", unsafe.Sizeof(rMin), "bytes")
   fmt.Println("fMin:", unsafe.Sizeof(fMin), "bytes")
   fmt.Println("cMin:", unsafe.Sizeof(cMin), "bytes")

}</lang> Output:

i: 4 bytes
r: 4 bytes
f: 8 bytes
c: 16 bytes
iMin: 1 bytes
rMin: 1 bytes
fMin: 4 bytes
cMin: 8 bytes

Haskell

<lang Haskell> import Data.Int import Foreign.Storable

task name value = putStrLn $ name ++ ": " ++ show (sizeOf value) ++ " byte(s)"

main = do

 let i8  = 0::Int8
 let i16 = 0::Int16
 let i32 = 0::Int32
 let i64 = 0::Int64
 let int = 0::Int
 task "Int8" i8
 task "Int16" i16
 task "Int32" i32
 task "Int64" i64
 task "Int" int

</lang>

Output:

Int8: 1 byte(s)
Int16: 2 byte(s)
Int32: 4 byte(s)
Int64: 8 byte(s)
Int: 8 byte(s)

Icon and Unicon

Icon and Unicon values are self-descriptive types subject to automatic garbage collection. As a result the opportunities for setting the sizes of the variables are limited.

strings are always variable in length with some fixed overhead
csets are a fixed size
tables and sets are variable in size and start empty
integers and reals are fixed sizes
records are a fized size
co-expressions vary in size based on the environment when they are created
file, window, and procedure references are all fixed in size
lists can be specified with a minimum size (see below):

<lang Icon> L := list(10) # 10 element list </lang>

J

<lang J>v=: </lang>

Here, v is specified to have a minimum size. In this case, the minimum size of the content is zero, though the size of the representation is somewhat larger.

Julia

<lang julia>types = [Bool, Char, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64]

for t in types

   println("For type ", lpad(t,6), " size is $(sizeof(t)) 8-bit bytes, or ",
       lpad(string(8*sizeof(t)), 2), " bits.")

end

primitive type MyInt24 24 end

println("\nFor the 24-bit user defined type MyInt24, size is ", sizeof(MyInt24), " bytes.")

</lang>

Output:

For type Bool size is 1 8-bit bytes, or 8 bits. For type Char size is 4 8-bit bytes, or 32 bits. For type Int8 size is 1 8-bit bytes, or 8 bits. For type UInt8 size is 1 8-bit bytes, or 8 bits. For type Int16 size is 2 8-bit bytes, or 16 bits. For type UInt16 size is 2 8-bit bytes, or 16 bits. For type Int32 size is 4 8-bit bytes, or 32 bits. For type UInt32 size is 4 8-bit bytes, or 32 bits. For type Int64 size is 8 8-bit bytes, or 64 bits. For type UInt64 size is 8 8-bit bytes, or 64 bits.

For the 24-bit user defined type MyInt24, size is 3 bytes.

Kotlin

In Kotlin (or any other language targetting the JVM) the size of variables is outside the programmer's control. The primitive types are either fixed in size or (in the case of Boolean) implementation dependent and the size of objects will depend not only on the aggregate size of their fields but also on any overhead or alignment padding needed.

If one wants a numeric type to be able to accomodate a certain size of number, then one can of course declare a variable of the appropriate type (up to 8 bytes) or use the BigInteger or BigDecimal types where more than 8 byte precision is required.

The following program shows the range of numbers which the primitive numeric types can accomodate to enable one to choose the appropriate type: <lang scala>// version 1.0.6

fun main(args: Array<String>) {

  /* ranges for variables of the primitive numeric types */
  println("A  Byte   variable has a range of :  ${Byte.MIN_VALUE} to ${Byte.MAX_VALUE}")
  println("A  Short  variable has a range of :  ${Short.MIN_VALUE} to ${Short.MAX_VALUE}")
  println("An Int    variable has a range of :  ${Int.MIN_VALUE} to ${Int.MAX_VALUE}")
  println("A  Long   variable has a range of :  ${Long.MIN_VALUE} to ${Long.MAX_VALUE}")
  println("A  Float  variable has a range of :  ${Float.MIN_VALUE} to ${Float.MAX_VALUE}")
  println("A  Double variable has a range of :  ${Double.MIN_VALUE} to ${Double.MAX_VALUE}")

}</lang>

Output:

A  Byte   variable has a range of :  -128 to 127
A  Short  variable has a range of :  -32768 to 32767
An Int    variable has a range of :  -2147483648 to 2147483647
A  Long   variable has a range of :  -9223372036854775808 to 9223372036854775807
A  Float  variable has a range of :  1.4E-45 to 3.4028235E38
A  Double variable has a range of :  4.9E-324 to 1.7976931348623157E308

Mathematica/Wolfram Language

Mathematica stores variables in symbols : e.g. variable 'A' containing integer 0 requires 24 bytes under Windows.

Modula-3

<lang modula3>TYPE UByte = BITS 8 FOR [0..255];</lang> Note that this only works for records, arrays, and objects. Also note that the size in bits must be large enough to hold the entire range (in this case, 8 bits is the correct amount for the range 0 to 255) or the compiler will error.

Nim

1) In Nim, for base types, the size of a variable is determined by the type. There is some flexibility, as it is possible to choose the type according to the required size. For instance, for signed integers, we can choose among int8, int16, int32 and int64. The size of the variable of type “int” is 32 or 64 bits according to the platform. The same exists for unsigned integers (uint8/byte, uint16, uint32, uint64) and for floating point numbers (float32, float64/float).

2) For structured types, the size is generally fixed, except for sequences and strings. Sequences and strings are managed via a descriptor whose size is fixed (eight bytes for now). The actual data are allocated on the heap. The true size of a sequence or a string is named its capacity. The capacity is managed at runtime but the user may indicate the initial size using the procedures initSeqOfCap or initStringOfCap. For instance, var s = initStringOfCap(20) allocates an empty string (whose length is 0) but with the capacity to accept 20 bytes without any reallocation. If the length exceeds the capacity, the reallocation algorithm increases the capacity to a new size.

3) Nim offers also the possibility to set the size of fields into an object. This is done with the pragma bitsize. For instance, we can specify:

<lang Nim>type

 MyBitfield = object
   flag {.bitsize:1.}: cuint</lang>

When Nim uses C as intermediate language, the pragma “bitsize” is translated in C bit fields. The previous example will produce something like that: <lang C>struct mybitfield {

 unsigned int flag:1;

};</lang>

ooRexx

ooRexx variables are all references to object instances, so the variables themselves have no settable or gettable size.

PARI/GP

<lang parigp>default(precision, 1000)</lang> Alternately, in the gp interpreter, <lang parigp>\p 1000</lang>

Pascal

Pascal discourages the programmer to think about specific internal memory structures. Therefore, there is no way to specify the size of any data type.

The GPC (GNU Pascal compiler), however, allows for integer types the specification of a minimum precision: <lang pascal>type correctInteger = integer attribute (size = 42);</lang> See also: Free Pascal

Perl

I suppose you could use vec() or similar to twiddle a single bit. The thing is, as soon as you store this in a variable, the SV (the underlying C implementation of the most simple data type) already takes a couple dozen of bytes.

In Perl, memory is readily and happily traded for expressiveness and ease of use.

Phix

Phix native numeric types are fixed size:
on 32 bit integers are 4 bytes and floats 8 bytes,
on 64 bit integers are 8 bytes and floats 10 bytes.

Note that native integers are always signed and one bit shy of a full machine word, ie
-1,073,741,824 to +1,073,741,823 (-#40000000 to #3FFFFFFF) on 32 bit, and
-4,611,686,018,427,387,904 to +4,611,686,018,427,387,903 (-#4000000000000000 to #3FFFFFFFFFFFFFFF) on 64 bit.

Sequences are 4 or 8 bytes per element, and can grow or shrink at will.
Strings are always one byte per character, ie ansi or utf-8, utf-16 and utf-32 are held as sequences.

When using mprf.e (aka gmp), variables can have any precision required, up to available memory.
mpz (integer) variables automatically grow as needed but can optionally be initialised with a minimum bitcount to avoid later reallocations.
mpfr (floating point) variables require the precision to be explicitly specified in binary bits, for example if you want PI to 120 decimal places:

Library: Phix/mpfr

with javascript_semantics
requires("1.0.0")
include mpfr.e
mpfr pi = mpfr_init(0,-121) -- 120 dp, +1 for the "3."
mpfr_const_pi(pi)
printf(1,"PI with 120 decimals: %s\n\n",mpfr_get_fixed(pi,120))

PicoLisp

In PicoLisp, all variables have the same size (a single cell). But it is possible to create a data structure of a given minimal size with the 'need' function.

PL/I

<lang pli> declare i fixed binary (7), /* occupies 1 byte */

       j fixed binary (15),     /* occupies  2 bytes */
       k fixed binary (31),     /* occupies  4 bytes */
       l fixed binary (63);     /* occupies  8 bytes */

declare d fixed decimal (1), /* occupies 1 byte */

       e fixed decimal (3),     /* occupies  2 bytes */
                                /* an so on ...      */
       f fixed decimal (15);    /* occupies  8 bytes */

declare b(16) bit (1) unaligned; /* occupies 2 bytes */ declare c(16) bit (1) aligned; /* occupies 16 bytes */

declare x float decimal (6), /* occupies 4 bytes */

       y float decimal (16),    /* occupies  8 bytes */
       z float decimal (33);    /* occupies 16 bytes */

</lang>

PureBasic

<lang PureBasic> EnableExplicit

Structure AllTypes

 b.b
 a.a
 w.w
 u.u
 c.c    ; character type : 1 byte on x86, 2 bytes on x64
 l.l
 i.i    ; integer type : 4 bytes on x86, 8 bytes on x64
 q.q
 f.f
 d.d
 s.s    ; pointer to string on heap : pointer size same as integer
 z.s{2} ; fixed length string of 2 characters, stored inline

EndStructure

If OpenConsole()

 Define at.AllTypes  
 PrintN("Size of types in bytes (x64)")
 PrintN("")
 PrintN("byte      = " + SizeOf(at\b))
 PrintN("ascii     = " + SizeOf(at\a))
 PrintN("word      = " + SizeOf(at\w))
 PrintN("unicode   = " + SizeOf(at\u))
 PrintN("character = " + SizeOf(at\c))
 PrintN("long      = " + SizeOf(at\l))
 PrintN("integer   = " + SizeOf(at\i))
 PrintN("quod      = " + SizeOf(at\q))
 PrintN("float     = " + SizeOf(at\f))
 PrintN("double    = " + SizeOf(at\d))
 PrintN("string    = " + SizeOf(at\s))
 PrintN("string{2} = " + SizeOf(at\z))
 PrintN("---------------")
 PrintN("AllTypes  = " + SizeOf(at))
 PrintN("")
 PrintN("Press any key to close the console")
 Repeat: Delay(10) : Until Inkey() <> ""
 CloseConsole()

EndIf

</lang>

Output:

Size of types in bytes (x64)

byte      = 1
ascii     = 1
word      = 2
unicode   = 2
character = 2
long      = 4
integer   = 8
quod      = 8
float     = 4
double    = 8
string    = 8
string{2} = 4
---------------
AllTypes  = 52

Python

For compatibility with the calling conventions of external C functions, the ctypes module has functions that map data types and sizes between Python and C:

ctypes type	C type	Python type
c_char	char	1-character string
c_wchar	wchar_t	1-character unicode string
c_byte	char	int/long
c_ubyte	unsigned char	int/long
c_short	short	int/long
c_ushort	unsigned short	int/long
c_int	int	int/long
c_uint	unsigned int	int/long
c_long	long	int/long
c_ulong	unsigned long	int/long
c_longlong	__int64 or long long	int/long
c_ulonglong	unsigned __int64 or unsigned long long	int/long
c_float	float	float
c_double	double	float
c_longdouble	long double	float
c_char_p	char * (NUL terminated)	string or None
c_wchar_p	wchar_t * (NUL terminated)	unicode or None
c_void_p	void *	int/long or None

Racket

Like many other highlevel languages, Racket doesn't have direct control on object sizes. More than that, objects are almost always references, so holding a vector or a list still starts from some object with pointers to the rest. It is possible, however, to create random ffi structs with some given length, by using something like (_array _byte N) and it's possible to add that to some other ffi type by wrapping it with such an array in a struct. But to create and manage chunks of memory, it's much better to just use malloc (which is also available via the ffi).

Raku

(formerly Perl 6) In Raku, normal user-facing types (Int, Rat, Str, Array, Hash) are all auto-sizing, so there is no need to specify a minimum size for them. (Floating point, known as "Num", defaults to a machine double.) For storage declarations, native storage types (starting with a lowercase letter) may also be specified, in which case the required bit size is part of the type name: int16, uint8 (aka "byte"), num32 (a "float"), complex64 (made of two num64's), etc. More generally, such types are created through an API supporting representational polymorphism, in this case, the NativeHOW representation, when provides methods to set the size of a type; the actual allocation calculation happens when such generic types are composed into a class instance representing the semantics of the effective type to the compiler and run-time system. But mostly this is not something users will concern themselves with directly.

By spec, arrays may be declared with dimensions of fixed size, but as of this writing, such arrays not yet implemented. An array of fixed size that returns elements of a native type will be stored compactly, and uses exactly the memory you'd think it should, (modulo alignment constraints between elements and any slop at the end due to your memory allocator).

REXX

In REXX, there are no minimums for variables holding character literals, so you just simply assign (set)
character strings (or numbers) to REXX variables.

Note that REXX stores all the values of variables as characters, and that includes numbers (all kinds),
booleans (logical), and labels (including subroutine/function names).

However, to insure that REXX can store numbers with a minimum size (amount of decimal digits),
the NUMERIC DIGITS nnn REXX instruction can be used. This will ensure that the decimal
number can be stored without resorting to exponential notation (although exponential notation
can be forced via the format BIF (Built In Function).

The default for numeric digits is 9 (decimal) digits.

There's effectively no limit for the precision [or length] for REXX numbers (except for memory),
but eight million is probably the practical limit. <lang rexx>/*REXX program demonstrates on setting a variable (using a "minimum var size".*/ numeric digits 100 /*default: 9 (decimal digs) for numbers*/

/*── 1 2 3 4 5 6 7──*/ /*──1234567890123456789012345678901234567890123456789012345678901234567890──*/

z = 12345678901111111112222222222333333333344444444445555555555.66 n =-12345678901111111112222222222333333333344444444445555555555.66

                                      /* [↑]  these #'s are stored as coded. */
                                      /*stick a fork in it,  we're all done. */</lang>

Scala

<lang Scala>/* Ranges for variables of the primitive numeric types */ println(s"A Byte variable has a range of : ${Byte.MinValue} to ${Byte.MaxValue}") println(s"A Short variable has a range of : ${Short.MinValue} to ${Short.MaxValue}") println(s"An Int variable has a range of : ${Int.MinValue} to ${Int.MaxValue}") println(s"A Long variable has a range of : ${Long.MinValue} to ${Long.MaxValue}") println(s"A Float variable has a range of : ${Float.MinValue} to ${Float.MaxValue}") println(s"A Double variable has a range of : ${Double.MinValue} to ${Double.MaxValue}")</lang>

See it running in your browser by Scastie (JVM).

Tcl

In Tcl, most values are (Unicode) strings. Their size is measured in characters, and the minimum size of a string is of course 0. However, one can arrange, via write traces, that the value of a variable is reformatted to bigger size. Examples, from an interactive tclsh session: <lang Tcl>% proc format_trace {fmt _var el op} {upvar 1 $_var v; set v [format $fmt $v]}

% trace var foo w {format_trace %10s} % puts "/[set foo bar]/" / bar/

% trace var grill w {format_trace %-10s} % puts "/[set grill bar]/" /bar /</lang>..or limit its size to a certain length: <lang Tcl>% proc range_trace {n _var el op} {upvar 1 $_var v; set v [string range $v 0 [incr n -1]]}

% trace var baz w {range_trace 2} % set baz Frankfurt Fr</lang>

TXR

This task has many possible interpretations in many contexts.

For instance, there is a buffer type. When we create a buffer, we specify its length. Optionally, we can also specify how much storage is actually allocated. This will prevent re-allocations if the length is increased within that limit.

Here, the buffer holds eight zero bytes, but 4096 bytes is allocated to it:

Another situation, in the context of FFI, is that some structure needs to achieve some size, but we don't care about all of its members. We can add anonymous padding to ensure that it meets the minimum size. For instance, suppose we want to call uname, and we only care about retrieving the sysname:

1> (with-dyn-lib nil
     (deffi uname "uname" int ((ptr-out (struct utsname
                                          (sysname (zarray 65 char))
                                          (nil (array 512 uint)))))))
** warning: (expr-1:2) defun: redefining uname, which is a built-in defun
#:lib-0172
2> (defvar u (new utsname))
u
3> (uname u)
0
4> u
#S(utsname sysname "Linux" nodename nil release nil version nil machine nil
           domainname nil)

We have specified a FFI definition for utsname which lays down the sysname member to the correct system-specific array size, and then a generous amount of padding: 512 unsigned integers.

Anonymous padding can be specified anywhere in a FFI structure by using the slot name nil. The corresponding space will be reserved in the structure using the type of that slot, but the slot will not participate in any data conversions. FFI will not fill in that area of the structure when preparing data, and will not extract anything from that area in the reverse direction.

The padding prevents the uname function from accessing beyond the end of the memory that is passed to it.

We can, of course, determine the exact size of struct utsname we can specify the padding such that we know for certain that it meets or exceeds the requirement.

Ursala

There is no way to set the minimum size of natural, integer, or rational numbers, but no need because they all have unlimited precision.

For (mpfr format) arbitrary precision floating point numbers, there are several mechanisms for setting the minimum precision, although not the exact amount of real memory used.

If it's initialized from a literal constant, the compiler infers the intended precision from the number of digits in the constant (or 160 bits, whichever is greater).
The library function mpfr..grow(x,n) returns a copy of x with its precision increased by n bits (padded with zeros).
The library function mpfr..shrink(x,n) returns a copy of x with its precision reduced by n bits, or to MPFR_PREC_MIN, whichever is greater.
Library functions such as mpfr..pi and mpfr..const_catalan take a natural number specifying the precision as an argument and return a constant with at least that precision.
If two numbers of unequal precision are combined using any binary operation from the mpfr library, the result is computed and allocated using the greater precision of the two.

The last feature eliminates the need for explicitly setting the precision of numbers having exact representations, albeit contrary to the convention in physical sciences. <lang Ursala>p = mpfr..pi 200 # 200 bits of precision

x = mpfr..grow(1.0E+0,1000) # 160 default precision, grown to 1160

y = mpfr..shrink(1.0+0,40) # 160 default shrunk to 120

z = mpfr..add(p,y) # inherits 200 bits of precision

a = # 180 bits (not the default 160) because of more digits in the constant

1.00000000000000000000000000000000000000000000000000000E0</lang>

Vlang

<lang vlang>fn main() {

   b := true
   i := 5   // default type is i32
   r := '5'
   f := 5.0  // default type is float64
   println("b: ${sizeof(b)} bytes")
   println("i: ${sizeof(i)} bytes")
   println("r: ${sizeof(r)} bytes")
   println("f: ${sizeof(f)} bytes")
   i_min := i8(5)
   r_min := `5`
   f_min := f32(5.0)
   println("i_min: ${sizeof(i_min)} bytes")
   println("r_min: ${sizeof(r_min)} bytes")
   println("f_min: ${sizeof(f_min)} bytes")

}</lang>

Output:

b: 1 bytes
i: 4 bytes
r: 16 bytes
f: 8 bytes
i_min: 1 bytes
r_min: 4 bytes
f_min: 4 bytes

Wren

In Wren, variables are always exactly 8 bytes in size. They either store their value directly (Num, Bool or Null) or a reference (a 64 bit pointer) to where an object is stored in memory.

Technically, a variable always contains an 8 byte double precision floating point value but uses a technique known as 'NaN tagging' to store bools, nulls or pointers as well as numbers.

So there is no such thing as a minimum size for a variable or scalar type; the size of the former is always 8 bytes and of the latter however many 8 byte values are needed to store its fields.

Wren has four built-in collection types: String, Range, Map and List. The size of the first two is determined by their currently assigned value which is immutable but the size of the last two can change dynamically as elements are added or removed. Of course, a variable which has been assigned values of these types is always exactly 8 bytes in size and holds a reference to where the actual object is stored on the heap,

The programmer cannot specify a minimum size for a Map but can specify a minimum size for a List - in effect the amount of heap storage required to store its elements. This can still be increased dynamically by adding futher elements to the List. Here's an example. <lang ecmascript>// create a list with 10 elements all initialized to zero var l = List.filled(10, 0) // give them different values and print them for (i in 0..9) l[i] = i System.print(l) // add another element to the list dynamically and print it again l.add(10) System.print(l)</lang>

Output:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

XPL0

<lang XPL0>include c:\cxpl\codes; \intrinsic 'code' declarations string 0; \use zero-terminated strings char S,

    A(1),              \sets up an array containing one byte
    B(0);              \sets up an array containing no bytes

int I; [S:= ""; \a zero-length (null) string A:= Reserve(1); \sets up a 1-byte array at runtime B:= Reserve(0); \sets up a 0-byte array at runtime I:= I ! 1<<3; \stores a single 1 bit into an integer I:= I & ~(1<<29); \stores a 0 bit into bit 29 of the integer IntOut(0, I>>3 & 1); \displays value of bit 3 ]</lang>

Other than arrays and strings, variables are a fixed size. Integers are four bytes and reals are eight bytes.

Z80 Assembly

Translation of: 6502 Assembly

Syntax will vary depending on the assembler and whether your program will run in RAM or ROM. For programs that execute from RAM such as those that are loaded from a disk, you can use the same syntax that you would use to define constants. When defining bytes as variables or constants, you do not prefix them with a # sign. Only in actual CPU instructions do you need to use a # to prevent the assembler from treating the numeric value as a memory location.

Since these are variables, the value given to them (in this case, 0) is the initial value, and can be changed later at runtime. If you don't care what the initial value is, some assemblers allow you to use a "?" where the 0s are.

<lang Z80>MyByte:

  byte 0       ;most assemblers will also accept DB or DFB

MyWord:

  word 0       ;most assemblers will also accept DW or DFW

MyDouble:

  dd 0</lang>

For programs that are executed solely from ROM, such as video game console cartridges, you won't be able to use the above method. The assembler can often use an enum or rsset directive to sequentially assign labels to a series of consecutive memory locations in the system's RAM.

<lang z80>.rsset $C000 ;starting at $C000, the following labels represent sequential memory locations of length ".rs n" tempByte .rs 1 tempWord .rs 2 tempLong .rs 4</lang>

Assemblers that don't have an enum or rsset directive can use the equ directive instead. This method lets you immediately see what each memory location actually is, but it makes it harder to insert a new one without having to redo all the numbering. When loading from an immediate memory location to a 16-bit register pair (e.g. LD HL,($4000), the "low register" is loaded with the byte stored at the memory location specified by the operand, and the "high register" is loaded with the byte stored at the memory location after that. <lang Z80>tempByte equ $C000 tempWord equ $C001 ;high byte at $C002 tempLong equ $C003 ;you have to track spacing yourself with this method</lang>

While setting a variable's size is easy, getting it isn't possible without knowing it in advance. The CPU does not (and cannot) know the intended size of a variable. There's no enforcement of types whatsoever on the Z80, anything goes. If, for example, you executed LD HL,($4000) and $4000 was intended to be the storage location of an 8-bit value, you'd get the intended value in L and whatever happened to be after it in memory into H. Whether that's a problem or not is up to the programmer, not the CPU.

zkl

It is up to the object to decide on size. For example, Ints and Floats are 8 bytes, Strings are immutable and are sized when created. Mutable lists and dictionaries grow and shrink as needed. Some mutable types (such as Lists and Dictionaries) can take [programmer supplied] hints as to how big they might become.

ZX Spectrum Basic

<lang basic>10 DIM a$(10): REM This array will be 10 characters long 20 DIM b(10): REM this will hold a set of numbers. The fixed number of bytes per number is implementation specific 30 LET c=5: REM this is a single numerical value of fixed size</lang>