Data Type

In Computer, everything is a 0 or 1, i.e bit and its memory have a basic unit, Byte (8-Bit).

  • Byte

  • Word (2 Bytes)

  • Double Word (4 Bytes)

  • Quad Word (8 Bytes)

  • Double Quad Word (16 Bytes)

Defining data

Lets first look into initialised data, static.

For defining specific format we are going to use d[format] form.

Example: for defining byte we will use db (b for byte), for word → dw (w for word), for double word→ dd (d for double)and so on.

db 0x5 ; 0x5 byte is defined

db 0x5,0x7,0x8 ; three bytes defined at once

db ‘a’ ; Characters constant also allowed.

db ‘hello’ ; Strings constant also allowed.

dw 0x1234 ; 0x34 0x12 are stored.

dw ‘a’ ; 0x61 0x00 , here 0x00 is acting as padding for the word.

dw ‘ab’ ; 0x61 0x62 stored.

dd ‘0x12345678’ ; 0x78 0x56 0x34 0x12

Wait, what? 0x1234 and 0x12345678 are stored as 0x34 0x12 rather than 0x12 0x34 and same goes for 0x78 0x56 0x34 0x12.

The answer for this lies in Endianness.

Endianness is the sequence or order in which byte value is stored in memory.

There are 2 type of endianness:

  • Little Endian

  • Big Endian

Intel 32 Bit Architecture uses Little Endian format. Therefore, 0x1234 is stored as 0x34 0x12.

A visual representation of data stored as Little endian format in memory

0x12345678, in this 0x12 is Byte 0 (MSB), 0x34 is Byte 1, 0x56 is Byte 2, 0x78 is Byte 3.

And they will be stored as 0x78 0x56 0x34 0x12

[Contd]

Lets look into Uninitialised data, reserving memory.

buff resb 64 ; reserve 64 byte, where buff is variable for which 64 byte will be declared.

wordVar resw 1 ; reserve 1 word, where wordVar is variable for which 1 word will be declared.

declare.nasm :

Compiling and Linking:

Analysis:

  • Set breakpoint at _start

  • run

  • Check variables and its address

  • Analyse how data is stored in those addresses.

->0x0804a000 decByte ; Here we are declaring a byte, so next variable at 0x0804a000+0x1 ->0x0804a001 decByteSeries ; Here we are declaring 3 bytes, so next variable at 0x0804a001+0x3 ->0x0804a004 decWord ; Here we are declaring a word(2 byte), so next variable at ->0x0804a004+0x2 ->0x0804a006 decDouble ; Here we are declaring a word(4 byte), so next variable at 0x0804a006+0x4 ->0x0804a00a message ; Here we are declaring a series of bytes (15), so next variable at 0x0804a00a+0xf. 0xf in hex is 15 in decimal. hex(0xa + 0xf) = 0x19 ->0x0804a019 __bss_start ->0x0804a019 _edata ->0x0804a01c reserveByte ; Here declared a buffer of 80 Bytes, so next variable will be at 0x0804a01c + 0x50. 0x50 in hex is 80 in decimal. hex(0x1c + 0x50) = 0x6c ->0x0804a06c reserveWord ->0x0804a078 _end

Here we can check values stored in variables by 2 ways.

  • &variable_vame

  • Address_of_variable

In above image, we are looking for a byte variable. So we used x/xb, to see byte (b) in hex format (x).

x/xb &VarName

x/xb Address

Similarly for Series of bytes, we used x/3xb, to see 3 bytes in hex format.

x/3xb &varName

💡 Why, decByteSeries is stored as 0xaa 0xbb 0xcc, not in little endian format. Since, bytes are specified individually (0xaa,0xbb,0xcc) not as (0xaabbcc).

Here, we can see in decDouble variable, data is stored in little endian format. As we initialised 0x12345678 as a whole.

Here, for Uninitialised variables, memory is allocated and nothing is present in it.

Last updated