ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

重读&笔记系列-《深入理解计算机系统》第三章-part1,part2

2021-05-05 23:00:47  阅读:151  来源: 互联网

标签:goto Dest unsigned long part1 part2 重读 rax result


Machine Programming I

Machine Programming I: Basic

History of Intel processors and architectures

C, assembly, machine code

Definition

  • Architecture: also ISA instruction set architecture) The parts of a processor design that one needs to understand or write assembly/machine code.

    • Example: instruction set specification, registers.
  • MicroArchitecture: Implementation of the architecure.

    • Example: cache sizes and core frequency.
  • Code Forms:

    • Machine Code: The byte-level programs that a processor executes.
    • Assembly Code: A text representation of machine code
  • Example ISAs:

    • Intel: x86, IA32, Itanium, x86-64
    • ARM: Used in almost all mobile phones

Programmer-Visible State

  • PC: Program counter
    • Address of next instruction
    • Called “RIP” (x86-64)
  • Register file
    • Heavily used program data
  • Condition codes
    • Store status information about most recent architecture or logical operation
    • Used for conditional branching
  • Memory
    • Byte addressable array
    • Code and user data
    • Stack to support procedures

Turning C into Object Code

  • Code in file p1.c p2.c
  • Compile with command: gcc -Og p1.c p2.c -o p
    • Use basic optimizations(-Og) [New to recent version of GCC]
    • Put resulting binary in file p

Compiling Into Assembly

C Code (sum.c)

long plus(long x, long y);

void sumstore(long x, long y, long *dest)
{
    long t = plus(x, y);
    *dest = t;
}

Generated x86-64 Assembly

sumstore:
	pushq	%rbx
	movq	%rdx, %rbx
	call 	plus
	movq	%rax, (%rbx)
	popq	%rbx
	ret

Obtain(on shark machine) with command

gcc -Og -S sum.c

Produces file sum.s

Assembly Characteristics: Data Types

  • "Integers" data of 1, 2, 4 or 8 bytes
    • Data value
    • Addresses(untyped pointers)
  • Floating point data of 4, 8 or 10 bytes
  • Code: Byte sequences encoding series of instructions
  • No aggregate types such as arrays or structures
    • Just contiguously allocated bytes in memory

Assembly Characteristics: Operations

  • Perform arithmetic function on register or memory data

  • Transfer data between memory and register

    • Load data from memory into register
    • Store register data into memory
  • Transfer control

    • Unconditional jumps to/from procedures
    • Conditional branches

Object Code

  • Assembler
    • Translates .s into .o
    • Binary encoding of each instruction
    • Nearly-complete image of executable code
    • Missing linkages between code in different files
  • Linker
    • Resolves references between files
    • Combines with static run-time libraries
      • E.g., code for malloc, printf
    • Some libraries are dynamically linked
      • Linking occurs when program begin execution

Machine Instruction Example

  • C code

    *dest = t;
    
    • Store value t where designated by dest
  • Assambly

    movq %rax, (%rbx)
    
    • Move 8-byte value to memory

      • Quad words in x86-64 parlance
    • Operands:

      t: Register %rax

      dest: Register %rbx

      *dest: Memory M[%rbx]

  • Object Code

    0x40059e:	48 89 03
    
    • 3-byte instruction
    • Stored at address 0x40059e

Disassembling Object Code

  • Disassembled
0000000000400595 <sumstore>:
	400595:	53					push 	%rbx
	400596:	48 89 d3			mov		%rdx, %rbx
	400599:	e8 f2 ff ff ff		callq	400590 <plus>
	40059e:	48 89 03			mov		%rax, (%rbx)
	4005a1:	5b					pop		%rbx
	4005a2:	c3					retq
  • Disassembler

    objdump -d sum
    
    • Useful tool for examining object code
    • Analyzes bit pattern of series of instructions
    • Produces approximate rendition of assembly code
    • Can be run on either a.out or .o file

    Alternate Disassembly

    • Within gdb Debugger

      ​ gdb sum

      ​ disassemble sumstore

      • Disassemble procedure

        x/14xb sumstore

      • Examine the 14 bytes starting at sumstore

What Can be Disassembled?

  • Anything that can be interpreted as executable code
  • Disassembler examines bytes and reconstructs assembly

Assembly Basics: Registers, operands, move

Moving Data

  • Moving Data

    moveq Source, Dest:

  • Operand Types

    • Immediate: Constant integer data
      • Example: $0x400, $-533
      • Like C constant, but prefixed with ‘$’
      • Encoded with 1, 2, or 4 bytes
    • Register: One of 16 integer registers
      • Example: %rax, %r13
      • But %rsp reserved for special use
      • Others have special uses for particular instructions
    • Memory: 8 consecutive bytes of memory at address given by register
      • Simplest example: (%rax)
      • Various other “address modes”

moveq Operand Combinations

Source -> DestSrc, DestC Analog
Imm -> Regmoveq $0x4, %raxtemp = 0x4;
Imm -> Memmoveq %-147, (%rax)*p = -147;
Reg -> Regmoveq %rax, %rdxtemp2 = temp1;
Reg -> Memmoveq %rax, (%rdx)*p = temp;
Mem -> Regmoveq (%rax), rdxtemp = *p;

Simple Memory Addressing Modes

  • Normal ® Mem[Reg[R]]

    • Register R specifies memory address

    • Aha! Pointer dereferencing in C

      moveq (%rcx), %rax

  • Displacement D® Mem[Reg[R]+D]

    • Register R specifies start of memory region

    • Constant displacement D specifies offset

      moveq 8(%rbp), %rdx

Complete Memory Addressing Modes

  • Most General Form D(Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]+D]

    • D: Constant “displacement” 1, 2 or 4 bytes
    • Rb: Base register: Any of 16 integer registers
    • Ri: Index register: Any, except for %rsp
    • S: Scale: 1, 2, 4, or 8 (why these numbers?)
  • Special Cases

    ​ (Rb, Ri) Mem[Reg[Rb]+Reg[Ri]]

    ​ D(Rb, Ri) Mem[Reg[Rb]+Reg[Ri]+D]

    ​ (Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]]

Address Computation Instruction

  • leaq Src, Dst

    • Src is address mode expression
    • Set Dst to address denoted by expression
  • Uses

    • Computing addresses without a memory reference
      • E.g., translation of p = &x[i];
    • Computing arithmetic expressions of the form x + k*y
      • k = 1, 2, 4, or 8
  • Example

    long m12(long x) {
        return x*12;
    }
    
    leaq (%rdi, %rdi, 2), %rax 		# t <- x + x*2
    salq $2, %rax				 	# return t
    

Some Arithmetic Operations

  • Two Operand Instructions:

    FormatComputation
    addq Src, DestDest = Dest + Src
    subq Src, DestDest = Dest - Src
    imulq Src, DestDest = Dest * Src
    salq Src, DestDest = Dest << Src
    sarq Src, DestDest = Dest >> Src
    shrq Src, DestDest = Dest >> Src
    xorq Src, DestDest = Dest ^ Src
    andq Src, DestDest = Dest & Src
    orq Src, DestDest = Dest | Src
  • Watch out for argument order!

  • No distinction between signed and unsigned int (why?)

  • One Operand Instructions

    incq Dest Dest = Dest + 1

    decq Dest Dest = Dest - 1

    negq Dest Dest = -Dest

    notq Dest Dest = ~Dest

  • See book for more instructions

Summary

  • History of Intel processors and architectures
    • Evolutionary design leads to many quirks and artifacts
  • C, assembly, machine code
    • New forms of visible state: program counter, registers, …
    • Complier must transform statements, expressions, procedures into low-level instruction sequences
  • Assembly Basics: Register, operands, move
    • The x86-64 move instructions cover wide range of data movement forms
  • Arithmetic
    • C compiler will figure out different instruction combinations to carry out computation

Machine-Level Programming II: Control

Control: Condition codes

  • Information about currently executing program
    • Temporary data (%rax, …)
    • Location of runtime stack (%rsp)
    • Location of current code control point (%rip, …)
    • Status of recent tests (CF, ZF, SF, OF)

condition Code (Implicit Setting)

  • Single bit registers

    • CF Carry Flag (for unsigned)
    • SF Sign Flag (for signed)
    • ZF Zero Flag
    • OF Overflow Flag (for signed)
  • Implicitly set (think of it as side effect) by arithmetic operations

    • Example: addq Src, Dest -> t = a+b

    • CF set if carry out from most significant bit (unsigned overflow)

    • ZF set if t == 0

    • SF set if t < 0 (as signed)

    • Of set if two’s -complement (signed) overflow

      (a > 0 && b > 0 && t < 0) || (a < 0 && b < 0 && t >= 0)

  • Not set by leaq instruction

  • Explicit Setting by Compare Instruction

    • cmpq Src2, Src1

    • cmpq b, a like computing a-b without setting destination

    • CF set if carry out from most significant bit (used for unsigned comparisons)

    • ZF set if a == b

    • SF set if (a-b) < 0 (as signed)

    • OF set if two’s -complement (signed) overflow

      (a > 0 && b > 0 && (a-b) < 0) || (a < 0 && b < 0 && (a-b) >= 0)

  • Explicit Seting by Test instruction

    • testq Src2, Src1

      • testq b, a like computing a&b without setting destination
    • Sets condition codes based on value of Src1 & Src2

    • Useful to have one of the operands be a mask

    • ZF set when a&b == 0

    • SF set when a&b < 0

  • SetX Instructions

    • Set low-order byte of destination to 0 or 1 based on combinations of condition codes

    • Does not alter remaining 7 bytes

      SetXConditionDescription
      seteZFEqual / Zero
      setne~ZFNot Equal / Not Zero
      setsSFNegative
      setns~SFNonnegative
      setg~(SF^OF) & ~ZFGreater (Signed)
      setge~(SF^OF)Greater or Equal (Signed)
      setl(SF^OF)Less (Signed)
      setle(SF^OF) | ZFLess or Equal (Signed)
      setaCF&ZFAbove (unsigned)
      setbCFBelow (unsigned)

Reading Condition Codes (Cont.)

  • SetX Instructions:

    • Set single byte based on combination of condition codes
  • One of addressable byte registers

    • Does not alter remaining bytes

    • Typically use movzbl to finish job

      • 32-bit instruction also set upper 32bits to 0
      int gt (long x, long y) 
      {
          return x > y;
      }
      
      # %rdi is x, %rsi is y, %rax is Return value
      cmpq	%rsi, %rdi		# Compare x:y
      setg	%al			# Set when >
      movzbl	%al, %eax	# Zero rset of %rax
      ret
      

Conditional branches

Jumping

  • jX Instructions

    • Jump to different part of code depending on condition codes

      jXConditionDescription
      jmp1Unconditional
      jeZFEqual / Zero
      jne~ZFNot Equal / Not Zero
      jsSFNegative
      jns~SFNonnegative
      jg~(SF^OF) & ~ZFGreater (Signed)
      jge~(SF^OF)Greater or Equal (Signed)
      jl(SF^OF)Less (Signed)
      jle(SF^OF) | ZFLess or Equal (Signed)
      ja~CF & ZFAbove (unsigned)
      jbCFBelow (unsigned)

Conditional Branch Example (Old Style)

  • Generation

    shark> gcc -Og -S -fno-if-conversion control.c

    long absdiff (long x, long y)
    {
        long result;
        if (x > y)
            result = x-y;
        else
            result = y-x;
        return result;
    }
    
    # x in %rdi, y in %rsi
    absdiff:
    	cmpq	%rsi, %rdi		# x:y
    	jle		.L4
    	movq	%rdi, %rax
    	subq	%rsi, %rax
    	ret
    .L4:		# x <= y
    	movq	%rsi, %rax
    	subq	%rdi, %rax
    	ret
    

Expressing with Goto Code

  • C allows goto statement

  • Jump to position designated by label

    long absdiff (long x, long y)
    {
        long result;
        if (x > y)
            result = x-y;
        else
            result = y-x;
        return result;
    }
    
    long absdiff_j (long x, long y)
    {
        long result;
        int ntest = x <= y;
        if (ntest) goto Else;
        result = x-y;
        goto Done;
        Else:
            result = y-x;
        Done:
        	return result;
    }
    

General Conditional Expression Translation (Using Branches)

C code

val = Test ? Then_Expr : Else_Expr;

Goto Version

	ntest = !Test;
	if (ntest) goto Else;
	val = Then_Expr;
	goto Done;
Else:
	val = Else_Expr;
Done:
	...
  • Create separate code regions for then & else expressions
  • Execute appropriate one

Using Conditional Moves

  • Conditional Move Instructions

    • Instruction supports:

      if (Test) Dest <- Src

    • Supported in post-1995 x86 processors

    • GCC tries to use them

      • But, only when know to be safe
  • Why?

    • Branches are very disruptive to instruction flow through pipelines
    • Conditional moves do not require control transfer
val = Test ? Then_Expr : Else_Expr;
result = Then_Expr;
eval = Else_Expr;
nt = !Test;
if (nt) result = eval;
return result;

Conditional Move Example

long absdiff (long x, long y)
{
    long result;
    if (x > y)
        result = x-y;
    else
        result = y-x;
    return result;
}
# x in %rdi, y in %rsi
absdiff:
	movq	%rdi, %rax	# x
	subq	%rsi, %rax	# result = x-y
	movq	%rsi, %rdx
	subq	%rdi, %rdx	# eval = y-x
	cmpq	%rsi, %rdi	# x:y
	cmovle	%rdx, %rax	# if <=, result = eval
	ret

Bad Cases for Conditional Move

Expensive Computations

val = Test(x) ? Hard1(x) : Hard2(x);
  • Both values get computed
  • Only makes sense when computations are very simple

Risky Computations

val = p ? *p : 0;
  • Both values get computed
  • May have undesirable effects

Computations with side effects

val = x > 0 ? x*=7 : x+=3;
  • Both values get computed
  • Must be side-effect free

Loops

“Do-While” Loop Example

C code

long pcount_do (unsigned long x)
{
    long result = 0;
    do {
        result += x & 0x1;
        x >>= 1;
    } while (x);
    return result;
}

Goto Version

long pcount_goto (unsigned long x)
{
    long result = 0;
loop:
	result += x & 0x1;
    x >>= 1;
    if (x) goto loop;
    return result;
}
  • count number of 1’s in argument x (“popcount”)
  • Use conditional branch to either continue looping or to exit loop

“Do-While” Loop Compilation

# x in %rdi
movl	$0, %eax	# result = 0
.L2:
movq 	%rdi, %rax	# loop:
andl	$1, %edx	# t = x & 0x1
addq	%rdx, %rax	# result += t
shrq	%rdi		# x >>= 1
jne		L2			# if (x)
rep; ret

General “Do-While” Translation

C Code

do
	Body
	while (Test);

Goto Version

loop:
	Body
    if (Test)
    goto loop
  • Body: {

    ​ Statement1;

    ​ Statement2;

    ​ …

    ​ Statementn;

    }

General “While” Translation #1

  • “Jump-to-middle” translation
  • Used with -Og

While version

while (Test)
    Body

Goto Version

	goto test;
loop:
	Body
test:
	if (Test)
        goto loop;
done:

While Loop Example #1

C Code

long pcount_while (unsigned long x)
{
    long result = 0;
    while (x) {
        result += x & 0x1;
        x >>= 1;
    }
    return result;
}

Jump to Middle

long pcount_goto_jtm (unsigned long x)
{
    long result = 0;
    goto test;
loop:
    result += x & 0x1;
    x >>= 1;
test:
    if (x) goto loop;
    return result;
}
  • Compare to do-while version of function
  • Initial goto starts loop at test

General “While” Translation #2

While Version

while (Test)
    Body

Do-While Version

if (!Test)
    goto done;
do
    Body
    while (Test);
done:

Goto Version

if (!Test)
    goto done;
loop:
	Body
    if (Test)
        goto loop;
done:
  • “Do-while” conversion
  • Used with -O1

While Loop Example #2

C Code

long pcount_while (unsigned long x)
{
    long result = 0;
    while (x) {
        result += x & 0x1;
        x >>= 1;
    }
    return result;
}

Do-While Version

long pcount_goto_dw (unsigned long x)
{
    long result = 0;
    if (!x) goto done;
loop:
    result += x & 0x1;
    x >>= 1;
    if (x) goto loop;
done:
    return result;
}
  • Compare to do-while version of function
  • Initial conditional guards entrance(入口) to loop

“For” Loop Form

General Form

for (Init; Test; Update)
    Body
#define WSIZE 8*sizeof(int)
long pcount_for (unsigned long x)
{
    size_t i;
    long result = 0;
    for (i = 0; i < WSIZE; i++)
    {
        unsigned bit = (x >> i) & 0x1;
        result += bit;
    }
    return result;
}

Init

i = 0

Test

i < WSIZE

Update

i++

Body

{
    unsigned bit = (x >> i) & 0x1;
	result += bit;
}

“For” Loop -> While Loop

For Version

for (Init; Test; Update)
	Body

While Version

Init;
while (Test) {
    Body
    Update;
}

For-While Conversion

long pcount_for_while (unsigned long x)
{
    size_t i;
    long result = 0;
    i = 0;
    while (i < WSIZE)
    {
        unsigned bit = (x >> i) & 0x1;
        result += bit;
        i++;
    }
    return result;
}

“For” Loop Do-While Conversion

C Code

long pcount_for (unsigned long x)
{
    size_t i;
    long result = 0;
    for (i = 0; i < WSIZE; i++)
    {
        unsigned bit = (x >> i) & 0x1;
        result += bit;
    }
    return result;
}

Goto Version

long pcount_for (unsigned long x)
{
    size_t i;
    long result = 0;
    i = 0;	//Init
loop:
    {	//Body
        unsigned bit = (x >> i) & 0x1;	
        result += bit;
    }
    i++;	//Update
    if (i < WSIZE)	//Test
        goto loop;
done:
    return result;
}
  • Init test can be optimized away

Switch Statements

Switch Statement Example

long switch_eg (long x, long y, long z)
{
    long w = 1;
    switch (x) {
    case 1:
    	e = y*z;
        break;
   	case 2:
    	w = y/z;
        // Fall Through
    case 3:
        w += z;
        break;
    case 5:
    case 6:
        w -= z;
        break;
    default:
    	w = 2;
    }
}
  • Multiple case labels
    • Here: 5 & 6
  • Fall through cases
    • Here: 2
  • Missing cases
    • Here: 4

Jump Table Structure

Switch Form

switch (x) {
case val_0:
	Block 0  
case val_1:
	Block 1    
	...
case val_n-1:
	Block n-1     
}    

Jump Table

jtab:
	Targ0
	Targ1
	Targ2
	...
	Targn-1

Jump Targets

Targ0:
	Code Block 0
Targ1:
	Code Block 1
Targ2:
	Code Block 2
...
Targn-1:
	Code Block n-1

Translation (Extended C)

goto *JTab[x];

Switch Statement Example

long switch_eg (long x, long y, long z)
{
    long w = 1;
    switch (x) {
    	...
    }
    return w;
}

Setup:

# x in %rdi, y in %rsi, z in %rdx
switch_eg:
	movq	%rdx, %rcx
	cmpq	$6, %rdi	# x:6
	ja		.L8			# 使用无符号值比较,负数可以视为很大的值 Use default
	jmp		*.L4(, %rdi, 8)	# goto *JTab[x]

What range of values take default?

Jump table

.section	.rodata
	.align 8
.L4
	.quad	.L8	# x = 0
	.quad	.L3	# x = 1
	.quad	.L5	# x = 2
	.quad	.L9	# x = 3
	.quad	.L8	# x = 4
	.quad	.L7	# x = 5
	.quad	.L7	# x = 6

Code Blocks (x == 1)

switch (x) {
case 1:		// .L3
	w = y*z;
    break;
    ...
}
# x in %rdi, y in %rsi, z in %rdx
.L3
	movq	%rsi, %rax	# y
	imulq	%rdx, %rax	# y*z
	ret

Code Blocks (x == 2, x == 3)

switch (x) {
case 2:	
	w = y/z;
    //Fall Through
case 3:	
	w += z;
    break;
    ...
}
# x in %rdi, y in %rsi, z in %rdx
.L5						# Case 2
	movq	$rsi, $rax
	cqto
	idivq	$rcx		# y/z
	jmp		.L6			# goto merge
.L9						# Case 3
	movl	%1, %eax	# w = 1
.L6						# merge:
	addq	%rcx, %rax	# w += z
	ret

Code Blocks (x == 5, x == 6, default)

switch (x) {
case 5:		// .L7
case 6:		// .L7
	w -= z;
    break;
default: 	// .L8
    w = 2
}
# x in %rdi, y in %rsi, z in %rdx
.L7						# Case 5,6
	movql	$1, $eax	# w = 1
	subq	%rdx, %rax	# w -= z
	ret
.L8						# Default:
	movl	$2, %eax	# 2
	ret

Summary

  • C Control
    • if-then-else
    • do-while
    • while, for
    • switch
  • Assembler Control
    • Conditional jump
    • Conditional move
    • Indirect jump (via jump tables)
    • Compiler generates code sequence to implement more complex control
  • Standard Techniques
    • Loop converted to do-while or jump-to-middle form
    • Large switch statements use jump tables
    • Sparse switch statements may use decision trees (if-elseif-elseif-else)

标签:goto,Dest,unsigned,long,part1,part2,重读,rax,result
来源: https://blog.csdn.net/qq_38537503/article/details/116431733

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有