...

Source file src/github.com/twitchyliquid64/golang-asm/obj/ppc64/doc.go

Documentation: github.com/twitchyliquid64/golang-asm/obj/ppc64

     1  // Copyright 2019 The Go Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Package ppc64 implements a PPC64 assembler that assembles Go asm into
     7  the corresponding PPC64 instructions as defined by the Power ISA 3.0B.
     8  
     9  This document provides information on how to write code in Go assembler
    10  for PPC64, focusing on the differences between Go and PPC64 assembly language.
    11  It assumes some knowledge of PPC64 assembler. The original implementation of
    12  PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but
    13  updates to the Go assembly language used mnemonics that are mostly similar if not
    14  identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail
    15  is included here; refer to the Power ISA document if interested in more detail.
    16  
    17  Starting with Go 1.15 the Go objdump supports the -gnu option, which provides a
    18  side by side view of the Go assembler and the PPC64 assembler output. This is
    19  extremely helpful in determining what final PPC64 assembly is generated from the
    20  corresponding Go assembly.
    21  
    22  In the examples below, the Go assembly is on the left, PPC64 assembly on the right.
    23  
    24  1. Operand ordering
    25  
    26    In Go asm, the last operand (right) is the target operand, but with PPC64 asm,
    27    the first operand (left) is the target. The order of the remaining operands is
    28    not consistent: in general opcodes with 3 operands that perform math or logical
    29    operations have their operands in reverse order. Opcodes for vector instructions
    30    and those with more than 3 operands usually have operands in the same order except
    31    for the target operand, which is first in PPC64 asm and last in Go asm.
    32  
    33    Example:
    34      ADD R3, R4, R5		<=>	add r5, r4, r3
    35  
    36  2. Constant operands
    37  
    38    In Go asm, an operand that starts with '$' indicates a constant value. If the
    39    instruction using the constant has an immediate version of the opcode, then an
    40    immediate value is used with the opcode if possible.
    41  
    42    Example:
    43      ADD $1, R3, R4		<=> 	addi r4, r3, 1
    44  
    45  3. Opcodes setting condition codes
    46  
    47    In PPC64 asm, some instructions other than compares have variations that can set
    48    the condition code where meaningful. This is indicated by adding '.' to the end
    49    of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of
    50    the opcode. The possible settings of the condition code depend on the instruction.
    51    CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for
    52    vector instructions.
    53  
    54    Example:
    55      ANDCC R3, R4, R5		<=>	and. r5, r3, r4 (set CR0)
    56  
    57  4. Loads and stores from memory
    58  
    59    In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target
    60    is a memory reference, then it is a store; when the target is a register and the
    61    source is a memory reference, then it is a load.
    62  
    63    MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword.
    64  
    65    Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend.
    66    Adding 'U' to a load or store indicates an update of the base register with the offset.
    67    Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite
    68    of the expected endian order. If 'BR' is used then zero extend is assumed.
    69  
    70    Memory references n(Ra) indicate the address in Ra + n. When used with an update form
    71    of an opcode, the value in Ra is incremented by n.
    72  
    73    Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed
    74    loads or stores. Both forms are accepted. When used with an update then the base register
    75    is updated by the value in the index register.
    76  
    77    Examples:
    78      MOVD (R3), R4		<=>	ld r4,0(r3)
    79      MOVW (R3), R4		<=>	lwa r4,0(r3)
    80      MOVWZU 4(R3), R4		<=>	lwzu r4,4(r3)
    81      MOVWZ (R3+R5), R4		<=>	lwzx r4,r3,r5
    82      MOVHZ  (R3), R4		<=>	lhz r4,0(r3)
    83      MOVHU 2(R3), R4		<=>	lhau r4,2(r3)
    84      MOVBZ (R3), R4		<=>	lbz r4,0(r3)
    85  
    86      MOVD R4,(R3)		<=>	std r4,0(r3)
    87      MOVW R4,(R3)		<=>	stw r4,0(r3)
    88      MOVW R4,(R3+R5)		<=>	stwx r4,r3,r5
    89      MOVWU R4,4(R3)		<=>	stwu r4,4(r3)
    90      MOVH R4,2(R3)		<=>	sth r4,2(r3)
    91      MOVBU R4,(R3)(R5)		<=>	stbux r4,r3,r5
    92  
    93  4. Compares
    94  
    95    When an instruction does a compare or other operation that might
    96    result in a condition code, then the resulting condition is set
    97    in a field of the condition register. The condition register consists
    98    of 8 4-bit fields named CR0 - CR7. When a compare instruction
    99    identifies a CR then the resulting condition is set in that field
   100    to be read by a later branch or isel instruction. Within these fields,
   101    bits are set to indicate less than, greater than, or equal conditions.
   102  
   103    Once an instruction sets a condition, then a subsequent branch, isel or
   104    other instruction can read the condition field and operate based on the
   105    bit settings.
   106  
   107    Examples:
   108      CMP R3, R4			<=>	cmp r3, r4	(CR0 assumed)
   109      CMP R3, R4, CR1		<=>	cmp cr1, r3, r4
   110  
   111    Note that the condition register is the target operand of compare opcodes, so
   112    the remaining operands are in the same order for Go asm and PPC64 asm.
   113    When CR0 is used then it is implicit and does not need to be specified.
   114  
   115  5. Branches
   116  
   117    Many branches are represented as a form of the BC instruction. There are
   118    other extended opcodes to make it easier to see what type of branch is being
   119    used.
   120  
   121    The following is a brief description of the BC instruction and its commonly
   122    used operands.
   123  
   124    BC op1, op2, op3
   125  
   126      op1: type of branch
   127          16 -> bctr (branch on ctr)
   128          12 -> bcr  (branch if cr bit is set)
   129          8  -> bcr+bctr (branch on ctr and cr values)
   130  	4  -> bcr != 0 (branch if specified cr bit is not set)
   131  
   132  	There are more combinations but these are the most common.
   133  
   134      op2: condition register field and condition bit
   135  
   136  	This contains an immediate value indicating which condition field
   137  	to read and what bits to test. Each field is 4 bits long with CR0
   138          at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition
   139          with these condition values:
   140  
   141          0 -> LT
   142          1 -> GT
   143          2 -> EQ
   144          3 -> OVG
   145  
   146  	Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ.
   147  
   148      op3: branch target
   149  
   150    Examples:
   151  
   152      BC 12, 0, target		<=>	blt cr0, target
   153      BC 12, 2, target		<=>	beq cr0, target
   154      BC 12, 5, target		<=>	bgt cr1, target
   155      BC 12, 30, target		<=>	beq cr7, target
   156      BC 4, 6, target		<=>	bne cr1, target
   157      BC 4, 1, target		<=>	ble cr1, target
   158  
   159      The following extended opcodes are available for ease of use and readability:
   160  
   161      BNE CR2, target		<=>	bne cr2, target
   162      BEQ CR4, target		<=>	beq cr4, target
   163      BLT target			<=>	blt target (cr0 default)
   164      BGE CR7, target		<=>	bge cr7, target
   165  
   166    Refer to the ISA for more information on additional values for the BC instruction,
   167    how to handle OVG information, and much more.
   168  
   169  5. Align directive
   170  
   171    Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates
   172    that the next instruction should be aligned to the specified value. Currently
   173    8 and 16 are the only supported values, and a maximum of 2 NOPs will be added
   174    to align the code. That means in the case where the code is aligned to 4 but
   175    PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid
   176    adding 3 NOPs.
   177  
   178    The purpose of this directive is to improve performance for cases like loops
   179    where better alignment (8 or 16 instead of 4) might be helpful. This directive
   180    exists in PPC64 assembler and is frequently used by PPC64 assembler writers.
   181  
   182    PCALIGN $16
   183    PCALIGN $8
   184  
   185    Functions in Go are aligned to 16 bytes, as is the case in all other compilers
   186    for PPC64.
   187  
   188  6. Shift instructions
   189  
   190    The simple scalar shifts on PPC64 expect a shift count that fits in 5 bits for
   191    32-bit values or 6 bit for 64-bit values. If the shift count is a constant value
   192    greater than the max then the assembler sets it to the max for that size (31 for
   193    32 bit values, 63 for 64 bit values). If the shift count is in a register, then
   194    only the low 5 or 6 bits of the register will be used as the shift count. The
   195    Go compiler will add appropriate code to compare the shift value to achieve the
   196    the correct result, and the assembler does not add extra checking.
   197  
   198    Examples:
   199  
   200      SRAD $8,R3,R4		=>	sradi r4,r3,8
   201      SRD $8,R3,R4		=>	rldicl r4,r3,56,8
   202      SLD $8,R3,R4		=>	rldicr r4,r3,8,55
   203      SRAW $16,R4,R5		=>	srawi r5,r4,16
   204      SRW $40,R4,R5		=>	rlwinm r5,r4,0,0,31
   205      SLW $12,R4,R5		=>	rlwinm r5,r4,12,0,19
   206  
   207    Some non-simple shifts have operands in the Go assembly which don't map directly
   208    onto operands in the PPC64 assembly. When an operand in a shift instruction in the
   209    Go assembly is a bit mask, that mask is represented as a start and end bit in the
   210    PPC64 assembly instead of a mask. See the ISA for more detail on these types of shifts.
   211    Here are a few examples:
   212  
   213      RLWMI $7,R3,$65535,R6 	=>	rlwimi r6,r3,7,16,31
   214      RLDMI $0,R4,$7,R6 		=>	rldimi r6,r4,0,61
   215  
   216    More recently, Go opcodes were added which map directly onto the PPC64 opcodes. It is
   217    recommended to use the newer opcodes to avoid confusion.
   218  
   219      RLDICL $0,R4,$15,R6		=>	rldicl r6,r4,0,15
   220      RLDICR $0,R4,$15,R6		=>	rldicr r6.r4,0,15
   221  
   222  Register naming
   223  
   224  1. Special register usage in Go asm
   225  
   226    The following registers should not be modified by user Go assembler code.
   227  
   228    R0: Go code expects this register to contain the value 0.
   229    R1: Stack pointer
   230    R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code)
   231    R13: TLS pointer
   232    R30: g (goroutine)
   233  
   234    Register names:
   235  
   236    Rn is used for general purpose registers. (0-31)
   237    Fn is used for floating point registers. (0-31)
   238    Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31)
   239    VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63)
   240    CTR represents the count register.
   241    LR represents the link register.
   242  
   243  */
   244  package ppc64
   245  

View as plain text