doc.go

Documentation: github.com/twitchyliquid64/golang-asm/obj/arm64

     1  // Copyright 2018 The Go Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
     7  syntax, but we can still follow the general rules to map between them.
     8  
     9  Instructions mnemonics mapping rules
    10  
    11  1. Most instructions use width suffixes of instruction names to indicate operand width rather than
    12  using different register names.
    13  
    14    Examples:
    15      ADC R24, R14, R12          <=>     adc x12, x24
    16      ADDW R26->24, R21, R15     <=>     add w15, w21, w26, asr #24
    17      FCMPS F2, F3               <=>     fcmp s3, s2
    18      FCMPD F2, F3               <=>     fcmp d3, d2
    19      FCVTDH F2, F3              <=>     fcvt h3, d2
    20  
    21  2. Go uses .P and .W suffixes to indicate post-increment and pre-increment.
    22  
    23    Examples:
    24      MOVD.P -8(R10), R8         <=>      ldr x8, [x10],#-8
    25      MOVB.W 16(R16), R10        <=>      ldrsb x10, [x16,#16]!
    26      MOVBU.W 16(R16), R10       <=>      ldrb x10, [x16,#16]!
    27  
    28  3. Go uses a series of MOV instructions as load and store.
    29  
    30  64-bit variant ldr, str, stur => MOVD;
    31  32-bit variant str, stur, ldrsw => MOVW;
    32  32-bit variant ldr => MOVWU;
    33  ldrb => MOVBU; ldrh => MOVHU;
    34  ldrsb, sturb, strb => MOVB;
    35  ldrsh, sturh, strh =>  MOVH.
    36  
    37  4. Go moves conditions into opcode suffix, like BLT.
    38  
    39  5. Go adds a V prefix for most floating-point and SIMD instructions, except cryptographic extension
    40  instructions and floating-point(scalar) instructions.
    41  
    42    Examples:
    43      VADD V5.H8, V18.H8, V9.H8         <=>      add v9.8h, v18.8h, v5.8h
    44      VLD1.P (R6)(R11), [V31.D1]        <=>      ld1 {v31.1d}, [x6], x11
    45      VFMLA V29.S2, V20.S2, V14.S2      <=>      fmla v14.2s, v20.2s, v29.2s
    46      AESD V22.B16, V19.B16             <=>      aesd v19.16b, v22.16b
    47      SCVTFWS R3, F16                   <=>      scvtf s17, w6
    48  
    49  6. Align directive
    50  
    51  Go asm supports the PCALIGN directive, which indicates that the next instruction should be aligned
    52  to a specified boundary by padding with NOOP instruction. The alignment value supported on arm64
    53  must be a power of 2 and in the range of [8, 2048].
    54  
    55    Examples:
    56      PCALIGN $16
    57      MOVD $2, R0          // This instruction is aligned with 16 bytes.
    58      PCALIGN $1024
    59      MOVD $3, R1          // This instruction is aligned with 1024 bytes.
    60  
    61  PCALIGN also changes the function alignment. If a function has one or more PCALIGN directives,
    62  its address will be aligned to the same or coarser boundary, which is the maximum of all the
    63  alignment values.
    64  
    65  In the following example, the function Add is aligned with 128 bytes.
    66    Examples:
    67      TEXT ·Add(SB),$40-16
    68      MOVD $2, R0
    69      PCALIGN $32
    70      MOVD $4, R1
    71      PCALIGN $128
    72      MOVD $8, R2
    73      RET
    74  
    75  On arm64, functions in Go are aligned to 16 bytes by default, we can also use PCALGIN to set the
    76  function alignment. The functions that need to be aligned are preferably using NOFRAME and NOSPLIT
    77  to avoid the impact of the prologues inserted by the assembler, so that the function address will
    78  have the same alignment as the first hand-written instruction.
    79  
    80  In the following example, PCALIGN at the entry of the function Add will align its address to 2048 bytes.
    81  
    82    Examples:
    83      TEXT ·Add(SB),NOSPLIT|NOFRAME,$0
    84        PCALIGN $2048
    85        MOVD $1, R0
    86        MOVD $1, R1
    87        RET
    88  
    89  Special Cases.
    90  
    91  (1) umov is written as VMOV.
    92  
    93  (2) br is renamed JMP, blr is renamed CALL.
    94  
    95  (3) No need to add "W" suffix: LDARB, LDARH, LDAXRB, LDAXRH, LDTRH, LDXRB, LDXRH.
    96  
    97  (4) In Go assembly syntax, NOP is a zero-width pseudo-instruction serves generic purpose, nothing
    98  related to real ARM64 instruction. NOOP serves for the hardware nop instruction. NOOP is an alias of
    99  HINT $0.
   100  
   101    Examples:
   102      VMOV V13.B[1], R20      <=>      mov x20, v13.b[1]
   103      VMOV V13.H[1], R20      <=>      mov w20, v13.h[1]
   104      JMP (R3)                <=>      br x3
   105      CALL (R17)              <=>      blr x17
   106      LDAXRB (R19), R16       <=>      ldaxrb w16, [x19]
   107      NOOP                    <=>      nop
   108  
   109  
   110  Register mapping rules
   111  
   112  1. All basic register names are written as Rn.
   113  
   114  2. Go uses ZR as the zero register and RSP as the stack pointer.
   115  
   116  3. Bn, Hn, Dn, Sn and Qn instructions are written as Fn in floating-point instructions and as Vn
   117  in SIMD instructions.
   118  
   119  
   120  Argument mapping rules
   121  
   122  1. The operands appear in left-to-right assignment order.
   123  
   124  Go reverses the arguments of most instructions.
   125  
   126      Examples:
   127        ADD R11.SXTB<<1, RSP, R25      <=>      add x25, sp, w11, sxtb #1
   128        VADD V16, V19, V14             <=>      add d14, d19, d16
   129  
   130  Special Cases.
   131  
   132  (1) Argument order is the same as in the GNU ARM64 syntax: cbz, cbnz and some store instructions,
   133  such as str, stur, strb, sturb, strh, sturh stlr, stlrb. stlrh, st1.
   134  
   135    Examples:
   136      MOVD R29, 384(R19)    <=>    str x29, [x19,#384]
   137      MOVB.P R30, 30(R4)    <=>    strb w30, [x4],#30
   138      STLRH R21, (R19)      <=>    stlrh w21, [x19]
   139  
   140  (2) MADD, MADDW, MSUB, MSUBW, SMADDL, SMSUBL, UMADDL, UMSUBL <Rm>, <Ra>, <Rn>, <Rd>
   141  
   142    Examples:
   143      MADD R2, R30, R22, R6       <=>    madd x6, x22, x2, x30
   144      SMSUBL R10, R3, R17, R27    <=>    smsubl x27, w17, w10, x3
   145  
   146  (3) FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS, FNMSUBD, FNMSUBS <Fm>, <Fa>, <Fn>, <Fd>
   147  
   148    Examples:
   149      FMADDD F30, F20, F3, F29    <=>    fmadd d29, d3, d30, d20
   150      FNMSUBS F7, F25, F7, F22    <=>    fnmsub s22, s7, s7, s25
   151  
   152  (4) BFI, BFXIL, SBFIZ, SBFX, UBFIZ, UBFX $<lsb>, <Rn>, $<width>, <Rd>
   153  
   154    Examples:
   155      BFIW $16, R20, $6, R0      <=>    bfi w0, w20, #16, #6
   156      UBFIZ $34, R26, $5, R20    <=>    ubfiz x20, x26, #34, #5
   157  
   158  (5) FCCMPD, FCCMPS, FCCMPED, FCCMPES <cond>, Fm. Fn, $<nzcv>
   159  
   160    Examples:
   161      FCCMPD AL, F8, F26, $0     <=>    fccmp d26, d8, #0x0, al
   162      FCCMPS VS, F29, F4, $4     <=>    fccmp s4, s29, #0x4, vs
   163      FCCMPED LE, F20, F5, $13   <=>    fccmpe d5, d20, #0xd, le
   164      FCCMPES NE, F26, F10, $0   <=>    fccmpe s10, s26, #0x0, ne
   165  
   166  (6) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, $<imm>, $<nzcv>
   167  
   168    Examples:
   169      CCMP MI, R22, $12, $13     <=>    ccmp x22, #0xc, #0xd, mi
   170      CCMNW AL, R1, $11, $8      <=>    ccmn w1, #0xb, #0x8, al
   171  
   172  (7) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, <Rm>, $<nzcv>
   173  
   174    Examples:
   175      CCMN VS, R13, R22, $10     <=>    ccmn x13, x22, #0xa, vs
   176      CCMPW HS, R19, R14, $11    <=>    ccmp w19, w14, #0xb, cs
   177  
   178  (9) CSEL, CSELW, CSNEG, CSNEGW, CSINC, CSINCW <cond>, <Rn>, <Rm>, <Rd> ;
   179  FCSELD, FCSELS <cond>, <Fn>, <Fm>, <Fd>
   180  
   181    Examples:
   182      CSEL GT, R0, R19, R1        <=>    csel x1, x0, x19, gt
   183      CSNEGW GT, R7, R17, R8      <=>    csneg w8, w7, w17, gt
   184      FCSELD EQ, F15, F18, F16    <=>    fcsel d16, d15, d18, eq
   185  
   186  (10) TBNZ, TBZ $<imm>, <Rt>, <label>
   187  
   188  
   189  (11) STLXR, STLXRW, STXR, STXRW, STLXRB, STLXRH, STXRB, STXRH  <Rf>, (<Rn|RSP>), <Rs>
   190  
   191    Examples:
   192      STLXR ZR, (R15), R16    <=>    stlxr w16, xzr, [x15]
   193      STXRB R9, (R21), R19    <=>    stxrb w19, w9, [x21]
   194  
   195  (12) STLXP, STLXPW, STXP, STXPW (<Rf1>, <Rf2>), (<Rn|RSP>), <Rs>
   196  
   197    Examples:
   198      STLXP (R17, R19), (R4), R5      <=>    stlxp w5, x17, x19, [x4]
   199      STXPW (R30, R25), (R22), R13    <=>    stxp w13, w30, w25, [x22]
   200  
   201  2. Expressions for special arguments.
   202  
   203  #<immediate> is written as $<immediate>.
   204  
   205  Optionally-shifted immediate.
   206  
   207    Examples:
   208      ADD $(3151<<12), R14, R20     <=>    add x20, x14, #0xc4f, lsl #12
   209      ADDW $1864, R25, R6           <=>    add w6, w25, #0x748
   210  
   211  Optionally-shifted registers are written as <Rm>{<shift><amount>}.
   212  The <shift> can be <<(lsl), >>(lsr), ->(asr), @>(ror).
   213  
   214    Examples:
   215      ADD R19>>30, R10, R24     <=>    add x24, x10, x19, lsr #30
   216      ADDW R26->24, R21, R15    <=>    add w15, w21, w26, asr #24
   217  
   218  Extended registers are written as <Rm>{.<extend>{<<<amount>}}.
   219  <extend> can be UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW or SXTX.
   220  
   221    Examples:
   222      ADDS R19.UXTB<<4, R9, R26     <=>    adds x26, x9, w19, uxtb #4
   223      ADDSW R14.SXTX, R14, R6       <=>    adds w6, w14, w14, sxtx
   224  
   225  Memory references: [<Xn|SP>{,#0}] is written as (Rn|RSP), a base register and an immediate
   226  offset is written as imm(Rn|RSP), a base register and an offset register is written as (Rn|RSP)(Rm).
   227  
   228    Examples:
   229      LDAR (R22), R9                  <=>    ldar x9, [x22]
   230      LDP 28(R17), (R15, R23)         <=>    ldp x15, x23, [x17,#28]
   231      MOVWU (R4)(R12<<2), R8          <=>    ldr w8, [x4, x12, lsl #2]
   232      MOVD (R7)(R11.UXTW<<3), R25     <=>    ldr x25, [x7,w11,uxtw #3]
   233      MOVBU (R27)(R23), R14           <=>    ldrb w14, [x27,x23]
   234  
   235  Register pairs are written as (Rt1, Rt2).
   236  
   237    Examples:
   238      LDP.P -240(R11), (R12, R26)    <=>    ldp x12, x26, [x11],#-240
   239  
   240  Register with arrangement and register with arrangement and index.
   241  
   242    Examples:
   243      VADD V5.H8, V18.H8, V9.H8                     <=>    add v9.8h, v18.8h, v5.8h
   244      VLD1 (R2), [V21.B16]                          <=>    ld1 {v21.16b}, [x2]
   245      VST1.P V9.S[1], (R16)(R21)                    <=>    st1 {v9.s}[1], [x16], x28
   246      VST1.P [V13.H8, V14.H8, V15.H8], (R3)(R14)    <=>    st1 {v13.8h-v15.8h}, [x3], x14
   247      VST1.P [V14.D1, V15.D1], (R7)(R23)            <=>    st1 {v14.1d, v15.1d}, [x7], x23
   248  */
   249  package arm64
   250
View as plain text