lib/asm-docs.js - compiler-explorer - Rivoreo Source Code Repositories

 function getAsmOpcode(opcode) {
     if (!opcode) return;
     switch (opcode.toUpperCase()) {
         case "AAA":
             return {
                 "url": "http://www.felixcloutier.com/x86/AAA.html",
                 "html": "<p>Adjusts the sum of two unpacked BCD values to create an unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAA instruction is only useful when it follows an ADD instruction that adds (binary addition) two unpacked BCD values and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result.</p><p>If the addition produces a decimal carry, the AH register increments by 1, and the CF and AF flags are set. If there was no decimal carry, the CF and AF flags are cleared and the AH register is unchanged. In either case, bits 4 through 7 of the AL register are set to 0.</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts the sum of two unpacked BCD values to create an unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAA instruction is only useful when it follows an ADD instruction that adds (binary addition) two unpacked BCD values and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result."
             };

         case "AAD":
             return {
                 "url": "http://www.felixcloutier.com/x86/AAD.html",
                 "html": "<p>Adjusts two unpacked BCD digits (the least-significant digit in the AL register and the most-significant digit in the AH register) so that a division operation performed on the result will yield a correct unpacked BCD value. The AAD instruction is only useful when it precedes a DIV instruction that divides (binary division) the adjusted value in the AX register by an unpacked BCD value.</p><p>The AAD instruction sets the value in the AL register to (AL + (10 * AH)), and then clears the AH register to 00H. The value in the AX register is then equal to the binary equivalent of the original unpacked two-digit (base 10) number in registers AH and AL.</p><p>The generalized version of this instruction allows adjustment of two unpacked digits of any number base (see the \u201cOperation\u201d section below), by setting the <em>imm8</em> byte to the selected number base (for example, 08H for octal, 0AH for decimal, or 0CH for base 12 numbers). The AAD mnemonic is interpreted by all assemblers to mean adjust ASCII (base 10) values. To adjust values in another number base, the instruction must be hand coded in machine code (D5 <em>imm8</em>).</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts two unpacked BCD digits (the least-significant digit in the AL register and the most-significant digit in the AH register) so that a division operation performed on the result will yield a correct unpacked BCD value. The AAD instruction is only useful when it precedes a DIV instruction that divides (binary division) the adjusted value in the AX register by an unpacked BCD value."
             };

         case "AAM":
             return {
                 "url": "http://www.felixcloutier.com/x86/AAM.html",
                 "html": "<p>Adjusts the result of the multiplication of two unpacked BCD values to create a pair of unpacked (base 10) BCD values. The AX register is the implied source and destination operand for this instruction. The AAM instruction is only useful when it follows an MUL instruction that multiplies (binary multiplication) two unpacked BCD values and stores a word result in the AX register. The AAM instruction then adjusts the contents of the AX register to contain the correct 2-digit unpacked (base 10) BCD result.</p><p>The generalized version of this instruction allows adjustment of the contents of the AX to create two unpacked digits of any number base (see the \u201cOperation\u201d section below). Here, the <em>imm8</em> byte is set to the selected number base (for example, 08H for octal, 0AH for decimal, or 0CH for base 12 numbers). The AAM mnemonic is interpreted by all assemblers to mean adjust to ASCII (base 10) values. To adjust to values in another number base, the instruction must be hand coded in machine code (D4 <em>imm8</em>).</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts the result of the multiplication of two unpacked BCD values to create a pair of unpacked (base 10) BCD values. The AX register is the implied source and destination operand for this instruction. The AAM instruction is only useful when it follows an MUL instruction that multiplies (binary multiplication) two unpacked BCD values and stores a word result in the AX register. The AAM instruction then adjusts the contents of the AX register to contain the correct 2-digit unpacked (base 10) BCD result."
             };

         case "AAS":
             return {
                 "url": "http://www.felixcloutier.com/x86/AAS.html",
                 "html": "<p>Adjusts the result of the subtraction of two unpacked BCD values to create a unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAS instruction is only useful when it follows a SUB instruction that subtracts (binary subtraction) one unpacked BCD value from another and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result.</p><p>If the subtraction produced a decimal carry, the AH register decrements by 1, and the CF and AF flags are set. If no decimal carry occurred, the CF and AF flags are cleared, and the AH register is unchanged. In either case, the AL register is left with its top four bits set to 0.</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts the result of the subtraction of two unpacked BCD values to create a unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAS instruction is only useful when it follows a SUB instruction that subtracts (binary subtraction) one unpacked BCD value from another and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result."
             };

         case "ADC":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADC.html",
                 "html": "<p>Adds the destination operand (first operand), the source operand (second operand), and the carry (CF) flag and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a carry from a previous addition. When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p><p>The ADC instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates the result for both data types and sets the OF and CF flags to indicate a carry in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p><p>The ADC instruction is usually executed as part of a multibyte or multiword addition in which an ADD instruction is followed by an ADC instruction.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Adds the destination operand (first operand), the source operand (second operand), and the carry (CF) flag and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a carry from a previous addition. When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format."
             };

         case "ADCX":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADCX.html",
                 "html": "<p>Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the carry-flag (CF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of CF can represent a carry from a previous addition. The instruction sets the CF flag with the carry generated by the unsigned addition of the operands.</p><p>The ADCX instruction is executed in the context of multi-precision addition, where we add a series of operands with a carry-chain. At the beginning of a chain of additions, we need to make sure the CF is in a desired initial state. Often, this initial state needs to be 0, which can be achieved with an instruction to zero the CF (e.g. XOR).</p><p>This instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode.</p><p>In 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to addi-tional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64 bits.</p><p>ADCX executes normally either inside or outside a transaction region.</p>",
                 "tooltip": "Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the carry-flag (CF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of CF can represent a carry from a previous addition. The instruction sets the CF flag with the carry generated by the unsigned addition of the operands."
             };

         case "ADD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADD.html",
                 "html": "<p>Adds the destination operand (first operand) and the source operand (second operand) and then stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p><p>The ADD instruction performs integer addition. It evaluates the result for both signed and unsigned integer oper-ands and sets the OF and CF flags to indicate a carry (overflow) in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Adds the destination operand (first operand) and the source operand (second operand) and then stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format."
             };

         case "ADDPD":
         case "VADDPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDPD.html",
                 "html": "<p>Performs a SIMD add of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. See Chapter 11 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for an overview of SIMD double-precision floating-point operation.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD add of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand."
             };

         case "VADDPS":
         case "ADDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDPS.html",
                 "html": "<p>Performs a SIMD add of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. See Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for an overview of SIMD single-precision floating-point operation.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD add of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand."
             };

         case "ADDSD":
         case "VADDSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDSD.html",
                 "html": "<p>Adds the low double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the double-precision floating-point result in the destination operand.</p><p>The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. See Chapter 11 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an overview of a scalar double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Adds the low double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the double-precision floating-point result in the destination operand."
             };

         case "VADDSS":
         case "ADDSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDSS.html",
                 "html": "<p>Adds the low single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the single-precision floating-point result in the destination operand.</p><p>The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. See Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an overview of a scalar single-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:32) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Adds the low single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the single-precision floating-point result in the destination operand."
             };

         case "ADDSUBPD":
         case "VADDSUBPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDSUBPD.html",
                 "html": "<p>Adds odd-numbered double-precision floating-point values of the first source operand (second operand) with the corresponding double-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered double-precision floating-point values from the second source operand from the corresponding double-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. See Figure 3-3.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Adds odd-numbered double-precision floating-point values of the first source operand (second operand) with the corresponding double-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered double-precision floating-point values from the second source operand from the corresponding double-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand."
             };

         case "VADDSUBPS":
         case "ADDSUBPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADDSUBPS.html",
                 "html": "<p>Adds odd-numbered single-precision floating-point values of the first source operand (second operand) with the corresponding single-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered single-precision floating-point values from the second source operand from the corresponding single-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. See Figure 3-4.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Adds odd-numbered single-precision floating-point values of the first source operand (second operand) with the corresponding single-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered single-precision floating-point values from the second source operand from the corresponding single-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand."
             };

         case "ADOX":
             return {
                 "url": "http://www.felixcloutier.com/x86/ADOX.html",
                 "html": "<p>Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the overflow-flag (OF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of OF represents a carry from a previous addition. The instruction sets the OF flag with the carry generated by the unsigned addition of the operands.</p><p>The ADOX instruction is executed in the context of multi-precision addition, where we add a series of operands with a carry-chain. At the beginning of a chain of additions, we execute an instruction to zero the OF (e.g. XOR).</p><p>This instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode.</p><p>In 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to addi-tional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64-bits.</p><p>ADOX executes normally either inside or outside a transaction region.</p>",
                 "tooltip": "Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the overflow-flag (OF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of OF represents a carry from a previous addition. The instruction sets the OF flag with the carry generated by the unsigned addition of the operands."
             };

         case "AESDEC":
         case "VAESDEC":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESDEC.html",
                 "html": "<p>This instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>Use the AESDEC instruction for all but the last decryption round. For the last decryption round, use the AESDEC-CLAST instruction.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand."
             };

         case "VAESDECLAST":
         case "AESDECLAST":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESDECLAST.html",
                 "html": "<p>This instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand."
             };

         case "AESENC":
         case "VAESENC":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESENC.html",
                 "html": "<p>This instruction performs a single round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>Use the AESENC instruction for all but the last encryption rounds. For the last encryption round, use the AESENC-CLAST instruction.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs a single round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand."
             };

         case "AESENCLAST":
         case "VAESENCLAST":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESENCLAST.html",
                 "html": "<p>This instruction performs the last round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs the last round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand."
             };

         case "AESIMC":
         case "VAESIMC":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESIMC.html",
                 "html": "<p>Perform the InvMixColumns transformation on the source operand and store the result in the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory loca-tion.</p><p>Note: the AESIMC instruction should be applied to the expanded AES round keys (except for the first and last round key) in order to prepare them for decryption using the \u201cEquivalent Inverse Cipher\u201d (defined in FIPS 197).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Perform the InvMixColumns transformation on the source operand and store the result in the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory loca-tion."
             };

         case "AESKEYGENASSIST":
         case "VAESKEYGENASSIST":
             return {
                 "url": "http://www.felixcloutier.com/x86/AESKEYGENASSIST.html",
                 "html": "<p>Assist in expanding the AES cipher key, by computing steps towards generating a round key for encryption, using 128-bit data specified in the source operand and an 8-bit round constant specified as an immediate, store the result in the destination operand.</p><p>The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory loca-tion.</p><p>128-bit Legacy SSE version:Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Assist in expanding the AES cipher key, by computing steps towards generating a round key for encryption, using 128-bit data specified in the source operand and an 8-bit round constant specified as an immediate, store the result in the destination operand."
             };

         case "AND":
             return {
                 "url": "http://www.felixcloutier.com/x86/AND.html",
                 "html": "<p>Performs a bitwise AND operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory operands cannot be used in one instruction.) Each bit of the result is set to 1 if both corresponding bits of the first and second operands are 1; otherwise, it is set to 0.</p><p>This instruction can be used with a LOCK prefix to allow the it to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Performs a bitwise AND operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory operands cannot be used in one instruction.) Each bit of the result is set to 1 if both corresponding bits of the first and second operands are 1; otherwise, it is set to 0."
             };

         case "ANDN":
             return {
                 "url": "http://www.felixcloutier.com/x86/ANDN.html",
                 "html": "<p>Performs a bitwise logical AND of inverted second operand (the first source operand) with the third operand (the second source operand). The result is stored in the first operand (destination operand).</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Performs a bitwise logical AND of inverted second operand (the first source operand) with the third operand (the second source operand). The result is stored in the first operand (destination operand)."
             };

         case "ANDNPD":
         case "VANDNPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ANDNPD.html",
                 "html": "<p>Performs a bitwise logical AND NOT of the two or four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical AND NOT of the two or four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand."
             };

         case "VANDNPS":
         case "ANDNPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ANDNPS.html",
                 "html": "<p>Inverts the bits of the four packed single-precision floating-point values in the destination operand (first operand), performs a bitwise logical AND of the four packed single-precision floating-point values in the source operand (second operand) and the temporary inverted result, and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Inverts the bits of the four packed single-precision floating-point values in the destination operand (first operand), performs a bitwise logical AND of the four packed single-precision floating-point values in the source operand (second operand) and the temporary inverted result, and stores the result in the destination operand."
             };

         case "ANDPD":
         case "VANDPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ANDPD.html",
                 "html": "<p>Performs a bitwise logical AND of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical AND of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand."
             };

         case "VANDPS":
         case "ANDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ANDPS.html",
                 "html": "<p>Performs a bitwise logical AND of the four or eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical AND of the four or eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand."
             };

         case "ARPL":
             return {
                 "url": "http://www.felixcloutier.com/x86/ARPL.html",
                 "html": "<p>Compares the RPL fields of two segment selectors. The first operand (the destination operand) contains one segment selector and the second operand (source operand) contains the other. (The RPL field is located in bits 0 and 1 of each operand.) If the RPL field of the destination operand is less than the RPL field of the source operand, the ZF flag is set and the RPL field of the destination operand is increased to match that of the source operand. Otherwise, the ZF flag is cleared and no change is made to the destination operand. (The destination operand can be a word register or a memory location; the source operand must be a word register.)</p><p>The ARPL instruction is provided for use by operating-system procedures (however, it can also be used by applica-tions). It is generally used to adjust the RPL of a segment selector that has been passed to the operating system by an application program to match the privilege level of the application program. Here the segment selector passed to the operating system is placed in the destination operand and segment selector for the application program\u2019s code segment is placed in the source operand. (The RPL field in the source operand represents the priv-ilege level of the application program.) Execution of the ARPL instruction then ensures that the RPL of the segment selector received by the operating system is no lower (does not have a higher privilege) than the privilege level of the application program (the segment selector for the application program\u2019s code segment can be read from the stack following a procedure call).</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not encodable in 64-bit mode.</p><p>See \u201cChecking Caller Access Privileges\u201d in Chapter 3, \u201cProtected-Mode Memory Management,\u201d of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>, for more information about the use of this instruc-tion.</p>",
                 "tooltip": "Compares the RPL fields of two segment selectors. The first operand (the destination operand) contains one segment selector and the second operand (source operand) contains the other. (The RPL field is located in bits 0 and 1 of each operand.) If the RPL field of the destination operand is less than the RPL field of the source operand, the ZF flag is set and the RPL field of the destination operand is increased to match that of the source operand. Otherwise, the ZF flag is cleared and no change is made to the destination operand. (The destination operand can be a word register or a memory location; the source operand must be a word register.)"
             };

         case "BEXTR":
             return {
                 "url": "http://www.felixcloutier.com/x86/BEXTR.html",
                 "html": "<p>Extracts contiguous bits from the first source operand (the second operand) using an index value and length value specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH) beginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are extracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher order bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is cleared if no bits are extracted.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Extracts contiguous bits from the first source operand (the second operand) using an index value and length value specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH) beginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are extracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher order bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is cleared if no bits are extracted."
             };

         case "BLENDPD":
         case "VBLENDPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLENDPD.html",
                 "html": "<p>Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the desti-nation is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\", then the double-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the desti-nation is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\", then the double-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied."
             };

         case "VBLENDPS":
         case "BLENDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLENDPS.html",
                 "html": "<p>Packed single-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\", then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Packed single-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\", then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied."
             };

         case "BLENDVPD":
         case "VBLENDVPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLENDVPD.html",
                 "html": "<p>Conditionally copy each quadword data element of double-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each quadword element of the mask register.</p><p>Each quadword element of the destination operand is copied from:</p><p>The register assignment of the implicit mask operand for BLENDVPD is defined to be the architectural register XMM0.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute BLENDVPD with a VEX prefix will cause #UD.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (VLMAX-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.W must be 0, otherwise, the instruction will #UD.</p>",
                 "tooltip": "Conditionally copy each quadword data element of double-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each quadword element of the mask register."
             };

         case "VBLENDVPS":
         case "BLENDVPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLENDVPS.html",
                 "html": "<p>Conditionally copy each dword data element of single-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each dword element of the mask register.</p><p>Each quadword element of the destination operand is copied from:</p><p>The register assignment of the implicit mask operand for BLENDVPS is defined to be the architectural register XMM0.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute BLENDVPS with a VEX prefix will cause #UD.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (VLMAX-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.W must be 0, otherwise, the instruction will #UD.</p>",
                 "tooltip": "Conditionally copy each dword data element of single-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each dword element of the mask register."
             };

         case "BLSI":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLSI.html",
                 "html": "<p>Extracts the lowest set bit from the source operand and set the corresponding bit in the destination register. All other bits in the destination operand are zeroed. If no bits are set in the source operand, BLSI sets all the bits in the destination to 0 and sets ZF and CF.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Extracts the lowest set bit from the source operand and set the corresponding bit in the destination register. All other bits in the destination operand are zeroed. If no bits are set in the source operand, BLSI sets all the bits in the destination to 0 and sets ZF and CF."
             };

         case "BLSMSK":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLSMSK.html",
                 "html": "<p>Sets all the lower bits of the destination operand to \u201c1\u201d up to and including lowest set bit (=1) in the source operand. If source operand is zero, BLSMSK sets all bits of the destination operand to 1 and also sets CF to 1.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Sets all the lower bits of the destination operand to \u201c1\u201d up to and including lowest set bit (=1) in the source operand. If source operand is zero, BLSMSK sets all bits of the destination operand to 1 and also sets CF to 1."
             };

         case "BLSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/BLSR.html",
                 "html": "<p>Copies all bits from the source operand to the destination operand and resets (=0) the bit position in the destina-tion operand that corresponds to the lowest set bit of the source operand. If the source operand is zero BLSR sets CF.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Copies all bits from the source operand to the destination operand and resets (=0) the bit position in the destina-tion operand that corresponds to the lowest set bit of the source operand. If the source operand is zero BLSR sets CF."
             };

         case "BOUND":
             return {
                 "url": "http://www.felixcloutier.com/x86/BOUND.html",
                 "html": "<p>BOUND determines if the first operand (array index) is within the bounds of an array specified the second operand (bounds operand). The array index is a signed integer located in a register. The bounds operand is a memory loca-tion that contains a pair of signed doubleword-integers (when the operand-size attribute is 32) or a pair of signed word-integers (when the operand-size attribute is 16). The first doubleword (or word) is the lower bound of the array and the second doubleword (or word) is the upper bound of the array. The array index must be greater than or equal to the lower bound and less than or equal to the upper bound plus the operand size in bytes. If the index is not within bounds, a BOUND range exceeded exception (#BR) is signaled. When this exception is generated, the saved return instruction pointer points to the BOUND instruction.</p><p>The bounds limit data structure (two words or doublewords containing the lower and upper limits of the array) is usually placed just before the array itself, making the limits addressable via a constant offset from the beginning of the array. Because the address of the array already will be present in a register, this practice avoids extra bus cycles to obtain the effective address of the array bounds.</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "BOUND determines if the first operand (array index) is within the bounds of an array specified the second operand (bounds operand). The array index is a signed integer located in a register. The bounds operand is a memory loca-tion that contains a pair of signed doubleword-integers (when the operand-size attribute is 32) or a pair of signed word-integers (when the operand-size attribute is 16). The first doubleword (or word) is the lower bound of the array and the second doubleword (or word) is the upper bound of the array. The array index must be greater than or equal to the lower bound and less than or equal to the upper bound plus the operand size in bytes. If the index is not within bounds, a BOUND range exceeded exception (#BR) is signaled. When this exception is generated, the saved return instruction pointer points to the BOUND instruction."
             };

         case "BSF":
             return {
                 "url": "http://www.felixcloutier.com/x86/BSF.html",
                 "html": "<p>Searches the source operand (second operand) for the least significant set bit (1 bit). If a least significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content of the source operand is 0, the content of the destination operand is undefined.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Searches the source operand (second operand) for the least significant set bit (1 bit). If a least significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content of the source operand is 0, the content of the destination operand is undefined."
             };

         case "BSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/BSR.html",
                 "html": "<p>Searches the source operand (second operand) for the most significant set bit (1 bit). If a most significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content source operand is 0, the content of the destination operand is undefined.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Searches the source operand (second operand) for the most significant set bit (1 bit). If a most significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content source operand is 0, the content of the destination operand is undefined."
             };

         case "BSWAP":
             return {
                 "url": "http://www.felixcloutier.com/x86/BSWAP.html",
                 "html": "<p>Reverses the byte order of a 32-bit or 64-bit (destination) register. This instruction is provided for converting little-endian values to big-endian format and vice versa. To swap bytes in a word value (16-bit register), use the XCHG instruction. When the BSWAP instruction references a 16-bit register, the result is undefined.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Reverses the byte order of a 32-bit or 64-bit (destination) register. This instruction is provided for converting little-endian values to big-endian format and vice versa. To swap bytes in a word value (16-bit register), use the XCHG instruction. When the BSWAP instruction references a 16-bit register, the result is undefined."
             };

         case "BT":
             return {
                 "url": "http://www.felixcloutier.com/x86/BT.html",
                 "html": "<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset (specified by the second operand) and stores the value of the bit in the CF flag. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p><p>See also: <strong>Bit(BitBase, BitOffset) </strong>on page 3-10.</p><p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-tion with the displacement field of the memory operand. In this case, the low-order 3 or 5 bits (3 for 16-bit oper-ands, 5 for 32-bit operands) of the immediate bit offset are stored in the immediate bit offset field, and the high-order bits are shifted and combined with the byte displacement in the addressing mode by the assembler. The processor will ignore the high order bits if they are not zero.</p><p>When accessing a bit in memory, the processor may access 4 bytes starting from the memory address for a 32-bit operand size, using by the following relationship:</p><p>Effective Address + (4 \u2217 (BitOffset DIV 32))</p>",
                 "tooltip": "Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset (specified by the second operand) and stores the value of the bit in the CF flag. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value"
             };

         case "BTC":
             return {
                 "url": "http://www.felixcloutier.com/x86/BTC.html",
                 "html": "<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and complements the selected bit in the bit string. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p><p>See also: <strong>Bit(BitBase, BitOffset) </strong>on page 3-10.</p><p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-tion with the displacement field of the memory operand. See \u201cBT\u2014Bit Test\u201d in this chapter for more information on this addressing mechanism.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and complements the selected bit in the bit string. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value"
             };

         case "BTR":
             return {
                 "url": "http://www.felixcloutier.com/x86/BTR.html",
                 "html": "<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and clears the selected bit in the bit string to 0. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p><p>See also: <strong>Bit(BitBase, BitOffset) </strong>on page 3-10.</p><p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-tion with the displacement field of the memory operand. See \u201cBT\u2014Bit Test\u201d in this chapter for more information on this addressing mechanism.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and clears the selected bit in the bit string to 0. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value"
             };

         case "BTS":
             return {
                 "url": "http://www.felixcloutier.com/x86/BTS.html",
                 "html": "<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and sets the selected bit in the bit string to 1. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p><p>See also: <strong>Bit(BitBase, BitOffset) </strong>on page 3-10.</p><p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-tion with the displacement field of the memory operand. See \u201cBT\u2014Bit Test\u201d in this chapter for more information on this addressing mechanism.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and sets the selected bit in the bit string to 1. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value"
             };

         case "BZHI":
             return {
                 "url": "http://www.felixcloutier.com/x86/BZHI.html",
                 "html": "<p>BZHI copies the bits of the first source operand (the second operand) into the destination operand (the first operand) and clears the higher bits in the destination according to the INDEX value specified by the second source operand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is saturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand is greater than OperandSize -1.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "BZHI copies the bits of the first source operand (the second operand) into the destination operand (the first operand) and clears the higher bits in the destination according to the INDEX value specified by the second source operand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is saturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand is greater than OperandSize -1."
             };

         case "CALL":
             return {
                 "url": "http://www.felixcloutier.com/x86/CALL.html",
                 "html": "<p>Saves procedure linking information on the stack and branches to the called procedure specified using the target operand. The target operand specifies the address of the first instruction in the called procedure. The operand can be an immediate value, a general-purpose register, or a memory location.</p><p>This instruction can be used to execute four types of calls:</p><p>The latter two call types (inter-privilege-level call and task switch) can only be executed in protected mode. See \u201cCalling Procedures Using Call and RET\u201d in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for additional information on near, far, and inter-privilege-level calls. See Chapter 7, \u201cTask Management,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>, for infor-mation on performing task switches with the CALL instruction.</p><p><strong>Near Call. </strong>When executing a near call, the processor pushes the value of the EIP register (which contains the offset of the instruction following the CALL instruction) on the stack (for use later as a return-instruction pointer). The processor then branches to the address in the current code segment specified by the target operand. The target operand specifies either an absolute offset in the code segment (an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register; this value points to the instruction following the CALL instruction). The CS register is not changed on near calls.</p><p>For a near call absolute, an absolute offset is specified indirectly in a general-purpose register or a memory location (<em>r/m16</em>, <em>r/m32, or r/m64</em>). The operand-size attribute determines the size of the target operand (16, 32 or 64 bits). When in 64-bit mode, the operand size for near call (and all near branches) is forced to 64-bits. Absolute offsets are loaded directly into the EIP(RIP) register. If the operand size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits. When accessing an absolute offset indirectly using the stack pointer [ESP] as the base register, the base value used is the value of the ESP before the instruction executes.</p>",
                 "tooltip": "Saves procedure linking information on the stack and branches to the called procedure specified using the target operand. The target operand specifies the address of the first instruction in the called procedure. The operand can be an immediate value, a general-purpose register, or a memory location."
             };

         case "CDQE":
         case "CWDE":
         case "CBW":
             return {
                 "url": "http://www.felixcloutier.com/x86/CDQE.html",
                 "html": "<p>Double the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction copies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-word) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register.</p><p>CBW and CWDE reference the same opcode. The CBW instruction is intended for use when the operand-size attri-bute is 16; CWDE is intended for use when the operand-size attribute is 32. Some assemblers may force the operand size. Others may treat these two mnemonics as synonyms (CBW/CWDE) and use the setting of the operand-size attribute to determine the size of values to be converted.</p><p>In 64-bit mode, the default operation size is the size of the destination register. Use of the REX.W prefix promotes this instruction (CDQE when promoted) to operate on 64-bit operands. In which case, CDQE copies the sign (bit 31) of the doubleword in the EAX register into the high 32 bits of RAX.</p>",
                 "tooltip": "Double the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction copies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-word) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register."
             };

         case "CLAC":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLAC.html",
                 "html": "<p>Clears the AC flag bit in EFLAGS register. This disables any alignment checking of user-mode data accesses. If the SMAP bit is set in the CR4 register, this disallows explicit supervisor-mode data accesses to user-mode pages.</p><p>This instruction's operation is the same in non-64-bit modes and 64-bit mode. Attempts to execute CLAC when CPL &gt; 0 cause #UD.</p>",
                 "tooltip": "Clears the AC flag bit in EFLAGS register. This disables any alignment checking of user-mode data accesses. If the SMAP bit is set in the CR4 register, this disallows explicit supervisor-mode data accesses to user-mode pages."
             };

         case "CLC":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLC.html",
                 "html": "<p>Clears the CF flag in the EFLAGS register. Operation is the same in all modes.</p>",
                 "tooltip": "Clears the CF flag in the EFLAGS register. Operation is the same in all modes."
             };

         case "CLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLD.html",
                 "html": "<p>Clears the DF flag in the EFLAGS register. When the DF flag is set to 0, string operations increment the index regis-ters (ESI and/or EDI). Operation is the same in all modes.</p>",
                 "tooltip": "Clears the DF flag in the EFLAGS register. When the DF flag is set to 0, string operations increment the index regis-ters (ESI and/or EDI). Operation is the same in all modes."
             };

         case "CLFLUSH":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLFLUSH.html",
                 "html": "<p>Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy (data and instruction). The invalidation is broadcast throughout the cache coherence domain. If, at any level of the cache hierarchy, the line is inconsistent with memory (dirty) it is written to memory before invalidation. The source operand is a byte memory location.</p><p>The availability of CLFLUSH is indicated by the presence of the CPUID feature flag CLFSH (bit 19 of the EDX register, see \u201cCPUID\u2014CPU Identification\u201d in this chapter). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).</p><p>The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCH<em>h </em>instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to PREFETCH<em>h</em> instructions or any of the speculative fetching mechanisms (that is, data can be specula-tively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references the cache line).</p><p>CLFLUSH is only ordered by the MFENCE instruction. It is not guaranteed to be ordered by any other fencing or seri-alizing instructions or by another CLFLUSH instruction. For example, software can use an MFENCE instruction to ensure that previous stores are included in the write-back.</p><p>The CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults asso-ciated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables.</p>",
                 "tooltip": "Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy (data and instruction). The invalidation is broadcast throughout the cache coherence domain. If, at any level of the cache hierarchy, the line is inconsistent with memory (dirty) it is written to memory before invalidation. The source operand is a byte memory location."
             };

         case "CLI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLI.html",
                 "html": "<p>If protected-mode virtual interrupts are not enabled, CLI clears the IF flag in the EFLAGS register. No other flags are affected. Clearing the IF flag causes the processor to ignore maskable external interrupts. The IF flag and the CLI and STI instruction have no affect on the generation of exceptions and NMI interrupts.</p><p>When protected-mode virtual interrupts are enabled, CPL is 3, and IOPL is less than 3; CLI clears the VIF flag in the EFLAGS register, leaving IF unaffected. Table 3-6 indicates the action of the CLI instruction depending on the processor operating mode and the CPL/IOPL of the running program or procedure.</p><p>Operation is the same in all modes.</p><h3>Table 3-6.  Decision Table for CLI Results</h3><table>\n<tr>\n<th>PE</th>\n<th>VM</th>\n<th>IOPL</th>\n<th>CPL</th>\n<th>PVI</th>\n<th>VIP</th>\n<th>VME</th>\n<th>CLI Result</th></tr>\n<tr>\n<td>0</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<th>IF = 0</th></tr>\n<tr>\n<td>1</td>\n<td>0</td>\n<td>\u2265 CPL</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<th>IF = 0</th></tr>\n<tr>\n<td>1</td>\n<td>0</td>\n<td>&lt; CPL</td>\n<td>3</td>\n<td>1</td>\n<td>X</td>\n<td>X</td>\n<th>VIF = 0</th></tr>\n<tr>\n<td>1</td>\n<td>0</td>\n<td>&lt; CPL</td>\n<td>&lt; 3</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<th>GP Fault</th></tr>\n<tr>\n<td>1</td>\n<td>0</td>\n<td>&lt; CPL</td>\n<td>X</td>\n<td>0</td>\n<td>X</td>\n<td>X</td>\n<th>GP Fault</th></tr>\n<tr>\n<td>1</td>\n<td>1</td>\n<td>3</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<th>IF = 0</th></tr>\n<tr>\n<td>1</td>\n<td>1</td>\n<td>&lt; 3</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>1</td>\n<th>VIF = 0</th></tr>\n<tr>\n<td>1</td>\n<td>1</td>\n<td>&lt; 3</td>\n<td>X</td>\n<td>X</td>\n<td>X</td>\n<td>0</td>\n<th>GP Fault</th></tr></table>",
                 "tooltip": "If protected-mode virtual interrupts are not enabled, CLI clears the IF flag in the EFLAGS register. No other flags are affected. Clearing the IF flag causes the processor to ignore maskable external interrupts. The IF flag and the CLI and STI instruction have no affect on the generation of exceptions and NMI interrupts."
             };

         case "CLTS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CLTS.html",
                 "html": "<p>Clears the task-switched (TS) flag in the CR0 register. This instruction is intended for use in operating-system procedures. It is a privileged instruction that can only be executed at a CPL of 0. It is allowed to be executed in real-address mode to allow initialization for protected mode.</p><p>The processor sets the TS flag every time a task switch occurs. The flag is used to synchronize the saving of FPU context in multitasking applications. See the description of the TS flag in the section titled \u201cControl Registers\u201d in Chapter 2 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>, for more information about this flag.</p><p>CLTS operation is the same in non-64-bit modes and 64-bit mode.</p><p>See Chapter 25, \u201cVMX Non-Root Operation,\u201d of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3C</em>, for more information about the behavior of this instruction in VMX non-root operation.</p>",
                 "tooltip": "Clears the task-switched (TS) flag in the CR0 register. This instruction is intended for use in operating-system procedures. It is a privileged instruction that can only be executed at a CPL of 0. It is allowed to be executed in real-address mode to allow initialization for protected mode."
             };

         case "CMC":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMC.html",
                 "html": "<p>Complements the CF flag in the EFLAGS register. CMC operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Complements the CF flag in the EFLAGS register. CMC operation is the same in non-64-bit modes and 64-bit mode."
             };

         case "CMOVNAE":
         case "CMOVGE":
         case "CMOVNB":
         case "CMOVLE":
         case "CMOVNBE":
         case "CMOVL":
         case "CMOVNA":
         case "CMOVAE":
         case "CMOVA":
         case "CMOVB":
         case "CMOVC":
         case "CMOVE":
         case "CMOVBE":
         case "CMOVG":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMOVNBE.html",
                 "html": "<p>The CMOV<em>cc</em> instructions check the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and perform a move operation if the flags are in a specified state (or condition). A condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOV<em>cc</em> instruction.</p><p>These instructions can move 16-bit, 32-bit or 64-bit values from memory to a general-purpose register or from one general-purpose register to another. Conditional moves of 8-bit register operands are not supported.</p><p>The condition for each CMOV<em>cc</em> mnemonic is given in the description column of the above table. The terms \u201cless\u201d and \u201cgreater\u201d are used for comparisons of signed integers and the terms \u201cabove\u201d and \u201cbelow\u201d are used for unsigned integers.</p><p>Because a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are defined for some opcodes. For example, the CMOVA (conditional move if above) instruction and the CMOVNBE (conditional move if not below or equal) instruction are alternate mnemonics for the opcode 0F 47H.</p><p>The CMOV<em>cc</em> instructions were introduced in P6 family processors; however, these instructions may not be supported by all IA-32 processors. Software can determine if the CMOV<em>cc</em> instructions are supported by checking the processor\u2019s feature information with the CPUID instruction (see \u201cCPUID\u2014CPU Identification\u201d in this chapter).</p>",
                 "tooltip": "The CMOVcc instructions check the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and perform a move operation if the flags are in a specified state (or condition). A condition code (cc) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOVcc instruction."
             };

         case "CMP":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMP.html",
                 "html": "<p>Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results. The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand.</p><p>The condition codes used by the J<em>cc</em>, CMOV<em>cc</em>, and SET<em>cc</em> instructions are based on the results of a CMP instruction. Appendix B, \u201cEFLAGS Condition Codes,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, shows the relationship of the status flags and the condition codes.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results. The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand."
             };

         case "VCMPPD":
         case "CMPPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPPD.html",
                 "html": "<p>Performs a SIMD compare of the packed double-precision floating-point values in the source operand (second operand) and the destination operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed on each of the pairs of packed values. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 128-bit memory location. The comparison predicate operand is an 8-bit immediate, bits 2:0 of the immediate define the type of comparison to be performed (see Table 3-7). Bits 7:3 of the immediate is reserved. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. Two comparisons are performed with results written to bits 127:0 of the destination operand.</p><h3>Table 3-7.  Comparison Predicate for CMPPD and CMPPS Instructions</h3><table>\n<tr>\n<th>Predi-cate</th>\n<th>imm8 Encoding</th>\n<th>Description</th>\n<th>Relation where: A Is 1st Operand B Is 2nd Operand</th>\n<th>Emulation</th>\n<th>Result if NaN Operand</th>\n<th>QNaN Oper-and Signals Invalid</th></tr>\n<tr>\n<td>EQ</td>\n<td>000B</td>\n<td>Equal</td>\n<td>A = B</td>\n<td></td>\n<td>False</td>\n<td>No</td></tr>\n<tr>\n<td>LT</td>\n<td>001B</td>\n<td>Less-than</td>\n<td>A &lt; B</td>\n<td></td>\n<td>False</td>\n<td>Yes</td></tr>\n<tr>\n<td>LE</td>\n<td>010B</td>\n<td>Less-than-or-equal</td>\n<td>A \u2264 B</td>\n<td></td>\n<td>False</td>\n<td>Yes</td></tr>\n<tr>\n<td></td>\n<td></td>\n<td>Greater than</td>\n<td>A &gt; B</td>\n<td>Swap Operands, Use LT</td>\n<td>False</td>\n<td>Yes</td></tr>\n<tr>\n<td></td>\n<td></td>\n<td>Greater-than-or-equal</td>\n<td>A \u2265 B</td>\n<td>Swap Operands, Use LE</td>\n<td>False</td>\n<td>Yes</td></tr>\n<tr>\n<td>UNORD</td>\n<td>011B</td>\n<td>Unordered</td>\n<td>A, B = Unordered</td>\n<td></td>\n<td>True</td>\n<td>No</td></tr>\n<tr>\n<td>NEQ</td>\n<td>100B</td>\n<td>Not-equal</td>\n<td>A \u2260 B</td>\n<td></td>\n<td>True</td>\n<td>No</td></tr>\n<tr>\n<td>NLT</td>\n<td>101B</td>\n<td>Not-less-than</td>\n<td>NOT(A &lt; B)</td>\n<td></td>\n<td>True</td>\n<td>Yes</td></tr></table><h3>Table 3-7.  Comparison Predicate for CMPPD and CMPPS Instructions  (Contd.)</h3>",
                 "tooltip": "Performs a SIMD compare of the packed double-precision floating-point values in the source operand (second operand) and the destination operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed on each of the pairs of packed values. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "CMPPS":
         case "VCMPPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPPS.html",
                 "html": "<p>Performs a SIMD compare of the packed single-precision floating-point values in the source operand (second operand) and the destination operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed on each of the pairs of packed values. The result of each comparison is a doubleword mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 128-bit memory location. The comparison predicate operand is an 8-bit immediate, bits 2:0 of the immediate define the type of comparison to be performed (see Table 3-7). Bits 7:3 of the immediate is reserved. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. Four comparisons are performed with results written to bits 127:0 of the destination operand.</p><p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN.</p><p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate a fault, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p><p>Note that processors with \u201cCPUID.1H:ECX.AVX =0\u201d do not implement the \u201cgreater-than\u201d, \u201cgreater-than-or-equal\u201d, \u201cnot-greater than\u201d, and \u201cnot-greater-than-or-equal relations\u201d predicates. These comparisons can be made either by using the inverse relationship (that is, use the \u201cnot-less-than-or-equal\u201d to make a \u201cgreater-than\u201d comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in Table 3-7 under the heading Emulation.</p>",
                 "tooltip": "Performs a SIMD compare of the packed single-precision floating-point values in the source operand (second operand) and the destination operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed on each of the pairs of packed values. The result of each comparison is a doubleword mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "CMPSQ":
         case "CMPSB":
         case "CMPS":
         case "CMPSD":
         case "CMPSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPSQ.html",
                 "html": "<p>Compares the byte, word, doubleword, or quadword specified with the first source operand with the byte, word, doubleword, or quadword specified with the second source operand and sets the status flags in the EFLAGS register according to the results.</p><p>Both source operands are located in memory. The address of the first source operand is read from DS:SI, DS:ESI or RSI (depending on the address-size attribute of the instruction is 16, 32, or 64, respectively). The address of the second source operand is read from ES:DI, ES:EDI or RDI (again depending on the address-size attribute of the</p><p>instruction is 16, 32, or 64). The DS segment may be overridden with a segment override prefix, but the ES segment cannot be overridden.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the CMPS mnemonic) allows the two source operands to be specified explicitly. Here, the source operands should be symbols that indicate the size and location of the source values. This explicit-operand form is provided to allow documentation. However, note that the documenta-tion provided by this form can be misleading. That is, the source operand symbols must specify the correct type (size) of the operands (bytes, words, or doublewords, quadwords), but they do not have to specify the correct loca-tion. Locations of the source operands are always specified by the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) regis-ters, which must be loaded correctly before the compare string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the CMPS instructions. Here also the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers are assumed by the processor to specify the loca-tion of the source operands. The size of the source operands is selected with the mnemonic: CMPSB (byte compar-ison), CMPSW (word comparison), CMPSD (doubleword comparison), or CMPSQ (quadword comparison using REX.W).</p>",
                 "tooltip": "Compares the byte, word, doubleword, or quadword specified with the first source operand with the byte, word, doubleword, or quadword specified with the second source operand and sets the status flags in the EFLAGS register according to the results."
             };

         case "VCMPSS":
         case "CMPSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPSS.html",
                 "html": "<p>Compares the low single-precision floating-point values in the source operand (second operand) and the destina-tion operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed. The comparison result is a double-word mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 64-bit memory location. The comparison pred-icate operand is an 8-bit immediate, bits 2:0 of the immediate define the type of comparison to be performed (see Table 3-7). Bits 7:3 of the immediate is reserved. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN</p><p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate a fault, since a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p><p>Note that processors with \u201cCPUID.1H:ECX.AVX =0\u201d do not implement the \u201cgreater-than\u201d, \u201cgreater-than-or-equal\u201d, \u201cnot-greater than\u201d, and \u201cnot-greater-than-or-equal relations\u201d predicates. These comparisons can be made either by using the inverse relationship (that is, use the \u201cnot-less-than-or-equal\u201d to make a \u201cgreater-than\u201d comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination operand), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in Table 3-7 under the heading Emulation.</p>",
                 "tooltip": "Compares the low single-precision floating-point values in the source operand (second operand) and the destina-tion operand (first operand) and returns the results of the comparison to the destination operand. The comparison predicate operand (third operand) specifies the type of comparison performed. The comparison result is a double-word mask of all 1s (comparison true) or all 0s (comparison false). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "CMPXCHG":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPXCHG.html",
                 "html": "<p>Compares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two values are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the destination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processor\u2019s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Compares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two values are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the destination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode."
             };

         case "CMPXCHG8B":
         case "CMPXCHG16B":
             return {
                 "url": "http://www.felixcloutier.com/x86/CMPXCHG8B:CMPXCHG16B.html",
                 "html": "<p>Compares the 64-bit value in EDX:EAX (or 128-bit value in RDX:RAX if operand size is 128 bits) with the operand (destination operand). If the values are equal, the 64-bit value in ECX:EBX (or 128-bit value in RCX:RBX) is stored in the destination operand. Otherwise, the value in the destination operand is loaded into EDX:EAX (or RDX:RAX). The destination operand is an 8-byte memory location (or 16-byte memory location if operand size is 128 bits). For the EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order 32 bits and EAX and EBX contain the low-order 32 bits of a 64-bit value. For the RDX:RAX and RCX:RBX register pairs, RDX and RCX contain the high-order 64 bits and RAX and RBX contain the low-order 64bits of a 128-bit value.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processor\u2019s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)</p><p>In 64-bit mode, default operation size is 64 bits. Use of the REX.W prefix promotes operation to 128 bits. Note that CMPXCHG16B requires that the destination (memory) operand be 16-byte aligned. See the summary chart at the beginning of this section for encoding data and limits. For information on the CPUID flag that indicates CMPXCHG16B, see page 3-175.</p>",
                 "tooltip": "Compares the 64-bit value in EDX:EAX (or 128-bit value in RDX:RAX if operand size is 128 bits) with the operand (destination operand). If the values are equal, the 64-bit value in ECX:EBX (or 128-bit value in RCX:RBX) is stored in the destination operand. Otherwise, the value in the destination operand is loaded into EDX:EAX (or RDX:RAX). The destination operand is an 8-byte memory location (or 16-byte memory location if operand size is 128 bits). For the EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order 32 bits and EAX and EBX contain the low-order 32 bits of a 64-bit value. For the RDX:RAX and RCX:RBX register pairs, RDX and RCX contain the high-order 64 bits and RAX and RBX contain the low-order 64bits of a 128-bit value."
             };

         case "VCOMISD":
         case "COMISD":
             return {
                 "url": "http://www.felixcloutier.com/x86/COMISD.html",
                 "html": "<p>Compares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unor-dered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unor-dered result is returned if either source operand is a NaN (QNaN or SNaN).The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>Operand 1 is an XMM register; operand 2 can be an XMM register or a 64 bit memory location.</p><p>The COMISD instruction differs from the UCOMISD instruction in that it signals a SIMD floating-point invalid oper-ation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISD instruction signals an invalid numeric exception only if a source operand is an SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unor-dered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unor-dered result is returned if either source operand is a NaN (QNaN or SNaN).The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "COMISS":
         case "VCOMISS":
             return {
                 "url": "http://www.felixcloutier.com/x86/COMISS.html",
                 "html": "<p>Compares the single-precision floating-point values in the low doublewords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unor-dered, greater than, less than, or equal). The OF, SF, and AF flags in the EFLAGS register are set to 0. The unor-dered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>Operand 1 is an XMM register; Operand 2 can be an XMM register or a 32 bit memory location.</p><p>The COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid opera-tion exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid numeric exception only if a source operand is an SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the single-precision floating-point values in the low doublewords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unor-dered, greater than, less than, or equal). The OF, SF, and AF flags in the EFLAGS register are set to 0. The unor-dered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "CPUID":
             return {
                 "url": "http://www.felixcloutier.com/x86/CPUID.html",
                 "html": "<p>The ID flag (bit 21) in the EFLAGS register indicates support for the CPUID instruction. If a software procedure can set and clear this flag, the processor executing the procedure supports the CPUID instruction. This instruction oper-ates the same in non-64-bit modes and 64-bit mode.</p><p>CPUID returns processor identification and feature information in the EAX, EBX, ECX, and EDX registers.<sup>1</sup> The instruction\u2019s output is dependent on the contents of the EAX register upon execution (in some cases, ECX as well). For example, the following pseudocode loads EAX with 00H and causes CPUID to return a Maximum Return Value and the Vendor Identification String in the appropriate registers:</p><p>MOV EAX, 00H</p><p>CPUID</p><p>Table 3-17 shows information returned, depending on the initial value loaded into the EAX register. Table 3-18 shows the maximum CPUID input value recognized for each family of IA-32 processors on which CPUID is imple-mented.</p>",
                 "tooltip": "The ID flag (bit 21) in the EFLAGS register indicates support for the CPUID instruction. If a software procedure can set and clear this flag, the processor executing the procedure supports the CPUID instruction. This instruction oper-ates the same in non-64-bit modes and 64-bit mode."
             };

         case "CQO":
         case "CWD":
         case "CDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/CQO.html",
                 "html": "<p>Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register. The CDQ instruction copies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register. The CQO instruc-tion (available in 64-bit mode only) copies the sign (bit 63) of the value in the RAX register into every bit position in the RDX register.</p><p>The CWD instruction can be used to produce a doubleword dividend from a word before word division. The CDQ instruction can be used to produce a quadword dividend from a doubleword before doubleword division. The CQO instruction can be used to produce a double quadword dividend from a quadword before a quadword division.</p><p>The CWD and CDQ mnemonics reference the same opcode. The CWD instruction is intended for use when the operand-size attribute is 16 and the CDQ instruction for when the operand-size attribute is 32. Some assemblers may force the operand size to 16 when CWD is used and to 32 when CDQ is used. Others may treat these mnemonics as synonyms (CWD/CDQ) and use the current setting of the operand-size attribute to determine the size of values to be converted, regardless of the mnemonic used.</p><p>In 64-bit mode, use of the REX.W prefix promotes operation to 64 bits. The CQO mnemonics reference the same opcode as CWD/CDQ. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register. The CDQ instruction copies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register. The CQO instruc-tion (available in 64-bit mode only) copies the sign (bit 63) of the value in the RAX register into every bit position in the RDX register."
             };

         case "CRC32":
             return {
                 "url": "http://www.felixcloutier.com/x86/CRC32.html",
                 "html": "<p>Starting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial 11EDC6F41H) value for the second operand (source operand) and stores the result in the destination operand. The source operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If the destination is an r64 register, then the 32-bit result is stored in the least significant double word and 00000000H is stored in the most significant double word of the r64 register.</p><p>The initial value supplied in the destination operand is a double word integer stored in the r32 register or the least significant double word of the r64 register. To incrementally accumulate a CRC32 value, software retains the result of the previous CRC32 operation in the destination operand, then executes the CRC32 instruction again with new input data in the source operand. Data contained in the source operand is processed in reflected bit order. This means that the most significant bit of the source operand is treated as the least significant bit of the quotient, and so on, for all the bits of the source operand. Likewise, the result of the CRC operation is stored in the destination operand in reflected bit order. This means that the most significant bit of the resulting CRC (bit 31) is stored in the least significant bit of the destination operand (bit 0), and so on, for all the bits of the CRC.</p>",
                 "tooltip": "Starting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial 11EDC6F41H) value for the second operand (source operand) and stores the result in the destination operand. The source operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If the destination is an r64 register, then the 32-bit result is stored in the least significant double word and 00000000H is stored in the most significant double word of the r64 register."
             };

         case "CVTDQ2PD":
         case "VCVTDQ2PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTDQ2PD.html",
                 "html": "<p>Converts two packed signed doubleword integers in the source operand (second operand) to two packed double-precision floating-point values in the destination operand (first operand).</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding XMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or 128- bit memory location. The destination operation is a YMM register.</p>",
                 "tooltip": "Converts two packed signed doubleword integers in the source operand (second operand) to two packed double-precision floating-point values in the destination operand (first operand)."
             };

         case "VCVTDQ2PS":
         case "CVTDQ2PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTDQ2PS.html",
                 "html": "<p>Converts four packed signed doubleword integers in the source operand (second operand) to four packed single-precision floating-point values in the destination operand (first operand).</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding XMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operation is a YMM register.</p>",
                 "tooltip": "Converts four packed signed doubleword integers in the source operand (second operand) to four packed single-precision floating-point values in the destination operand (first operand)."
             };

         case "CVTPD2DQ":
         case "VCVTPD2DQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPD2DQ.html",
                 "html": "<p>Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand).</p><p>The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. The result is stored in the low quadword of the destination operand and the high quadword is cleared to all 0s.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand)."
             };

         case "CVTPD2PI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPD2PI.html",
                 "html": "<p>Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand).</p><p>The source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX tech-nology register.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPD2PI instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand)."
             };

         case "VCVTPD2PS":
         case "CVTPD2PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPD2PS.html",
                 "html": "<p>Converts two packed double-precision floating-point values in the source operand (second operand) to two packed single-precision floating-point values in the destination operand (first operand).</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is a YMM register. The upper bits (VLMAX-1:64) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Converts two packed double-precision floating-point values in the source operand (second operand) to two packed single-precision floating-point values in the destination operand (first operand)."
             };

         case "CVTPI2PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPI2PD.html",
                 "html": "<p>Converts two packed signed doubleword integers in the source operand (second operand) to two packed double-precision floating-point values in the destination operand (first operand).</p><p>The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an XMM register. In addition, depending on the operand configuration:</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed signed doubleword integers in the source operand (second operand) to two packed double-precision floating-point values in the destination operand (first operand)."
             };

         case "CVTPI2PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPI2PS.html",
                 "html": "<p>Converts two packed signed doubleword integers in the source operand (second operand) to two packed single-precision floating-point values in the destination operand (first operand).</p><p>The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an XMM register. The results are stored in the low quadword of the destination operand, and the high quadword remains unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPI2PS instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed signed doubleword integers in the source operand (second operand) to two packed single-precision floating-point values in the destination operand (first operand)."
             };

         case "CVTPS2DQ":
         case "VCVTPS2DQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPS2DQ.html",
                 "html": "<p>Converts four or eight packed single-precision floating-point values in the source operand to four or eight signed doubleword integers in the destination operand.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is a YMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Converts four or eight packed single-precision floating-point values in the source operand to four or eight signed doubleword integers in the destination operand."
             };

         case "CVTPS2PD":
         case "VCVTPS2PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPS2PD.html",
                 "html": "<p>Converts two or four packed single-precision floating-point values in the source operand (second operand) to two or four packed double-precision floating-point values in the destination operand (first operand).</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination operation is a YMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is a YMM register.</p>",
                 "tooltip": "Converts two or four packed single-precision floating-point values in the source operand (second operand) to two or four packed double-precision floating-point values in the destination operand (first operand)."
             };

         case "CVTPS2PI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTPS2PI.html",
                 "html": "<p>Converts two packed single-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand).</p><p>The source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX tech-nology register. When the source operand is an XMM register, the two single-precision floating-point values are contained in the low quadword of the register. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indef-inite integer value (80000000H) is returned.</p><p>CVTPS2PI causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPS2PI instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed single-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand)."
             };

         case "CVTSD2SI":
         case "VCVTSD2SI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSD2SI.html",
                 "html": "<p>Converts a double-precision floating-point value in the source operand (second operand) to a signed doubleword integer in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. Use of the REX.W prefix promotes the instruction to 64-bit operation. See the summary chart at the begin-ning of this section for encoding data and limits.</p><p>Legacy SSE instructions: Use of the REX.W prefix promotes the instruction to 64-bit operation. See the summary chart at the beginning of this section for encoding data and limits.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a double-precision floating-point value in the source operand (second operand) to a signed doubleword integer in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register."
             };

         case "CVTSD2SS":
         case "VCVTSD2SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSD2SS.html",
                 "html": "<p>Converts a double-precision floating-point value in the source operand (second operand) to a single-precision floating-point value in the destination operand (first operand).</p><p>The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register. The result is stored in the low doubleword of the destination operand, and the upper 3 doublewords are left unchanged. When the conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Converts a double-precision floating-point value in the source operand (second operand) to a single-precision floating-point value in the destination operand (first operand)."
             };

         case "CVTSI2SD":
         case "VCVTSI2SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSI2SD.html",
                 "html": "<p>Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the second source operand to a double-precision floating-point value in the destination operand. The result is stored in the low quad-word of the destination operand, and the high quadword left unchanged. When conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>Legacy SSE instructions: Use of the REX.W prefix promotes the instruction to 64-bit operands. See the summary chart at the beginning of this section for encoding data and limits.</p><p>The second source operand can be a general-purpose register or a 32/64-bit memory location. The first source and destination operands are XMM registers.</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the second source operand to a double-precision floating-point value in the destination operand. The result is stored in the low quad-word of the destination operand, and the high quadword left unchanged. When conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register."
             };

         case "VCVTSI2SS":
         case "CVTSI2SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSI2SS.html",
                 "html": "<p>Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the source operand (second operand) to a single-precision floating-point value in the destination operand (first operand). The source operand can be a general-purpose register or a memory location. The destination operand is an XMM register. The result is stored in the low doubleword of the destination operand, and the upper three doublewords are left unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>Legacy SSE instructions: In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. Use of the REX.W prefix promotes the instruction to 64-bit operands. See the summary chart at the beginning of this section for encoding data and limits.</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:32) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the source operand (second operand) to a single-precision floating-point value in the destination operand (first operand). The source operand can be a general-purpose register or a memory location. The destination operand is an XMM register. The result is stored in the low doubleword of the destination operand, and the upper three doublewords are left unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register."
             };

         case "VCVTSS2SD":
         case "CVTSS2SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSS2SD.html",
                 "html": "<p>Converts a single-precision floating-point value in the source operand (second operand) to a double-precision floating-point value in the destination operand (first operand). The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register. The result is stored in the low quadword of the destination operand, and the high quadword is left unchanged.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Converts a single-precision floating-point value in the source operand (second operand) to a double-precision floating-point value in the destination operand (first operand). The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register. The result is stored in the low quadword of the destination operand, and the high quadword is left unchanged."
             };

         case "VCVTSS2SI":
         case "CVTSS2SI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTSS2SI.html",
                 "html": "<p>Converts a single-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. Use of the REX.W prefix promotes the instruction to 64-bit operands. See the summary chart at the begin-ning of this section for encoding data and limits.</p><p>Legacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operands. See the summary chart at the beginning of this section for encoding data and limits.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a single-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register."
             };

         case "VCVTTPD2DQ":
         case "CVTTPD2DQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTPD2DQ.html",
                 "html": "<p>Converts two or four packed double-precision floating-point values in the source operand (second operand) to two or four packed signed doubleword integers in the destination operand (first operand).</p><p>When a conversion is inexact, a truncated (round toward zero) value is returned.If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is a YMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Converts two or four packed double-precision floating-point values in the source operand (second operand) to two or four packed signed doubleword integers in the destination operand (first operand)."
             };

         case "CVTTPD2PI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTPD2PI.html",
                 "html": "<p>Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand). The source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX technology register.</p><p>When a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTTPD2PI instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed double-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand). The source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX technology register."
             };

         case "CVTTPS2DQ":
         case "VCVTTPS2DQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTPS2DQ.html",
                 "html": "<p>Converts four or eight packed single-precision floating-point values in the source operand to four or eight signed doubleword integers in the destination operand.</p><p>When a conversion is inexact, a truncated (round toward zero) value is returned.If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operation is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operation is a YMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Converts four or eight packed single-precision floating-point values in the source operand to four or eight signed doubleword integers in the destination operand."
             };

         case "CVTTPS2PI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTPS2PI.html",
                 "html": "<p>Converts two packed single-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is an MMX technology register. When the source operand is an XMM register, the two single-precision floating-point values are contained in the low quadword of the register.</p><p>When a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTTPS2PI instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Converts two packed single-precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is an MMX technology register. When the source operand is an XMM register, the two single-precision floating-point values are contained in the low quadword of the register."
             };

         case "CVTTSD2SI":
         case "VCVTTSD2SI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTSD2SI.html",
                 "html": "<p>Converts a double-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register.</p><p>When a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger than the maximum signed doubleword integer, the floating point invalid exception is raised. If this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>Legacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operation. See the summary chart at the beginning of this section for encoding data and limits.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a double-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register."
             };

         case "CVTTSS2SI":
         case "VCVTTSS2SI":
             return {
                 "url": "http://www.felixcloutier.com/x86/CVTTSS2SI.html",
                 "html": "<p>Converts a single-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a 32-bit memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register.</p><p>When a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised. If this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>Legacy SSE instructions: In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. Use of the REX.W prefix promotes the instruction to 64-bit operation. See the summary chart at the beginning of this section for encoding data and limits.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a single-precision floating-point value in the source operand (second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (first operand). The source operand can be an XMM register or a 32-bit memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register."
             };

         case "DAA":
             return {
                 "url": "http://www.felixcloutier.com/x86/DAA.html",
                 "html": "<p>Adjusts the sum of two packed BCD values to create a packed BCD result. The AL register is the implied source and destination operand. The DAA instruction is only useful when it follows an ADD instruction that adds (binary addi-tion) two 2-digit, packed BCD values and stores a byte result in the AL register. The DAA instruction then adjusts the contents of the AL register to contain the correct 2-digit, packed BCD result. If a decimal carry is detected, the CF and AF flags are set accordingly.</p><p>This instruction executes as described above in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts the sum of two packed BCD values to create a packed BCD result. The AL register is the implied source and destination operand. The DAA instruction is only useful when it follows an ADD instruction that adds (binary addi-tion) two 2-digit, packed BCD values and stores a byte result in the AL register. The DAA instruction then adjusts the contents of the AL register to contain the correct 2-digit, packed BCD result. If a decimal carry is detected, the CF and AF flags are set accordingly."
             };

         case "DAS":
             return {
                 "url": "http://www.felixcloutier.com/x86/DAS.html",
                 "html": "<p>Adjusts the result of the subtraction of two packed BCD values to create a packed BCD result. The AL register is the implied source and destination operand. The DAS instruction is only useful when it follows a SUB instruction that subtracts (binary subtraction) one 2-digit, packed BCD value from another and stores a byte result in the AL register. The DAS instruction then adjusts the contents of the AL register to contain the correct 2-digit, packed BCD result. If a decimal borrow is detected, the CF and AF flags are set accordingly.</p><p>This instruction executes as described above in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Adjusts the result of the subtraction of two packed BCD values to create a packed BCD result. The AL register is the implied source and destination operand. The DAS instruction is only useful when it follows a SUB instruction that subtracts (binary subtraction) one 2-digit, packed BCD value from another and stores a byte result in the AL register. The DAS instruction then adjusts the contents of the AL register to contain the correct 2-digit, packed BCD result. If a decimal borrow is detected, the CF and AF flags are set accordingly."
             };

         case "DEC":
             return {
                 "url": "http://www.felixcloutier.com/x86/DEC.html",
                 "html": "<p>Subtracts 1 from the destination operand, while preserving the state of the CF flag. The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (To perform a decrement operation that updates the CF flag, use a SUB instruction with an immediate operand of 1.)</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, DEC r16 and DEC r32 are not encodable (because opcodes 48H through 4FH are REX prefixes). Otherwise, the instruction\u2019s 64-bit mode default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits.</p><p>See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Subtracts 1 from the destination operand, while preserving the state of the CF flag. The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (To perform a decrement operation that updates the CF flag, use a SUB instruction with an immediate operand of 1.)"
             };

         case "DIV":
             return {
                 "url": "http://www.felixcloutier.com/x86/DIV.html",
                 "html": "<p>Divides unsigned the value in the AX, DX:AX, EDX:EAX, or RDX:RAX registers (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor). Division using 64-bit operand is available only in 64-bit mode.</p><p>Non-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magni-tude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. In 64-bit mode when REX.W is applied, the instruction divides the unsigned value in RDX:RAX by the source operand and stores the quotient in RAX, the remainder in RDX.</p><p>See the summary chart at the beginning of this section for encoding data and limits. See Table 3-25.</p><h3>Table 3-25.  DIV Action</h3>",
                 "tooltip": "Divides unsigned the value in the AX, DX:AX, EDX:EAX, or RDX:RAX registers (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor). Division using 64-bit operand is available only in 64-bit mode."
             };

         case "DIVPD":
         case "VDIVPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/DIVPD.html",
                 "html": "<p>Performs an SIMD divide of the two or four packed double-precision floating-point values in the first source operand by the two or four packed double-precision floating-point values in the second source operand. See Chapter 11 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an overview of a SIMD double-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs an SIMD divide of the two or four packed double-precision floating-point values in the first source operand by the two or four packed double-precision floating-point values in the second source operand. See Chapter 11 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an overview of a SIMD double-precision floating-point operation."
             };

         case "VDIVPS":
         case "DIVPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/DIVPS.html",
                 "html": "<p>Performs an SIMD divide of the four or eight packed single-precision floating-point values in the first source operand by the four or eight packed single-precision floating-point values in the second source operand. See Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an overview of a SIMD single-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs an SIMD divide of the four or eight packed single-precision floating-point values in the first source operand by the four or eight packed single-precision floating-point values in the second source operand. See Chapter 10 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an overview of a SIMD single-precision floating-point operation."
             };

         case "VDIVSD":
         case "DIVSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/DIVSD.html",
                 "html": "<p>Divides the low double-precision floating-point value in the first source operand by the low double-precision floating-point value in the second source operand, and stores the double-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination hyperons are XMM registers. The high quadword of the destination operand is copied from the high quadword of the first source operand. See Chapter 11 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an overview of a scalar double-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Divides the low double-precision floating-point value in the first source operand by the low double-precision floating-point value in the second source operand, and stores the double-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination hyperons are XMM registers. The high quadword of the destination operand is copied from the high quadword of the first source operand. See Chapter 11 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an overview of a scalar double-precision floating-point operation."
             };

         case "DIVSS":
         case "VDIVSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/DIVSS.html",
                 "html": "<p>Divides the low single-precision floating-point value in the first source operand by the low single-precision floating-point value in the second source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands are XMM registers. The three high-order doublewords of the destination are copied from the same dwords of the first source operand. See Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for an overview of a scalar single-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Divides the low single-precision floating-point value in the first source operand by the low single-precision floating-point value in the second source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands are XMM registers. The three high-order doublewords of the destination are copied from the same dwords of the first source operand. See Chapter 10 in the Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1, for an overview of a scalar single-precision floating-point operation."
             };

         case "VDPPD":
         case "DPPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/DPPD.html",
                 "html": "<p>Conditionally multiplies the packed double-precision floating-point values in the destination operand (first operand) with the packed double-precision floating-point values in the source (second operand) depending on a mask extracted from bits [5:4] of the immediate operand (third operand). If a condition mask bit is zero, the corre-sponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>The two resulting double-precision values are summed into an intermediate result. The intermediate result is conditionally broadcasted to the destination using a broadcast mask specified by bits [1:0] of the immediate byte.</p><p>If a broadcast mask bit is \"1\", the intermediate result is copied to the corresponding qword element in the destina-tion operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.</p><p>DPPD follows the NaN forwarding rules stated in the Software Developer\u2019s Manual, vol. 1, table 4.7. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally gener-ated NaNs will have at least one NaN propagated to the destination.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Conditionally multiplies the packed double-precision floating-point values in the destination operand (first operand) with the packed double-precision floating-point values in the source (second operand) depending on a mask extracted from bits [5:4] of the immediate operand (third operand). If a condition mask bit is zero, the corre-sponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1."
             };

         case "DPPS":
         case "VDPPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/DPPS.html",
                 "html": "<p>Conditionally multiplies the packed single precision floating-point values in the destination operand (first operand) with the packed single-precision floats in the source (second operand) depending on a mask extracted from the high 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of <em>Intel\u00ae 64 and IA-32 Archi-tectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>The four resulting single-precision values are summed into an intermediate result. The intermediate result is condi-tionally broadcasted to the destination using a broadcast mask specified by bits [3:0] of the immediate byte.</p><p>If a broadcast mask bit is \"1\", the intermediate result is copied to the corresponding dword element in the destina-tion operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.</p><p>DPPS follows the NaN forwarding rules stated in the Software Developer\u2019s Manual, vol. 1, table 4.7. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally gener-ated NaNs will have at least one NaN propagated to the destination.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Conditionally multiplies the packed single precision floating-point values in the destination operand (first operand) with the packed single-precision floats in the source (second operand) depending on a mask extracted from the high 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of Intel\u00ae 64 and IA-32 Archi-tectures Software Developer\u2019s Manual, Volume 1."
             };

         case "EMMS":
             return {
                 "url": "http://www.felixcloutier.com/x86/EMMS.html",
                 "html": "<p>Sets the values of all the tags in the x87 FPU tag word to empty (all 1s). This operation marks the x87 FPU data registers (which are aliased to the MMX technology registers) as available for use by x87 FPU floating-point instruc-tions. (See Figure 8-7 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for the format of the x87 FPU tag word.) All other MMX instructions (other than the EMMS instruction) set all the tags in x87 FPU tag word to valid (all 0s).</p><p>The EMMS instruction must be used to clear the MMX technology state at the end of all MMX technology procedures or subroutines and before calling other procedures or subroutines that may execute x87 floating-point instructions. If a floating-point instruction loads one of the registers in the x87 FPU data register stack before the x87 FPU tag word has been reset by the EMMS instruction, an x87 floating-point register stack overflow can occur that will result in an x87 floating-point exception or incorrect result.</p><p>EMMS operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Sets the values of all the tags in the x87 FPU tag word to empty (all 1s). This operation marks the x87 FPU data registers (which are aliased to the MMX technology registers) as available for use by x87 FPU floating-point instruc-tions. (See Figure 8-7 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for the format of the x87 FPU tag word.) All other MMX instructions (other than the EMMS instruction) set all the tags in x87 FPU tag word to valid (all 0s)."
             };

         case "ENTER":
             return {
                 "url": "http://www.felixcloutier.com/x86/ENTER.html",
                 "html": "<p>Creates a stack frame for a procedure. The first operand (size operand) specifies the size of the stack frame (that is, the number of bytes of dynamic storage allocated on the stack for the procedure). The second operand (nesting level operand) gives the lexical nesting level (0 to 31) of the procedure. The nesting level determines the number of stack frame pointers that are copied into the \u201cdisplay area\u201d of the new stack frame from the preceding frame. Both of these operands are immediate values.</p><p>The stack-size attribute determines whether the BP (16 bits), EBP (32 bits), or RBP (64 bits) register specifies the current frame pointer and whether SP (16 bits), ESP (32 bits), or RSP (64 bits) specifies the stack pointer. In 64-bit mode, stack-size attribute is always 64-bits.</p><p>The ENTER and companion LEAVE instructions are provided to support block structured languages. The ENTER instruction (when used) is typically the first instruction in a procedure and is used to set up a new stack frame for a procedure. The LEAVE instruction is then used at the end of the procedure (just before the RET instruction) to release the stack frame.</p><p>If the nesting level is 0, the processor pushes the frame pointer from the BP/EBP/RBP register onto the stack, copies the current stack pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and loads the SP/ESP/RSP register with the current stack-pointer value minus the value in the size operand. For nesting levels of 1 or greater, the processor pushes additional frame pointers on the stack before adjusting the stack pointer. These additional frame pointers provide the called procedure with access points to other nested frames on the stack. See \u201cProcedure Calls for Block-Structured Languages\u201d in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information about the actions of the ENTER instruction.</p><p>The ENTER instruction causes a page fault whenever a write using the final value of the stack pointer (within the current stack segment) would do so.</p>",
                 "tooltip": "Creates a stack frame for a procedure. The first operand (size operand) specifies the size of the stack frame (that is, the number of bytes of dynamic storage allocated on the stack for the procedure). The second operand (nesting level operand) gives the lexical nesting level (0 to 31) of the procedure. The nesting level determines the number of stack frame pointers that are copied into the \u201cdisplay area\u201d of the new stack frame from the preceding frame. Both of these operands are immediate values."
             };

         case "EXTRACTPS":
         case "VEXTRACTPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/EXTRACTPS.html",
                 "html": "<p>Extracts a single-precision floating-point value from the source operand (second operand) at the 32-bit offset spec-ified from imm8. Immediate bits higher than the most significant offset for the vector length are ignored.</p><p>The extracted single-precision floating-point value is stored in the low 32-bits of the destination operand</p><p>In 64-bit mode, destination register operand has default operand size of 64 bits. The upper 32-bits of the register are filled with zero. REX.W is ignored.</p><p>128-bit Legacy SSE version: When a REX.W prefix is used in 64-bit mode with a general purpose register (GPR) as a destination operand, the packed single quantity is zero extended to 64 bits.</p><p>VEX.128 encoded version: When VEX.128.66.0F3A.W1 17 form is used in 64-bit mode with a general purpose register (GPR) as a destination operand, the packed single quantity is zero extended to 64 bits. VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Extracts a single-precision floating-point value from the source operand (second operand) at the 32-bit offset spec-ified from imm8. Immediate bits higher than the most significant offset for the vector length are ignored."
             };

         case "F2XM1":
             return {
                 "url": "http://www.felixcloutier.com/x86/F2XM1.html",
                 "html": "<p>Computes the exponential value of 2 to the power of the source operand minus 1. The source operand is located in register ST(0) and the result is also stored in ST(0). The value of the source operand must lie in the range \u20131.0 to +1.0. If the source value is outside this range, the result is undefined.</p><p>The following table shows the results obtained when computing the exponential value of various classes of numbers, assuming that neither overflow nor underflow occurs.</p><h3>Table 3-26.  Results Obtained from F2XM1</h3><table>\n<tr>\n<th>ST(0) SRC</th>\n<th>ST(0) DEST</th></tr>\n<tr>\n<td>\u2212 1.0 to \u22120</td>\n<td>\u2212 0.5 to \u2212 0</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>\u2212 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ 0 to +1.0</td>\n<td>+ 0 to 1.0</td></tr></table><p>Values other than 2 can be exponentiated using the following formula:</p>",
                 "tooltip": "Computes the exponential value of 2 to the power of the source operand minus 1. The source operand is located in register ST(0) and the result is also stored in ST(0). The value of the source operand must lie in the range \u20131.0 to +1.0. If the source value is outside this range, the result is undefined."
             };

         case "FABS":
             return {
                 "url": "http://www.felixcloutier.com/x86/FABS.html",
                 "html": "<p>Clears the sign bit of ST(0) to create the absolute value of the operand. The following table shows the results obtained when creating the absolute value of various classes of numbers.</p><h3>Table 3-27.  Results Obtained from FABS</h3><table>\n<tr>\n<th>ST(0) SRC</th>\n<th>ST(0) DEST</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>+ \u221e</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>+ F</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ F</td>\n<td>+ F</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>+ \u221e</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p><p>F Means finite floating-point value.</p>",
                 "tooltip": "Clears the sign bit of ST(0) to create the absolute value of the operand. The following table shows the results obtained when creating the absolute value of various classes of numbers."
             };

         case "FBLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/FBLD.html",
                 "html": "<p>Converts the BCD source operand into double extended-precision floating-point format and pushes the value onto the FPU stack. The source operand is loaded without rounding errors. The sign of the source operand is preserved, including that of \u22120.</p><p>The packed BCD digits are assumed to be in the range 0 through 9; the instruction does not check for invalid digits (AH through FH). Attempting to load an invalid encoding produces an undefined result.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Converts the BCD source operand into double extended-precision floating-point format and pushes the value onto the FPU stack. The source operand is loaded without rounding errors. The sign of the source operand is preserved, including that of \u22120."
             };

         case "FBSTP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FBSTP.html",
                 "html": "<p>Converts the value in the ST(0) register to an 18-digit packed BCD integer, stores the result in the destination operand, and pops the register stack. If the source value is a non-integral value, it is rounded to an integer value, according to rounding mode specified by the RC field of the FPU control word. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.</p><p>The destination operand specifies the address where the first byte destination value is to be stored. The BCD value (including its sign bit) requires 10 bytes of space in memory.</p><p>The following table shows the results obtained when storing various classes of numbers in packed BCD format.</p><h3>Table 3-29.  FBSTP Results</h3><table>\n<tr>\n<th>ST(0)</th>\n<th>DEST</th></tr>\n<tr>\n<td>\u2212 \u221e or Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>F \u2264 \u2212 1</td>\n<td>\u2212 D</td></tr>\n<tr>\n<td>\u22121 &lt; F &lt; -0</td>\n<td>**</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>\u2212 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ 0 &lt; F &lt; +1</td>\n<td>**</td></tr>\n<tr>\n<td>F \u2265 +1</td>\n<td>+ D</td></tr>\n<tr>\n<td>+ \u221e or Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>*</td></tr></table>",
                 "tooltip": "Converts the value in the ST(0) register to an 18-digit packed BCD integer, stores the result in the destination operand, and pops the register stack. If the source value is a non-integral value, it is rounded to an integer value, according to rounding mode specified by the RC field of the FPU control word. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1."
             };

         case "FCHS":
             return {
                 "url": "http://www.felixcloutier.com/x86/FCHS.html",
                 "html": "<p>Complements the sign bit of ST(0). This operation changes a positive value into a negative value of equal magni-tude or vice versa. The following table shows the results obtained when changing the sign of various classes of numbers.</p><h3>Table 3-30.  FCHS Results</h3><table>\n<tr>\n<th>ST(0) SRC</th>\n<th>ST(0) DEST</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>+ \u221e</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>+ F</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>\u2212 0</td></tr>\n<tr>\n<td>+ F</td>\n<td>\u2212 F</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>\u2212 \u221e</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p><p>*</p>",
                 "tooltip": "Complements the sign bit of ST(0). This operation changes a positive value into a negative value of equal magni-tude or vice versa. The following table shows the results obtained when changing the sign of various classes of numbers."
             };

         case "FCMOVU":
         case "FCMOVBE":
         case "FCMOVNU":
         case "FCMOVNBE":
         case "FCMOVE":
         case "FCMOVNB":
         case "FCMOVNE":
         case "FCMOVB":
             return {
                 "url": "http://www.felixcloutier.com/x86/FCMOVNU.html",
                 "html": "<p>Tests the status flags in the EFLAGS register and moves the source operand (second operand) to the destination operand (first operand) if the given test condition is true. The condition for each mnemonic os given in the Descrip-tion column above and in Chapter 8 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>. The source operand is always in the ST(i) register and the destination operand is always ST(0).</p><p>The FCMOV<em>cc</em> instructions are useful for optimizing small IF constructions. They also help eliminate branching over-head for IF operations and the possibility of branch mispredictions by the processor.</p><p>A processor may not support the FCMOV<em>cc</em> instructions. Software can check if the FCMOV<em>cc</em> instructions are supported by checking the processor\u2019s feature information with the CPUID instruction (see \u201cCOMISS\u2014Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS\u201d in this chapter). If both the CMOV and FPU feature bits are set, the FCMOV<em>cc</em> instructions are supported.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Tests the status flags in the EFLAGS register and moves the source operand (second operand) to the destination operand (first operand) if the given test condition is true. The condition for each mnemonic os given in the Descrip-tion column above and in Chapter 8 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1. The source operand is always in the ST(i) register and the destination operand is always ST(0)."
             };

         case "FCOM":
         case "FCOMP":
         case "FCOMPP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FCOMPP.html",
                 "html": "<p>Compares the contents of register ST(0) and source value and sets condition code flags C0, C2, and C3 in the FPU status word according to the results (see the table below). The source operand can be a data register or a memory location. If no source operand is given, the value in ST(0) is compared with the value in ST(1). The sign of zero is ignored, so that \u20130.0 is equal to +0.0.</p><h3>Table 3-31.  FCOM/FCOMP/FCOMPP Results</h3><table>\n<tr>\n<th>Condition</th>\n<th>C3</th>\n<th>C2</th>\n<th>C0</th></tr>\n<tr>\n<td>ST(0) &gt; SRC</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>ST(0) &lt; SRC</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>ST(0) = SRC</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Unordered*</td>\n<td>1</td>\n<td>1</td>\n<td>1</td></tr></table><p><strong>NOTES:</strong></p><p>*</p>",
                 "tooltip": "Compares the contents of register ST(0) and source value and sets condition code flags C0, C2, and C3 in the FPU status word according to the results (see the table below). The source operand can be a data register or a memory location. If no source operand is given, the value in ST(0) is compared with the value in ST(1). The sign of zero is ignored, so that \u20130.0 is equal to +0.0."
             };

         case "FCOS":
             return {
                 "url": "http://www.felixcloutier.com/x86/FCOS.html",
                 "html": "<p>Computes the cosine of the source operand in register ST(0) and stores the result in ST(0). The source operand must be given in radians and must be within the range \u22122<sup>63</sup> to +2<sup>63</sup>. The following table shows the results obtained when taking the cosine of various classes of numbers.</p><h3>Table 3-33.  FCOS Results</h3><table>\n<tr>\n<th>ST(0) SRC</th>\n<th>ST(0) DEST</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>\u22121 to +1</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>+ 1</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 1</td></tr>\n<tr>\n<td>+ F</td>\n<td>\u2212 1 to + 1</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p><p>F Means finite floating-point value.</p>",
                 "tooltip": "Computes the cosine of the source operand in register ST(0) and stores the result in ST(0). The source operand must be given in radians and must be within the range \u2212263 to +263. The following table shows the results obtained when taking the cosine of various classes of numbers."
             };

         case "FDECSTP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FDECSTP.html",
                 "html": "<p>Subtracts one from the TOP field of the FPU status word (decrements the top-of-stack pointer). If the TOP field contains a 0, it is set to 7. The effect of this instruction is to rotate the stack by one position. The contents of the FPU data registers and tag register are not affected.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Subtracts one from the TOP field of the FPU status word (decrements the top-of-stack pointer). If the TOP field contains a 0, it is set to 7. The effect of this instruction is to rotate the stack by one position. The contents of the FPU data registers and tag register are not affected."
             };

         case "FFREE":
             return {
                 "url": "http://www.felixcloutier.com/x86/FFREE.html",
                 "html": "<p>Sets the tag in the FPU tag register associated with register ST(i) to empty (11B). The contents of ST(i) and the FPU stack-top pointer (TOP) are not affected.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Sets the tag in the FPU tag register associated with register ST(i) to empty (11B). The contents of ST(i) and the FPU stack-top pointer (TOP) are not affected."
             };

         case "FADDP":
         case "FIADD":
         case "FADD":
             return {
                 "url": "http://www.felixcloutier.com/x86/FIADD.html",
                 "html": "<p>Adds the destination and source operands and stores the sum in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction adds the contents of the ST(0) register to the ST(1) register. The one-operand version adds the contents of a memory location (either a floating-point or an integer value) to the contents of the ST(0) register. The two-operand version, adds the contents of the ST(0) register to the ST(i) register or vice versa. The value in ST(0) can be doubled by coding:</p><p>FADD ST(0), ST(0);</p><p>The FADDP instructions perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. (The no-operand version of the floating-point add instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FADD rather than FADDP.)</p><p>The FIADD instructions convert an integer source operand to double extended-precision floating-point format before performing the addition.</p>",
                 "tooltip": "Adds the destination and source operands and stores the sum in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format."
             };

         case "FICOM":
         case "FICOMP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FICOMP.html",
                 "html": "<p>Compares the value in ST(0) with an integer source operand and sets the condition code flags C0, C2, and C3 in the FPU status word according to the results (see table below). The integer value is converted to double extended-precision floating-point format before the comparison is made.</p><h3>Table 3-36.  FICOM/FICOMP Results</h3><table>\n<tr>\n<th>Condition</th>\n<th>C3</th>\n<th>C2</th>\n<th>C0</th></tr>\n<tr>\n<td>ST(0) &gt; SRC</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>ST(0) &lt; SRC</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>ST(0) = SRC</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Unordered</td>\n<td>1</td>\n<td>1</td>\n<td>1</td></tr></table><p>These instructions perform an \u201cunordered comparison.\u201d An unordered comparison also checks the class of the numbers being compared (see \u201cFXAM\u2014Examine ModR/M\u201d in this chapter). If either operand is a NaN or is in an undefined format, the condition flags are set to \u201cunordered.\u201d</p><p>The sign of zero is ignored, so that \u20130.0 \u2190 +0.0.</p>",
                 "tooltip": "Compares the value in ST(0) with an integer source operand and sets the condition code flags C0, C2, and C3 in the FPU status word according to the results (see table below). The integer value is converted to double extended-precision floating-point format before the comparison is made."
             };

         case "FDIVP":
         case "FIDIV":
         case "FDIV":
             return {
                 "url": "http://www.felixcloutier.com/x86/FIDIV.html",
                 "html": "<p>Divides the destination operand by the source operand and stores the result in the destination location. The desti-nation operand (dividend) is always in an FPU register; the source operand (divisor) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format, word or doubleword integer format.</p><p>The no-operand version of the instruction divides the contents of the ST(1) register by the contents of the ST(0) register. The one-operand version divides the contents of the ST(0) register by the contents of a memory location (either a floating-point or an integer value). The two-operand version, divides the contents of the ST(0) register by the contents of the ST(i) register or vice versa.</p><p>The FDIVP instructions perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point divide instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FDIV rather than FDIVP.</p><p>The FIDIV instructions convert an integer source operand to double extended-precision floating-point format before performing the division. When the source operand is an integer 0, it is treated as a +0.</p><p>If an unmasked divide-by-zero exception (#Z) is generated, no result is stored; if the exception is masked, an \u221e of the appropriate sign is stored in the destination operand.</p>",
                 "tooltip": "Divides the destination operand by the source operand and stores the result in the destination location. The desti-nation operand (dividend) is always in an FPU register; the source operand (divisor) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format, word or doubleword integer format."
             };

         case "FIDIVR":
         case "FDIVR":
         case "FDIVRP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FIDIVR.html",
                 "html": "<p>Divides the source operand by the destination operand and stores the result in the destination location. The desti-nation operand (divisor) is always in an FPU register; the source operand (dividend) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format, word or doubleword integer format.</p><p>These instructions perform the reverse operations of the FDIV, FDIVP, and FIDIV instructions. They are provided to support more efficient coding.</p><p>The no-operand version of the instruction divides the contents of the ST(0) register by the contents of the ST(1) register. The one-operand version divides the contents of a memory location (either a floating-point or an integer value) by the contents of the ST(0) register. The two-operand version, divides the contents of the ST(i) register by the contents of the ST(0) register or vice versa.</p><p>The FDIVRP instructions perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point divide instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FDIVR rather than FDIVRP.</p><p>The FIDIVR instructions convert an integer source operand to double extended-precision floating-point format before performing the division.</p>",
                 "tooltip": "Divides the source operand by the destination operand and stores the result in the destination location. The desti-nation operand (divisor) is always in an FPU register; the source operand (dividend) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format, word or doubleword integer format."
             };

         case "FILD":
             return {
                 "url": "http://www.felixcloutier.com/x86/FILD.html",
                 "html": "<p>Converts the signed-integer source operand into double extended-precision floating-point format and pushes the value onto the FPU register stack. The source operand can be a word, doubleword, or quadword integer. It is loaded without rounding errors. The sign of the source operand is preserved.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Converts the signed-integer source operand into double extended-precision floating-point format and pushes the value onto the FPU register stack. The source operand can be a word, doubleword, or quadword integer. It is loaded without rounding errors. The sign of the source operand is preserved."
             };

         case "FMUL":
         case "FMULP":
         case "FIMUL":
             return {
                 "url": "http://www.felixcloutier.com/x86/FIMUL.html",
                 "html": "<p>Multiplies the destination and source operands and stores the product in the destination location. The destination operand is always an FPU data register; the source operand can be an FPU data register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction multiplies the contents of the ST(1) register by the contents of the ST(0) register and stores the product in the ST(1) register. The one-operand version multiplies the contents of the ST(0) register by the contents of a memory location (either a floating point or an integer value) and stores the product in the ST(0) register. The two-operand version, multiplies the contents of the ST(0) register by the contents of the ST(i) register, or vice versa, with the result being stored in the register specified with the first operand (the desti-nation operand).</p><p>The FMULP instructions perform the additional operation of popping the FPU register stack after storing the product. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point multiply instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FMUL rather than FMULP.</p><p>The FIMUL instructions convert an integer source operand to double extended-precision floating-point format before performing the multiplication.</p><p>The sign of the result is always the exclusive-OR of the source signs, even if one or more of the values being multi-plied is 0 or \u221e. When the source operand is an integer 0, it is treated as a +0.</p>",
                 "tooltip": "Multiplies the destination and source operands and stores the product in the destination location. The destination operand is always an FPU data register; the source operand can be an FPU data register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format."
             };

         case "FINCSTP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FINCSTP.html",
                 "html": "<p>Adds one to the TOP field of the FPU status word (increments the top-of-stack pointer). If the TOP field contains a 7, it is set to 0. The effect of this instruction is to rotate the stack by one position. The contents of the FPU data registers and tag register are not affected. This operation is not equivalent to popping the stack, because the tag for the previous top-of-stack register is not marked empty.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Adds one to the TOP field of the FPU status word (increments the top-of-stack pointer). If the TOP field contains a 7, it is set to 0. The effect of this instruction is to rotate the stack by one position. The contents of the FPU data registers and tag register are not affected. This operation is not equivalent to popping the stack, because the tag for the previous top-of-stack register is not marked empty."
             };

         case "FISTP":
         case "FIST":
             return {
                 "url": "http://www.felixcloutier.com/x86/FISTP.html",
                 "html": "<p>The FIST instruction converts the value in the ST(0) register to a signed integer and stores the result in the desti-nation operand. Values can be stored in word or doubleword integer format. The destination operand specifies the address where the first byte of the destination value is to be stored.</p><p>The FISTP instruction performs the same operation as the FIST instruction and then pops the register stack. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The FISTP instruction also stores values in quadword integer format.</p><p>The following table shows the results obtained when storing various classes of numbers in integer format.</p><h3>Table 3-37.  FIST/FISTP Results</h3><table>\n<tr>\n<th>ST(0)</th>\n<th>DEST</th></tr>\n<tr>\n<td>\u2212 \u221e or Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>F \u2264 \u22121</td>\n<td>\u2212 I</td></tr>\n<tr>\n<td>\u22121 &lt; F &lt; \u22120</td>\n<td>**</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>0</td></tr>\n<tr>\n<td>+ 0 &lt; F &lt; + 1</td>\n<td>**</td></tr>\n<tr>\n<td>F \u2265 + 1</td>\n<td>+ I</td></tr>\n<tr>\n<td>+ \u221e or Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>*</td></tr>\n<tr>\n<td></td>\n<td></td></tr></table>",
                 "tooltip": "The FIST instruction converts the value in the ST(0) register to a signed integer and stores the result in the desti-nation operand. Values can be stored in word or doubleword integer format. The destination operand specifies the address where the first byte of the destination value is to be stored."
             };

         case "FISTTP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FISTTP.html",
                 "html": "<p>FISTTP converts the value in ST into a signed integer using truncation (chop) as rounding mode, transfers the result to the destination, and pop ST. FISTTP accepts word, short integer, and long integer destinations.</p><p>The following table shows the results obtained when storing various classes of numbers in integer format.</p><h3>Table 3-38.  FISTTP Results</h3><table>\n<tr>\n<th>ST(0)</th>\n<th>DEST</th></tr>\n<tr>\n<td>\u2212 \u221e  or  Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>F \u2264  \u2212 1</td>\n<td>\u2212 I</td></tr>\n<tr>\n<td>\u2212 1 &lt; F &lt; + 1</td>\n<td>0</td></tr>\n<tr>\n<td>F \u0160 + 1</td>\n<td>+ I</td></tr>\n<tr>\n<td>+ \u221e  or Value Too Large for DEST Format</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>*</td></tr></table><p><strong>NOTES:</strong></p>",
                 "tooltip": "FISTTP converts the value in ST into a signed integer using truncation (chop) as rounding mode, transfers the result to the destination, and pop ST. FISTTP accepts word, short integer, and long integer destinations."
             };

         case "FSUB":
         case "FSUBP":
         case "FISUB":
             return {
                 "url": "http://www.felixcloutier.com/x86/FISUB.html",
                 "html": "<p>Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction subtracts the contents of the ST(0) register from the ST(1) register and stores the result in ST(1). The one-operand version subtracts the contents of a memory location (either a floating-point or an integer value) from the contents of the ST(0) register and stores the result in ST(0). The two-operand version, subtracts the contents of the ST(0) register from the ST(i) register or vice versa.</p><p>The FSUBP instructions perform the additional operation of popping the FPU register stack following the subtrac-tion. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point subtract instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FSUB rather than FSUBP.</p><p>The FISUB instructions convert an integer source operand to double extended-precision floating-point format before performing the subtraction.</p><p>Table 3-48 shows the results obtained when subtracting various classes of numbers from one another, assuming that neither overflow nor underflow occurs. Here, the SRC value is subtracted from the DEST value (DEST \u2212 SRC = result).</p>",
                 "tooltip": "Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format."
             };

         case "FSUBRP":
         case "FISUBR":
         case "FSUBR":
             return {
                 "url": "http://www.felixcloutier.com/x86/FISUBR.html",
                 "html": "<p>Subtracts the destination operand from the source operand and stores the difference in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>These instructions perform the reverse operations of the FSUB, FSUBP, and FISUB instructions. They are provided to support more efficient coding.</p><p>The no-operand version of the instruction subtracts the contents of the ST(1) register from the ST(0) register and stores the result in ST(1). The one-operand version subtracts the contents of the ST(0) register from the contents of a memory location (either a floating-point or an integer value) and stores the result in ST(0). The two-operand version, subtracts the contents of the ST(i) register from the ST(0) register or vice versa.</p><p>The FSUBRP instructions perform the additional operation of popping the FPU register stack following the subtrac-tion. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point reverse subtract instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FSUBR rather than FSUBRP.</p><p>The FISUBR instructions convert an integer source operand to double extended-precision floating-point format before performing the subtraction.</p>",
                 "tooltip": "Subtracts the destination operand from the source operand and stores the difference in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format."
             };

         case "FLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/FLD.html",
                 "html": "<p>Pushes the source operand onto the FPU register stack. The source operand can be in single-precision, double-precision, or double extended-precision floating-point format. If the source operand is in single-precision or double-precision floating-point format, it is automatically converted to the double extended-precision floating-point format before being pushed on the stack.</p><p>The FLD instruction can also push the value in a selected FPU register [ST(i)] onto the stack. Here, pushing register ST(0) duplicates the stack top.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Pushes the source operand onto the FPU register stack. The source operand can be in single-precision, double-precision, or double extended-precision floating-point format. If the source operand is in single-precision or double-precision floating-point format, it is automatically converted to the double extended-precision floating-point format before being pushed on the stack."
             };

         case "FLDCW":
             return {
                 "url": "http://www.felixcloutier.com/x86/FLDCW.html",
                 "html": "<p>Loads the 16-bit source operand into the FPU control word. The source operand is a memory location. This instruc-tion is typically used to establish or change the FPU\u2019s mode of operation.</p><p>If one or more exception flags are set in the FPU status word prior to loading a new FPU control word and the new control word unmasks one or more of those exceptions, a floating-point exception will be generated upon execution of the next floating-point instruction (except for the no-wait floating-point instructions, see the section titled \u201cSoft-ware Exception Handling\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). To avoid raising exceptions when changing FPU operating modes, clear any pending exceptions (using the FCLEX or FNCLEX instruction) before loading the new control word.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Loads the 16-bit source operand into the FPU control word. The source operand is a memory location. This instruc-tion is typically used to establish or change the FPU\u2019s mode of operation."
             };

         case "FLDENV":
             return {
                 "url": "http://www.felixcloutier.com/x86/FLDENV.html",
                 "html": "<p>Loads the complete x87 FPU operating environment from memory into the FPU registers. The source operand spec-ifies the first byte of the operating-environment data in memory. This data is typically written to the specified memory location by a FSTENV or FNSTENV instruction.</p><p>The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, show the layout in memory of the loaded environment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used.</p><p>The FLDENV instruction should be executed in the same operating mode as the corresponding FSTENV/FNSTENV instruction.</p><p>If one or more unmasked exception flags are set in the new FPU status word, a floating-point exception will be generated upon execution of the next floating-point instruction (except for the no-wait floating-point instructions, see the section titled \u201cSoftware Exception Handling\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>). To avoid generating exceptions when loading a new environment, clear all the exception flags in the FPU status word that is being loaded.</p><p>If a page or limit fault occurs during the execution of this instruction, the state of the x87 FPU registers as seen by the fault handler may be different than the state being loaded from memory. In such situations, the fault handler should ignore the status of the x87 FPU registers, handle the fault, and return. The FLDENV instruction will then complete the loading of the x87 FPU registers with no resulting context inconsistency.</p>",
                 "tooltip": "Loads the complete x87 FPU operating environment from memory into the FPU registers. The source operand spec-ifies the first byte of the operating-environment data in memory. This data is typically written to the specified memory location by a FSTENV or FNSTENV instruction."
             };

         case "FLDL2T":
         case "FLDZ":
         case "FLD1":
         case "FLDLN2":
         case "FLDPI":
         case "FLDL2E":
         case "FLDLG2":
             return {
                 "url": "http://www.felixcloutier.com/x86/FLDZ.html",
                 "html": "<p>Push one of seven commonly used constants (in double extended-precision floating-point format) onto the FPU register stack. The constants that can be loaded with these instructions include +1.0, +0.0, log<sub>2</sub>10, log<sub>2</sub>e, \u03c0, log<sub>10</sub>2, and log<sub>e</sub>2. For each constant, an internal 66-bit constant is rounded (as specified by the RC field in the FPU control word) to double extended-precision floating-point format. The inexact-result exception (#P) is not generated as a result of the rounding, nor is the C1 flag set in the x87 FPU status word if the value is rounded up.</p><p>See the section titled \u201cPi\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for a description of the \u03c0 constant.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Push one of seven commonly used constants (in double extended-precision floating-point format) onto the FPU register stack. The constants that can be loaded with these instructions include +1.0, +0.0, log210, log2e, \u03c0, log102, and loge2. For each constant, an internal 66-bit constant is rounded (as specified by the RC field in the FPU control word) to double extended-precision floating-point format. The inexact-result exception (#P) is not generated as a result of the rounding, nor is the C1 flag set in the x87 FPU status word if the value is rounded up."
             };

         case "FNCLEX":
         case "FCLEX":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNCLEX.html",
                 "html": "<p>Clears the floating-point exception flags (PE, UE, OE, ZE, DE, and IE), the exception summary status flag (ES), the stack fault flag (SF), and the busy flag (B) in the FPU status word. The FCLEX instruction checks for and handles any pending unmasked floating-point exceptions before clearing the exception flags; the FNCLEX instruction does not.</p><p>The assembler issues two instructions for the FCLEX instruction (an FWAIT instruction followed by an FNCLEX instruction), and the processor executes each of these instructions separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p>",
                 "tooltip": "Clears the floating-point exception flags (PE, UE, OE, ZE, DE, and IE), the exception summary status flag (ES), the stack fault flag (SF), and the busy flag (B) in the FPU status word. The FCLEX instruction checks for and handles any pending unmasked floating-point exceptions before clearing the exception flags; the FNCLEX instruction does not."
             };

         case "FNINIT":
         case "FINIT":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNINIT.html",
                 "html": "<p>Sets the FPU control, status, tag, instruction pointer, and data pointer registers to their default states. The FPU control word is set to 037FH (round to nearest, all exceptions masked, 64-bit precision). The status word is cleared (no exception flags set, TOP is set to 0). The data registers in the register stack are left unchanged, but they are all tagged as empty (11B). Both the instruction and data pointers are cleared.</p><p>The FINIT instruction checks for and handles any pending unmasked floating-point exceptions before performing the initialization; the FNINIT instruction does not.</p><p>The assembler issues two instructions for the FINIT instruction (an FWAIT instruction followed by an FNINIT instruction), and the processor executes each of these instructions in separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Sets the FPU control, status, tag, instruction pointer, and data pointer registers to their default states. The FPU control word is set to 037FH (round to nearest, all exceptions masked, 64-bit precision). The status word is cleared (no exception flags set, TOP is set to 0). The data registers in the register stack are left unchanged, but they are all tagged as empty (11B). Both the instruction and data pointers are cleared."
             };

         case "FNOP":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNOP.html",
                 "html": "<p>Performs no FPU operation. This instruction takes up space in the instruction stream but does not affect the FPU or machine context, except the EIP register.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Performs no FPU operation. This instruction takes up space in the instruction stream but does not affect the FPU or machine context, except the EIP register."
             };

         case "FSAVE":
         case "FNSAVE":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNSAVE.html",
                 "html": "<p>Stores the current FPU state (operating environment and register stack) at the specified destination in memory, and then re-initializes the FPU. The FSAVE instruction checks for and handles pending unmasked floating-point exceptions before storing the FPU state; the FNSAVE instruction does not.</p><p>The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, show the layout in memory of the stored environment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used. The contents of the FPU register stack are stored in the 80 bytes immediately follow the operating environment image.</p><p>The saved image reflects the state of the FPU after all floating-point instructions preceding the FSAVE/FNSAVE instruction in the instruction stream have been executed.</p><p>After the FPU state has been saved, the FPU is reset to the same default values it is set to with the FINIT/FNINIT instructions (see \u201cFINIT/FNINIT\u2014Initialize Floating-Point Unit\u201d in this chapter).</p><p>The FSAVE/FNSAVE instructions are typically used when the operating system needs to perform a context switch, an exception handler needs to use the FPU, or an application program needs to pass a \u201cclean\u201d FPU to a procedure.</p>",
                 "tooltip": "Stores the current FPU state (operating environment and register stack) at the specified destination in memory, and then re-initializes the FPU. The FSAVE instruction checks for and handles pending unmasked floating-point exceptions before storing the FPU state; the FNSAVE instruction does not."
             };

         case "FSTCW":
         case "FNSTCW":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNSTCW.html",
                 "html": "<p>Stores the current value of the FPU control word at the specified destination in memory. The FSTCW instruction checks for and handles pending unmasked floating-point exceptions before storing the control word; the FNSTCW instruction does not.</p><p>The assembler issues two instructions for the FSTCW instruction (an FWAIT instruction followed by an FNSTCW instruction), and the processor executes each of these instructions in separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Stores the current value of the FPU control word at the specified destination in memory. The FSTCW instruction checks for and handles pending unmasked floating-point exceptions before storing the control word; the FNSTCW instruction does not."
             };

         case "FSTENV":
         case "FNSTENV":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNSTENV.html",
                 "html": "<p>Saves the current FPU operating environment at the memory location specified with the destination operand, and then masks all floating-point exceptions. The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, show the layout in memory of the stored environ-ment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used.</p><p>The FSTENV instruction checks for and handles any pending unmasked floating-point exceptions before storing the FPU environment; the FNSTENV instruction does not. The saved image reflects the state of the FPU after all floating-point instructions preceding the FSTENV/FNSTENV instruction in the instruction stream have been executed.</p><p>These instructions are often used by exception handlers because they provide access to the FPU instruction and data pointers. The environment is typically saved in the stack. Masking all exceptions after saving the environment prevents floating-point exceptions from interrupting the exception handler.</p><p>The assembler issues two instructions for the FSTENV instruction (an FWAIT instruction followed by an FNSTENV instruction), and the processor executes each of these instructions separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Saves the current FPU operating environment at the memory location specified with the destination operand, and then masks all floating-point exceptions. The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, show the layout in memory of the stored environ-ment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used."
             };

         case "FSTSW":
         case "FNSTSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/FNSTSW.html",
                 "html": "<p>Stores the current value of the x87 FPU status word in the destination location. The destination operand can be either a two-byte memory location or the AX register. The FSTSW instruction checks for and handles pending unmasked floating-point exceptions before storing the status word; the FNSTSW instruction does not.</p><p>The FNSTSW AX form of the instruction is used primarily in conditional branching (for instance, after an FPU comparison instruction or an FPREM, FPREM1, or FXAM instruction), where the direction of the branch depends on the state of the FPU condition code flags. (See the section titled \u201cBranching and Conditional Moves on FPU Condi-tion Codes\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.) This instruction can also be used to invoke exception handlers (by examining the exception flags) in environments that do not use interrupts. When the FNSTSW AX instruction is executed, the AX register is updated before the processor executes any further instructions. The status stored in the AX register is thus guaranteed to be from the completion of the prior FPU instruction.</p><p>The assembler issues two instructions for the FSTSW instruction (an FWAIT instruction followed by an FNSTSW instruction), and the processor executes each of these instructions separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Stores the current value of the x87 FPU status word in the destination location. The destination operand can be either a two-byte memory location or the AX register. The FSTSW instruction checks for and handles pending unmasked floating-point exceptions before storing the status word; the FNSTSW instruction does not."
             };

         case "FPATAN":
             return {
                 "url": "http://www.felixcloutier.com/x86/FPATAN.html",
                 "html": "<p>Computes the arctangent of the source operand in register ST(1) divided by the source operand in register ST(0), stores the result in ST(1), and pops the FPU register stack. The result in register ST(0) has the same sign as the source operand ST(1) and a magnitude less than +\u03c0.</p><p>The FPATAN instruction returns the angle between the X axis and the line from the origin to the point (X,Y), where Y (the ordinate) is ST(1) and X (the abscissa) is ST(0). The angle depends on the sign of X and Y independently, not just on the sign of the ratio Y/X. This is because a point (\u2212X,Y) is in the second quadrant, resulting in an angle between \u03c0/2 and \u03c0, while a point (X,\u2212Y) is in the fourth quadrant, resulting in an angle between 0 and \u2212\u03c0/2. A point (\u2212X,\u2212Y) is in the third quadrant, giving an angle between \u2212\u03c0/2 and \u2212\u03c0.</p><p>The following table shows the results obtained when computing the arctangent of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-40.  FPATAN Results</h3><table>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<th colspan=\"2\">ST(0)</th>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 3\u03c0/4*</td>\n<td colspan=\"2\">\u2212 \u03c0/2</td>\n<td colspan=\"2\">\u2212 \u03c0/2</td>\n<td colspan=\"2\">\u2212 \u03c0/2</td>\n<td colspan=\"2\">\u2212 \u03c0/2</td>\n<td colspan=\"2\">\u2212 \u03c0/4*</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<th colspan=\"2\">ST(1)</th>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">-p</td>\n<td colspan=\"2\">\u2212\u03c0 to \u2212\u03c0/2</td>\n<td colspan=\"2\">\u2212\u03c0/2</td>\n<td colspan=\"2\">\u2212\u03c0/2</td>\n<td colspan=\"2\">\u2212\u03c0/2 to \u22120</td>\n<td colspan=\"2\">- 0</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">-p</td>\n<td colspan=\"2\">-p</td>\n<td colspan=\"2\">-p*</td>\n<td colspan=\"2\">\u2212 0*</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+p</td>\n<td colspan=\"2\">+ p</td>\n<td colspan=\"2\">+ \u03c0*</td>\n<td colspan=\"2\">+ 0*</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+p</td>\n<td colspan=\"2\">+\u03c0 to +\u03c0/2</td>\n<td colspan=\"2\">+ \u03c0/2</td>\n<td colspan=\"2\">+\u03c0/2</td>\n<td colspan=\"2\">+\u03c0/2 to +0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+3\u03c0/4*</td>\n<td colspan=\"2\">+\u03c0/2</td>\n<td colspan=\"2\">+\u03c0/2</td>\n<td colspan=\"2\">+\u03c0/2</td>\n<td colspan=\"2\">+ \u03c0/2</td>\n<td colspan=\"2\">+ \u03c0/4*</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td></tr></table>",
                 "tooltip": "Computes the arctangent of the source operand in register ST(1) divided by the source operand in register ST(0), stores the result in ST(1), and pops the FPU register stack. The result in register ST(0) has the same sign as the source operand ST(1) and a magnitude less than +\u03c0."
             };

         case "FPREM":
             return {
                 "url": "http://www.felixcloutier.com/x86/FPREM.html",
                 "html": "<p>Computes the remainder obtained from dividing the value in the ST(0) register (the dividend) by the value in the ST(1) register (the divisor or <strong>modulus</strong>), and stores the result in ST(0). The remainder represents the following value:</p><p>Remainder \u2190 ST(0) \u2212 (Q \u2217 ST(1))</p><p>Here, Q is an integer value that is obtained by truncating the floating-point number quotient of [ST(0) / ST(1)] toward zero. The sign of the remainder is the same as the sign of the dividend. The magnitude of the remainder is less than that of the modulus, unless a partial remainder was computed (as described below).</p><p>This instruction produces an exact result; the inexact-result exception does not occur and the rounding control has no effect. The following table shows the results obtained when computing the remainder of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-41.  FPREM Results</h3>",
                 "tooltip": "Computes the remainder obtained from dividing the value in the ST(0) register (the dividend) by the value in the ST(1) register (the divisor or modulus), and stores the result in ST(0). The remainder represents the following value"
             };

         case "FPREM1":
             return {
                 "url": "http://www.felixcloutier.com/x86/FPREM1.html",
                 "html": "<p>Computes the IEEE remainder obtained from dividing the value in the ST(0) register (the dividend) by the value in the ST(1) register (the divisor or <strong>modulus</strong>), and stores the result in ST(0). The remainder represents the following value:</p><p>Remainder \u2190 ST(0) \u2212 (Q \u2217 ST(1))</p><p>Here, Q is an integer value that is obtained by rounding the floating-point number quotient of [ST(0) / ST(1)] toward the nearest integer value. The magnitude of the remainder is less than or equal to half the magnitude of the modulus, unless a partial remainder was computed (as described below).</p><p>This instruction produces an exact result; the precision (inexact) exception does not occur and the rounding control has no effect. The following table shows the results obtained when computing the remainder of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-42.  FPREM1 Results</h3>",
                 "tooltip": "Computes the IEEE remainder obtained from dividing the value in the ST(0) register (the dividend) by the value in the ST(1) register (the divisor or modulus), and stores the result in ST(0). The remainder represents the following value"
             };

         case "FPTAN":
             return {
                 "url": "http://www.felixcloutier.com/x86/FPTAN.html",
                 "html": "<p>Computes the tangent of the source operand in register ST(0), stores the result in ST(0), and pushes a 1.0 onto the FPU register stack. The source operand must be given in radians and must be less than \u00b12<sup>63</sup>. The following table shows the unmasked results obtained when computing the partial tangent of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-43.  FPTAN Results</h3><table>\n<tr>\n<th>ST(0) SRC</th>\n<th>ST(0) DEST</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>\u2212 F to + F</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>- 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ F</td>\n<td>\u2212 F to + F</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p><p>F Means finite floating-point value.</p>",
                 "tooltip": "Computes the tangent of the source operand in register ST(0), stores the result in ST(0), and pushes a 1.0 onto the FPU register stack. The source operand must be given in radians and must be less than \u00b1263. The following table shows the unmasked results obtained when computing the partial tangent of various classes of numbers, assuming that underflow does not occur."
             };

         case "FRNDINT":
             return {
                 "url": "http://www.felixcloutier.com/x86/FRNDINT.html",
                 "html": "<p>Rounds the source value in the ST(0) register to the nearest integral value, depending on the current rounding mode (setting of the RC field of the FPU control word), and stores the result in ST(0).</p><p>If the source value is \u221e, the value is not changed. If the source value is not an integral value, the floating-point inexact-result exception (#P) is generated.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Rounds the source value in the ST(0) register to the nearest integral value, depending on the current rounding mode (setting of the RC field of the FPU control word), and stores the result in ST(0)."
             };

         case "FRSTOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/FRSTOR.html",
                 "html": "<p>Loads the FPU state (operating environment and register stack) from the memory area specified with the source operand. This state data is typically written to the specified memory location by a previous FSAVE/FNSAVE instruc-tion.</p><p>The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, show the layout in memory of the stored environment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used. The contents of the FPU register stack are stored in the 80 bytes immediately following the operating environment image.</p><p>The FRSTOR instruction should be executed in the same operating mode as the corresponding FSAVE/FNSAVE instruction.</p><p>If one or more unmasked exception bits are set in the new FPU status word, a floating-point exception will be generated. To avoid raising exceptions when loading a new operating environment, clear all the exception flags in the FPU status word that is being loaded.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Loads the FPU state (operating environment and register stack) from the memory area specified with the source operand. This state data is typically written to the specified memory location by a previous FSAVE/FNSAVE instruc-tion."
             };

         case "FSCALE":
             return {
                 "url": "http://www.felixcloutier.com/x86/FSCALE.html",
                 "html": "<p>Truncates the value in the source operand (toward 0) to an integral value and adds that value to the exponent of the destination operand. The destination and source operands are floating-point values located in registers ST(0) and ST(1), respectively. This instruction provides rapid multiplication or division by integral powers of 2. The following table shows the results obtained when scaling various classes of numbers, assuming that neither overflow nor underflow occurs.</p><h3>Table 3-44.  FSCALE Results</h3><table>\n<tr>\n<th colspan=\"2\">ST(1)</th>\n<th colspan=\"2\">ST(1)</th>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<th colspan=\"2\">ST(0)</th>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td></tr></table><p><strong>NOTES:</strong></p><p>F Means finite floating-point value.</p>",
                 "tooltip": "Truncates the value in the source operand (toward 0) to an integral value and adds that value to the exponent of the destination operand. The destination and source operands are floating-point values located in registers ST(0) and ST(1), respectively. This instruction provides rapid multiplication or division by integral powers of 2. The following table shows the results obtained when scaling various classes of numbers, assuming that neither overflow nor underflow occurs."
             };

         case "FSIN":
             return {
                 "url": "http://www.felixcloutier.com/x86/FSIN.html",
                 "html": "<p>Computes the sine of the source operand in register ST(0) and stores the result in ST(0). The source operand must be given in radians and must be within the range \u22122<sup>63</sup> to +2<sup>63</sup>. The following table shows the results obtained when taking the sine of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-45.  FSIN Results</h3><table>\n<tr>\n<th>SRC (ST(0))</th>\n<th>DEST (ST(0))</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>\u2212 1 to + 1</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>\u22120</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ F</td>\n<td>\u2212 1 to +1</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p><p>F Means finite floating-point value.</p>",
                 "tooltip": "Computes the sine of the source operand in register ST(0) and stores the result in ST(0). The source operand must be given in radians and must be within the range \u2212263 to +263. The following table shows the results obtained when taking the sine of various classes of numbers, assuming that underflow does not occur."
             };

         case "FSINCOS":
             return {
                 "url": "http://www.felixcloutier.com/x86/FSINCOS.html",
                 "html": "<p>Computes both the sine and the cosine of the source operand in register ST(0), stores the sine in ST(0), and pushes the cosine onto the top of the FPU register stack. (This instruction is faster than executing the FSIN and FCOS instructions in succession.)</p><p>The source operand must be given in radians and must be within the range \u22122<sup>63</sup> to +2<sup>63</sup>. The following table shows the results obtained when taking the sine and cosine of various classes of numbers, assuming that underflow does not occur.</p><h3>Table 3-46.  FSINCOS Results</h3><p><strong>SRC</strong></p><p><strong>DEST</strong></p>",
                 "tooltip": "Computes both the sine and the cosine of the source operand in register ST(0), stores the sine in ST(0), and pushes the cosine onto the top of the FPU register stack. (This instruction is faster than executing the FSIN and FCOS instructions in succession.)"
             };

         case "FSQRT":
             return {
                 "url": "http://www.felixcloutier.com/x86/FSQRT.html",
                 "html": "<p>Computes the square root of the source value in the ST(0) register and stores the result in ST(0).</p><p>The following table shows the results obtained when taking the square root of various classes of numbers, assuming that neither overflow nor underflow occurs.</p><h3>Table 3-47.  FSQRT Results</h3><table>\n<tr>\n<th>SRC (ST(0))</th>\n<th>DEST (ST(0))</th></tr>\n<tr>\n<td>\u2212 \u221e</td>\n<td>*</td></tr>\n<tr>\n<td>\u2212 F</td>\n<td>*</td></tr>\n<tr>\n<td>\u2212 0</td>\n<td>\u2212 0</td></tr>\n<tr>\n<td>+ 0</td>\n<td>+ 0</td></tr>\n<tr>\n<td>+ F</td>\n<td>+ F</td></tr>\n<tr>\n<td>+ \u221e</td>\n<td>+ \u221e</td></tr>\n<tr>\n<td>NaN</td>\n<td>NaN</td></tr></table><p><strong>NOTES:</strong></p>",
                 "tooltip": "Computes the square root of the source value in the ST(0) register and stores the result in ST(0)."
             };

         case "FSTP":
         case "FST":
             return {
                 "url": "http://www.felixcloutier.com/x86/FSTP.html",
                 "html": "<p>The FST instruction copies the value in the ST(0) register to the destination operand, which can be a memory loca-tion or another register in the FPU register stack. When storing the value in memory, the value is converted to single-precision or double-precision floating-point format.</p><p>The FSTP instruction performs the same operation as the FST instruction and then pops the register stack. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The FSTP instruction can also store values in memory in double extended-precision floating-point format.</p><p>If the destination operand is a memory location, the operand specifies the address where the first byte of the desti-nation value is to be stored. If the destination operand is a register, the operand specifies a register in the register stack relative to the top of the stack.</p><p>If the destination size is single-precision or double-precision, the significand of the value being stored is rounded to the width of the destination (according to the rounding mode specified by the RC field of the FPU control word), and the exponent is converted to the width and bias of the destination format. If the value being stored is too large for the destination format, a numeric overflow exception (#O) is generated and, if the exception is unmasked, no value is stored in the destination operand. If the value being stored is a denormal value, the denormal exception (#D) is not generated. This condition is simply signaled as a numeric underflow exception (#U) condition.</p><p>If the value being stored is \u00b10, \u00b1\u221e, or a NaN, the least-significant bits of the significand and the exponent are trun-cated to fit the destination format. This operation preserves the value\u2019s identity as a 0, \u221e, or NaN.</p>",
                 "tooltip": "The FST instruction copies the value in the ST(0) register to the destination operand, which can be a memory loca-tion or another register in the FPU register stack. When storing the value in memory, the value is converted to single-precision or double-precision floating-point format."
             };

         case "FTST":
             return {
                 "url": "http://www.felixcloutier.com/x86/FTST.html",
                 "html": "<p>Compares the value in the ST(0) register with 0.0 and sets the condition code flags C0, C2, and C3 in the FPU status word according to the results (see table below).</p><h3>Table 3-50.  FTST Results</h3><table>\n<tr>\n<th>Condition</th>\n<th>C3</th>\n<th>C2</th>\n<th>C0</th></tr>\n<tr>\n<td>ST(0) &gt; 0.0</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>ST(0) &lt; 0.0</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>ST(0) = 0.0</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Unordered</td>\n<td>1</td>\n<td>1</td>\n<td>1</td></tr></table><p>This instruction performs an \u201cunordered comparison.\u201d An unordered comparison also checks the class of the numbers being compared (see \u201cFXAM\u2014Examine ModR/M\u201d in this chapter). If the value in register ST(0) is a NaN or is in an undefined format, the condition flags are set to \u201cunordered\u201d and the invalid operation exception is gener-ated.</p><p>The sign of zero is ignored, so that (\u2013 0.0 \u2190 +0.0).</p>",
                 "tooltip": "Compares the value in the ST(0) register with 0.0 and sets the condition code flags C0, C2, and C3 in the FPU status word according to the results (see table below)."
             };

         case "FCOMIP":
         case "FUCOMI":
         case "FUCOMIP":
         case "FCOMI":
             return {
                 "url": "http://www.felixcloutier.com/x86/FUCOMIP.html",
                 "html": "<p>Performs an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and CF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for compari-sons, so that \u20130.0 is equal to +0.0.</p><h3>Table 3-32.  FCOMI/FCOMIP/ FUCOMI/FUCOMIP Results</h3><table>\n<tr>\n<th>Comparison Results*</th>\n<th>ZF</th>\n<th>PF</th>\n<th>CF</th></tr>\n<tr>\n<td>ST0 &gt; ST(i)</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>ST0 &lt; ST(i)</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>ST0 = ST(i)</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Unordered**</td>\n<td>1</td>\n<td>1</td>\n<td>1</td></tr></table><p><strong>NOTES:</strong></p><p>*</p>",
                 "tooltip": "Performs an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and CF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for compari-sons, so that \u20130.0 is equal to +0.0."
             };

         case "FUCOMP":
         case "FUCOMPP":
         case "FUCOM":
             return {
                 "url": "http://www.felixcloutier.com/x86/FUCOMPP.html",
                 "html": "<p>Performs an unordered comparison of the contents of register ST(0) and ST(i) and sets condition code flags C0, C2, and C3 in the FPU status word according to the results (see the table below). If no operand is specified, the contents of registers ST(0) and ST(1) are compared. The sign of zero is ignored, so that \u20130.0 is equal to +0.0.</p><h3>Table 3-51.  FUCOM/FUCOMP/FUCOMPP Results</h3><table>\n<tr>\n<th>Comparison Results*</th>\n<th>C3</th>\n<th>C2</th>\n<th>C0</th></tr>\n<tr>\n<td>ST0 &gt; ST(i)</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>ST0 &lt; ST(i)</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>ST0 = ST(i)</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Unordered</td>\n<td>1</td>\n<td>1</td>\n<td>1</td></tr></table><p><strong>NOTES:</strong></p><p>*</p>",
                 "tooltip": "Performs an unordered comparison of the contents of register ST(0) and ST(i) and sets condition code flags C0, C2, and C3 in the FPU status word according to the results (see the table below). If no operand is specified, the contents of registers ST(0) and ST(1) are compared. The sign of zero is ignored, so that \u20130.0 is equal to +0.0."
             };

         case "FWAIT":
         case "WAIT":
             return {
                 "url": "http://www.felixcloutier.com/x86/FWAIT.html",
                 "html": "<p>Causes the processor to check for and handle pending, unmasked, floating-point exceptions before proceeding. (FWAIT is an alternate mnemonic for WAIT.)</p><p>This instruction is useful for synchronizing exceptions in critical sections of code. Coding a WAIT instruction after a floating-point instruction ensures that any unmasked floating-point exceptions the instruction may raise are handled before the processor can modify the instruction\u2019s results. See the section titled \u201cFloating-Point Exception Synchronization\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information on using the WAIT/FWAIT instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Causes the processor to check for and handle pending, unmasked, floating-point exceptions before proceeding. (FWAIT is an alternate mnemonic for WAIT.)"
             };

         case "FXAM":
             return {
                 "url": "http://www.felixcloutier.com/x86/FXAM.html",
                 "html": "<p>Examines the contents of the ST(0) register and sets the condition code flags C0, C2, and C3 in the FPU status word to indicate the class of value or number in the register (see the table below).</p><h3>Table 3-52.  FXAM Results</h3><p>.</p><table>\n<tr>\n<th>Class</th>\n<th>C3</th>\n<th>C2</th>\n<th>C0</th></tr>\n<tr>\n<td>Unsupported</td>\n<td>0</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>NaN</td>\n<td>0</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>Normal finite number</td>\n<td>0</td>\n<td>1</td>\n<td>0</td></tr>\n<tr>\n<td>Infinity</td>\n<td>0</td>\n<td>1</td>\n<td>1</td></tr>\n<tr>\n<td>Zero</td>\n<td>1</td>\n<td>0</td>\n<td>0</td></tr>\n<tr>\n<td>Empty</td>\n<td>1</td>\n<td>0</td>\n<td>1</td></tr>\n<tr>\n<td>Denormal number</td>\n<td>1</td>\n<td>1</td>\n<td>0</td></tr></table><p>The C1 flag is set to the sign of the value in ST(0), regardless of whether the register is empty or full.</p>",
                 "tooltip": "Examines the contents of the ST(0) register and sets the condition code flags C0, C2, and C3 in the FPU status word to indicate the class of value or number in the register (see the table below)."
             };

         case "FXCH":
             return {
                 "url": "http://www.felixcloutier.com/x86/FXCH.html",
                 "html": "<p>Exchanges the contents of registers ST(0) and ST(i). If no source operand is specified, the contents of ST(0) and ST(1) are exchanged.</p><p>This instruction provides a simple means of moving values in the FPU register stack to the top of the stack [ST(0)], so that they can be operated on by those floating-point instructions that can only operate on values in ST(0). For example, the following instruction sequence takes the square root of the third register from the top of the register stack:</p><p>FXCH ST(3);</p><p>FSQRT;</p><p>FXCH ST(3);</p>",
                 "tooltip": "Exchanges the contents of registers ST(0) and ST(i). If no source operand is specified, the contents of ST(0) and ST(1) are exchanged."
             };

         case "FXRSTOR":
         case "FXRSTOR64":
             return {
                 "url": "http://www.felixcloutier.com/x86/FXRSTOR.html",
                 "html": "<p>Reloads the x87 FPU, MMX technology, XMM, and MXCSR registers from the 512-byte memory image specified in the source operand. This data should have been written to memory previously using the FXSAVE instruction, and in the same format as required by the operating modes. The first byte of the data should be located on a 16-byte boundary. There are three distinct layouts of the FXSAVE state map: one for legacy and compatibility mode, a second format for 64-bit mode FXSAVE/FXRSTOR with REX.W=0, and the third format is for 64-bit mode with FXSAVE64/FXRSTOR64. Table 3-53 shows the layout of the legacy/compatibility mode state information in memory and describes the fields in the memory image for the FXRSTOR and FXSAVE instructions. Table 3-56 shows the layout of the 64-bit mode state information when REX.W is set (FXSAVE64/FXRSTOR64). Table 3-57 shows the layout of the 64-bit mode state information when REX.W is clear (FXSAVE/FXRSTOR).</p><p>The state image referenced with an FXRSTOR instruction must have been saved using an FXSAVE instruction or be in the same format as required by Table 3-53, Table 3-56, or Table 3-57. Referencing a state image saved with an FSAVE, FNSAVE instruction or incompatible field layout will result in an incorrect state restoration.</p><p>The FXRSTOR instruction does not flush pending x87 FPU exceptions. To check and raise exceptions when loading x87 FPU state information with the FXRSTOR instruction, use an FWAIT instruction after the FXRSTOR instruction.</p><p>If the OSFXSR bit in control register CR4 is not set, the FXRSTOR instruction may not restore the states of the XMM and MXCSR registers. This behavior is implementation dependent.</p><p>If the MXCSR state contains an unmasked exception with a corresponding status flag also set, loading the register with the FXRSTOR instruction will not result in a SIMD floating-point error condition being generated. Only the next occurrence of this unmasked exception will result in the exception being generated.</p>",
                 "tooltip": "Reloads the x87 FPU, MMX technology, XMM, and MXCSR registers from the 512-byte memory image specified in the source operand. This data should have been written to memory previously using the FXSAVE instruction, and in the same format as required by the operating modes. The first byte of the data should be located on a 16-byte boundary. There are three distinct layouts of the FXSAVE state map: one for legacy and compatibility mode, a second format for 64-bit mode FXSAVE/FXRSTOR with REX.W=0, and the third format is for 64-bit mode with FXSAVE64/FXRSTOR64. Table 3-53 shows the layout of the legacy/compatibility mode state information in memory and describes the fields in the memory image for the FXRSTOR and FXSAVE instructions. Table 3-56 shows the layout of the 64-bit mode state information when REX.W is set (FXSAVE64/FXRSTOR64). Table 3-57 shows the layout of the 64-bit mode state information when REX.W is clear (FXSAVE/FXRSTOR)."
             };

         case "FXSAVE64":
         case "FXSAVE":
             return {
                 "url": "http://www.felixcloutier.com/x86/FXSAVE.html",
                 "html": "<p>Saves the current state of the x87 FPU, MMX technology, XMM, and MXCSR registers to a 512-byte memory loca-tion specified in the destination operand. The content layout of the 512 byte region depends on whether the processor is operating in non-64-bit operating modes or 64-bit sub-mode of IA-32e mode.</p><p>Bytes 464:511 are available to software use. The processor does not write to bytes 464:511 of an FXSAVE area.</p><p>The operation of FXSAVE in non-64-bit modes is described first.</p>",
                 "tooltip": "Saves the current state of the x87 FPU, MMX technology, XMM, and MXCSR registers to a 512-byte memory loca-tion specified in the destination operand. The content layout of the 512 byte region depends on whether the processor is operating in non-64-bit operating modes or 64-bit sub-mode of IA-32e mode."
             };

         case "FXTRACT":
             return {
                 "url": "http://www.felixcloutier.com/x86/FXTRACT.html",
                 "html": "<p>Separates the source value in the ST(0) register into its exponent and significand, stores the exponent in ST(0), and pushes the significand onto the register stack. Following this operation, the new top-of-stack register ST(0) contains the value of the original significand expressed as a floating-point value. The sign and significand of this value are the same as those found in the source operand, and the exponent is 3FFFH (biased value for a true expo-nent of zero). The ST(1) register contains the value of the original operand\u2019s true (unbiased) exponent expressed as a floating-point value. (The operation performed by this instruction is a superset of the IEEE-recommended logb(<em>x</em>) function.)</p><p>This instruction and the F2XM1 instruction are useful for performing power and range scaling operations. The FXTRACT instruction is also useful for converting numbers in double extended-precision floating-point format to decimal representations (e.g., for printing or displaying).</p><p>If the floating-point zero-divide exception (#Z) is masked and the source operand is zero, an exponent value of \u2013 \u221e is stored in register ST(1) and 0 with the sign of the source operand is stored in register ST(0).</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Separates the source value in the ST(0) register into its exponent and significand, stores the exponent in ST(0), and pushes the significand onto the register stack. Following this operation, the new top-of-stack register ST(0) contains the value of the original significand expressed as a floating-point value. The sign and significand of this value are the same as those found in the source operand, and the exponent is 3FFFH (biased value for a true expo-nent of zero). The ST(1) register contains the value of the original operand\u2019s true (unbiased) exponent expressed as a floating-point value. (The operation performed by this instruction is a superset of the IEEE-recommended logb(x) function.)"
             };

         case "FYL2X":
             return {
                 "url": "http://www.felixcloutier.com/x86/FYL2X.html",
                 "html": "<p>Computes (ST(1) \u2217 log<sub>2</sub> (ST(0))), stores the result in resister ST(1), and pops the FPU register stack. The source operand in ST(0) must be a non-zero positive number.</p><p>The following table shows the results obtained when taking the log of various classes of numbers, assuming that neither overflow nor underflow occurs.</p><h3>Table 3-58.  FYL2X Results</h3><table>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<th colspan=\"2\">ST(0)</th>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u00b10</td>\n<td colspan=\"2\">+0&lt;+F&lt;+1</td>\n<td colspan=\"2\">+ 1</td>\n<td colspan=\"2\">+ F &gt; + 1</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<th colspan=\"2\">ST(1)</th>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">**</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">\u2212 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">**</td>\n<td colspan=\"2\">\u2212 F</td>\n<td colspan=\"2\">+ 0</td>\n<td colspan=\"2\">+ F</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">\u2212 \u221e</td>\n<td colspan=\"2\">*</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">+ \u221e</td>\n<td colspan=\"2\">NaN</td></tr>\n<tr>\n<td colspan=\"2\"></td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td>\n<td colspan=\"2\">NaN</td></tr></table><p><strong>NOTES:</strong></p>",
                 "tooltip": "Computes (ST(1) \u2217 log2 (ST(0))), stores the result in resister ST(1), and pops the FPU register stack. The source operand in ST(0) must be a non-zero positive number."
             };

         case "FYL2XP1":
             return {
                 "url": "http://www.felixcloutier.com/x86/FYL2XP1.html",
                 "html": "<p>Computes (ST(1) \u2217 log<sub>2</sub>(ST(0) + 1.0)), stores the result in register ST(1), and pops the FPU register stack. The source operand in ST(0) must be in the range:</p><math>\n<mo>-</mo>\n<mo stretchy=\"true\">(</mo>\n<mn>1</mn>\n<mo>-</mo>\n<mfrac>\n<mrow><msqrt><mn>2</mn></msqrt></mrow>\n<mrow><mn>2</mn></mrow>\n</mfrac>\n<mo stretchy=\"true\">)</mo>\n<mtext>\u00a0to\u00a0</mtext>\n<mo stretchy=\"true\">(</mo>\n<mn>1</mn>\n<mo>-</mo>\n<mfrac>\n<mrow><msqrt><mn>2</mn></msqrt></mrow></mfrac></math>",
                 "tooltip": "Computes (ST(1) \u2217 log2(ST(0) + 1.0)), stores the result in register ST(1), and pops the FPU register stack. The source operand in ST(0) must be in the range"
             };

         case "HADDPD":
         case "VHADDPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/HADDPD.html",
                 "html": "<p>Adds the double-precision floating-point values in the high and low quadwords of the destination operand and stores the result in the low quadword of the destination operand.</p><p>Adds the double-precision floating-point values in the high and low quadwords of the source operand and stores the result in the high quadword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>See Figure 3-15 for HADDPD; see Figure 3-16 for VHADDPD.</p><svg height=\"313.83\" viewbox=\"132.900000 286366.740000 338.280000 209.220000\" width=\"507.42\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"104.3636447\" x=\"240.5523\" y=\"286384.013472\">HADDPD xmm1, xmm2/m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"21.3930146\" x=\"430.6153\" y=\"286404.164672\">xmm2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"28.4035102\" x=\"212.2145\" y=\"286408.880372\">[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"19.6639686\" x=\"349.2125\" y=\"286408.880372\">[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"21.8409947\" x=\"430.6153\" y=\"286413.595872\">/m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"28.4035102\" x=\"212.2145\" y=\"286456.404672\">[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"19.6639686\" x=\"349.2125\" y=\"286456.404672\">[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"21.3930146\" x=\"430.6153\" y=\"286456.404672\">xmm1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"69.6726945\" x=\"191.5837\" y=\"286509.160172\">xmm2/m128[63:0] +</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"24.4581416\" x=\"430.6158\" y=\"286509.160272\">Result:</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"99.81311\" x=\"309.1397\" y=\"286513.875772\">xmm1[63:0] + xmm1[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"71.6375195\" x=\"190.6013\" y=\"286518.591372\">xmm2/m128[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"21.3930146\" x=\"430.6158\" y=\"286518.591372\">xmm1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"28.4035102\" x=\"212.2247\" y=\"286537.098572\">[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.859300pt\" textlength=\"19.6639686\" x=\"353.64337234\" y=\"286537.098572\">[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:5.894400pt\" textlength=\"25.8823104\" x=\"440.6362\" y=\"286567.4736\">OM15993</text>\n<rect height=\"181.256\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"331.566\" x=\"133.582\" y=\"286367.475\"></rect>\n<rect height=\"181.256\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"331.566\" x=\"133.582\" y=\"286367.475\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286392.895\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286392.895\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286440.419\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286440.42\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286440.419\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286497.891\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286440.42\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286497.891\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286392.895\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286392.895\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"160.107\" y=\"286497.891\"></rect>\n<rect height=\"26.525\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"132.626\" x=\"292.733\" y=\"286497.891\"></rect></svg>",
                 "tooltip": "Adds the double-precision floating-point values in the high and low quadwords of the destination operand and stores the result in the low quadword of the destination operand."
             };

         case "VHADDPS":
         case "HADDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/HADDPS.html",
                 "html": "<p>Adds the single-precision floating-point values in the first and second dwords of the destination operand and stores the result in the first dword of the destination operand.</p><p>Adds single-precision floating-point values in the third and fourth dword of the destination operand and stores the result in the second dword of the destination operand.</p><p>Adds single-precision floating-point values in the first and second dword of the source operand and stores the result in the third dword of the destination operand.</p><p>Adds single-precision floating-point values in the third and fourth dword of the source operand and stores the result in the fourth dword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Adds the single-precision floating-point values in the first and second dwords of the destination operand and stores the result in the first dword of the destination operand."
             };

         case "HLT":
             return {
                 "url": "http://www.felixcloutier.com/x86/HLT.html",
                 "html": "<p>Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer (CS:EIP) points to the instruction following the HLT instruction.</p><p>When a HLT instruction is executed on an Intel 64 or IA-32 processor supporting Intel Hyper-Threading Technology, only the logical processor that executes the instruction is halted. The other logical processors in the physical processor remain active, unless they are each individually halted by executing a HLT instruction.</p><p>The HLT instruction is a privileged instruction. When the processor is running in protected or virtual-8086 mode, the privilege level of a program or procedure must be 0 to execute the HLT instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Stops instruction execution and places the processor in a HALT state. An enabled interrupt (including NMI and SMI), a debug exception, the BINIT# signal, the INIT# signal, or the RESET# signal will resume execution. If an interrupt (including NMI) is used to resume execution after a HLT instruction, the saved instruction pointer (CS:EIP) points to the instruction following the HLT instruction."
             };

         case "HSUBPD":
         case "VHSUBPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/HSUBPD.html",
                 "html": "<p>The HSUBPD instruction subtracts horizontally the packed DP FP numbers of both operands.</p><p>Subtracts the double-precision floating-point value in the high quadword of the destination operand from the low quadword of the destination operand and stores the result in the low quadword of the destination operand.</p><p>Subtracts the double-precision floating-point value in the high quadword of the source operand from the low quad-word of the source operand and stores the result in the high quadword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>See Figure 3-19 for HSUBPD; see Figure 3-20 for VHSUBPD.</p>",
                 "tooltip": "The HSUBPD instruction subtracts horizontally the packed DP FP numbers of both operands."
             };

         case "VHSUBPS":
         case "HSUBPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/HSUBPS.html",
                 "html": "<p>Subtracts the single-precision floating-point value in the second dword of the destination operand from the first dword of the destination operand and stores the result in the first dword of the destination operand.</p><p>Subtracts the single-precision floating-point value in the fourth dword of the destination operand from the third dword of the destination operand and stores the result in the second dword of the destination operand.</p><p>Subtracts the single-precision floating-point value in the second dword of the source operand from the first dword of the source operand and stores the result in the third dword of the destination operand.</p><p>Subtracts the single-precision floating-point value in the fourth dword of the source operand from the third dword of the source operand and stores the result in the fourth dword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Subtracts the single-precision floating-point value in the second dword of the destination operand from the first dword of the destination operand and stores the result in the first dword of the destination operand."
             };

         case "IDIV":
             return {
                 "url": "http://www.felixcloutier.com/x86/IDIV.html",
                 "html": "<p>Divides the (signed) value in the AX, DX:AX, or EDX:EAX (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, or EDX:EAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor).</p><p>Non-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magni-tude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. In 64-bit mode when REX.W is applied, the instruction divides the signed value in RDX:RAX by the source operand. RAX contains a 64-bit quotient; RDX contains a 64-bit remainder.</p><p>See the summary chart at the beginning of this section for encoding data and limits. See Table 3-60.</p><h3>Table 3-60.  IDIV Results</h3>",
                 "tooltip": "Divides the (signed) value in the AX, DX:AX, or EDX:EAX (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, or EDX:EAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor)."
             };

         case "IMUL":
             return {
                 "url": "http://www.felixcloutier.com/x86/IMUL.html",
                 "html": "<p>Performs a signed multiplication of two operands. This instruction has three forms, depending on the number of operands.</p><p>When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p><p>The CF and OF flags are set when the signed integer value of the intermediate product differs from the sign extended operand-size-truncated product, otherwise the CF and OF flags are cleared.</p><p>The three forms of the IMUL instruction are similar in that the length of the product is calculated to twice the length of the operands. With the one-operand form, the product is stored exactly in the destination. With the two- and three- operand forms, however, the result is truncated to the length of the destination before it is stored in the destination register. Because of this truncation, the CF or OF flag should be tested to ensure that no significant bits are lost.</p><p>The two- and three-operand forms may also be used with unsigned operands because the lower half of the product is the same regardless if the operands are signed or unsigned. The CF and OF flags, however, cannot be used to determine if the upper half of the result is non-zero.</p>",
                 "tooltip": "Performs a signed multiplication of two operands. This instruction has three forms, depending on the number of operands."
             };

         case "IN":
             return {
                 "url": "http://www.felixcloutier.com/x86/IN.html",
                 "html": "<p>Copies the value from the I/O port specified with the second operand (source operand) to the destination operand (first operand). The source operand can be a byte-immediate or the DX register; the destination operand can be register AL, AX, or EAX, depending on the size of the port being accessed (8, 16, or 32 bits, respectively). Using the DX register as a source operand allows I/O port addresses from 0 to 65,535 to be accessed; using a byte imme-diate allows I/O port addresses 0 to 255 to be accessed.</p><p>When accessing an 8-bit I/O port, the opcode determines the port size; when accessing a 16- and 32-bit I/O port, the operand-size attribute determines the port size. At the machine code level, I/O instructions are shorter when accessing 8-bit I/O ports. Here, the upper eight bits of the port address will be 0.</p><p>This instruction is only useful for accessing I/O ports located in the processor\u2019s I/O address space. See Chapter 16, \u201cInput/Output,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more infor-mation on accessing I/O ports in the I/O address space.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Copies the value from the I/O port specified with the second operand (source operand) to the destination operand (first operand). The source operand can be a byte-immediate or the DX register; the destination operand can be register AL, AX, or EAX, depending on the size of the port being accessed (8, 16, or 32 bits, respectively). Using the DX register as a source operand allows I/O port addresses from 0 to 65,535 to be accessed; using a byte imme-diate allows I/O port addresses 0 to 255 to be accessed."
             };

         case "INC":
             return {
                 "url": "http://www.felixcloutier.com/x86/INC.html",
                 "html": "<p>Adds 1 to the destination operand, while preserving the state of the CF flag. The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (Use a ADD instruction with an immediate operand of 1 to perform an increment operation that does updates the CF flag.)</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, INC r16 and INC r32 are not encodable (because opcodes 40H through 47H are REX prefixes). Otherwise, the instruction\u2019s 64-bit mode default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits.</p>",
                 "tooltip": "Adds 1 to the destination operand, while preserving the state of the CF flag. The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (Use a ADD instruction with an immediate operand of 1 to perform an increment operation that does updates the CF flag.)"
             };

         case "INSB":
         case "INSW":
         case "INSD":
         case "INS":
             return {
                 "url": "http://www.felixcloutier.com/x86/INSD.html",
                 "html": "<p>Copies the data from the I/O port specified with the source operand (second operand) to the destination operand (first operand). The source operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The destination operand is a memory location, the address of which is read from either the ES:DI, ES:EDI or the RDI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The ES segment cannot be overridden with a segment override prefix.) The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attri-bute of the instruction for a 16- or 32-bit I/O port.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the INS mnemonic) allows the source and destination operands to be specified explicitly. Here, the source operand must be \u201cDX,\u201d and the destination operand should be a symbol that indicates the size of the I/O port and the destination address. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the destination operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the ES:(E)DI registers, which must be loaded correctly before the INS instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the INS instructions. Here also DX is assumed by the processor to be the source operand and ES:(E)DI is assumed to be the destination operand. The size of the I/O port is specified with the choice of mnemonic: INSB (byte), INSW (word), or INSD (doubleword).</p><p>After the byte, word, or doubleword is transfer from the I/O port to the memory location, the DI/EDI/RDI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the (E)DI register is decremented.) The (E)DI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.</p><p>The INS, INSB, INSW, and INSD instructions can be preceded by the REP prefix for block input of ECX bytes, words, or doublewords. See \u201cREP/REPE/REPZ /REPNE/REPNZ\u2014Repeat String Operation Prefix\u201d in Chapter 4 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2B</em>, for a description of the REP prefix.</p>",
                 "tooltip": "Copies the data from the I/O port specified with the source operand (second operand) to the destination operand (first operand). The source operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The destination operand is a memory location, the address of which is read from either the ES:DI, ES:EDI or the RDI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The ES segment cannot be overridden with a segment override prefix.) The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attri-bute of the instruction for a 16- or 32-bit I/O port."
             };

         case "VINSERTPS":
         case "INSERTPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/INSERTPS.html",
                 "html": "<p>(register source form)</p><p>Select a single precision floating-point element from second source as indicated by Count_S bits of the immediate operand and insert it into the first source at the location indicated by the Count_D bits of the immediate operand. Store in the destination and zero out destination elements based on the ZMask bits of the immediate operand.</p><p>(memory source form)</p><p>Load a floating-point element from a 32-bit memory location and insert it into the first source at the location indi-cated by the Count_D bits of the immediate operand. Store in the destination and zero out destination elements based on the ZMask bits of the immediate operand.</p><p>128-bit Legacy SSE version: The first source register is an XMM register. The second source operand is either an XMM register or a 32-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "(register source form)"
             };

         case "INT":
         case "INTO":
             return {
                 "url": "http://www.felixcloutier.com/x86/INTO.html",
                 "html": "<p>The INT <em>n</em> instruction generates a call to the interrupt or exception handler specified with the destination operand (see the section titled \u201cInterrupts and Exceptions\u201d in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). The destination operand specifies a vector from 0 to 255, encoded as an 8-bit unsigned intermediate value. Each vector provides an index to a gate descriptor in the IDT. The first 32 vectors are reserved by Intel for system use. Some of these vectors are used for internally generated exceptions.</p><p>The INT <em>n</em> instruction is the general mnemonic for executing a software-generated call to an interrupt handler. The INTO instruction is a special mnemonic for calling overflow exception (#OF), exception 4. The overflow interrupt checks the OF flag in the EFLAGS register and calls the overflow interrupt handler if the OF flag is set to 1. (The INTO instruction cannot be used in 64-bit mode.)</p><p>The INT 3 instruction generates a special one byte opcode (CC) that is intended for calling the debug exception handler. (This one byte form is valuable because it can be used to replace the first byte of any instruction with a breakpoint, including other one byte instructions, without over-writing other code). To further support its function as a debug breakpoint, the interrupt generated with the CC opcode also differs from the regular software interrupts as follows:</p><p>Note that the \u201cnormal\u201d 2-byte opcode for INT 3 (CD03) does not have these special features. Intel and Microsoft assemblers will not generate the CD03 opcode from any mnemonic, but this opcode can be created by direct numeric code definition or by self-modifying code.</p><p>The action of the INT <em>n</em> instruction (including the INTO and INT 3 instructions) is similar to that of a far call made with the CALL instruction. The primary difference is that with the INT <em>n</em> instruction, the EFLAGS register is pushed onto the stack before the return address. (The return address is a far address consisting of the current values of the CS and EIP registers.) Returns from interrupt procedures are handled with the IRET instruction, which pops the EFLAGS information and return address from the stack.</p>",
                 "tooltip": "The INT n instruction generates a call to the interrupt or exception handler specified with the destination operand (see the section titled \u201cInterrupts and Exceptions\u201d in Chapter 6 of the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1). The destination operand specifies a vector from 0 to 255, encoded as an 8-bit unsigned intermediate value. Each vector provides an index to a gate descriptor in the IDT. The first 32 vectors are reserved by Intel for system use. Some of these vectors are used for internally generated exceptions."
             };

         case "INVD":
             return {
                 "url": "http://www.felixcloutier.com/x86/INVD.html",
                 "html": "<p>Invalidates (flushes) the processor\u2019s internal caches and issues a special-function bus cycle that directs external caches to also flush themselves. Data held in internal caches is not written back to main memory.</p><p>After executing this instruction, the processor does not wait for the external caches to complete their flushing oper-ation before proceeding with instruction execution. It is the responsibility of hardware to respond to the cache flush signal.</p><p>The INVD instruction is a privileged instruction. When the processor is running in protected mode, the CPL of a program or procedure must be 0 to execute this instruction.</p><p>The INVD instruction may be used when the cache is used as temporary memory and the cache contents need to be invalidated rather than written back to memory. When the cache is used as temporary memory, no external device should be actively writing data to main memory.</p><p>Use this instruction with care. Data cached internally and not written back to main memory will be lost. Note that any data from an external device to main memory (for example, via a PCIWrite) can be temporarily stored in the caches; these data can be lost when an INVD instruction is executed. Unless there is a specific requirement or benefit to flushing caches without writing back modified cache lines (for example, temporary memory, testing, or fault recovery where cache coherency with main memory is not a concern), software should instead use the WBINVD instruction.</p>",
                 "tooltip": "Invalidates (flushes) the processor\u2019s internal caches and issues a special-function bus cycle that directs external caches to also flush themselves. Data held in internal caches is not written back to main memory."
             };

         case "INVLPG":
             return {
                 "url": "http://www.felixcloutier.com/x86/INVLPG.html",
                 "html": "<p>Invalidates any translation lookaside buffer (TLB) entries specified with the source operand. The source operand is a memory address. The processor determines the page that contains that address and flushes all TLB entries for that page.<sup>1</sup></p><p>The INVLPG instruction is a privileged instruction. When the processor is running in protected mode, the CPL must be 0 to execute this instruction.</p><p>The INVLPG instruction normally flushes TLB entries only for the specified page; however, in some cases, it may flush more entries, even the entire TLB. The instruction is guaranteed to invalidates only TLB entries associated with the current PCID. (If PCIDs are disabled \u2014 CR4.PCIDE = 0 \u2014 the current PCID is 000H.) The instruction also invalidates any global TLB entries for the specified page, regardless of PCID.</p><p>For more details on operations that flush the TLB, see \u201cMOV\u2014Move to/from Control Registers\u201d and Section 4.10.4.1, \u201cOperations that Invalidate TLBs and Paging-Structure Caches,\u201d of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>This instruction\u2019s operation is the same in all non-64-bit modes. It also operates the same in 64-bit mode, except if the memory address is in non-canonical form. In this case, INVLPG is the same as a NOP.</p>",
                 "tooltip": "Invalidates any translation lookaside buffer (TLB) entries specified with the source operand. The source operand is a memory address. The processor determines the page that contains that address and flushes all TLB entries for that page.1"
             };

         case "INVPCID":
             return {
                 "url": "http://www.felixcloutier.com/x86/INVPCID.html",
                 "html": "<p>Invalidates mappings in the translation lookaside buffers (TLBs) and paging-structure caches based on process-context identifier (PCID). (See Section 4.10, \u201cCaching Translation Information,\u201d in <em>Intel 64 and IA-32 Architecture Software Developer\u2019s Manual, Volume 3A</em>.) Invalidation is based on the INVPCID type specified in the register operand and the INVPCID descriptor specified in the memory operand.</p><p>Outside 64-bit mode, the register operand is always 32 bits, regardless of the value of CS.D. In 64-bit mode the register operand has 64 bits.</p><p>There are four INVPCID types currently defined:</p><p>The INVPCID descriptor comprises 128 bits and consists of a PCID and a linear address as shown in Figure 3-23. For INVPCID type 0, the processor uses the full 64 bits of the linear address even outside 64-bit mode; the linear address is not used for other INVPCID types.</p><p>1.</p>",
                 "tooltip": "Invalidates mappings in the translation lookaside buffers (TLBs) and paging-structure caches based on process-context identifier (PCID). (See Section 4.10, \u201cCaching Translation Information,\u201d in Intel 64 and IA-32 Architecture Software Developer\u2019s Manual, Volume 3A.) Invalidation is based on the INVPCID type specified in the register operand and the INVPCID descriptor specified in the memory operand."
             };

         case "IRET":
         case "IRETD":
         case "IRETQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/IRETQ.html",
                 "html": "<p>Returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task. (A nested task is created when a CALL instruction is used to initiate a task switch or when an interrupt or exception causes a task switch to an interrupt or exception handler.) See the section titled \u201cTask Linking\u201d in Chapter 7 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>IRET and IRETD are mnemonics for the same opcode. The IRETD mnemonic (interrupt return double) is intended for use when returning from an interrupt when using the 32-bit operand size; however, most assemblers use the IRET mnemonic interchangeably for both operand sizes.</p><p>In Real-Address Mode, the IRET instruction preforms a far return to the interrupted program or procedure. During this operation, the processor pops the return instruction pointer, return code segment selector, and EFLAGS image from the stack to the EIP, CS, and EFLAGS registers, respectively, and then resumes execution of the interrupted program or procedure.</p><p>In Protected Mode, the action of the IRET instruction depends on the settings of the NT (nested task) and VM flags in the EFLAGS register and the VM flag in the EFLAGS image stored on the current stack. Depending on the setting of these flags, the processor performs the following types of interrupt returns:</p><p>If the NT flag (EFLAGS register) is cleared, the IRET instruction performs a far return from the interrupt procedure, without a task switch. The code segment being returned to must be equally or less privileged than the interrupt handler routine (as indicated by the RPL field of the code segment selector popped from the stack).</p>",
                 "tooltip": "Returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task. (A nested task is created when a CALL instruction is used to initiate a task switch or when an interrupt or exception causes a task switch to an interrupt or exception handler.) See the section titled \u201cTask Linking\u201d in Chapter 7 of the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A."
             };

         case "JZ":
         case "JP":
         case "JS":
         case "JL":
         case "JBE":
         case "JO":
         case "JNBE":
         case "JRCXZ":
         case "JE":
         case "JG":
         case "JA":
         case "JB":
         case "JC":
         case "JNC":
         case "JNLE":
         case "JGE":
         case "JNAE":
         case "JECXZ":
         case "JAE":
         case "JNS":
         case "JNP":
         case "JNGE":
         case "JPO":
         case "JNZ":
         case "JPE":
         case "JLE":
         case "JNB":
         case "JNA":
         case "JCXZ":
         case "JNG":
         case "JNE":
         case "JNO":
         case "JNL":
             return {
                 "url": "http://www.felixcloutier.com/x86/JAE.html",
                 "html": "<p>Checks the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and, if the flags are in the specified state (condition), performs a jump to the target instruction specified by the destination operand. A condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, the jump is not performed and execution continues with the instruction following the J<em>cc </em>instruction.</p><p>The target instruction is specified with a relative offset (a signed offset relative to the current value of the instruc-tion pointer in the EIP register). A relative offset (<em>rel8</em>, <em>rel16,</em> or <em>rel32</em>) is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed, 8-bit or 32-bit immediate value, which is added to the instruction pointer. Instruction coding is most efficient for offsets of \u2013128 to +127. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits.</p><p>The conditions for each J<em>cc</em> mnemonic are given in the \u201cDescription\u201d column of the table on the preceding page. The terms \u201cless\u201d and \u201cgreater\u201d are used for comparisons of signed integers and the terms \u201cabove\u201d and \u201cbelow\u201d are used for unsigned integers.</p><p>Because a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are defined for some opcodes. For example, the JA (jump if above) instruction and the JNBE (jump if not below or equal) instruction are alternate mnemonics for the opcode 77H.</p><p>The J<em>cc</em> instruction does not support far jumps (jumps to other code segments). When the target for the conditional jump is in a different segment, use the opposite condition from the condition being tested for the J<em>cc</em> instruction, and then access the target with an unconditional far jump (JMP instruction) to the other segment. For example, the following conditional far jump is illegal:</p>",
                 "tooltip": "Checks the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and, if the flags are in the specified state (condition), performs a jump to the target instruction specified by the destination operand. A condition code (cc) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, the jump is not performed and execution continues with the instruction following the Jcc instruction."
             };

         case "JMP":
             return {
                 "url": "http://www.felixcloutier.com/x86/JMP.html",
                 "html": "<p>Transfers program control to a different point in the instruction stream without recording return information. The destination (target) operand specifies the address of the instruction being jumped to. This operand can be an immediate value, a general-purpose register, or a memory location.</p><p>This instruction can be used to execute four different types of jumps:</p><p>A task switch can only be executed in protected mode (see Chapter 7, in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>, for information on performing task switches with the JMP instruction).</p><p><strong>Near and Short Jumps. </strong>When executing a near jump, the processor jumps to the address (within the current code segment) that is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current</p><p>value of the instruction pointer in the EIP register). A near jump to a relative offset of 8-bits (<em>rel8</em>) is referred to as a short jump. The CS register is not changed on near and short jumps.</p>",
                 "tooltip": "Transfers program control to a different point in the instruction stream without recording return information. The destination (target) operand specifies the address of the instruction being jumped to. This operand can be an immediate value, a general-purpose register, or a memory location."
             };

         case "LAHF":
             return {
                 "url": "http://www.felixcloutier.com/x86/LAHF.html",
                 "html": "<p>This instruction executes as described above in compatibility mode and legacy mode. It is valid in 64-bit mode only if CPUID.80000001H:ECX.LAHF-SAHF[bit 0] = 1.</p>",
                 "tooltip": "This instruction executes as described above in compatibility mode and legacy mode. It is valid in 64-bit mode only if CPUID.80000001H:ECX.LAHF-SAHF[bit 0] = 1."
             };

         case "LAR":
             return {
                 "url": "http://www.felixcloutier.com/x86/LAR.html",
                 "html": "<p>Loads the access rights from the segment descriptor specified by the second operand (source operand) into the first operand (destination operand) and sets the ZF flag in the flag register. The source operand (which can be a register or a memory location) contains the segment selector for the segment descriptor being accessed. If the source operand is a memory address, only 16 bits of data are accessed. The destination operand is a general-purpose register.</p><p>The processor performs access checks as part of the loading process. Once loaded in the destination register, soft-ware can perform additional checks on the access rights information.</p><p>The access rights for a segment descriptor include fields located in the second doubleword (bytes 4\u20137) of the segment descriptor. The following fields are loaded by the LAR instruction:</p><p>\u2014 Bits 19:16 are undefined.</p><p>\u2014 Bit 20 returns the software-available bit in the descriptor.</p>",
                 "tooltip": "Loads the access rights from the segment descriptor specified by the second operand (source operand) into the first operand (destination operand) and sets the ZF flag in the flag register. The source operand (which can be a register or a memory location) contains the segment selector for the segment descriptor being accessed. If the source operand is a memory address, only 16 bits of data are accessed. The destination operand is a general-purpose register."
             };

         case "VLDDQU":
         case "LDDQU":
             return {
                 "url": "http://www.felixcloutier.com/x86/LDDQU.html",
                 "html": "<p>The instruction is <em>functionally similar </em>to (V)MOVDQU ymm/xmm, m256/m128 for loading from memory. That is: 32/16 bytes of data starting at an address specified by the source memory operand (second operand) are fetched from memory and placed in a destination register (first operand). The source operand need not be aligned on a 32/16-byte boundary. Up to 64/32 bytes may be loaded from memory; this is implementation dependent.</p><p>This instruction may improve performance relative to (V)MOVDQU if the source operand crosses a cache line boundary. In situations that require the data loaded by (V)LDDQU be modified and stored to the same location, use (V)MOVDQU or (V)MOVDQA instead of (V)LDDQU. To move a double quadword to or from memory locations that are known to be aligned on 16-byte boundaries, use the (V)MOVDQA instruction.</p>",
                 "tooltip": "The instruction is functionally similar to (V)MOVDQU ymm/xmm, m256/m128 for loading from memory. That is: 32/16 bytes of data starting at an address specified by the source memory operand (second operand) are fetched from memory and placed in a destination register (first operand). The source operand need not be aligned on a 32/16-byte boundary. Up to 64/32 bytes may be loaded from memory; this is implementation dependent."
             };

         case "LDMXCSR":
         case "VLDMXCSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/LDMXCSR.html",
                 "html": "<p>Loads the source operand into the MXCSR control/status register. The source operand is a 32-bit memory location. See \u201cMXCSR Control and Status Register\u201d in Chapter 10, of the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for a description of the MXCSR register and its contents.</p><p>The LDMXCSR instruction is typically used in conjunction with the (V)STMXCSR instruction, which stores the contents of the MXCSR register in memory.</p><p>The default MXCSR value at reset is 1F80H.</p><p>If a (V)LDMXCSR instruction clears a SIMD floating-point exception mask bit and sets the corresponding exception flag bit, a SIMD floating-point exception will not be immediately generated. The exception will be generated only upon the execution of the next instruction that meets both conditions below:</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Loads the source operand into the MXCSR control/status register. The source operand is a 32-bit memory location. See \u201cMXCSR Control and Status Register\u201d in Chapter 10, of the Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1, for a description of the MXCSR register and its contents."
             };

         case "LEA":
             return {
                 "url": "http://www.felixcloutier.com/x86/LEA.html",
                 "html": "<p>Computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general-purpose register. The address-size and operand-size attri-butes affect the action performed by this instruction, as shown in the following table. The operand-size attribute of the instruction is determined by the chosen register; the address-size attribute is determined by the attribute of the code segment.</p><h3>Table 3-63.  Non-64-bit Mode LEA Operation with Address and Operand Size Attributes</h3><table>\n<tr>\n<th>Operand Size</th>\n<th>Address Size</th>\n<th>Action Performed</th></tr>\n<tr>\n<td>16</td>\n<td>16</td>\n<td>16-bit effective address is calculated and stored in requested 16-bit register destination.</td></tr>\n<tr>\n<td>16</td>\n<td>32</td>\n<td>32-bit effective address is calculated. The lower 16 bits of the address are stored in the requested 16-bit register destination.</td></tr>\n<tr>\n<td>32</td>\n<td>16</td>\n<td>16-bit effective address is calculated. The 16-bit address is zero-extended and stored in the requested 32-bit register destination.</td></tr>\n<tr>\n<td>32</td>\n<td>32</td>\n<td>32-bit effective address is calculated and stored in the requested 32-bit register destination.</td></tr></table><p>Different assemblers may use different algorithms based on the size attribute and symbolic reference of the source operand.</p><p>In 64-bit mode, the instruction\u2019s destination operand is governed by operand size attribute, the default operand size is 32 bits. Address calculation is governed by address size attribute, the default address size is 64-bits. In 64-bit mode, address size of 16 bits is not encodable. See Table 3-64.</p>",
                 "tooltip": "Computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general-purpose register. The address-size and operand-size attri-butes affect the action performed by this instruction, as shown in the following table. The operand-size attribute of the instruction is determined by the chosen register; the address-size attribute is determined by the attribute of the code segment."
             };

         case "LEAVE":
             return {
                 "url": "http://www.felixcloutier.com/x86/LEAVE.html",
                 "html": "<p>Releases the stack frame set up by an earlier ENTER instruction. The LEAVE instruction copies the frame pointer (in the EBP register) into the stack pointer register (ESP), which releases the stack space allocated to the stack frame. The old frame pointer (the frame pointer for the calling procedure that was saved by the ENTER instruction) is then popped from the stack into the EBP register, restoring the calling procedure\u2019s stack frame.</p><p>A RET instruction is commonly executed following a LEAVE instruction to return program control to the calling procedure.</p><p>See \u201cProcedure Calls for Block-Structured Languages\u201d in Chapter 7 of the <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>, for detailed information on the use of the ENTER and LEAVE instructions.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 64 bits; 32-bit operation cannot be encoded. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Releases the stack frame set up by an earlier ENTER instruction. The LEAVE instruction copies the frame pointer (in the EBP register) into the stack pointer register (ESP), which releases the stack space allocated to the stack frame. The old frame pointer (the frame pointer for the calling procedure that was saved by the ENTER instruction) is then popped from the stack into the EBP register, restoring the calling procedure\u2019s stack frame."
             };

         case "LFENCE":
             return {
                 "url": "http://www.felixcloutier.com/x86/LFENCE.html",
                 "html": "<p>Performs a serializing operation on all load-from-memory instructions that were issued prior the LFENCE instruc-tion. Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruc-tion begins execution until LFENCE completes. In particular, an instruction that loads from memory and that precedes an LFENCE receives data from memory prior to completion of the LFENCE. (An LFENCE that follows an instruction that stores to memory might complete <strong>before</strong> the data being stored have become globally visible.) Instructions following an LFENCE may be fetched from memory before the LFENCE, but they will not execute until the LFENCE completes.</p><p>Weakly ordered memory types can be used to achieve higher processor performance through such techniques as out-of-order issue and speculative reads. The degree to which a consumer of data recognizes or knows that the data is weakly ordered varies among applications and may be unknown to the producer of this data. The LFENCE instruction provides a performance-efficient way of ensuring load ordering between routines that produce weakly-ordered results and routines that consume that data.</p><p>Processors are free to fetch and cache data speculatively from regions of system memory that use the WB, WC, and WT memory types. This speculative fetching can occur at any time and is not tied to instruction execution. Thus, it is not ordered with respect to executions of the LFENCE instruction; data can be brought into the caches specula-tively just before, during, or after the execution of an LFENCE instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>Specification of the instruction's opcode above indicates a ModR/M byte of E8. For this instruction, the processor ignores the r/m field of the ModR/M byte. Thus, LFENCE is encoded by any opcode of the form 0F AE Ex, where x is in the range 8-F.</p>",
                 "tooltip": "Performs a serializing operation on all load-from-memory instructions that were issued prior the LFENCE instruc-tion. Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruc-tion begins execution until LFENCE completes. In particular, an instruction that loads from memory and that precedes an LFENCE receives data from memory prior to completion of the LFENCE. (An LFENCE that follows an instruction that stores to memory might complete before the data being stored have become globally visible.) Instructions following an LFENCE may be fetched from memory before the LFENCE, but they will not execute until the LFENCE completes."
             };

         case "LDS":
         case "LGS":
         case "LES":
         case "LSS":
         case "LFS":
             return {
                 "url": "http://www.felixcloutier.com/x86/LGS.html",
                 "html": "<p>Loads a far pointer (segment selector and offset) from the second operand (source operand) into a segment register and the first operand (destination operand). The source operand specifies a 48-bit or a 32-bit pointer in memory depending on the current setting of the operand-size attribute (32 bits or 16 bits, respectively). The instruction opcode and the destination operand specify a segment register/general-purpose register pair. The 16-bit segment selector from the source operand is loaded into the segment register specified with the opcode (DS, SS, ES, FS, or GS). The 32-bit or 16-bit offset is loaded into the register specified with the destination operand.</p><p>If one of these instructions is executed in protected mode, additional information from the segment descriptor pointed to by the segment selector in the source operand is loaded in the hidden part of the selected segment register.</p><p>Also in protected mode, a NULL selector (values 0000 through 0003) can be loaded into DS, ES, FS, or GS registers without causing a protection exception. (Any subsequent reference to a segment whose corresponding segment register is loaded with a NULL selector, causes a general-protection exception (#GP) and no memory reference to the segment occurs.)</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.W promotes operation to specify a source operand referencing an 80-bit pointer (16-bit selector, 64-bit offset) in memory. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Loads a far pointer (segment selector and offset) from the second operand (source operand) into a segment register and the first operand (destination operand). The source operand specifies a 48-bit or a 32-bit pointer in memory depending on the current setting of the operand-size attribute (32 bits or 16 bits, respectively). The instruction opcode and the destination operand specify a segment register/general-purpose register pair. The 16-bit segment selector from the source operand is loaded into the segment register specified with the opcode (DS, SS, ES, FS, or GS). The 32-bit or 16-bit offset is loaded into the register specified with the destination operand."
             };

         case "LIDT":
         case "LGDT":
             return {
                 "url": "http://www.felixcloutier.com/x86/LIDT.html",
                 "html": "<p>Loads the values in the source operand into the global descriptor table register (GDTR) or the interrupt descriptor table register (IDTR). The source operand specifies a 6-byte memory location that contains the base address (a linear address) and the limit (size of table in bytes) of the global descriptor table (GDT) or the interrupt descriptor table (IDT). If operand-size attribute is 32 bits, a 16-bit limit (lower 2 bytes of the 6-byte data operand) and a 32-bit base address (upper 4 bytes of the data operand) are loaded into the register. If the operand-size attribute is 16 bits, a 16-bit limit (lower 2 bytes) and a 24-bit base address (third, fourth, and fifth byte) are loaded. Here, the high-order byte of the operand is not used and the high-order byte of the base address in the GDTR or IDTR is filled with zeros.</p><p>The LGDT and LIDT instructions are used only in operating-system software; they are not used in application programs. They are the only instructions that directly load a linear address (that is, not a segment-relative address) and a limit in protected mode. They are commonly executed in real-address mode to allow processor initialization prior to switching to protected mode.</p><p>In 64-bit mode, the instruction\u2019s operand size is fixed at 8+2 bytes (an 8-byte base and a 2-byte limit). See the summary chart at the beginning of this section for encoding data and limits.</p><p>See \u201cSGDT\u2014Store Global Descriptor Table Register\u201d in Chapter 4, <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2B</em>, for information on storing the contents of the GDTR and IDTR.</p>",
                 "tooltip": "Loads the values in the source operand into the global descriptor table register (GDTR) or the interrupt descriptor table register (IDTR). The source operand specifies a 6-byte memory location that contains the base address (a linear address) and the limit (size of table in bytes) of the global descriptor table (GDT) or the interrupt descriptor table (IDT). If operand-size attribute is 32 bits, a 16-bit limit (lower 2 bytes of the 6-byte data operand) and a 32-bit base address (upper 4 bytes of the data operand) are loaded into the register. If the operand-size attribute is 16 bits, a 16-bit limit (lower 2 bytes) and a 24-bit base address (third, fourth, and fifth byte) are loaded. Here, the high-order byte of the operand is not used and the high-order byte of the base address in the GDTR or IDTR is filled with zeros."
             };

         case "LLDT":
             return {
                 "url": "http://www.felixcloutier.com/x86/LLDT.html",
                 "html": "<p>Loads the source operand into the segment selector field of the local descriptor table register (LDTR). The source operand (a general-purpose register or a memory location) contains a segment selector that points to a local descriptor table (LDT). After the segment selector is loaded in the LDTR, the processor uses the segment selector to locate the segment descriptor for the LDT in the global descriptor table (GDT). It then loads the segment limit and base address for the LDT from the segment descriptor into the LDTR. The segment registers DS, ES, SS, FS, GS, and CS are not affected by this instruction, nor is the LDTR field in the task state segment (TSS) for the current task.</p><p>If bits 2-15 of the source operand are 0, LDTR is marked invalid and the LLDT instruction completes silently. However, all subsequent references to descriptors in the LDT (except by the LAR, VERR, VERW or LSL instructions) cause a general protection exception (#GP).</p><p>The operand-size attribute has no effect on this instruction.</p><p>The LLDT instruction is provided for use in operating-system software; it should not be used in application programs. This instruction can only be executed in protected mode or 64-bit mode.</p><p>In 64-bit mode, the operand size is fixed at 16 bits.</p>",
                 "tooltip": "Loads the source operand into the segment selector field of the local descriptor table register (LDTR). The source operand (a general-purpose register or a memory location) contains a segment selector that points to a local descriptor table (LDT). After the segment selector is loaded in the LDTR, the processor uses the segment selector to locate the segment descriptor for the LDT in the global descriptor table (GDT). It then loads the segment limit and base address for the LDT from the segment descriptor into the LDTR. The segment registers DS, ES, SS, FS, GS, and CS are not affected by this instruction, nor is the LDTR field in the task state segment (TSS) for the current task."
             };

         case "LMSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/LMSW.html",
                 "html": "<p>Loads the source operand into the machine status word, bits 0 through 15 of register CR0. The source operand can be a 16-bit general-purpose register or a memory location. Only the low-order 4 bits of the source operand (which contains the PE, MP, EM, and TS flags) are loaded into CR0. The PG, CD, NW, AM, WP, NE, and ET flags of CR0 are not affected. The operand-size attribute has no effect on this instruction.</p><p>If the PE flag of the source operand (bit 0) is set to 1, the instruction causes the processor to switch to protected mode. While in protected mode, the LMSW instruction cannot be used to clear the PE flag and force a switch back to real-address mode.</p><p>The LMSW instruction is provided for use in operating-system software; it should not be used in application programs. In protected or virtual-8086 mode, it can only be executed at CPL 0.</p><p>This instruction is provided for compatibility with the Intel 286 processor; programs and procedures intended to run on the Pentium 4, Intel Xeon, P6 family, Pentium, Intel486, and Intel386 processors should use the MOV (control registers) instruction to load the whole CR0 register. The MOV CR0 instruction can be used to set and clear the PE flag in CR0, allowing a procedure or program to switch between protected and real-address modes.</p><p>This instruction is a serializing instruction.</p>",
                 "tooltip": "Loads the source operand into the machine status word, bits 0 through 15 of register CR0. The source operand can be a 16-bit general-purpose register or a memory location. Only the low-order 4 bits of the source operand (which contains the PE, MP, EM, and TS flags) are loaded into CR0. The PG, CD, NW, AM, WP, NE, and ET flags of CR0 are not affected. The operand-size attribute has no effect on this instruction."
             };

         case "LOCK":
             return {
                 "url": "http://www.felixcloutier.com/x86/LOCK.html",
                 "html": "<p>Causes the processor\u2019s LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted.</p><p>Note that, in later Intel 64 and IA-32 processors (including the Pentium 4, Intel Xeon, and P6 family processors), locking may occur without the LOCK# signal being asserted. See the \u201cIA-32 Architecture Compatibility\u201d section below.</p><p>The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions where the destination operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated. An undefined opcode exception will also be generated if the LOCK prefix is used with any instruction not in the above list. The XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix.</p><p>The LOCK prefix is typically used with the BTS instruction to perform a read-modify-write operation on a memory location in shared memory environment.</p><p>The integrity of the LOCK prefix is not affected by the alignment of the memory field. Memory locking is observed for arbitrarily misaligned fields.</p>",
                 "tooltip": "Causes the processor\u2019s LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted."
             };

         case "LODS":
         case "LODSQ":
         case "LODSB":
         case "LODSD":
         case "LODSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/LODSQ.html",
                 "html": "<p>Loads a byte, word, or doubleword from the source operand into the AL, AX, or EAX register, respectively. The source operand is a memory location, the address of which is read from the DS:ESI or the DS:SI registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). The DS segment may be over-ridden with a segment override prefix.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the LODS mnemonic) allows the source operand to be specified explicitly. Here, the source operand should be a symbol that indicates the size and location of the source value. The destination operand is then automatically selected to match the size of the source operand (the AL register for byte operands, AX for word operands, and EAX for doubleword operands). This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the DS:(E)SI registers, which must be loaded correctly before the load string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the LODS instructions. Here also DS:(E)SI is assumed to be the source operand and the AL, AX, or EAX register is assumed to be the desti-nation operand. The size of the source and destination operands is selected with the mnemonic: LODSB (byte loaded into register AL), LODSW (word loaded into AX), or LODSD (doubleword loaded into EAX).</p><p>After the byte, word, or doubleword is transferred from the memory location into the AL, AX, or EAX register, the (E)SI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the ESI register is decremented.) The (E)SI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.</p><p>In 64-bit mode, use of the REX.W prefix promotes operation to 64 bits. LODS/LODSQ load the quadword at address (R)SI into RAX. The (R)SI register is then incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register.</p>",
                 "tooltip": "Loads a byte, word, or doubleword from the source operand into the AL, AX, or EAX register, respectively. The source operand is a memory location, the address of which is read from the DS:ESI or the DS:SI registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). The DS segment may be over-ridden with a segment override prefix."
             };

         case "LOOPNE":
         case "LOOPE":
         case "LOOP":
             return {
                 "url": "http://www.felixcloutier.com/x86/LOOPNE.html",
                 "html": "<p>Performs a loop operation using the RCX, ECX or CX register as a counter (depending on whether address size is 64 bits, 32 bits, or 16 bits). Note that the LOOP instruction ignores REX.W; but 64-bit address size can be over-ridden using a 67H prefix.</p><p>Each time the LOOP instruction is executed, the count register is decremented, then checked for 0. If the count is 0, the loop is terminated and program execution continues with the instruction following the LOOP instruction. If the count is not zero, a near jump is performed to the destination (target) operand, which is presumably the instruction at the beginning of the loop.</p><p>The target instruction is specified with a relative offset (a signed offset relative to the current value of the instruc-tion pointer in the IP/EIP/RIP register). This offset is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed, 8-bit immediate value, which is added to the instruction pointer. Offsets of \u2013128 to +127 are allowed with this instruction.</p><p>Some forms of the loop instruction (LOOP<em>cc</em>) also accept the ZF flag as a condition for terminating the loop before the count reaches zero. With these forms of the instruction, a condition code (<em>cc</em>) is associated with each instruc-tion to indicate the condition being tested for. Here, the LOOP<em>cc</em> instruction itself does not affect the state of the ZF flag; the ZF flag is changed by other instructions in the loop.</p>",
                 "tooltip": "Performs a loop operation using the RCX, ECX or CX register as a counter (depending on whether address size is 64 bits, 32 bits, or 16 bits). Note that the LOOP instruction ignores REX.W; but 64-bit address size can be over-ridden using a 67H prefix."
             };

         case "LSL":
             return {
                 "url": "http://www.felixcloutier.com/x86/LSL.html",
                 "html": "<p>Loads the unscrambled segment limit from the segment descriptor specified with the second operand (source operand) into the first operand (destination operand) and sets the ZF flag in the EFLAGS register. The source operand (which can be a register or a memory location) contains the segment selector for the segment descriptor being accessed. The destination operand is a general-purpose register.</p><p>The processor performs access checks as part of the loading process. Once loaded in the destination register, soft-ware can compare the segment limit with the offset of a pointer.</p><p>The segment limit is a 20-bit value contained in bytes 0 and 1 and in the first 4 bits of byte 6 of the segment descriptor. If the descriptor has a byte granular segment limit (the granularity flag is set to 0), the destination operand is loaded with a byte granular value (byte limit). If the descriptor has a page granular segment limit (the granularity flag is set to 1), the LSL instruction will translate the page granular limit (page limit) into a byte limit before loading it into the destination operand. The translation is performed by shifting the 20-bit \u201craw\u201d limit left 12 bits and filling the low-order 12 bits with 1s.</p><p>When the operand size is 32 bits, the 32-bit byte limit is stored in the destination operand. When the operand size is 16 bits, a valid 32-bit limit is computed; however, the upper 16 bits are truncated and only the low-order 16 bits are loaded into the destination operand.</p><p>This instruction performs the following checks before it loads the segment limit into the destination register:</p>",
                 "tooltip": "Loads the unscrambled segment limit from the segment descriptor specified with the second operand (source operand) into the first operand (destination operand) and sets the ZF flag in the EFLAGS register. The source operand (which can be a register or a memory location) contains the segment selector for the segment descriptor being accessed. The destination operand is a general-purpose register."
             };

         case "LTR":
             return {
                 "url": "http://www.felixcloutier.com/x86/LTR.html",
                 "html": "<p>Loads the source operand into the segment selector field of the task register. The source operand (a general-purpose register or a memory location) contains a segment selector that points to a task state segment (TSS). After the segment selector is loaded in the task register, the processor uses the segment selector to locate the segment descriptor for the TSS in the global descriptor table (GDT). It then loads the segment limit and base address for the TSS from the segment descriptor into the task register. The task pointed to by the task register is marked busy, but a switch to the task does not occur.</p><p>The LTR instruction is provided for use in operating-system software; it should not be used in application programs. It can only be executed in protected mode when the CPL is 0. It is commonly used in initialization code to establish the first task to be executed.</p><p>The operand-size attribute has no effect on this instruction.</p><p>In 64-bit mode, the operand size is still fixed at 16 bits. The instruction references a 16-byte descriptor to load the 64-bit base.</p>",
                 "tooltip": "Loads the source operand into the segment selector field of the task register. The source operand (a general-purpose register or a memory location) contains a segment selector that points to a task state segment (TSS). After the segment selector is loaded in the task register, the processor uses the segment selector to locate the segment descriptor for the TSS in the global descriptor table (GDT). It then loads the segment limit and base address for the TSS from the segment descriptor into the task register. The task pointed to by the task register is marked busy, but a switch to the task does not occur."
             };

         case "LZCNT":
             return {
                 "url": "http://www.felixcloutier.com/x86/LZCNT.html",
                 "html": "<p>Counts the number of leading most significant zero bits in a source operand (second operand) returning the result into a destination (first operand).</p><p>LZCNT differs from BSR. For example, LZCNT will produce the operand size when the input operand is zero. It should be noted that on processors that do not support LZCNT, the instruction byte encoding is executed as BSR.</p><p>In 64-bit mode 64-bit operand size requires REX.W=1.</p>",
                 "tooltip": "Counts the number of leading most significant zero bits in a source operand (second operand) returning the result into a destination (first operand)."
             };

         case "VMASKMOVDQU":
         case "MASKMOVDQU":
             return {
                 "url": "http://www.felixcloutier.com/x86/MASKMOVDQU.html",
                 "html": "<p>Stores selected bytes from the source operand (first operand) into an 128-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask oper-ands are XMM registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)</p><p>The most significant bit in each byte of the mask operand determines whether the corresponding byte in the source operand is written to the corresponding byte location in memory: 0 indicates no write and 1 indicates write.</p><p>The MASKMOVDQU instruction generates a non-temporal hint to the processor to minimize cache pollution. The non-temporal hint is implemented by using a write combining (WC) memory type protocol (see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10, of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing opera-tion implemented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVDQU instructions if multiple processors might use different memory types to read/write the destination memory loca-tions.</p><p>Behavior with a mask of all 0s is as follows:</p><p>The MASKMOVDQU instruction can be used to improve performance of algorithms that need to merge data on a byte-by-byte basis. MASKMOVDQU should not cause a read for ownership; doing so generates unnecessary band-width since data is to be written directly using the byte-mask without allocating old data prior to the store.</p>",
                 "tooltip": "Stores selected bytes from the source operand (first operand) into an 128-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask oper-ands are XMM registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)"
             };

         case "MASKMOVQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MASKMOVQ.html",
                 "html": "<p>Stores selected bytes from the source operand (first operand) into a 64-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask oper-ands are MMX technology registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)</p><p>The most significant bit in each byte of the mask operand determines whether the corresponding byte in the source operand is written to the corresponding byte location in memory: 0 indicates no write and 1 indicates write.</p><p>The MASKMOVQ instruction generates a non-temporal hint to the processor to minimize cache pollution. The non-temporal hint is implemented by using a write combining (WC) memory type protocol (see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10, of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation imple-mented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVQ instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>This instruction causes a transition from x87 FPU to MMX technology state (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]).</p><p>The behavior of the MASKMOVQ instruction with a mask of all 0s is as follows:</p>",
                 "tooltip": "Stores selected bytes from the source operand (first operand) into a 64-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask oper-ands are MMX technology registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)"
             };

         case "MAXPD":
         case "VMAXPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MAXPD.html",
                 "html": "<p>Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. If a value in the second operand is an SNaN, that SNaN is forwarded unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second operand) be returned, the action of MAXPD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand."
             };

         case "VMAXPS":
         case "MAXPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MAXPS.html",
                 "html": "<p>Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. If a value in the second operand is an SNaN, that SNaN is forwarded unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second operand) be returned, the action of MAXPS can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand."
             };

         case "MAXSD":
         case "VMAXSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MAXSD.html",
                 "html": "<p>Compares the low double-precision floating-point values in the first source operand and second the source operand, and returns the maximum value to the low quadword of the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers. When the second source operand is a memory operand, only 64 bits are accessed. The high quadword of the destination operand is copied from the same bits of first source operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second source operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN of either source operand be returned, the action of MAXSD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the low double-precision floating-point values in the first source operand and second the source operand, and returns the maximum value to the low quadword of the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers. When the second source operand is a memory operand, only 64 bits are accessed. The high quadword of the destination operand is copied from the same bits of first source operand."
             };

         case "VMAXSS":
         case "MAXSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MAXSS.html",
                 "html": "<p>Compares the low single-precision floating-point values in the first source operand and the second source operand, and returns the maximum value to the low doubleword of the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second source operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN from either source operand be returned, the action of MAXSS can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands are XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the low single-precision floating-point values in the first source operand and the second source operand, and returns the maximum value to the low doubleword of the destination operand."
             };

         case "MFENCE":
             return {
                 "url": "http://www.felixcloutier.com/x86/MFENCE.html",
                 "html": "<p>Performs a serializing operation on all load-from-memory and store-to-memory instructions that were issued prior the MFENCE instruction. This serializing operation guarantees that every load and store instruction that precedes the MFENCE instruction in program order becomes globally visible before any load or store instruction that follows the MFENCE instruction.<sup>1</sup> The MFENCE instruction is ordered with respect to all load and store instructions, other MFENCE instructions, any LFENCE and SFENCE instructions, and any serializing instructions (such as the CPUID instruction). MFENCE does not serialize the instruction stream.</p><p>Weakly ordered memory types can be used to achieve higher processor performance through such techniques as out-of-order issue, speculative reads, write-combining, and write-collapsing. The degree to which a consumer of data recognizes or knows that the data is weakly ordered varies among applications and may be unknown to the producer of this data. The MFENCE instruction provides a performance-efficient way of ensuring load and store ordering between routines that produce weakly-ordered results and routines that consume that data.</p><p>Processors are free to fetch and cache data speculatively from regions of system memory that use the WB, WC, and WT memory types. This speculative fetching can occur at any time and is not tied to instruction execution. Thus, it is not ordered with respect to executions of the MFENCE instruction; data can be brought into the caches specula-tively just before, during, or after the execution of an MFENCE instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>Specification of the instruction's opcode above indicates a ModR/M byte of F0. For this instruction, the processor ignores the r/m field of the ModR/M byte. Thus, MFENCE is encoded by any opcode of the form 0F AE Fx, where x is in the range 0-7.</p>",
                 "tooltip": "Performs a serializing operation on all load-from-memory and store-to-memory instructions that were issued prior the MFENCE instruction. This serializing operation guarantees that every load and store instruction that precedes the MFENCE instruction in program order becomes globally visible before any load or store instruction that follows the MFENCE instruction.1 The MFENCE instruction is ordered with respect to all load and store instructions, other MFENCE instructions, any LFENCE and SFENCE instructions, and any serializing instructions (such as the CPUID instruction). MFENCE does not serialize the instruction stream."
             };

         case "VMINPD":
         case "MINPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MINPD.html",
                 "html": "<p>Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. If a value in the second operand is an SNaN, that SNaN is forwarded unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second operand) be returned, the action of MINPD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination operand."
             };

         case "MINPS":
         case "VMINPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MINPS.html",
                 "html": "<p>Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. If a value in the second operand is an SNaN, that SNaN is forwarded unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second operand) be returned, the action of MINPS can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination operand."
             };

         case "MINSD":
         case "VMINSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MINSD.html",
                 "html": "<p>Compares the low double-precision floating-point values in the first source operand and the second source operand, and returns the minimum value to the low quadword of the destination operand. When the source operand is a memory operand, only the 64 bits are accessed. The high quadword of the destination operand is copied from the same bits in the first source operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second source operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second source) be returned, the action of MINSD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the low double-precision floating-point values in the first source operand and the second source operand, and returns the minimum value to the low quadword of the destination operand. When the source operand is a memory operand, only the 64 bits are accessed. The high quadword of the destination operand is copied from the same bits in the first source operand."
             };

         case "VMINSS":
         case "MINSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MINSS.html",
                 "html": "<p>Compares the low single-precision floating-point values in the first source operand and the second source operand and returns the minimum value to the low doubleword of the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN in either source operand be returned, the action of MINSD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands are XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Compares the low single-precision floating-point values in the first source operand and the second source operand and returns the minimum value to the low doubleword of the destination operand."
             };

         case "MONITOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/MONITOR.html",
                 "html": "<p>The MONITOR instruction arms address monitoring hardware using an address specified in EAX (the address range that the monitoring hardware checks for store operations can be determined by using CPUID). A store to an address within the specified address range triggers the monitoring hardware. The state of monitor hardware is used by MWAIT.</p><p>The content of EAX is an effective address (in 64-bit mode, RAX is used). By default, the DS segment is used to create a linear address that is monitored. Segment overrides can be used.</p><p>ECX and EDX are also used. They communicate other information to MONITOR. ECX specifies optional extensions. EDX specifies optional hints; it does not change the architectural behavior of the instruction. For the Pentium 4 processor (family 15, model 3), no extensions or hints are defined. Undefined hints in EDX are ignored by the processor; undefined extensions in ECX raises a general protection fault.</p><p>The address range must use memory of the write-back type. Only write-back memory will correctly trigger the monitoring hardware. Additional information on determining what address range to use in order to prevent false wake-ups is described in Chapter 8, \u201cMultiple-Processor Management\u201d of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>The MONITOR instruction is ordered as a load operation with respect to other memory transactions. The instruction is subject to the permission checking and faults associated with a byte load. Like a load, MONITOR sets the A-bit but not the D-bit in page tables.</p>",
                 "tooltip": "The MONITOR instruction arms address monitoring hardware using an address specified in EAX (the address range that the monitoring hardware checks for store operations can be determined by using CPUID). A store to an address within the specified address range triggers the monitoring hardware. The state of monitor hardware is used by MWAIT."
             };

         case "MOV":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOV.html",
                 "html": "<p>Copies the second operand (source operand) to the first operand (destination operand). The source operand can be an immediate value, general-purpose register, segment register, or memory location; the destination register can be a general-purpose register, segment register, or memory location. Both operands must be the same size, which can be a byte, a word, a doubleword, or a quadword.</p><p>The MOV instruction cannot be used to load the CS register. Attempting to do so results in an invalid opcode excep-tion (#UD). To load the CS register, use the far JMP, CALL, or RET instruction.</p><p>If the destination operand is a segment register (DS, ES, FS, GS, or SS), the source operand must be a valid segment selector. In protected mode, moving a segment selector into a segment register automatically causes the segment descriptor information associated with that segment selector to be loaded into the hidden (shadow) part of the segment register. While loading this information, the segment selector and segment descriptor information is validated (see the \u201cOperation\u201d algorithm below). The segment descriptor data is obtained from the GDT or LDT entry for the specified segment selector.</p><p>A NULL segment selector (values 0000-0003) can be loaded into the DS, ES, FS, and GS registers without causing a protection exception. However, any subsequent attempt to reference a segment whose corresponding segment register is loaded with a NULL value causes a general protection exception (#GP) and no memory reference occurs.</p><p>Loading the SS register with a MOV instruction inhibits all interrupts until after the execution of the next instruc-tion. This operation allows a stack pointer to be loaded into the ESP register with the next instruction (MOV ESP, <strong>stack-pointer value</strong>) before an interrupt occurs<sup>1</sup>. Be aware that the LSS instruction offers a more efficient method of loading the SS and ESP registers.</p>",
                 "tooltip": "Copies the second operand (source operand) to the first operand (destination operand). The source operand can be an immediate value, general-purpose register, segment register, or memory location; the destination register can be a general-purpose register, segment register, or memory location. Both operands must be the same size, which can be a byte, a word, a doubleword, or a quadword."
             };

         case "MOVAPD":
         case "VMOVAPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVAPD.html",
                 "html": "<p>Moves 2 or 4 double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM or YMM register from an 128-bit or 256-bit memory location, to store the contents of an XMM or YMM register into a 128-bit or 256-bit memory location, or to move data between two XMM or two YMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary or a general-protection exception (#GP) will be generated.</p><p>To move double-precision floating-point values to and from unaligned memory locations, use the (V)MOVUPD instruction.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit versions: Moves 128 bits of packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated. To move single-precision floating-point values to and from unaligned memory locations, use the VMOVUPD instruction.</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Moves 2 or 4 double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM or YMM register from an 128-bit or 256-bit memory location, to store the contents of an XMM or YMM register into a 128-bit or 256-bit memory location, or to move data between two XMM or two YMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary or a general-protection exception (#GP) will be generated."
             };

         case "VMOVAPS":
         case "MOVAPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVAPS.html",
                 "html": "<p>Moves 4 or8 single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM or YMM register from an 128-bit or 256-bit memory location, to store the contents of an XMM or YMM register into a 128-bit or 256-bit memory location, or to move data between two XMM or two YMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary or a general-protection exception (#GP) will be generated.</p><p>To move single-precision floating-point values to and from unaligned memory locations, use the (V)MOVUPS instruction.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p><p>128-bit versions:</p>",
                 "tooltip": "Moves 4 or8 single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM or YMM register from an 128-bit or 256-bit memory location, to store the contents of an XMM or YMM register into a 128-bit or 256-bit memory location, or to move data between two XMM or two YMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary or a general-protection exception (#GP) will be generated."
             };

         case "MOVBE":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVBE.html",
                 "html": "<p>Performs a byte swap operation on the data copied from the second operand (source operand) and store the result in the first operand (destination operand). The source operand can be a general-purpose register, or memory loca-tion; the destination register can be a general-purpose register, or a memory location; however, both operands can not be registers, and only one operand can be a memory location. Both operands must be the same size, which can be a word, a doubleword or quadword.</p><p>The MOVBE instruction is provided for swapping the bytes on a read from memory or on a write to memory; thus providing support for converting little-endian values to big-endian format and vice versa.</p><p>In 64-bit mode, the instruction's default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Performs a byte swap operation on the data copied from the second operand (source operand) and store the result in the first operand (destination operand). The source operand can be a general-purpose register, or memory loca-tion; the destination register can be a general-purpose register, or a memory location; however, both operands can not be registers, and only one operand can be a memory location. Both operands must be the same size, which can be a word, a doubleword or quadword."
             };

         case "VMOVD":
         case "MOVD":
         case "VMOVQ":
         case "MOVQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVD:MOVQ.html",
                 "html": "<p>Copies a doubleword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be general-purpose registers, MMX technology registers, XMM registers, or 32-bit memory locations. This instruction can be used to move a doubleword to and from the low doubleword of an MMX technology register and a general-purpose register or a 32-bit memory location, or to and from the low doubleword of an XMM register and a general-purpose register or a 32-bit memory location. The instruction cannot be used to transfer data between MMX technology registers, between XMM registers, between general-purpose registers, or between memory locations.</p><p>When the destination operand is an MMX technology register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 64 bits. When the destination operand is an XMM register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 128 bits.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Copies a doubleword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be general-purpose registers, MMX technology registers, XMM registers, or 32-bit memory locations. This instruction can be used to move a doubleword to and from the low doubleword of an MMX technology register and a general-purpose register or a 32-bit memory location, or to and from the low doubleword of an XMM register and a general-purpose register or a 32-bit memory location. The instruction cannot be used to transfer data between MMX technology registers, between XMM registers, between general-purpose registers, or between memory locations."
             };

         case "MOVDDUP":
         case "VMOVDDUP":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVDDUP.html",
                 "html": "<p>The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 8 bytes of data at memory location m64 are loaded. When the register-register form of this operation is used, the lower half of the 128-bit source register is duplicated and copied into the 128-bit destination register. See Figure 3-24.</p><svg height=\"228.6\" viewbox=\"117.060000 405902.640000 366.780000 152.400000\" width=\"550.17\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"108.7560782\" x=\"220.8613\" y=\"405920.170114\">MOVDDUP xmm1, xmm2/m64</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"39.5614835\" x=\"423.3145\" y=\"405945.483114\">xmm2/m64</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"20.0167506\" x=\"337.3015\" y=\"405945.483614\">[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"32.3052114\" x=\"423.3145\" y=\"405989.059914\">RESULT:</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"128.0608021\" x=\"139.2831\" y=\"405993.860614\">xmm1[127:64]        xmm2/m64[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"119.1644685\" x=\"287.7295\" y=\"405993.860614\">xmm1[63:0]        xmm2/m64[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"21.7768166\" x=\"423.3145\" y=\"405998.660314\">xmm1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"28.9130842\" x=\"188.8553\" y=\"406017.511414\">[127:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"20.0167506\" x=\"337.31126689\" y=\"406017.511414\">[63:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.000200pt\" textlength=\"26.3468782\" x=\"452.7704\" y=\"406046.419809\">OM15997</text>\n<rect height=\"126.005\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"360.015\" x=\"117.802\" y=\"405903.336\"></rect>\n<rect height=\"126.005\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"360.015\" x=\"117.802\" y=\"405903.336\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"275.309\" y=\"405929.212\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"131.303\" y=\"405977.589\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"275.309\" y=\"405977.589\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"275.309\" y=\"405929.212\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"131.303\" y=\"405977.589\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.006\" x=\"275.309\" y=\"405977.589\"></rect></svg><h3>Figure 3-24.  MOVDDUP\u2014Move One Double-FP and Duplicate</h3><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 8 bytes of data at memory location m64 are loaded. When the register-register form of this operation is used, the lower half of the 128-bit source register is duplicated and copied into the 128-bit destination register. See Figure 3-24."
             };

         case "MOVDQ2Q":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVDQ2Q.html",
                 "html": "<p>Moves the low quadword from the source operand (second operand) to the destination operand (first operand). The source operand is an XMM register and the destination operand is an MMX technology register.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the MOVDQ2Q instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Moves the low quadword from the source operand (second operand) to the destination operand (first operand). The source operand is an XMM register and the destination operand is an MMX technology register."
             };

         case "VMOVDQA":
         case "MOVDQA":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVDQA.html",
                 "html": "<p>128-bit versions:</p><p>Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers.</p><p>When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated. To move integer data to and from unaligned memory locations, use the VMOVDQU instruction.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers."
             };

         case "MOVDQU":
         case "VMOVDQU":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVDQU.html",
                 "html": "<p>128-bit versions:</p><p>Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers. When the source or destination operand is a memory operand, the operand may be unaligned on a 16-byte boundary without causing a general-protection exception (#GP) to be generated.<sup>1</sup></p><p>To move a double quadword to or from memory locations that are known to be aligned on 16-byte boundaries, use the MOVDQA instruction.</p><p>While executing in 16-bit addressing mode, a linear address for a 128-bit data access that overlaps the end of a 16-bit segment is not allowed and is defined as reserved behavior. A specific processor implementation may or may not generate a general-protection exception (#GP) in this situation, and the address that spans the end of the segment may or may not wrap around to the beginning of the segment.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers. When the source or destination operand is a memory operand, the operand may be unaligned on a 16-byte boundary without causing a general-protection exception (#GP) to be generated.1"
             };

         case "MOVHLPS":
         case "VMOVHLPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVHLPS.html",
                 "html": "<p>This instruction cannot be used for memory to register moves.</p><p><strong>128-bit two-argument form:</strong></p><p>Moves two packed single-precision floating-point values from the high quadword of the second XMM argument (second operand) to the low quadword of the first XMM register (first argument). The high quadword of the desti-nation operand is left unchanged. Bits (VLMAX-1:64) of the corresponding YMM destination register are unmodi-fied.</p><p><strong>128-bit three-argument form</strong></p><p>Moves two packed single-precision floating-point values from the high quadword of the third XMM argument (third operand) to the low quadword of the destination (first operand). Copies the high quadword from the second XMM argument (second operand) to the high quadword of the destination (first operand). Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction cannot be used for memory to register moves."
             };

         case "MOVHPD":
         case "VMOVHPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVHPD.html",
                 "html": "<p>This instruction cannot be used for register to register or memory to memory moves.</p><p><strong>128-bit Legacy SSE load:</strong></p><p>Moves a double-precision floating-point value from the source 64-bit memory operand and stores it in the high 64-bits of the destination XMM register. The lower 64bits of the XMM register are preserved. The upper 128-bits of the corresponding YMM destination register are preserved.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p><strong>VEX.128 encoded load:</strong></p>",
                 "tooltip": "This instruction cannot be used for register to register or memory to memory moves."
             };

         case "VMOVHPS":
         case "MOVHPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVHPS.html",
                 "html": "<p>This instruction cannot be used for register to register or memory to memory moves.</p><p><strong>128-bit Legacy SSE load:</strong></p><p>Moves two packed single-precision floating-point values from the source 64-bit memory operand and stores them in the high 64-bits of the destination XMM register. The lower 64bits of the XMM register are preserved. The upper 128-bits of the corresponding YMM destination register are preserved.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p><strong>VEX.128 encoded load:</strong></p>",
                 "tooltip": "This instruction cannot be used for register to register or memory to memory moves."
             };

         case "MOVLHPS":
         case "VMOVLHPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVLHPS.html",
                 "html": "<p>This instruction cannot be used for memory to register moves.</p><p><strong>128-bit two-argument form:</strong></p><p>Moves two packed single-precision floating-point values from the low quadword of the second XMM argument (second operand) to the high quadword of the first XMM register (first argument). The low quadword of the desti-nation operand is left unchanged. The upper 128 bits of the corresponding YMM destination register are unmodi-fied.</p><p><strong>128-bit three-argument form</strong></p><p>Moves two packed single-precision floating-point values from the low quadword of the third XMM argument (third operand) to the high quadword of the destination (first operand). Copies the low quadword from the second XMM argument (second operand) to the low quadword of the destination (first operand). The upper 128-bits of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction cannot be used for memory to register moves."
             };

         case "MOVLPD":
         case "VMOVLPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVLPD.html",
                 "html": "<p>This instruction cannot be used for register to register or memory to memory moves.</p><p><strong>128-bit Legacy SSE load:</strong></p><p>Moves a double-precision floating-point value from the source 64-bit memory operand and stores it in the low 64-bits of the destination XMM register. The upper 64bits of the XMM register are preserved. The upper 128-bits of the corresponding YMM destination register are preserved.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p><strong>VEX.128 encoded load:</strong></p>",
                 "tooltip": "This instruction cannot be used for register to register or memory to memory moves."
             };

         case "VMOVLPS":
         case "MOVLPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVLPS.html",
                 "html": "<p>This instruction cannot be used for register to register or memory to memory moves.</p><p><strong>128-bit Legacy SSE load:</strong></p><p>Moves two packed single-precision floating-point values from the source 64-bit memory operand and stores them in the low 64-bits of the destination XMM register. The upper 64bits of the XMM register are preserved. The upper 128-bits of the corresponding YMM destination register are preserved.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p><strong>VEX.128 encoded load:</strong></p>",
                 "tooltip": "This instruction cannot be used for register to register or memory to memory moves."
             };

         case "VMOVMSKPD":
         case "MOVMSKPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVMSKPD.html",
                 "html": "<p>Extracts the sign bits from the packed double-precision floating-point values in the source operand (second operand), formats them into a 2-bit mask, and stores the mask in the destination operand (first operand). The source operand is an XMM register, and the destination operand is a general-purpose register. The mask is stored in the 2 low-order bits of the destination operand. Zero-extend the upper bits of the destination.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. The default operand size is 64-bit in 64-bit mode.</p><p>128-bit versions: The source operand is a YMM register. The destination operand is a general purpose register.</p><p>VEX.256 encoded version: The source operand is a YMM register. The destination operand is a general purpose register.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Extracts the sign bits from the packed double-precision floating-point values in the source operand (second operand), formats them into a 2-bit mask, and stores the mask in the destination operand (first operand). The source operand is an XMM register, and the destination operand is a general-purpose register. The mask is stored in the 2 low-order bits of the destination operand. Zero-extend the upper bits of the destination."
             };

         case "MOVMSKPS":
         case "VMOVMSKPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVMSKPS.html",
                 "html": "<p>Extracts the sign bits from the packed single-precision floating-point values in the source operand (second operand), formats them into a 4- or 8-bit mask, and stores the mask in the destination operand (first operand). The source operand is an XMM or YMM register, and the destination operand is a general-purpose register. The mask is stored in the 4 or 8 low-order bits of the destination operand. The upper bits of the destination operand beyond the mask are filled with zeros.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. The default operand size is 64-bit in 64-bit mode.</p><p>128-bit versions: The source operand is a YMM register. The destination operand is a general purpose register.</p><p>VEX.256 encoded version: The source operand is a YMM register. The destination operand is a general purpose register.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Extracts the sign bits from the packed single-precision floating-point values in the source operand (second operand), formats them into a 4- or 8-bit mask, and stores the mask in the destination operand (first operand). The source operand is an XMM or YMM register, and the destination operand is a general-purpose register. The mask is stored in the 4 or 8 low-order bits of the destination operand. The upper bits of the destination operand beyond the mask are filled with zeros."
             };

         case "MOVNTDQ":
         case "VMOVNTDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTDQ.html",
                 "html": "<p>Moves the packed integers in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain integer data (packed bytes, words, doublewords, or quad-words). The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>Note: In VEX-128 encoded versions, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0; otherwise instruc-tions will #UD.</p>",
                 "tooltip": "Moves the packed integers in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain integer data (packed bytes, words, doublewords, or quad-words). The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated."
             };

         case "VMOVNTDQA":
         case "MOVNTDQA":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTDQA.html",
                 "html": "<p>(V)MOVNTDQA loads a double quadword from the source operand (second operand) to the destination operand (first operand) using a non-temporal hint. A processor implementation may make use of the non-temporal hint associated with this instruction if the memory source is WC (write combining) memory type. An implementation may also make use of the non-temporal hint associated with this instruction if the memory source is WB (write back) memory type.</p><p>A processor\u2019s implementation of the non-temporal hint does not override the effective memory type semantics, but the implementation of the hint is processor dependent. For example, a processor implementation may choose to ignore the hint and process the instruction as a normal MOVDQA for any memory type. Another implementation of the hint for WC memory type may optimize data transfer throughput of WC reads. A third implementation may optimize cache reads generated by (V)MOVNTDQA on WB memory type to reduce cache evictions.</p><p><strong>WC Streaming Load Hint</strong></p><p>For WC memory type in particular, the processor never appears to read the data into the cache hierarchy. Instead, the non-temporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. Any memory-type aliased lines in the cache will be snooped and flushed. Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. The temporary internal buffer may be flushed by the processor at any time for any reason, for example:</p><p>The memory type of the region being read can override the non-temporal hint, if the memory address specified for the non-temporal read is not a WC memory region. Information on non-temporal reads and writes can be found in Chapter 11, \u201cMemory Cache Control\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p>",
                 "tooltip": "(V)MOVNTDQA loads a double quadword from the source operand (second operand) to the destination operand (first operand) using a non-temporal hint. A processor implementation may make use of the non-temporal hint associated with this instruction if the memory source is WC (write combining) memory type. An implementation may also make use of the non-temporal hint associated with this instruction if the memory source is WB (write back) memory type."
             };

         case "MOVNTI":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTI.html",
                 "html": "<p>Moves the doubleword integer in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to minimize cache pollution during the write to memory. The source operand is a general-purpose register. The destination operand is a 32-bit memory location.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTI instructions if multiple proces-sors might use different memory types to read/write the destination memory locations.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Moves the doubleword integer in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to minimize cache pollution during the write to memory. The source operand is a general-purpose register. The destination operand is a 32-bit memory location."
             };

         case "VMOVNTPD":
         case "MOVNTPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTPD.html",
                 "html": "<p>Moves the packed double-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain packed double-precision, floating-pointing data. The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTPD instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>Note: In VEX-128 encoded versions, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0; otherwise instruc-tions will #UD.</p>",
                 "tooltip": "Moves the packed double-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain packed double-precision, floating-pointing data. The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated."
             };

         case "MOVNTPS":
         case "VMOVNTPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTPS.html",
                 "html": "<p>Moves the packed single-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain packed single-precision, floating-pointing. The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTPS instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Moves the packed single-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register or YMM register, which is assumed to contain packed single-precision, floating-pointing. The destination operand is a 128-bit or 256-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version) or 32-byte (VEX.256 encoded version) boundary otherwise a general-protection exception (#GP) will be generated."
             };

         case "MOVNTQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVNTQ.html",
                 "html": "<p>Moves the quadword in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to minimize cache pollution during the write to memory. The source operand is an MMX tech-nology register, which is assumed to contain packed integer data (packed bytes, words, or doublewords). The destination operand is a 64-bit memory location.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTQ instructions if multiple proces-sors might use different memory types to read/write the destination memory locations.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Moves the quadword in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to minimize cache pollution during the write to memory. The source operand is an MMX tech-nology register, which is assumed to contain packed integer data (packed bytes, words, or doublewords). The destination operand is a 64-bit memory location."
             };

         case "MOVQ2DQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVQ2DQ.html",
                 "html": "<p>Moves the quadword from the source operand (second operand) to the low quadword of the destination operand (first operand). The source operand is an MMX technology register and the destination operand is an XMM register.</p><p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the MOVQ2DQ instruction is executed.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Moves the quadword from the source operand (second operand) to the low quadword of the destination operand (first operand). The source operand is an MMX technology register and the destination operand is an XMM register."
             };

         case "VMOVSHDUP":
         case "MOVSHDUP":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVSHDUP.html",
                 "html": "<p>The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 16 bytes of data at memory location m128 are loaded and the single-precision elements in positions 1 and 3 are duplicated. When the register-register form of this operation is used, the same operation is performed but with data coming from the 128-bit source register. See Figure 3-25.</p><svg height=\"250.11\" viewbox=\"117.660000 439987.500000 373.920000 166.740000\" width=\"560.88\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"120.9096834\" x=\"223.0967\" y=\"440005.422533\">MOVSHDUP xmm1, xmm2/m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"24.4806\" x=\"444.2234\" y=\"440026.346133\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"29.4909628\" x=\"151.8219\" y=\"440031.241533\">[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"24.9538916\" x=\"232.129\" y=\"440031.241533\">[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"24.9538916\" x=\"310.1614\" y=\"440031.241533\">[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"20.4168204\" x=\"390.4582\" y=\"440031.241533\">[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"20.4086602\" x=\"444.2234\" y=\"440036.138333\">m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"47.0029824\" x=\"143.825\" y=\"440076.535933\">xmm1[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"42.878352\" x=\"223.8464\" y=\"440076.535933\">xmm1[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"42.878352\" x=\"301.8787\" y=\"440076.535933\">xmm1[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"38.7537216\" x=\"381.9103\" y=\"440076.535933\">xmm1[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"32.9508876\" x=\"444.2234\" y=\"440081.427433\">RESULT:</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"44.9406672\" x=\"144.825\" y=\"440086.328133\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"42.878352\" x=\"223.8568\" y=\"440086.328133\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"40.8160368\" x=\"302.8883\" y=\"440086.328133\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"36.6914064\" x=\"382.9103\" y=\"440086.328133\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"22.2120644\" x=\"444.2234\" y=\"440091.219733\">xmm1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"45.363516\" x=\"144.6106\" y=\"440096.120433\">m128[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"45.363516\" x=\"222.6431\" y=\"440096.120433\">m128[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"41.2388856\" x=\"302.674\" y=\"440096.120433\">m128[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.418400pt\" textlength=\"41.2388856\" x=\"380.707\" y=\"440096.120433\">m128[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"29.4909628\" x=\"151.8325\" y=\"440114.960333\">[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"24.9538916\" x=\"232.13947626\" y=\"440114.960333\">[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"24.9538916\" x=\"310.17138876\" y=\"440114.960333\">[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.160200pt\" textlength=\"20.4168204\" x=\"390.46775676\" y=\"440114.960333\">[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.120100pt\" textlength=\"26.8733591\" x=\"459.5796\" y=\"440145.415673\">OM15998</text>\n<rect height=\"142.294\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"367.211\" x=\"118.375\" y=\"439988.251\"></rect>\n<rect height=\"142.294\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"367.211\" x=\"118.375\" y=\"439988.251\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"361.653\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"283.62\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"127.555\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"205.588\" y=\"440014.644\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"361.653\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"283.62\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"127.555\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"205.588\" y=\"440063.988\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"361.653\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"283.62\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"127.555\" y=\"440014.644\"></rect>\n<rect height=\"27.541\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"205.588\" y=\"440014.644\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"361.653\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"283.62\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"127.555\" y=\"440063.988\"></rect>\n<rect height=\"39.016\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"78.032\" x=\"205.588\" y=\"440063.988\"></rect></svg><h3>Figure 3-25.  MOVSHDUP\u2014Move Packed Single-FP High and Duplicate</h3><p>In 64-bit mode, use of the REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 16 bytes of data at memory location m128 are loaded and the single-precision elements in positions 1 and 3 are duplicated. When the register-register form of this operation is used, the same operation is performed but with data coming from the 128-bit source register. See Figure 3-25."
             };

         case "MOVSLDUP":
         case "VMOVSLDUP":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVSLDUP.html",
                 "html": "<p>The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 16 bytes of data at memory location m128 are loaded and the single-precision elements in positions 0 and 2 are duplicated. When the register-register form of this operation is used, the same operation is performed but with data coming from the 128-bit source register.</p><p>See Figure 3-26.</p><svg height=\"245.25\" viewbox=\"118.380000 441586.740000 370.200000 163.500000\" width=\"555.3\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"117.2123953\" x=\"222.4403\" y=\"441604.322114\">MOVSLDUP xmm1, xmm2/m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"24.0009\" x=\"438.574\" y=\"441624.835714\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"28.9130842\" x=\"151.9024\" y=\"441629.635114\">[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"24.4649174\" x=\"230.6356\" y=\"441629.635114\">[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"24.4649174\" x=\"307.1387\" y=\"441629.635114\">[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"20.0167506\" x=\"385.8619\" y=\"441629.635114\">[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"20.0087503\" x=\"438.574\" y=\"441634.436014\">m128</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"46.081728\" x=\"144.0621\" y=\"441674.042014\">xmm1[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"42.03794\" x=\"222.5153\" y=\"441674.042014\">xmm1[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"42.03794\" x=\"299.0184\" y=\"441674.042014\">xmm1[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"37.994152\" x=\"377.4815\" y=\"441674.042014\">xmm1[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"32.3052114\" x=\"438.574\" y=\"441678.837714\">RESULT:</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"44.059834\" x=\"145.0425\" y=\"441683.642414\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"42.03794\" x=\"222.5255\" y=\"441683.642414\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"40.016046\" x=\"300.0083\" y=\"441683.642414\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"35.972258\" x=\"378.4619\" y=\"441683.642414\">xmm2/</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"21.7768166\" x=\"438.574\" y=\"441688.438114\">xmm1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"40.430607\" x=\"146.7924\" y=\"441693.242714\">m128[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"40.430607\" x=\"223.2957\" y=\"441693.242714\">m128[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"36.386819\" x=\"301.7581\" y=\"441693.242714\">m128[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.273000pt\" textlength=\"36.386819\" x=\"378.2619\" y=\"441693.242714\">m128[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"28.9130842\" x=\"151.9128\" y=\"441711.713414\">[127:96]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"24.4649174\" x=\"230.64615239\" y=\"441711.713414\">[95:64]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"24.4649174\" x=\"307.14902114\" y=\"441711.713414\">[63:32]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.000300pt\" textlength=\"20.0167506\" x=\"385.87197314\" y=\"441711.713414\">[31:0]</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.000200pt\" textlength=\"26.3468782\" x=\"453.6286\" y=\"441741.571909\">OM15999</text>\n<rect height=\"139.506\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"360.015\" x=\"119.111\" y=\"441587.487\"></rect>\n<rect height=\"139.506\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"360.015\" x=\"119.111\" y=\"441587.487\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"281.118\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"204.615\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"128.111\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"357.621\" y=\"441613.363\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"281.118\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"204.615\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"128.111\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"357.621\" y=\"441661.74\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"281.118\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"204.615\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"128.111\" y=\"441613.363\"></rect>\n<rect height=\"27.001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"357.621\" y=\"441613.363\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"281.118\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"204.615\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"128.111\" y=\"441661.74\"></rect>\n<rect height=\"38.252\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"76.503\" x=\"357.621\" y=\"441661.74\"></rect></svg><h3>Figure 3-26.  MOVSLDUP\u2014Move Packed Single-FP Low and Duplicate</h3><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "The linear address corresponds to the address of the least-significant byte of the referenced memory data. When a memory address is indicated, the 16 bytes of data at memory location m128 are loaded and the single-precision elements in positions 0 and 2 are duplicated. When the register-register form of this operation is used, the same operation is performed but with data coming from the 128-bit source register."
             };

         case "MOVSW":
         case "MOVSD":
         case "MOVSB":
         case "MOVS":
         case "MOVSQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVSQ.html",
                 "html": "<p>Moves the byte, word, or doubleword specified with the second operand (source operand) to the location specified with the first operand (destination operand). Both the source and destination operands are located in memory. The address of the source operand is read from the DS:ESI or the DS:SI registers (depending on the address-size attri-bute of the instruction, 32 or 16, respectively). The address of the destination operand is read from the ES:EDI or the ES:DI registers (again depending on the address-size attribute of the instruction). The DS segment may be overridden with a segment override prefix, but the ES segment cannot be overridden.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the MOVS mnemonic) allows the source and destination operands to be specified explicitly. Here, the source and destination operands should be symbols that indicate the size and location of the source value and the destination, respectively. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source and destination operand symbols must specify the correct <strong>type</strong> (size) of the operands (bytes, words, or doublewords), but they do not have to specify the correct <strong>location</strong>. The locations of the source and destination operands are always specified by the DS:(E)SI and ES:(E)DI registers, which must be loaded correctly before the move string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the MOVS instruc-tions. Here also DS:(E)SI and ES:(E)DI are assumed to be the source and destination operands, respectively. The size of the source and destination operands is selected with the mnemonic: MOVSB (byte move), MOVSW (word move), or MOVSD (doubleword move).</p><p>After the move operation, the (E)SI and (E)DI registers are incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI and (E)DI register are incre-</p><p>mented; if the DF flag is 1, the (E)SI and (E)DI registers are decremented.) The registers are incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.</p>",
                 "tooltip": "Moves the byte, word, or doubleword specified with the second operand (source operand) to the location specified with the first operand (destination operand). Both the source and destination operands are located in memory. The address of the source operand is read from the DS:ESI or the DS:SI registers (depending on the address-size attri-bute of the instruction, 32 or 16, respectively). The address of the destination operand is read from the ES:EDI or the ES:DI registers (again depending on the address-size attribute of the instruction). The DS segment may be overridden with a segment override prefix, but the ES segment cannot be overridden."
             };

         case "VMOVSS":
         case "MOVSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVSS.html",
                 "html": "<p>Moves a scalar single-precision floating-point value from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be XMM registers or 32-bit memory locations. This instruction can be used to move a single-precision floating-point value to and from the low doubleword of an XMM register and a 32-bit memory location, or to move a single-precision floating-point value between the low doublewords of two XMM registers. The instruction cannot be used to transfer data between memory locations.</p><p>For non-VEX encoded syntax and when the source and destination operands are XMM registers, the high double-words of the destination operand remains unchanged. When the source operand is a memory location and destina-tion operand is an XMM registers, the high doublewords of the destination operand is cleared to all 0s.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>VEX encoded instruction syntax supports two source operands and a destination operand if ModR/M.mod field is 11B. VEX.vvvv is used to encode the first source operand (the second operand). The low 128 bits of the destination operand stores the result of merging the low dword of the second source operand with three dwords in bits 127:32 of the first source operand. The upper bits of the destination operand are cleared.</p><p>Note: For the \u201cVMOVSS m32, xmm1\u201d (memory store form) instruction version, VEX.vvvv is reserved and must be 1111b otherwise instruction will #UD.</p>",
                 "tooltip": "Moves a scalar single-precision floating-point value from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be XMM registers or 32-bit memory locations. This instruction can be used to move a single-precision floating-point value to and from the low doubleword of an XMM register and a 32-bit memory location, or to move a single-precision floating-point value between the low doublewords of two XMM registers. The instruction cannot be used to transfer data between memory locations."
             };

         case "MOVSXD":
         case "MOVSX":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVSXD.html",
                 "html": "<p>Copies the contents of the source operand (register or memory location) to the destination operand (register) and sign extends the value to 16 or 32 bits (see Figure 7-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>). The size of the converted value depends on the operand-size attribute.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Copies the contents of the source operand (register or memory location) to the destination operand (register) and sign extends the value to 16 or 32 bits (see Figure 7-6 in the Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1). The size of the converted value depends on the operand-size attribute."
             };

         case "MOVUPD":
         case "VMOVUPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVUPD.html",
                 "html": "<p><strong>128-bit versions:</strong></p><p>Moves a double quadword containing two packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, store the contents of an XMM register into a 128-bit memory location, or move data between two XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>When the source or destination operand is a memory operand, the operand may be unaligned on a 16-byte boundary without causing a general-protection exception (#GP) to be generated.<sup>1</sup></p>",
                 "tooltip": "Moves a double quadword containing two packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, store the contents of an XMM register into a 128-bit memory location, or move data between two XMM registers."
             };

         case "VMOVUPS":
         case "MOVUPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVUPS.html",
                 "html": "<p>128-bit versions: Moves a double quadword containing four packed single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, store the contents of an XMM register into a 128-bit memory location, or move data between two XMM registers.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>When the source or destination operand is a memory operand, the operand may be unaligned on a 16-byte boundary without causing a general-protection exception (#GP) to be generated.<sup>1</sup></p><p>To move packed single-precision floating-point values to and from memory locations that are known to be aligned on 16-byte boundaries, use the MOVAPS instruction.</p>",
                 "tooltip": "128-bit versions: Moves a double quadword containing four packed single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, store the contents of an XMM register into a 128-bit memory location, or move data between two XMM registers."
             };

         case "MOVZX":
             return {
                 "url": "http://www.felixcloutier.com/x86/MOVZX.html",
                 "html": "<p>Copies the contents of the source operand (register or memory location) to the destination operand (register) and zero extends the value. The size of the converted value depends on the operand-size attribute.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bit operands. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Copies the contents of the source operand (register or memory location) to the destination operand (register) and zero extends the value. The size of the converted value depends on the operand-size attribute."
             };

         case "VMPSADBW":
         case "MPSADBW":
             return {
                 "url": "http://www.felixcloutier.com/x86/MPSADBW.html",
                 "html": "<p>(V)MPSADBW calculates packed word results of sum-absolute-difference (SAD) of unsigned bytes from two blocks of 32-bit dword elements, using two select fields in the immediate byte to select the offsets of the two blocks within the first source operand and the second operand. Packed SAD word results are calculated within each 128-bit lane. Each SAD word result is calculated between a stationary block_2 (whose offset within the second source operand is selected by a two bit select control, multiplied by 32 bits) and a sliding block_1 at consecutive byte-granular position within the first source operand. The offset of the first 32-bit block of block_1 is selectable using a one bit select control, multiplied by 32 bits.</p><p>128-bit Legacy SSE version: Imm8[1:0]*32 specifies the bit offset of block_2 within the second source operand. Imm[2]*32 specifies the initial bit offset of the block_1 within the first source operand. The first source operand and destination operand are the same. The first source and destination operands are XMM registers. The second source operand is either an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. Bits 7:3 of the immediate byte are ignored.</p><p>VEX.128 encoded version: Imm8[1:0]*32 specifies the bit offset of block_2 within the second source operand. Imm[2]*32 specifies the initial bit offset of the block_1 within the first source operand. The first source and desti-nation operands are XMM registers. The second source operand is either an XMM register or a 128-bit memory location. Bits (127:128) of the corresponding YMM register are zeroed. Bits 7:3 of the immediate byte are ignored.</p><p>VEX.256 encoded version: The sum-absolute-difference (SAD) operation is repeated 8 times for MPSADW between the same block_2 (fixed offset within the second source operand) and a variable block_1 (offset is shifted by 8 bits for each SAD operation) in the first source operand. Each 16-bit result of eight SAD operations between block_2 and block_1 is written to the respective word in the lower 128 bits of the destination operand.</p><p>Additionally, VMPSADBW performs another eight SAD operations on block_4 of the second source operand and block_3 of the first source operand. (Imm8[4:3]*32 + 128) specifies the bit offset of block_4 within the second source operand. (Imm[5]*32+128) specifies the initial bit offset of the block_3 within the first source operand. Each 16-bit result of eight SAD operations between block_4 and block_3 is written to the respective word in the upper 128 bits of the destination operand.</p>",
                 "tooltip": "(V)MPSADBW calculates packed word results of sum-absolute-difference (SAD) of unsigned bytes from two blocks of 32-bit dword elements, using two select fields in the immediate byte to select the offsets of the two blocks within the first source operand and the second operand. Packed SAD word results are calculated within each 128-bit lane. Each SAD word result is calculated between a stationary block_2 (whose offset within the second source operand is selected by a two bit select control, multiplied by 32 bits) and a sliding block_1 at consecutive byte-granular position within the first source operand. The offset of the first 32-bit block of block_1 is selectable using a one bit select control, multiplied by 32 bits."
             };

         case "MUL":
             return {
                 "url": "http://www.felixcloutier.com/x86/MUL.html",
                 "html": "<p>Performs an unsigned multiplication of the first operand (destination operand) and the second operand (source operand) and stores the result in the destination operand. The destination operand is an implied operand located in register AL, AX or EAX (depending on the size of the operand); the source operand is located in a general-purpose register or a memory location. The action of this instruction and the location of the result depends on the opcode and the operand size as shown in Table 3-66.</p><p>The result is stored in register AX, register pair DX:AX, or register pair EDX:EAX (depending on the operand size), with the high-order bits of the product contained in register AH, DX, or EDX, respectively. If the high-order bits of the product are 0, the CF and OF flags are cleared; otherwise, the flags are set.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to addi-tional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits.</p><p>See the summary chart at the beginning of this section for encoding data and limits.</p><h3>Table 3-66.  MUL Results</h3>",
                 "tooltip": "Performs an unsigned multiplication of the first operand (destination operand) and the second operand (source operand) and stores the result in the destination operand. The destination operand is an implied operand located in register AL, AX or EAX (depending on the size of the operand); the source operand is located in a general-purpose register or a memory location. The action of this instruction and the location of the result depends on the opcode and the operand size as shown in Table 3-66."
             };

         case "MULPD":
         case "VMULPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MULPD.html",
                 "html": "<p>Performs a SIMD multiply of the two or four packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD double-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the destination YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD multiply of the two or four packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD double-precision floating-point operation."
             };

         case "VMULPS":
         case "MULPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MULPS.html",
                 "html": "<p>Performs a SIMD multiply of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand. See Figure 10-5 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for an illustration of a SIMD single-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the destination YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD multiply of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand. See Figure 10-5 in the Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1, for an illustration of a SIMD single-precision floating-point operation."
             };

         case "MULSD":
         case "VMULSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/MULSD.html",
                 "html": "<p>Multiplies the low double-precision floating-point value in the source operand (second operand) by the low double-precision floating-point value in the destination operand (first operand), and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the destination operand remains unchanged. See Figure 11-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustra-tion of a scalar double-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Multiplies the low double-precision floating-point value in the source operand (second operand) by the low double-precision floating-point value in the destination operand (first operand), and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the destination operand remains unchanged. See Figure 11-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustra-tion of a scalar double-precision floating-point operation."
             };

         case "VMULSS":
         case "MULSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/MULSS.html",
                 "html": "<p>Multiplies the low single-precision floating-point value from the source operand (second operand) by the low single-precision floating-point value in the destination operand (first operand), and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar single-precision floating-point operation.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Multiplies the low single-precision floating-point value from the source operand (second operand) by the low single-precision floating-point value in the destination operand (first operand), and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar single-precision floating-point operation."
             };

         case "MULX":
             return {
                 "url": "http://www.felixcloutier.com/x86/MULX.html",
                 "html": "<p>Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand (the third operand) and stores the low half of the result in the second destination (second operand), the high half of the result in the first destination operand (first operand), without reading or writing the arithmetic flags. This enables efficient programming where the software can interleave add with carry operations and multiplications.</p><p>If the first and second operand are identical, it will contain the high half of the multiplication result.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand (the third operand) and stores the low half of the result in the second destination (second operand), the high half of the result in the first destination operand (first operand), without reading or writing the arithmetic flags. This enables efficient programming where the software can interleave add with carry operations and multiplications."
             };

         case "MWAIT":
             return {
                 "url": "http://www.felixcloutier.com/x86/MWAIT.html",
                 "html": "<p>MWAIT instruction provides hints to allow the processor to enter an implementation-dependent optimized state. There are two principal targeted usages: address-range monitor and advanced power management. Both usages of MWAIT require the use of the MONITOR instruction.</p><p>CPUID.01H:ECX.MONITOR[bit 3] indicates the availability of MONITOR and MWAIT in the processor. When set, MWAIT may be executed only at privilege level 0 (use at any other privilege level results in an invalid-opcode exception). The operating system or system BIOS may disable this instruction by using the IA32_MISC_ENABLE MSR; disabling MWAIT clears the CPUID feature flag and causes execution to generate an invalid-opcode excep-tion.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>ECX specifies optional extensions for the MWAIT instruction. EAX may contain hints such as the preferred opti-mized state the processor should enter. The first processors to implement MWAIT supported only the zero value for EAX and ECX. Later processors allowed setting ECX[0] to enable masked interrupts as break events for MWAIT (see below). Software can use the CPUID instruction to determine the extensions and hints supported by the processor.</p>",
                 "tooltip": "MWAIT instruction provides hints to allow the processor to enter an implementation-dependent optimized state. There are two principal targeted usages: address-range monitor and advanced power management. Both usages of MWAIT require the use of the MONITOR instruction."
             };

         case "NEG":
             return {
                 "url": "http://www.felixcloutier.com/x86/NEG.html",
                 "html": "<p>Replaces the value of operand (the destination operand) with its two's complement. (This operation is equivalent to subtracting the operand from 0.) The destination operand is located in a general-purpose register or a memory location.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Replaces the value of operand (the destination operand) with its two's complement. (This operation is equivalent to subtracting the operand from 0.) The destination operand is located in a general-purpose register or a memory location."
             };

         case "NOP":
             return {
                 "url": "http://www.felixcloutier.com/x86/NOP.html",
                 "html": "<p>This instruction performs no operation. It is a one-byte or multi-byte NOP that takes up space in the instruction stream but does not impact machine context, except for the EIP register.</p><p>The multi-byte form of NOP is available on processors with model encoding:</p><p>The multi-byte NOP instruction does not alter the content of a register and will not issue a memory operation. The instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "This instruction performs no operation. It is a one-byte or multi-byte NOP that takes up space in the instruction stream but does not impact machine context, except for the EIP register."
             };

         case "NOT":
             return {
                 "url": "http://www.felixcloutier.com/x86/NOT.html",
                 "html": "<p>Performs a bitwise NOT operation (each 1 is set to 0, and each 0 is set to 1) on the destination operand and stores the result in the destination operand location. The destination operand can be a register or a memory location.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Performs a bitwise NOT operation (each 1 is set to 0, and each 0 is set to 1) on the destination operand and stores the result in the destination operand location. The destination operand can be a register or a memory location."
             };

         case "OR":
             return {
                 "url": "http://www.felixcloutier.com/x86/OR.html",
                 "html": "<p>Performs a bitwise inclusive OR operation between the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory operands cannot be used in one instruction.) Each bit of the result of the OR instruction is set to 0 if both corresponding bits of the first and second operands are 0; otherwise, each bit is set to 1.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Performs a bitwise inclusive OR operation between the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory operands cannot be used in one instruction.) Each bit of the result of the OR instruction is set to 0 if both corresponding bits of the first and second operands are 0; otherwise, each bit is set to 1."
             };

         case "VORPD":
         case "ORPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ORPD.html",
                 "html": "<p>Performs a bitwise logical OR of the two or four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the destination YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical OR of the two or four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand"
             };

         case "ORPS":
         case "VORPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ORPS.html",
                 "html": "<p>Performs a bitwise logical OR of the four or eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the destination YMM register destination are zeroed.</p><p>VEX.256 Encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical OR of the four or eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand."
             };

         case "OUT":
             return {
                 "url": "http://www.felixcloutier.com/x86/OUT.html",
                 "html": "<p>Copies the value from the second operand (source operand) to the I/O port specified with the destination operand (first operand). The source operand can be register AL, AX, or EAX, depending on the size of the port being accessed (8, 16, or 32 bits, respectively); the destination operand can be a byte-immediate or the DX register. Using a byte immediate allows I/O port addresses 0 to 255 to be accessed; using the DX register as a source operand allows I/O ports from 0 to 65,535 to be accessed.</p><p>The size of the I/O port being accessed is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.</p><p>At the machine code level, I/O instructions are shorter when accessing 8-bit I/O ports. Here, the upper eight bits of the port address will be 0.</p><p>This instruction is only useful for accessing I/O ports located in the processor\u2019s I/O address space. See Chapter 16, \u201cInput/Output,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more infor-mation on accessing I/O ports in the I/O address space.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Copies the value from the second operand (source operand) to the I/O port specified with the destination operand (first operand). The source operand can be register AL, AX, or EAX, depending on the size of the port being accessed (8, 16, or 32 bits, respectively); the destination operand can be a byte-immediate or the DX register. Using a byte immediate allows I/O port addresses 0 to 255 to be accessed; using the DX register as a source operand allows I/O ports from 0 to 65,535 to be accessed."
             };

         case "OUTSB":
         case "OUTS":
         case "OUTSD":
         case "OUTSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/OUTSD.html",
                 "html": "<p>Copies data from the source operand (second operand) to the I/O port specified with the destination operand (first operand). The source operand is a memory location, the address of which is read from either the DS:SI, DS:ESI or the RSI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The DS segment may be overridden with a segment override prefix.) The destination operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the OUTS mnemonic) allows the source and destination operands to be specified explicitly. Here, the source operand should be a symbol that indicates the size of the I/O port and the source address, and the destination operand must be DX. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the DS:(E)SI or RSI registers, which must be loaded correctly before the OUTS instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the OUTS instructions. Here also DS:(E)SI is assumed to be the source operand and DX is assumed to be the destination operand. The size of the I/O port is specified with the choice of mnemonic: OUTSB (byte), OUTSW (word), or OUTSD (doubleword).</p><p>After the byte, word, or doubleword is transferred from the memory location to the I/O port, the SI/ESI/RSI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the SI/ESI/RSI register is decremented.) The SI/ESI/RSI register is incremented or decremented by 1 for byte operations, by 2 for word operations, and by 4 for doubleword operations.</p><p>The OUTS, OUTSB, OUTSW, and OUTSD instructions can be preceded by the REP prefix for block input of ECX bytes, words, or doublewords. See \u201cREP/REPE/REPZ /REPNE/REPNZ\u2014Repeat String Operation Prefix\u201d in this chapter for a description of the REP prefix. This instruction is only useful for accessing I/O ports located in the processor\u2019s I/O address space. See Chapter 16, \u201cInput/Output,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information on accessing I/O ports in the I/O address space.</p>",
                 "tooltip": "Copies data from the source operand (second operand) to the I/O port specified with the destination operand (first operand). The source operand is a memory location, the address of which is read from either the DS:SI, DS:ESI or the RSI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The DS segment may be overridden with a segment override prefix.) The destination operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port."
             };

         case "PABSD":
         case "VPABSW":
         case "PABSB":
         case "VPABSD":
         case "PABSW":
         case "VPABSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PABSB:PABSW:PABSD.html",
                 "html": "<p>(V)PABSB/W/D computes the absolute value of each data element of the source operand (the second operand) and stores the UNSIGNED results in the destination operand (the first operand). (V)PABSB operates on signed bytes, (V)PABSW operates on 16-bit words, and (V)PABSD operates on signed 32-bit integers. The source operand can be an MMX register or a 64-bit memory location, or it can be an XMM register, a YMM register, a 128-bit memory loca-tion, or a 256-bit memory location. The destination operand can be an MMX, an XMM or a YMM register. Both oper-ands can be MMX registers or XMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>128-bit Legacy SSE version: The source operand can be an XMM register or a 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "(V)PABSB/W/D computes the absolute value of each data element of the source operand (the second operand) and stores the UNSIGNED results in the destination operand (the first operand). (V)PABSB operates on signed bytes, (V)PABSW operates on 16-bit words, and (V)PABSD operates on signed 32-bit integers. The source operand can be an MMX register or a 64-bit memory location, or it can be an XMM register, a YMM register, a 128-bit memory loca-tion, or a 256-bit memory location. The destination operand can be an MMX, an XMM or a YMM register. Both oper-ands can be MMX registers or XMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "VPACKSSWB":
         case "PACKSSWB":
         case "PACKSSDW":
         case "VPACKSSDW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PACKSSWB:PACKSSDW.html",
                 "html": "<p>Converts packed signed word integers into packed signed byte integers (PACKSSWB) or converts packed signed doubleword integers into packed signed word integers (PACKSSDW), using saturation to handle overflow condi-tions. See Figure 4-2 for an example of the packing operation.</p><svg height=\"141.03\" viewbox=\"111.840000 484777.980010 379.199990 94.020000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"36.415134\" x=\"225.06\" y=\"484794.002844\">64-Bit SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"41.26458\" x=\"335.0996\" y=\"484795.502244\">64-Bit DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.019600pt\" textlength=\"44.4045252\" x=\"278.76\" y=\"484865.187972\">64-Bit DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"210.9\" y=\"484798.56\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"262.44\" y=\"484838.04\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"319.08\" y=\"484799.52\"></rect>\n<path d=\"M230.100000,484815.780000 L229.920000,484815.720000 L229.740000,484816.140000 L229.920000,484816.200000 \" style=\"stroke:black\"></path>\n<path d=\"M230.100000,484815.780000 L229.920000,484816.200000 L265.500000,484832.640000 L265.680000,484832.220000 \" style=\"stroke:black\"></path>\n<path d=\"M265.200000,484816.800000 L265.020000,484816.680000 L264.720000,484817.040000 L264.900000,484817.160000 \" style=\"stroke:black\"></path>\n<path d=\"M265.200000,484816.800000 L264.900000,484817.160000 L284.280000,484831.140000 L284.580000,484830.780000 \" style=\"stroke:black\"></path>\n<path d=\"M336.660000,484817.700000 L336.840000,484817.580000 L336.540000,484817.160000 L336.360000,484817.280000 \" style=\"stroke:black\"></path>\n<path d=\"M336.660000,484817.700000 L336.360000,484817.280000 L316.440000,484830.960000 L316.740000,484831.380000 \" style=\"stroke:black\"></path>\n<path d=\"M372.120000,484818.240000 L372.300000,484818.180000 L372.120000,484817.760000 L371.940000,484817.820000 \" style=\"stroke:black\"></path>\n<path d=\"M372.120000,484818.240000 L371.940000,484817.820000 L335.400000,484832.580000 L335.580000,484833.000000 \" style=\"stroke:black\"></path>\n<path d=\"M284.880000,484831.560000 L285.360000,484829.700000 L285.540000,484828.680000 L286.260000,484829.520000 L290.340000,484834.680000 L291.420000,484836.060000 L289.800000,484835.460000 L283.620000,484833.240000 L282.600000,484832.880000 L283.440000,484832.340000 L283.920000,484832.280000 L290.100000,484834.500000 L289.800000,484835.460000 L289.560000,484835.280000 L285.480000,484830.120000 L286.260000,484829.520000 L286.320000,484829.940000 L285.840000,484831.800000 \" style=\"stroke:black\"></path>\n<path d=\"M315.900000,484831.380000 L317.460000,484832.520000 L318.360000,484833.060000 L317.280000,484833.420000 L311.040000,484835.520000 L309.360000,484836.120000 L310.500000,484834.740000 L314.700000,484829.640000 L315.360000,484828.800000 L315.600000,484829.820000 L315.480000,484830.240000 L311.280000,484835.340000 L310.500000,484834.740000 L310.740000,484834.560000 L316.980000,484832.460000 L317.280000,484833.420000 L316.860000,484833.360000 L315.300000,484832.220000 \" style=\"stroke:black\"></path>\n<path d=\"M285.360000,484831.680000 L285.840000,484829.820000 L289.920000,484834.980000 L283.740000,484832.760000 \" style=\"stroke:black\"></path>\n<path d=\"M315.600000,484829.820000 L316.080000,484831.680000 L315.300000,484832.220000 L315.180000,484832.100000 L315.120000,484831.920000 L314.640000,484830.060000 \" style=\"stroke:black\"></path>\n<path d=\"M266.100000,484832.820000 L266.220000,484830.900000 L266.340000,484829.880000 L267.120000,484830.600000 L272.100000,484834.920000 L273.420000,484836.060000 L271.680000,484835.820000 L265.200000,484834.800000 L264.180000,484834.620000 L264.900000,484833.900000 L265.380000,484833.780000 L271.860000,484834.800000 L271.680000,484835.820000 L271.440000,484835.640000 L266.460000,484831.320000 L267.120000,484830.600000 L267.300000,484831.020000 L267.180000,484832.940000 \" style=\"stroke:black\"></path>\n<path d=\"M315.600000,484831.800000 L317.160000,484832.940000 L310.920000,484835.040000 L315.120000,484829.940000 \" style=\"stroke:black\"></path>\n<path d=\"M334.740000,484832.880000 L336.000000,484834.320000 L336.720000,484835.100000 L335.640000,484835.220000 L329.100000,484835.880000 L327.300000,484836.060000 L328.740000,484834.980000 L333.960000,484830.960000 L334.800000,484830.300000 L334.860000,484831.320000 L334.620000,484831.740000 L329.400000,484835.760000 L328.740000,484834.980000 L329.040000,484834.860000 L335.580000,484834.200000 L335.640000,484835.220000 L335.280000,484835.040000 L334.020000,484833.600000 \" style=\"stroke:black\"></path>\n<path d=\"M284.520000,484830.780000 L284.280000,484831.140000 L285.240000,484831.860000 L285.480000,484831.500000 \" style=\"stroke:black\"></path>\n<path d=\"M284.580000,484830.780000 L284.760000,484830.900000 L284.460000,484831.260000 L284.280000,484831.140000 \" style=\"stroke:black\"></path>\n<path d=\"M266.640000,484832.880000 L266.760000,484830.960000 L271.740000,484835.280000 L265.260000,484834.260000 \" style=\"stroke:black\"></path>\n<path d=\"M316.680000,484831.320000 L316.440000,484830.960000 L315.480000,484831.620000 L315.720000,484831.980000 \" style=\"stroke:black\"></path>\n<path d=\"M316.740000,484831.380000 L316.560000,484831.500000 L316.260000,484831.080000 L316.440000,484830.960000 \" style=\"stroke:black\"></path>\n<path d=\"M283.440000,484832.340000 L285.060000,484831.260000 L285.840000,484831.800000 L285.780000,484831.980000 L285.660000,484832.100000 L284.040000,484833.180000 \" style=\"stroke:black\"></path>\n<path d=\"M334.380000,484833.240000 L335.640000,484834.680000 L329.100000,484835.340000 L334.320000,484831.320000 \" style=\"stroke:black\"></path>\n<path d=\"M334.860000,484831.320000 L334.920000,484833.240000 L334.020000,484833.600000 L333.900000,484833.420000 L333.840000,484833.240000 L333.780000,484831.320000 \" style=\"stroke:black\"></path>\n<path d=\"M265.680000,484832.160000 L265.440000,484832.640000 L266.520000,484833.120000 L266.760000,484832.640000 \" style=\"stroke:black\"></path>\n<path d=\"M265.680000,484832.220000 L265.860000,484832.280000 L265.680000,484832.700000 L265.500000,484832.640000 \" style=\"stroke:black\"></path>\n<path d=\"M264.900000,484833.900000 L266.280000,484832.520000 L267.180000,484832.940000 L267.120000,484833.120000 L265.620000,484834.620000 \" style=\"stroke:black\"></path>\n<path d=\"M335.580000,484833.000000 L335.340000,484832.520000 L334.320000,484833.000000 L334.500000,484833.480000 \" style=\"stroke:black\"></path>\n<path d=\"M335.580000,484833.000000 L335.400000,484833.060000 L335.220000,484832.640000 L335.400000,484832.580000 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"4.1895\" x=\"261.6002\" y=\"484810.082844\">C</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"4.9077\" x=\"227.22\" y=\"484810.382844\">D</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"6.093528\" x=\"303.9002\" y=\"484849.322644\">B\u2019</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"5.613132\" x=\"286.4399\" y=\"484849.622644\">C\u2019</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"6.270684\" x=\"267.9\" y=\"484849.862844\">D\u2019</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"6.091932\" x=\"321.8999\" y=\"484850.102244\">A\u2019</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"4.71618\" x=\"333.06\" y=\"484810.322844\">B</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"4.97154\" x=\"370.5596\" y=\"484811.582844\">A</text></svg><h3>Figure 4-2.  Operation of the PACKSSDW Instruction Using 64-bit Operands</h3><p>The (V)PACKSSWB instruction converts 4, 8 or 16 signed word integers from the destination operand (first operand) and 4, 8 or 16 signed word integers from the source operand (second operand) into 8, 16 or 32 signed byte integers and stores the result in the destination operand. If a signed word integer value is beyond the range of a signed byte integer (that is, greater than 7FH for a positive integer or greater than 80H for a negative integer), the saturated signed byte integer value of 7FH or 80H, respectively, is stored in the destination.</p><p>The (V)PACKSSDW instruction packs 2, 4 or 8 signed doublewords from the destination operand (first operand) and 2, 4 or 8 signed doublewords from the source operand (second operand) into 4, 8 or 16 signed words in the desti-nation operand (see Figure 4-2). If a signed doubleword integer value is beyond the range of a signed word (that is, greater than 7FFFH for a positive integer or greater than 8000H for a negative integer), the saturated signed word integer value of 7FFFH or 8000H, respectively, is stored into the destination.</p>",
                 "tooltip": "Converts packed signed word integers into packed signed byte integers (PACKSSWB) or converts packed signed doubleword integers into packed signed word integers (PACKSSDW), using saturation to handle overflow condi-tions. See Figure 4-2 for an example of the packing operation."
             };

         case "VPACKUSDW":
         case "PACKUSDW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PACKUSDW.html",
                 "html": "<p>Converts packed signed doubleword integers into packed unsigned word integers using unsigned saturation to handle overflow conditions.  If the signed doubleword value is beyond the range of an unsigned word (that is, greater than FFFFH or less than 0000H), the saturated unsigned word integer value of FFFFH or 0000H, respec-tively, is stored in the destination.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Converts packed signed doubleword integers into packed unsigned word integers using unsigned saturation to handle overflow conditions.  If the signed doubleword value is beyond the range of an unsigned word (that is, greater than FFFFH or less than 0000H), the saturated unsigned word integer value of FFFFH or 0000H, respec-tively, is stored in the destination."
             };

         case "PACKUSWB":
         case "VPACKUSWB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PACKUSWB.html",
                 "html": "<p>Converts 4, 8 or 16 signed word integers from the destination operand (first operand) and 4, 8 or 16 signed word integers from the source operand (second operand) into 8, 16 or 32 unsigned byte integers and stores the result in the destination operand. (See Figure 4-2 for an example of the packing operation.) If a signed word integer value is beyond the range of an unsigned byte integer (that is, greater than FFH or less than 00H), the saturated unsigned byte integer value of FFH or 00H, respectively, is stored in the destination.</p><p>The PACKUSWB instruction operates on either 64-bit, 128-bit or 256-bit operands. When operating on 64-bit oper-ands, the destination operand must be an MMX technology register and the source operand can be either an MMX technology register or a 64-bit memory location. In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Converts 4, 8 or 16 signed word integers from the destination operand (first operand) and 4, 8 or 16 signed word integers from the source operand (second operand) into 8, 16 or 32 unsigned byte integers and stores the result in the destination operand. (See Figure 4-2 for an example of the packing operation.) If a signed word integer value is beyond the range of an unsigned byte integer (that is, greater than FFH or less than 00H), the saturated unsigned byte integer value of FFH or 00H, respectively, is stored in the destination."
             };

         case "PADDD":
         case "VPADDW":
         case "VPADDB":
         case "PADDW":
         case "VPADDD":
         case "PADDB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PADDB:PADDW:PADDD.html",
                 "html": "<p>Performs a SIMD add of the packed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.</p><p>Adds the packed byte, word, doubleword, or quadword integers in the first source operand to the second source operand and stores the result in the destination operand. When a result is too large to be represented in the</p><p>8/16/32 integer (overflow), the result is wrapped around and the low bits are written to the destination element (that is, the carry is ignored).</p><p>Note that these instructions can operate on either unsigned or signed (two\u2019s complement notation) integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of the values operated on.</p><p>These instructions can operate on either 64-bit, 128-bit or 256-bit operands. When operating on 64-bit operands, the destination operand must be an MMX technology register and the source operand can be either an MMX tech-nology register or a 64-bit memory location. In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD add of the packed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs."
             };

         case "VPADDQ":
         case "PADDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PADDQ.html",
                 "html": "<p>Adds the first operand (destination operand) to the second operand (source operand) and stores the result in the destination operand. The source operand can be a quadword integer stored in an MMX technology register or a 64-bit memory location, or it can be two packed quadword integers stored in an XMM register or an 128-bit memory location. The destination operand can be a quadword integer stored in an MMX technology register or two packed quadword integers stored in an XMM register. When packed quadword operands are used, a SIMD add is performed. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).</p><p>Note that the (V)PADDQ instruction can operate on either unsigned or signed (two\u2019s complement notation) inte-gers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of the values operated on.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Adds the first operand (destination operand) to the second operand (source operand) and stores the result in the destination operand. The source operand can be a quadword integer stored in an MMX technology register or a 64-bit memory location, or it can be two packed quadword integers stored in an XMM register or an 128-bit memory location. The destination operand can be a quadword integer stored in an MMX technology register or two packed quadword integers stored in an XMM register. When packed quadword operands are used, a SIMD add is performed. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored)."
             };

         case "PADDSW":
         case "VPADDSW":
         case "PADDSB":
         case "VPADDSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PADDSB:PADDSW.html",
                 "html": "<p>Performs a SIMD add of the packed signed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following paragraphs.</p><p>The PADDSB instruction adds packed signed byte integers. When an individual byte result is beyond the range of a signed byte integer (that is, greater than 7FH or less than 80H), the saturated value of 7FH or 80H, respectively, is written to the destination operand.</p><p>The PADDSW instruction adds packed signed word integers. When an individual word result is beyond the range of a signed word integer (that is, greater than 7FFFH or less than 8000H), the saturated value of 7FFFH or 8000H, respectively, is written to the destination operand.</p><p>These instructions can operate on either 64-bit, 128-bit or 256-bit operands. When operating on 64-bit operands, the destination operand must be an MMX technology register and the source operand can be either an MMX tech-nology register or a 64-bit memory location. In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD add of the packed signed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following paragraphs."
             };

         case "PADDUSW":
         case "VPADDUSB":
         case "VPADDUSW":
         case "PADDUSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PADDUSB:PADDUSW.html",
                 "html": "<p>Performs a SIMD add of the packed unsigned integers from the source operand (second operand) and the destina-tion operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.</p><p>The (V)PADDUSB instruction adds packed unsigned byte integers. When an individual byte result is beyond the range of an unsigned byte integer (that is, greater than FFH), the saturated value of FFH is written to the destina-tion operand.</p><p>The (V)PADDUSW instruction adds packed unsigned word integers. When an individual word result is beyond the range of an unsigned word integer (that is, greater than FFFFH), the saturated value of FFFFH is written to the destination operand.</p><p>These instructions can operate on either 64-bit, 128-bit or 256-bit operands. When operating on 64-bit operands, the destination operand must be an MMX technology register and the source operand can be either an MMX tech-</p><p>nology register or a 64-bit memory location. In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD add of the packed unsigned integers from the source operand (second operand) and the destina-tion operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs."
             };

         case "PALIGNR":
         case "VPALIGNR":
             return {
                 "url": "http://www.felixcloutier.com/x86/PALIGNR.html",
                 "html": "<p>(V)PALIGNR concatenates the destination operand (the first operand) and the source operand (the second operand) into an intermediate composite, shifts the composite at byte granularity to the right by a constant imme-diate, and extracts the right-aligned result into the destination. The first and the second operands can be an MMX, XMM or a YMM register. The immediate value is considered unsigned. Immediate shift counts larger than the 2L (i.e. 32 for 128-bit operands, or 16 for 64-bit operands) produce a zero result. Both operands can be MMX regis-ters, XMM registers or YMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register and contains two 16-byte blocks. The second source operand is a YMM register or a 256-bit memory location containing two 16-byte block. The destination operand is a YMM register and contain two 16-byte results. The imm8[7:0] is the common shift count used for the two lower 16-byte block sources and the two upper 16-byte block sources. The low 16-byte block of the two source</p>",
                 "tooltip": "(V)PALIGNR concatenates the destination operand (the first operand) and the source operand (the second operand) into an intermediate composite, shifts the composite at byte granularity to the right by a constant imme-diate, and extracts the right-aligned result into the destination. The first and the second operands can be an MMX, XMM or a YMM register. The immediate value is considered unsigned. Immediate shift counts larger than the 2L (i.e. 32 for 128-bit operands, or 16 for 64-bit operands) produce a zero result. Both operands can be MMX regis-ters, XMM registers or YMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "VPAND":
         case "PAND":
             return {
                 "url": "http://www.felixcloutier.com/x86/PAND.html",
                 "html": "<p>Performs a bitwise logical AND operation on the first source operand and second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bits of the first and second operands are 1, otherwise it is set to 0.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical AND operation on the first source operand and second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bits of the first and second operands are 1, otherwise it is set to 0."
             };

         case "PANDN":
         case "VPANDN":
             return {
                 "url": "http://www.felixcloutier.com/x86/PANDN.html",
                 "html": "<p>Performs a bitwise logical NOT operation on the first source operand, then performs bitwise AND with second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corre-sponding bit in the first operand is 0 and the corresponding bit in the second operand is 1, otherwise it is set to 0.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical NOT operation on the first source operand, then performs bitwise AND with second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corre-sponding bit in the first operand is 0 and the corresponding bit in the second operand is 1, otherwise it is set to 0."
             };

         case "PAUSE":
             return {
                 "url": "http://www.felixcloutier.com/x86/PAUSE.html",
                 "html": "<p>Improves the performance of spin-wait loops. When executing a \u201cspin-wait loop,\u201d processors will suffer a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops.</p><p>An additional function of the PAUSE instruction is to reduce the power consumed by a processor while executing a spin loop. A processor can execute a spin-wait loop extremely quickly, causing the processor to consume a lot of power while it waits for the resource it is spinning on to become available. Inserting a pause instruction in a spin-wait loop greatly reduces the processor\u2019s power consumption.</p><p>This instruction was introduced in the Pentium 4 processors, but is backward compatible with all IA-32 processors. In earlier IA-32 processors, the PAUSE instruction operates like a NOP instruction. The Pentium 4 and Intel Xeon processors implement the PAUSE instruction as a delay. The delay is finite and can be zero for some processors. This instruction does not change the architectural state of the processor (that is, it performs essentially a delaying no-op operation).</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Improves the performance of spin-wait loops. When executing a \u201cspin-wait loop,\u201d processors will suffer a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops."
             };

         case "PAVGW":
         case "VPAVGW":
         case "VPAVGB":
         case "PAVGB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PAVGB:PAVGW.html",
                 "html": "<p>Performs a SIMD average of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the results in the destination operand. For each corresponding pair of data elements in the first and second operands, the elements are added together, a 1 is added to the temporary sum, and that result is shifted right one bit position.</p><p>The (V)PAVGB instruction operates on packed unsigned bytes and the (V)PAVGW instruction operates on packed unsigned words.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or a 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD average of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the results in the destination operand. For each corresponding pair of data elements in the first and second operands, the elements are added together, a 1 is added to the temporary sum, and that result is shifted right one bit position."
             };

         case "PBLENDVB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PBLENDVB.html",
                 "html": "<p>Conditionally copies byte elements from the source operand (second operand) to the destination operand (first operand) depending on mask bits defined in the implicit third register argument, XMM0. The mask bits are the most significant bit in each byte element of the XMM0 register.</p><p>If a mask bit is \u201c1\", then the corresponding byte element in the source operand is copied to the destination, else the byte element in the destination operand is left unchanged.</p><p>The register assignment of the implicit third operand is defined to be the architectural register XMM0.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute PBLENDVB with a VEX prefix will cause #UD.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (VLMAX-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.L must be 0, otherwise the instruction will #UD. VEX.W must be 0, otherwise, the instruction will #UD.</p>",
                 "tooltip": "Conditionally copies byte elements from the source operand (second operand) to the destination operand (first operand) depending on mask bits defined in the implicit third register argument, XMM0. The mask bits are the most significant bit in each byte element of the XMM0 register."
             };

         case "VPBLENDW":
         case "PBLENDW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PBLENDW.html",
                 "html": "<p>Words from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word element in the destination operand is unchanged.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Words from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word element in the destination operand is unchanged."
             };

         case "PCLMULQDQ":
         case "VPCLMULQDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCLMULQDQ.html",
                 "html": "<p>Performs a carry-less multiplication of two quadwords, selected from the first source and second source operand according to the value of the immediate byte. Bits 4 and 0 are used to select which 64-bit half of each operand to use according to Table 4-10, other bits of the immediate byte are ignored.</p><h3>Table 4-10.  PCLMULQDQ Quadword Selection of Immediate Byte</h3><table>\n<tr>\n<th>Imm[4]</th>\n<th>Imm[0]</th>\n<th>PCLMULQDQ Operation</th></tr>\n<tr>\n<td>0</td>\n<td>0</td>\n<td>CL_MUL( SRC2<sup>1</sup>[63:0], SRC1[63:0] )</td></tr>\n<tr>\n<td>0</td>\n<td>1</td>\n<td>CL_MUL( SRC2[63:0], SRC1[127:64] )</td></tr>\n<tr>\n<td>1</td>\n<td>0</td>\n<td>CL_MUL( SRC2[127:64], SRC1[63:0] )</td></tr>\n<tr>\n<td>1</td>\n<td>1</td>\n<td>CL_MUL( SRC2[127:64], SRC1[127:64] )</td></tr></table><p><strong>NOTES:</strong></p><p>1. SRC2 denotes the second source operand, which can be a register or memory; SRC1 denotes the first source and destination oper-</p>",
                 "tooltip": "Performs a carry-less multiplication of two quadwords, selected from the first source and second source operand according to the value of the immediate byte. Bits 4 and 0 are used to select which 64-bit half of each operand to use according to Table 4-10, other bits of the immediate byte are ignored."
             };

         case "VPCMPEQW":
         case "PCMPEQW":
         case "VPCMPEQD":
         case "PCMPEQB":
         case "PCMPEQD":
         case "VPCMPEQB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPEQB:PCMPEQW:PCMPEQD.html",
                 "html": "<p>Performs a SIMD compare for equality of the packed bytes, words, or doublewords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.</p><p>The (V)PCMPEQB instruction compares the corresponding bytes in the destination and source operands; the (V)PCMPEQW instruction compares the corresponding words in the destination and source operands; and the (V)PCMPEQD instruction compares the corresponding doublewords in the destination and source operands.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Performs a SIMD compare for equality of the packed bytes, words, or doublewords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s."
             };

         case "PCMPEQQ":
         case "VPCMPEQQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPEQQ.html",
                 "html": "<p>Performs an SIMD compare for equality of the packed quadwords in the destination operand (first operand) and the source operand (second operand).  If a pair of data elements is equal, the corresponding data element in the desti-nation is set to all 1s; otherwise, it is set to 0s.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs an SIMD compare for equality of the packed quadwords in the destination operand (first operand) and the source operand (second operand).  If a pair of data elements is equal, the corresponding data element in the desti-nation is set to all 1s; otherwise, it is set to 0s."
             };

         case "VPCMPESTRI":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPESTRI.html",
                 "html": "<p>The instruction compares and processes data from two string fragments based on the encoded value in the Imm8 Control Byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMP-ISTRM\u201d), and generates an index stored to the count register (ECX/RCX).</p><p>Each string fragment is represented by two values. The first value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data).  The second value is stored in an input length register. The input length register is EAX/RAX (for xmm1) or EDX/RDX (for xmm2/m128). The length repre-sents the number of bytes/words which are valid for the respective xmm/m128 data.</p><p>The length of each input is interpreted as being the absolute-value of the value in the length register. The absolute-value computation saturates to 16 (for bytes) and 8 (for words), based on the value of imm8[bit3] when the value in the length register is greater than 16 (8) or less than -16 (-8).</p><p>The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). The index of the first (or last, according to imm8[6]) set bit of IntRes2 (see Section 4.1.4) is returned in ECX. If no bits are set in IntRes2, ECX is set to 16 (8).</p><p>Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant informa-tion:</p>",
                 "tooltip": "The instruction compares and processes data from two string fragments based on the encoded value in the Imm8 Control Byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMP-ISTRM\u201d), and generates an index stored to the count register (ECX/RCX)."
             };

         case "VPCMPESTRM":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPESTRM.html",
                 "html": "<p>The instruction compares data from two string fragments based on the encoded value in the imm8 contol byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d), and gener-ates a mask stored to XMM0.</p><p>Each string fragment is represented by two values. The first value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data). The second value is stored in an input length register. The input length register is EAX/RAX (for xmm1) or EDX/RDX (for xmm2/m128). The length repre-sents the number of bytes/words which are valid for the respective xmm/m128 data.</p><p>The length of each input is interpreted as being the absolute-value of the value in the length register. The absolute-value computation saturates to 16 (for bytes) and 8 (for words), based on the value of imm8[bit3] when the value in the length register is greater than 16 (8) or less than -16 (-8).</p><p>The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). As defined by imm8[6], IntRes2 is then either stored to the least significant bits of XMM0 (zero extended to 128 bits) or expanded into a byte/word-mask and then stored to XMM0.</p><p>Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant informa-tion:</p>",
                 "tooltip": "The instruction compares data from two string fragments based on the encoded value in the imm8 contol byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d), and gener-ates a mask stored to XMM0."
             };

         case "PCMPGTW":
         case "VPCMPGTB":
         case "VPCMPGTD":
         case "PCMPGTB":
         case "PCMPGTD":
         case "VPCMPGTW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPGTB:PCMPGTW:PCMPGTD.html",
                 "html": "<p>Performs an SIMD signed compare for the greater value of the packed byte, word, or doubleword integers in the destination operand (first operand) and the source operand (second operand). If a data element in the destination operand is greater than the corresponding date element in the source operand, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.</p><p>The PCMPGTB instruction compares the corresponding signed byte integers in the destination and source oper-ands; the PCMPGTW instruction compares the corresponding signed word integers in the destination and source</p><p>operands; and the PCMPGTD instruction compares the corresponding signed doubleword integers in the destina-tion and source operands.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p>",
                 "tooltip": "Performs an SIMD signed compare for the greater value of the packed byte, word, or doubleword integers in the destination operand (first operand) and the source operand (second operand). If a data element in the destination operand is greater than the corresponding date element in the source operand, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s."
             };

         case "VPCMPGTQ":
         case "PCMPGTQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPGTQ.html",
                 "html": "<p>Performs an SIMD signed compare for the packed quadwords in the destination operand (first operand) and the source operand (second operand). If the data element in the first (destination) operand is greater than the corresponding element in the second (source) operand, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source operand and destination operand are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source operand and destination operand are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs an SIMD signed compare for the packed quadwords in the destination operand (first operand) and the source operand (second operand). If the data element in the first (destination) operand is greater than the corresponding element in the second (source) operand, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s."
             };

         case "VPCMPISTRI":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPISTRI.html",
                 "html": "<p>The instruction compares data from two strings based on the encoded value in the Imm8 Control Byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d), and generates an index stored to ECX.</p><p>Each string is represented by a single value.  The value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data).  Each input byte/word is augmented with a valid/invalid tag.  A byte/word is considered valid only if it has a lower index than the least significant null byte/word.  (The least significant null byte/word is also considered invalid.)</p><p>The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). The index of the first (or last, according to imm8[6]) set bit of IntRes2 is returned in ECX. If no bits are set in IntRes2, ECX is set to 16 (8).</p><p>Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant informa-tion:</p><p>CFlag \u2013 Reset if IntRes2 is equal to zero, set otherwise</p>",
                 "tooltip": "The instruction compares data from two strings based on the encoded value in the Imm8 Control Byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d), and generates an index stored to ECX."
             };

         case "VPCMPISTRM":
             return {
                 "url": "http://www.felixcloutier.com/x86/PCMPISTRM.html",
                 "html": "<p>The instruction compares data from two strings based on the encoded value in the imm8 byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d) generating a mask stored to XMM0.</p><p>Each string is represented by a single value. The value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data).  Each input byte/word is augmented with a valid/invalid tag.  A byte/word is considered valid only if it has a lower index than the least significant null byte/word.  (The least significant null byte/word is also considered invalid.)</p><p>The comparison and aggregation operation are performed according to the encoded value of Imm8 bit fields (see Section 4.1). As defined by imm8[6], IntRes2 is then either stored to the least significant bits of XMM0 (zero extended to 128 bits) or expanded into a byte/word-mask and then stored to XMM0.</p><p>Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant informa-tion:</p><p>CFlag \u2013 Reset if IntRes2 is equal to zero, set otherwise</p>",
                 "tooltip": "The instruction compares data from two strings based on the encoded value in the imm8 byte (see Section 4.1, \u201cImm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM\u201d) generating a mask stored to XMM0."
             };

         case "PDEP":
             return {
                 "url": "http://www.felixcloutier.com/x86/PDEP.html",
                 "html": "<p>PDEP uses a mask in the second source operand (the third operand) to transfer/scatter contiguous low order bits in the first source operand (the second operand) into the destination (the first operand). PDEP takes the low bits from the first source operand and deposit them in the destination operand at the corresponding bit locations that are set in the second source operand (mask). All other bits (bits not set in mask) in destination are set to zero.</p><p>SRC1</p><p>S<sub>31 </sub>S<sub>30 </sub>S<sub>29 </sub>S<sub>28 </sub>S<sub>27</sub></p><p>S<sub>7</sub></p><p>S<sub>6</sub></p>",
                 "tooltip": "PDEP uses a mask in the second source operand (the third operand) to transfer/scatter contiguous low order bits in the first source operand (the second operand) into the destination (the first operand). PDEP takes the low bits from the first source operand and deposit them in the destination operand at the corresponding bit locations that are set in the second source operand (mask). All other bits (bits not set in mask) in destination are set to zero."
             };

         case "PEXT":
             return {
                 "url": "http://www.felixcloutier.com/x86/PEXT.html",
                 "html": "<p>PEXT uses a mask in the second source operand (the third operand) to transfer either contiguous or non-contig-uous bits in the first source operand (the second operand) to contiguous low order bit positions in the destination (the first operand). For each bit set in the MASK, PEXT extracts the corresponding bits from the first source operand and writes them into contiguous lower bits of destination operand. The remaining upper bits of destination are zeroed.</p><p>SRC1</p><p>S<sub>31 </sub>S<sub>30 </sub>S<sub>29 </sub>S<sub>28 </sub>S<sub>27</sub></p><p>S<sub>7</sub></p><p>S<sub>6</sub></p>",
                 "tooltip": "PEXT uses a mask in the second source operand (the third operand) to transfer either contiguous or non-contig-uous bits in the first source operand (the second operand) to contiguous low order bit positions in the destination (the first operand). For each bit set in the MASK, PEXT extracts the corresponding bits from the first source operand and writes them into contiguous lower bits of destination operand. The remaining upper bits of destination are zeroed."
             };

         case "VPEXTRB":
         case "PEXTRQ":
         case "VPEXTRD":
         case "PEXTRD":
         case "VPEXTRQ":
         case "PEXTRB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PEXTRB:PEXTRD:PEXTRQ.html",
                 "html": "<p>Extract a byte/dword/qword integer value from the source XMM register at a byte/dword/qword offset determined from imm8[3:0]. The destination can be a register or byte/dword/qword memory location. If the destination is a register, the upper bits of the register are zero extended.</p><p>In legacy non-VEX encoded version and if the destination operand is a register, the default operand size in 64-bit mode for PEXTRB/PEXTRD is 64 bits, the bits above the least significant byte/dword data are filled with zeros. PEXTRQ is not encodable in non-64-bit modes and requires REX.W in 64-bit mode.</p><p>Note: In VEX.128 encoded versions, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD. If the destination operand is a register, the default operand size in 64-bit mode for VPEXTRB/VPEXTRD is 64 bits, the bits above the least significant byte/word/dword data are filled with zeros. Attempt to execute VPEXTRQ in non-64-bit mode will cause #UD.</p>",
                 "tooltip": "Extract a byte/dword/qword integer value from the source XMM register at a byte/dword/qword offset determined from imm8[3:0]. The destination can be a register or byte/dword/qword memory location. If the destination is a register, the upper bits of the register are zero extended."
             };

         case "PEXTRW":
         case "VPEXTRW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PEXTRW.html",
                 "html": "<p>Copies the word in the source operand (second operand) specified by the count operand (third operand) to the destination operand (first operand). The source operand can be an MMX technology register or an XMM register. The destination operand can be the low word of a general-purpose register or a 16-bit memory address. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-signifi-cant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the loca-tion. The content of the destination register above bit 16 is cleared (set to all 0s).</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15). If the destination operand is a general-purpose register, the default operand size is 64-bits in 64-bit mode.</p><p>Note: In VEX.128 encoded versions, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD. If the destination operand is a register, the default operand size in 64-bit mode for VPEXTRW is 64 bits, the bits above the least significant byte/word/dword data are filled with zeros.</p>",
                 "tooltip": "Copies the word in the source operand (second operand) specified by the count operand (third operand) to the destination operand (first operand). The source operand can be an MMX technology register or an XMM register. The destination operand can be the low word of a general-purpose register or a 16-bit memory address. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-signifi-cant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the loca-tion. The content of the destination register above bit 16 is cleared (set to all 0s)."
             };

         case "PHADDSW":
         case "VPHADDSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PHADDSW.html",
                 "html": "<p>(V)PHADDSW adds two adjacent signed 16-bit integers horizontally from the source and destination operands and saturates the signed results; packs the signed, saturated 16-bit results to the destination operand (first operand) When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "(V)PHADDSW adds two adjacent signed 16-bit integers horizontally from the source and destination operands and saturates the signed results; packs the signed, saturated 16-bit results to the destination operand (first operand) When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "PHADDD":
         case "PHADDW":
         case "VPHADDD":
         case "VPHADDW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PHADDW:PHADDD.html",
                 "html": "<p>(V)PHADDW adds two adjacent 16-bit signed integers horizontally from the source and destination operands and packs the 16-bit signed results to the destination operand (first operand). (V)PHADDD adds two adjacent 32-bit signed integers horizontally from the source and destination operands and packs the 32-bit signed results to the destination operand (first operand). When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Note that these instructions can operate on either unsigned or signed (two\u2019s complement notation) integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of the values operated on.</p><p>Legacy SSE instructions: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p>",
                 "tooltip": "(V)PHADDW adds two adjacent 16-bit signed integers horizontally from the source and destination operands and packs the 16-bit signed results to the destination operand (first operand). (V)PHADDD adds two adjacent 32-bit signed integers horizontally from the source and destination operands and packs the 32-bit signed results to the destination operand (first operand). When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "PHMINPOSUW":
         case "VPHMINPOSUW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PHMINPOSUW.html",
                 "html": "<p>Determine the minimum unsigned word value in the source operand (second operand) and place the unsigned word in the low word (bits 0-15) of the destination operand (first operand).  The word index of the minimum value is stored in bits 16-18 of the destination operand.  The remaining upper bits of the destination are set to zero.</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Determine the minimum unsigned word value in the source operand (second operand) and place the unsigned word in the low word (bits 0-15) of the destination operand (first operand).  The word index of the minimum value is stored in bits 16-18 of the destination operand.  The remaining upper bits of the destination are set to zero."
             };

         case "PHSUBSW":
         case "VPHSUBSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PHSUBSW.html",
                 "html": "<p>(V)PHSUBSW performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands. The signed, saturated 16-bit results are packed to the destination operand (first operand). When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "(V)PHSUBSW performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands. The signed, saturated 16-bit results are packed to the destination operand (first operand). When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "PHSUBD":
         case "VPHSUBW":
         case "PHSUBW":
         case "VPHSUBD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PHSUBW:PHSUBD.html",
                 "html": "<p>(V)PHSUBW performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands, and packs the signed 16-bit results to the destination operand (first operand). (V)PHSUBD performs horizontal subtraction on each adjacent pair of 32-bit signed integers by subtracting the most significant doubleword from the least signifi-cant doubleword of each pair, and packs the signed 32-bit result to the destination operand. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "(V)PHSUBW performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands, and packs the signed 16-bit results to the destination operand (first operand). (V)PHSUBD performs horizontal subtraction on each adjacent pair of 32-bit signed integers by subtracting the most significant doubleword from the least signifi-cant doubleword of each pair, and packs the signed 32-bit result to the destination operand. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "VPINSRB":
         case "PINSRQ":
         case "VPINSRD":
         case "PINSRD":
         case "VPINSRQ":
         case "PINSRB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PINSRB:PINSRD:PINSRQ.html",
                 "html": "<p>Copies a byte/dword/qword from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other elements in the desti-nation register are left untouched.) The source operand can be a general-purpose register or a memory location. (When the source operand is a general-purpose register, PINSRB copies the low byte of the register.) The destina-tion operand is an XMM register. The count operand is an 8-bit immediate. When specifying a qword[dword, byte] location in an an XMM register, the [2, 4] least-significant bit(s) of the count operand specify the location.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15). Use of REX.W permits the use of 64 bit general purpose registers.</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, other-wise the instruction will #UD. Attempt to execute VPINSRQ in non-64-bit mode will cause #UD.</p>",
                 "tooltip": "Copies a byte/dword/qword from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other elements in the desti-nation register are left untouched.) The source operand can be a general-purpose register or a memory location. (When the source operand is a general-purpose register, PINSRB copies the low byte of the register.) The destina-tion operand is an XMM register. The count operand is an 8-bit immediate. When specifying a qword[dword, byte] location in an an XMM register, the [2, 4] least-significant bit(s) of the count operand specify the location."
             };

         case "PINSRW":
         case "VPINSRW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PINSRW.html",
                 "html": "<p>Copies a word from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other words in the destination register are left untouched.) The source operand can be a general-purpose register or a 16-bit memory location. (When the source operand is a general-purpose register, the low word of the register is copied.) The destination operand can be an MMX technology register or an XMM register. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-significant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the location.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, other-wise the instruction will #UD.</p>",
                 "tooltip": "Copies a word from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other words in the destination register are left untouched.) The source operand can be a general-purpose register or a 16-bit memory location. (When the source operand is a general-purpose register, the low word of the register is copied.) The destination operand can be an MMX technology register or an XMM register. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-significant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the location."
             };

         case "VPMADDUBSW":
         case "PMADDUBSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMADDUBSW.html",
                 "html": "<p>(V)PMADDUBSW multiplies vertically each unsigned byte of the destination operand (first operand) with the corre-sponding signed byte of the source operand (second operand), producing intermediate signed 16-bit integers. Each adjacent pair of signed words is added and the saturated result is packed to the destination operand. For example, the lowest-order bytes (bits 7-0) in the source and destination operands are multiplied and the intermediate signed word result is added with the corresponding intermediate result from the 2nd lowest-order bytes (bits 15-8) of the operands; the sign-saturated result is stored in the lowest word of the destination register (15-0). The same oper-ation is performed on the other pairs of adjacent bytes. Both operands can be MMX register or XMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The first source and destination operands are YMM registers. The second source operand can be an YMM register or a 256-bit memory location.</p>",
                 "tooltip": "(V)PMADDUBSW multiplies vertically each unsigned byte of the destination operand (first operand) with the corre-sponding signed byte of the source operand (second operand), producing intermediate signed 16-bit integers. Each adjacent pair of signed words is added and the saturated result is packed to the destination operand. For example, the lowest-order bytes (bits 7-0) in the source and destination operands are multiplied and the intermediate signed word result is added with the corresponding intermediate result from the 2nd lowest-order bytes (bits 15-8) of the operands; the sign-saturated result is stored in the lowest word of the destination register (15-0). The same oper-ation is performed on the other pairs of adjacent bytes. Both operands can be MMX register or XMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "VPMADDWD":
         case "PMADDWD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMADDWD.html",
                 "html": "<p>Multiplies the individual signed words of the destination operand (first operand) by the corresponding signed words of the source operand (second operand), producing temporary signed, doubleword results. The adjacent double-word results are then summed and stored in the destination operand. For example, the corresponding low-order words (15-0) and (31-16) in the source and destination operands are multiplied by one another and the double-word results are added together and stored in the low doubleword of the destination register (31-0). The same operation is performed on the other pairs of adjacent words. (Figure 4-7 shows this operation when using 64-bit operands).</p><p>The (V)PMADDWD instruction wraps around only in one situation: when the 2 pairs of words being operated on in a group are all 8000H. In this case, the result wraps around to 80000000H.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The first source and destination operands are MMX registers. The second source operand is an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p>",
                 "tooltip": "Multiplies the individual signed words of the destination operand (first operand) by the corresponding signed words of the source operand (second operand), producing temporary signed, doubleword results. The adjacent double-word results are then summed and stored in the destination operand. For example, the corresponding low-order words (15-0) and (31-16) in the source and destination operands are multiplied by one another and the double-word results are added together and stored in the low doubleword of the destination register (31-0). The same operation is performed on the other pairs of adjacent words. (Figure 4-7 shows this operation when using 64-bit operands)."
             };

         case "PMAXSB":
         case "VPMAXSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXSB.html",
                 "html": "<p>Compares packed signed byte integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed signed byte integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand."
             };

         case "VPMAXSD":
         case "PMAXSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXSD.html",
                 "html": "<p>Compares packed signed dword integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed signed dword integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand."
             };

         case "VPMAXSW":
         case "PMAXSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXSW.html",
                 "html": "<p>Performs a SIMD compare of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum value for each pair of word integers to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum value for each pair of word integers to the destination operand."
             };

         case "VPMAXUB":
         case "PMAXUB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXUB.html",
                 "html": "<p>Performs a SIMD compare of the packed unsigned byte integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum value for each pair of byte integers to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned byte integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum value for each pair of byte integers to the destination operand."
             };

         case "PMAXUD":
         case "VPMAXUD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXUD.html",
                 "html": "<p>Compares packed unsigned dword integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed unsigned dword integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand."
             };

         case "PMAXUW":
         case "VPMAXUW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMAXUW.html",
                 "html": "<p>Compares packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and returns the maximum for each packed value in the destination operand."
             };

         case "PMINSB":
         case "VPMINSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINSB.html",
                 "html": "<p>Compares packed signed byte integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed signed byte integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand."
             };

         case "VPMINSD":
         case "PMINSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINSD.html",
                 "html": "<p>Compares packed signed dword integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed signed dword integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand."
             };

         case "VPMINSW":
         case "PMINSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINSW.html",
                 "html": "<p>Performs a SIMD compare of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum value for each pair of word integers to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum value for each pair of word integers to the destination operand."
             };

         case "VPMINUB":
         case "PMINUB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINUB.html",
                 "html": "<p>Performs a SIMD compare of the packed unsigned byte integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum value for each pair of byte integers to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned byte integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum value for each pair of byte integers to the destination operand."
             };

         case "PMINUD":
         case "VPMINUD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINUD.html",
                 "html": "<p>Compares packed unsigned dword integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed unsigned dword integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand."
             };

         case "PMINUW":
         case "VPMINUW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMINUW.html",
                 "html": "<p>Compares packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Compares packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and returns the minimum for each packed value in the destination operand."
             };

         case "PMOVMSKB":
         case "VPMOVMSKB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMOVMSKB.html",
                 "html": "<p>Creates a mask made up of the most significant bit of each byte of the source operand (second operand) and stores the result in the low byte or word of the destination operand (first operand).</p><p>The byte mask is 8 bits for 64-bit source operand, 16 bits for 128-bit source operand and 32 bits for 256-bit source operand. The destination operand is a general-purpose register.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. The default operand size is 64-bit in 64-bit mode.</p><p>Legacy SSE version: The source operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The source operand is an XMM register.</p>",
                 "tooltip": "Creates a mask made up of the most significant bit of each byte of the source operand (second operand) and stores the result in the low byte or word of the destination operand (first operand)."
             };

         case "VPMOVSXBW":
         case "PMOVSXWQ":
         case "VPMOVSXBQ":
         case "VPMOVSXDQ":
         case "PMOVSXBD":
         case "VPMOVSXWD":
         case "VPMOVSXWQ":
         case "VPMOVSXBD":
         case "PMOVSXBW":
         case "PMOVSXWD":
         case "PMOVSXBQ":
         case "PMOVSXDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMOVSX.html",
                 "html": "<p>Sign-extend the low byte/word/dword values in each word/dword/qword element of the source operand (second operand) to word/dword/qword integers and stored as packed data in the destination operand (first operand).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The destination register is YMM Register.</p><p>Note: VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Sign-extend the low byte/word/dword values in each word/dword/qword element of the source operand (second operand) to word/dword/qword integers and stored as packed data in the destination operand (first operand)."
             };

         case "PMOVZXBW":
         case "PMOVZXWQ":
         case "PMOVZXBQ":
         case "VPMOVZXWQ":
         case "VPMOVZXBW":
         case "PMOVZXWD":
         case "PMOVZXBD":
         case "VPMOVZXBQ":
         case "VPMOVZXDQ":
         case "VPMOVZXBD":
         case "PMOVZXDQ":
         case "VPMOVZXWD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMOVZX.html",
                 "html": "<p>Zero-extend the low byte/word/dword values in each word/dword/qword element of the source operand (second operand) to word/dword/qword integers and stored as packed data in the destination operand (first operand).</p><p>128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The destination register is YMM Register.</p><p>Note: VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Zero-extend the low byte/word/dword values in each word/dword/qword element of the source operand (second operand) to word/dword/qword integers and stored as packed data in the destination operand (first operand)."
             };

         case "PMULDQ":
         case "VPMULDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULDQ.html",
                 "html": "<p>Multiplies the first source operand by the second source operand and stores the result in the destination operand.</p><p>For PMULDQ and VPMULDQ (VEX.128 encoded version), the second source operand is two packed signed double-word integers stored in the first (low) and third doublewords of an XMM register or a 128-bit memory location. The first source operand is two packed signed doubleword integers stored in the first and third doublewords of an XMM register. The destination contains two packed signed quadword integers stored in an XMM register. For 128-bit memory operands, 128 bits are fetched from memory, but only the first and third doublewords are used in the computation.</p><p>For VPMULDQ (VEX.256 encoded version), the second source operand is four packed signed doubleword integers stored in the first (low), third, fifth and seventh doublewords of an YMM register or a 256-bit memory location. The first source operand is four packed signed doubleword integers stored in the first, third, fifth and seventh double-words of an XMM register. The destination contains four packed signed quadword integers stored in an YMM register. For 256-bit memory operands, 256 bits are fetched from memory, but only the first, third, fifth and seventh doublewords are used in the computation.</p><p>When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p>",
                 "tooltip": "Multiplies the first source operand by the second source operand and stores the result in the destination operand."
             };

         case "VPMULHRSW":
         case "PMULHRSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULHRSW.html",
                 "html": "<p>PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand (first operand) with the corresponding signed 16-bit integer of the source operand (second operand), producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand.</p><p>When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand is an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p>",
                 "tooltip": "PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand (first operand) with the corresponding signed 16-bit integer of the source operand (second operand), producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand."
             };

         case "VPMULHUW":
         case "PMULHUW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULHUW.html",
                 "html": "<p>Performs a SIMD unsigned multiply of the packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each 32-bit intermediate results in the destination operand. (Figure 4-8 shows this operation when using 64-bit operands.)</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD unsigned multiply of the packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each 32-bit intermediate results in the destination operand. (Figure 4-8 shows this operation when using 64-bit operands.)"
             };

         case "VPMULHW":
         case "PMULHW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULHW.html",
                 "html": "<p>Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each intermediate 32-bit result in the destina-tion operand. (Figure 4-8 shows this operation when using 64-bit operands.)</p><p>n 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each intermediate 32-bit result in the destina-tion operand. (Figure 4-8 shows this operation when using 64-bit operands.)"
             };

         case "PMULLD":
         case "VPMULLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULLD.html",
                 "html": "<p>Performs four signed multiplications from four pairs of signed dword integers and stores the lower 32 bits of the four 64-bit products in the destination operand (first operand). Each dword element in the destination operand is multiplied with the corresponding dword element of the source operand (second operand) to obtain a 64-bit inter-mediate product.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p><p>Note: VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs four signed multiplications from four pairs of signed dword integers and stores the lower 32 bits of the four 64-bit products in the destination operand (first operand). Each dword element in the destination operand is multiplied with the corresponding dword element of the source operand (second operand) to obtain a 64-bit inter-mediate product."
             };

         case "VPMULLW":
         case "PMULLW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULLW.html",
                 "html": "<p>Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the low 16 bits of each intermediate 32-bit result in the destina-tion operand. (Figure 4-8 shows this operation when using 64-bit operands.)</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the low 16 bits of each intermediate 32-bit result in the destina-tion operand. (Figure 4-8 shows this operation when using 64-bit operands.)"
             };

         case "VPMULUDQ":
         case "PMULUDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PMULUDQ.html",
                 "html": "<p>Multiplies the first operand (destination operand) by the second operand (source operand) and stores the result in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an unsigned doubleword integer stored in the low doubleword of an MMX technology register or a 64-bit memory location. The destination operand can be an unsigned doubleword integer stored in the low doubleword an MMX technology register. The result is an unsigned quadword integer stored in the destination an MMX technology register. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).</p><p>For 64-bit memory operands, 64 bits are fetched from memory, but only the low doubleword is used in the compu-tation.</p><p>128-bit Legacy SSE version: The second source operand is two packed unsigned doubleword integers stored in the first (low) and third doublewords of an XMM register or a 128-bit memory location. For 128-bit memory operands, 128 bits are fetched from memory, but only the first and third doublewords are used in the computation.The first source operand is two packed unsigned doubleword integers stored in the first and third doublewords of an XMM register. The destination contains two packed unsigned quadword integers stored in an XMM register. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Multiplies the first operand (destination operand) by the second operand (source operand) and stores the result in the destination operand."
             };

         case "POP":
             return {
                 "url": "http://www.felixcloutier.com/x86/POP.html",
                 "html": "<p>Loads the value from the top of the stack to the location specified with the destination operand (or explicit opcode) and then increments the stack pointer. The destination operand can be a general-purpose register, memory loca-tion, or segment register.</p><p>Address and operand sizes are determined and used as follows:</p><p>The address size is used only when writing to a destination operand in memory.</p><p>The operand size (16, 32, or 64 bits) determines the amount by which the stack pointer is incremented (2, 4 or 8).</p><p>The stack-address size determines the width of the stack pointer when reading from the stack in memory and when incrementing the stack pointer. (As stated above, the amount by which the stack pointer is incremented is determined by the operand size.)</p>",
                 "tooltip": "Loads the value from the top of the stack to the location specified with the destination operand (or explicit opcode) and then increments the stack pointer. The destination operand can be a general-purpose register, memory loca-tion, or segment register."
             };

         case "POPA":
         case "POPAD":
             return {
                 "url": "http://www.felixcloutier.com/x86/POPAD.html",
                 "html": "<p>Pops doublewords (POPAD) or words (POPA) from the stack into the general-purpose registers. The registers are loaded in the following order: EDI, ESI, EBP, EBX, EDX, ECX, and EAX (if the operand-size attribute is 32) and DI, SI, BP, BX, DX, CX, and AX (if the operand-size attribute is 16). (These instructions reverse the operation of the PUSHA/PUSHAD instructions.) The value on the stack for the ESP or SP register is ignored. Instead, the ESP or SP register is incremented after each register is loaded.</p><p>The POPA (pop all) and POPAD (pop all double) mnemonics reference the same opcode. The POPA instruction is intended for use when the operand-size attribute is 16 and the POPAD instruction for when the operand-size attri-bute is 32. Some assemblers may force the operand size to 16 when POPA is used and to 32 when POPAD is used (using the operand-size override prefix [66H] if necessary). Others may treat these mnemonics as synonyms (POPA/POPAD) and use the current setting of the operand-size attribute to determine the size of values to be popped from the stack, regardless of the mnemonic used. (The D flag in the current code segment\u2019s segment descriptor determines the operand-size attribute.)</p><p>This instruction executes as described in non-64-bit modes. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Pops doublewords (POPAD) or words (POPA) from the stack into the general-purpose registers. The registers are loaded in the following order: EDI, ESI, EBP, EBX, EDX, ECX, and EAX (if the operand-size attribute is 32) and DI, SI, BP, BX, DX, CX, and AX (if the operand-size attribute is 16). (These instructions reverse the operation of the PUSHA/PUSHAD instructions.) The value on the stack for the ESP or SP register is ignored. Instead, the ESP or SP register is incremented after each register is loaded."
             };

         case "POPCNT":
             return {
                 "url": "http://www.felixcloutier.com/x86/POPCNT.html",
                 "html": "<p>This instruction calculates of number of bits set to 1 in the second operand (source) and returns the count in the first operand (a destination register).</p>",
                 "tooltip": "This instruction calculates of number of bits set to 1 in the second operand (source) and returns the count in the first operand (a destination register)."
             };

         case "POPFD":
         case "POPF":
         case "POPFQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/POPFQ.html",
                 "html": "<p>Pops a doubleword (POPFD) from the top of the stack (if the current operand-size attribute is 32) and stores the value in the EFLAGS register, or pops a word from the top of the stack (if the operand-size attribute is 16) and stores it in the lower 16 bits of the EFLAGS register (that is, the FLAGS register). These instructions reverse the operation of the PUSHF/PUSHFD instructions.</p><p>The POPF (pop flags) and POPFD (pop flags double) mnemonics reference the same opcode. The POPF instruction is intended for use when the operand-size attribute is 16; the POPFD instruction is intended for use when the operand-size attribute is 32. Some assemblers may force the operand size to 16 for POPF and to 32 for POPFD. Others may treat the mnemonics as synonyms (POPF/POPFD) and use the setting of the operand-size attribute to determine the size of values to pop from the stack.</p><p>The effect of POPF/POPFD on the EFLAGS register changes, depending on the mode of operation. See the Table 4-12 and key below for details.</p><p>When operating in protected, compatibility, or 64-bit mode at privilege level 0 (or in real-address mode, the equiv-alent to privilege level 0), all non-reserved flags in the EFLAGS register except RF<sup>1</sup>, VIP, VIF, and VM may be modi-fied. VIP, VIF and VM remain unaffected.</p><p>When operating in protected, compatibility, or 64-bit mode with a privilege level greater than 0, but less than or equal to IOPL, all flags can be modified except the IOPL field and RF<sup>1</sup>, IF, VIP, VIF, and VM; these remain unaf-fected. The AC and ID flags can only be modified if the operand-size attribute is 32. The interrupt flag (IF) is altered only when executing at a level at least as privileged as the IOPL. If a POPF/POPFD instruction is executed with insufficient privilege, an exception does not occur but privileged bits do not change.</p>",
                 "tooltip": "Pops a doubleword (POPFD) from the top of the stack (if the current operand-size attribute is 32) and stores the value in the EFLAGS register, or pops a word from the top of the stack (if the operand-size attribute is 16) and stores it in the lower 16 bits of the EFLAGS register (that is, the FLAGS register). These instructions reverse the operation of the PUSHF/PUSHFD instructions."
             };

         case "POR":
         case "VPOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/POR.html",
                 "html": "<p>Performs a bitwise logical OR operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is set to 1 if either or both of the corresponding bits of the first and second operands are 1; otherwise, it is set to 0.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand is an XMM register or a 128-bit memory location. The first source and destination operands can be XMM registers. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand is an XMM register or a 128-bit memory location. The first source and destination operands can be XMM registers. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical OR operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is set to 1 if either or both of the corresponding bits of the first and second operands are 1; otherwise, it is set to 0."
             };

         case "PREFETCHT2":
         case "PREFETCHT1":
         case "PREFETCHT0":
         case "PREFETCHNTA":
             return {
                 "url": "http://www.felixcloutier.com/x86/PREFETCHNTA.html",
                 "html": "<p>Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by a locality hint:</p><p>\u2014 Pentium III processor\u20141st- or 2nd-level cache.</p><p>\u2014 Pentium 4 and Intel Xeon processors\u20142nd-level cache.</p><p>\u2014 Pentium III processor\u20142nd-level cache.</p><p>\u2014 Pentium 4 and Intel Xeon processors\u20142nd-level cache.</p>",
                 "tooltip": "Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by a locality hint"
             };

         case "PREFETCHW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PREFETCHW.html",
                 "html": "<p>Fetches the cache line of data from memory that contains the byte specified with the source operand to a location in the 1st or 2nd level cache and invalidates all other cached instances of the line.</p><p>The source operand is a byte memory location. If the line selected is already present in the lowest level cache and is already in an exclusively owned state, no data movement occurs. Prefetches from non-writeback memory are ignored.</p><p>The PREFETCHW instruction is merely a hint and does not affect program behavior. If executed, this instruction moves data closer to the processor and invalidates any other cached copy in anticipation of the line being written to in the future.</p><p>The characteristic of prefetch locality hints is implementation-dependent, and can be overloaded or ignored by a processor implementation. The amount of data prefetched is also processor implementation-dependent. It will, however, be a minimum of 32 bytes.</p><p>It should be noted that processors are free to speculatively fetch and cache data with exclusive ownership from system memory regions that permit such accesses (that is, the WB memory type). A PREFETCHW instruction is considered a hint to this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, a PREFETCHW instruction is not ordered with respect to the fence instructions (MFENCE, SFENCE, and LFENCE) or locked memory references. A PREFETCHW instruction is also unordered with respect to CLFLUSH instructions, other PREFETCHW instructions, or any other general instruction</p>",
                 "tooltip": "Fetches the cache line of data from memory that contains the byte specified with the source operand to a location in the 1st or 2nd level cache and invalidates all other cached instances of the line."
             };

         case "PREFETCHWT1":
             return {
                 "url": "http://www.felixcloutier.com/x86/PREFETCHWT1.html",
                 "html": "<p>Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by an intent to write hint (so that data is brought into \u2018Exclusive\u2019 state via a request for ownership) and a locality hint:</p><p>The source operand is a byte memory location. (The locality hints are encoded into the machine level instruction using bits 3 through 5 of the ModR/M byte. Use of any ModR/M value other than the specified ones will lead to unpredictable behavior.)</p><p>If the line selected is already present in the cache hierarchy at a level closer to the processor, no data movement occurs. Prefetches from uncacheable or WC memory are ignored.</p><p>The PREFETCHh instruction is merely a hint and does not affect program behavior. If executed, this instruction moves data closer to the processor in anticipation of future use.</p><p>The implementation of prefetch locality hints is implementation-dependent, and can be overloaded or ignored by a processor implementation. The amount of data prefetched is also processor implementation-dependent. It will, however, be a minimum of 32 bytes.</p>",
                 "tooltip": "Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by an intent to write hint (so that data is brought into \u2018Exclusive\u2019 state via a request for ownership) and a locality hint"
             };

         case "VPSADBW":
         case "PSADBW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSADBW.html",
                 "html": "<p>Computes the absolute value of the difference of 8 unsigned byte integers from the source operand (second operand) and from the destination operand (first operand). These 8 differences are then summed to produce an unsigned word integer result that is stored in the destination operand. Figure 4-10 shows the operation of the PSADBW instruction when using 64-bit operands.</p><p>When operating on 64-bit operands, the word integer result is stored in the low word of the destination operand, and the remaining bytes in the destination operand are cleared to all 0s.</p><p>When operating on 128-bit operands, two packed results are computed. Here, the 8 low-order bytes of the source and destination operands are operated on to produce a word result that is stored in the low word of the destination operand, and the 8 high-order bytes are operated on to produce a word result that is stored in bits 64 through 79 of the destination operand. The remaining bytes of the destination operand are cleared.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p>",
                 "tooltip": "Computes the absolute value of the difference of 8 unsigned byte integers from the source operand (second operand) and from the destination operand (first operand). These 8 differences are then summed to produce an unsigned word integer result that is stored in the destination operand. Figure 4-10 shows the operation of the PSADBW instruction when using 64-bit operands."
             };

         case "PSHUFB":
         case "VPSHUFB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSHUFB.html",
                 "html": "<p>PSHUFB performs in-place shuffles of bytes in the destination operand (the first operand) according to the shuffle control mask in the source operand (the second operand). The instruction permutes the data in the destination operand, leaving the shuffle mask unaffected. If the most significant bit (bit[7]) of each byte of the shuffle control mask is set, then constant zero is written in the result byte. Each byte in the shuffle control mask forms an index to permute the corresponding byte in the destination operand. The value of each index is the least significant 4 bits (128-bit operation) or 3 bits (64-bit operation) of the shuffle control byte. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>Legacy SSE version: Both operands can be MMX registers.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The destination operand is the first operand, the first source operand is the second operand, the second source operand is the third operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "PSHUFB performs in-place shuffles of bytes in the destination operand (the first operand) according to the shuffle control mask in the source operand (the second operand). The instruction permutes the data in the destination operand, leaving the shuffle mask unaffected. If the most significant bit (bit[7]) of each byte of the shuffle control mask is set, then constant zero is written in the result byte. Each byte in the shuffle control mask forms an index to permute the corresponding byte in the destination operand. The value of each index is the least significant 4 bits (128-bit operation) or 3 bits (64-bit operation) of the shuffle control byte. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated."
             };

         case "VPSHUFD":
         case "PSHUFD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSHUFD.html",
                 "html": "<p>Copies doublewords from source operand (second operand) and inserts them in the destination operand (first operand) at the locations selected with the order operand (third operand). Figure 4-12 shows the operation of the 256-bit VPSHUFD instruction and the encoding of the order operand. Each 2-bit field in the order operand selects the contents of one doubleword location within a 128-bit lane and copy to the target element in the destination operand. For example, bits 0 and 1 of the order operand targets the first doubleword element in the low and high 128-bit lane of the destination operand for 256-bit VPSHUFD. The encoded value of bits 1:0 of the order operand (see the field encoding in Figure 4-12) determines which doubleword element (from the respective 128-bit lane) of the source operand will be copied to doubleword 0 of the destination operand.</p><p>For 128-bit operation, only the low 128-bit lane are operative. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. The order operand is an 8-bit immediate. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand.</p><p>SRC</p><p>X7</p><p>X6</p>",
                 "tooltip": "Copies doublewords from source operand (second operand) and inserts them in the destination operand (first operand) at the locations selected with the order operand (third operand). Figure 4-12 shows the operation of the 256-bit VPSHUFD instruction and the encoding of the order operand. Each 2-bit field in the order operand selects the contents of one doubleword location within a 128-bit lane and copy to the target element in the destination operand. For example, bits 0 and 1 of the order operand targets the first doubleword element in the low and high 128-bit lane of the destination operand for 256-bit VPSHUFD. The encoded value of bits 1:0 of the order operand (see the field encoding in Figure 4-12) determines which doubleword element (from the respective 128-bit lane) of the source operand will be copied to doubleword 0 of the destination operand."
             };

         case "VPSHUFHW":
         case "PSHUFHW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSHUFHW.html",
                 "html": "<p>Copies words from the high quadword of a 128-bit lane of the source operand and inserts them in the high quad-word of the destination operand at word locations (of the respective lane) selected with the immediate operand. This 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illus-trated in Figure 4-12. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the high quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3, 4) from the high quadword of the source operand to be copied to the destination operand. The low quadword of the source operand is copied to the low quadword of the destination operand, for each 128-bit lane.</p><p>Note that this instruction permits a word in the high quadword of the source operand to be copied to more than one word location in the high quadword of the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Copies words from the high quadword of a 128-bit lane of the source operand and inserts them in the high quad-word of the destination operand at word locations (of the respective lane) selected with the immediate operand. This 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illus-trated in Figure 4-12. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the high quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3, 4) from the high quadword of the source operand to be copied to the destination operand. The low quadword of the source operand is copied to the low quadword of the destination operand, for each 128-bit lane."
             };

         case "PSHUFLW":
         case "VPSHUFLW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSHUFLW.html",
                 "html": "<p>Copies words from the low quadword of a 128-bit lane of the source operand and inserts them in the low quadword of the destination operand at word locations (of the respective lane) selected with the immediate operand. The 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illustrated in Figure 4-12. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the low quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3) from the low quadword of the source operand to be copied to the destination operand. The high quadword of the source operand is copied to the high quadword of the destination operand, for each 128-bit lane.</p><p>Note that this instruction permits a word in the low quadword of the source operand to be copied to more than one word location in the low quadword of the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Copies words from the low quadword of a 128-bit lane of the source operand and inserts them in the low quadword of the destination operand at word locations (of the respective lane) selected with the immediate operand. The 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illustrated in Figure 4-12. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the low quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3) from the low quadword of the source operand to be copied to the destination operand. The high quadword of the source operand is copied to the high quadword of the destination operand, for each 128-bit lane."
             };

         case "PSHUFW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSHUFW.html",
                 "html": "<p>Copies words from the source operand (second operand) and inserts them in the destination operand (first operand) at word locations selected with the order operand (third operand). This operation is similar to the opera-tion used by the PSHUFD instruction, which is illustrated in Figure 4-12. For the PSHUFW instruction, each 2-bit field in the order operand selects the contents of one word location in the destination operand. The encodings of the order operand fields select words from the source operand to be copied to the destination operand.</p><p>The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register. The order operand is an 8-bit immediate. Note that this instruction permits a word in the source operand to be copied to more than one word location in the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Copies words from the source operand (second operand) and inserts them in the destination operand (first operand) at word locations selected with the order operand (third operand). This operation is similar to the opera-tion used by the PSHUFD instruction, which is illustrated in Figure 4-12. For the PSHUFW instruction, each 2-bit field in the order operand selects the contents of one word location in the destination operand. The encodings of the order operand fields select words from the source operand to be copied to the destination operand."
             };

         case "PSIGNW":
         case "VPSIGNW":
         case "VPSIGND":
         case "PSIGND":
         case "PSIGNB":
         case "VPSIGNB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSIGNB:PSIGNW:PSIGND.html",
                 "html": "<p>(V)PSIGNB/(V)PSIGNW/(V)PSIGND negates each data element of the destination operand (the first operand) if the signed integer value of the corresponding data element in the source operand (the second operand) is less than zero. If the signed integer value of a data element in the source operand is positive, the corresponding data element in the destination operand is unchanged. If a data element in the source operand is zero, the corre-sponding data element in the destination operand is set to zero.</p><p>(V)PSIGNB operates on signed bytes. (V)PSIGNW operates on 16-bit signed words. (V)PSIGND operates on signed 32-bit integers. When the source operand is a 128bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE instructions: Both operands can be MMX registers. In 64-bit mode, use the REX prefix to access addi-tional registers.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM destina-tion register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise instructions will #UD.</p>",
                 "tooltip": "(V)PSIGNB/(V)PSIGNW/(V)PSIGND negates each data element of the destination operand (the first operand) if the signed integer value of the corresponding data element in the source operand (the second operand) is less than zero. If the signed integer value of a data element in the source operand is positive, the corresponding data element in the destination operand is unchanged. If a data element in the source operand is zero, the corre-sponding data element in the destination operand is set to zero."
             };

         case "PSLLDQ":
         case "VPSLLDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSLLDQ.html",
                 "html": "<p>Shifts the destination operand (first operand) to the left by the number of bytes specified in the count operand (second operand). The empty low-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.</p><p>128-bit Legacy SSE version: The source and destination operands are the same. Bits (VLMAX-1:128) of the corre-sponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The source and destination operands are XMM registers. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register. The destination operand is a YMM register. The count operand applies to both the low and high 128-bit lanes.</p><p>Note: VEX.vvvv encodes the destination register, and VEX.B + ModRM.r/m encodes the source register. VEX.L must be 0, otherwise instructions will #UD.</p>",
                 "tooltip": "Shifts the destination operand (first operand) to the left by the number of bytes specified in the count operand (second operand). The empty low-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate."
             };

         case "PSLLD":
         case "PSLLW":
         case "PSLLQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSLLW:PSLLD:PSLLQ.html",
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the left by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-13 gives an example of shifting words in a 64-bit operand.</p><svg height=\"131.399985\" viewbox=\"111.840000 640385.040010 379.199990 87.599990\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"30.975168\" x=\"158.2199\" y=\"640403.267584\">Pre-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"168.72\" y=\"640411.607484\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"27.928392\" x=\"161.58\" y=\"640423.045488\">Shift Left</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"28.5012\" x=\"160.92\" y=\"640430.845388\">with Zero</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"29.379552\" x=\"160.3799\" y=\"640438.645488\">Extension</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"34.558986\" x=\"153.3601\" y=\"640456.127884\">Post-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"167.7\" y=\"640464.107384\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"192.84\" y=\"640395.6\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"381.54\" y=\"640395.6\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"255.72\" y=\"640395.6\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"318.6\" y=\"640395.6\"></rect>\n<rect height=\"18.0600000001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"192.24\" y=\"640448.58\"></rect>\n<path d=\"M192.600000,640395.360000 L192.600000,640413.600000 L193.080010,640413.600000 L193.080010,640395.360000 \" style=\"stroke:black\"></path>\n<path d=\"M255.480000,640395.360000 L255.480000,640413.600000 L255.960000,640413.600000 L255.960000,640395.360000 \" style=\"stroke:black\"></path>\n<path d=\"M318.360000,640395.360000 L318.360000,640413.600000 L318.839980,640413.600000 L318.839980,640395.360000 \" style=\"stroke:black\"></path>\n<path d=\"M381.300000,640395.360000 L381.300000,640413.600000 L381.779980,640413.600000 L381.779980,640395.360000 \" style=\"stroke:black\"></path>\n<path d=\"M192.840000,640395.360020 L192.840000,640395.840000 L255.960000,640395.840000 L255.960000,640395.360020 \" style=\"stroke:black\"></path>\n<path d=\"M255.720000,640395.360020 L255.720000,640395.840000 L318.840000,640395.840000 L318.840000,640395.360020 \" style=\"stroke:black\"></path>\n<path d=\"M318.600000,640395.360020 L318.600000,640395.840000 L381.780000,640395.840000 L381.780000,640395.360020 \" style=\"stroke:black\"></path>\n<path d=\"M381.540000,640395.360020 L381.540000,640395.840000 L444.660000,640395.840000 L444.660000,640395.360020 \" style=\"stroke:black\"></path>\n<path d=\"M255.480000,640395.600000 L255.480000,640413.840000 L255.960000,640413.840000 L255.960000,640395.600000 \" style=\"stroke:black\"></path>\n<path d=\"M318.360000,640395.600000 L318.360000,640413.840000 L318.839980,640413.840000 L318.839980,640395.600000 \" style=\"stroke:black\"></path>\n<path d=\"M381.300000,640395.600000 L381.300000,640413.840000 L381.779980,640413.840000 L381.779980,640395.600000 \" style=\"stroke:black\"></path>\n<path d=\"M444.180000,640395.600000 L444.180000,640413.840000 L444.659980,640413.840000 L444.659980,640395.600000 \" style=\"stroke:black\"></path>\n<path d=\"M192.600000,640413.360020 L192.600000,640413.840000 L255.720000,640413.840000 L255.720000,640413.360020 \" style=\"stroke:black\"></path>\n<path d=\"M255.480000,640413.360020 L255.480000,640413.840000 L318.600000,640413.840000 L318.600000,640413.360020 \" style=\"stroke:black\"></path>\n<path d=\"M318.360000,640413.360020 L318.360000,640413.840000 L381.540000,640413.840000 L381.540000,640413.360020 \" style=\"stroke:black\"></path>\n<path d=\"M381.300000,640413.360020 L381.300000,640413.840000 L444.420000,640413.840000 L444.420000,640413.360020 \" style=\"stroke:black\"></path>\n<path d=\"M236.520000,640413.780000 L236.520000,640428.720000 L237.000000,640428.720000 L237.000000,640413.780000 \" style=\"stroke:black\"></path>\n<path d=\"M300.240000,640413.780000 L300.240000,640428.720000 L300.720010,640428.720000 L300.720010,640413.780000 \" style=\"stroke:black\"></path>\n<path d=\"M418.680000,640413.840000 L418.680000,640428.780000 L419.159980,640428.780000 L419.159980,640413.840000 \" style=\"stroke:black\"></path>\n<path d=\"M356.940000,640414.260000 L356.940000,640429.200000 L357.420010,640429.200000 L357.420010,640414.260000 \" style=\"stroke:black\"></path>\n<path d=\"M210.780000,640428.239990 L210.780000,640428.720000 L236.760000,640428.720000 L236.760000,640428.239990 \" style=\"stroke:black\"></path>\n<path d=\"M274.500000,640428.239990 L274.500000,640428.720000 L300.480000,640428.720000 L300.480000,640428.239990 \" style=\"stroke:black\"></path>\n<path d=\"M392.940000,640428.299990 L392.940000,640428.780000 L418.920000,640428.780000 L418.920000,640428.299990 \" style=\"stroke:black\"></path>\n<path d=\"M210.780000,640428.480000 L210.780000,640441.440000 L211.260000,640441.440000 L211.260000,640428.480000 \" style=\"stroke:black\"></path>\n<path d=\"M274.500000,640428.480000 L274.500000,640441.380000 L274.980010,640441.380000 L274.980010,640428.480000 \" style=\"stroke:black\"></path>\n<path d=\"M392.940000,640428.540000 L392.940000,640441.440000 L393.420010,640441.440000 L393.420010,640428.540000 \" style=\"stroke:black\"></path>\n<path d=\"M331.260000,640428.719990 L331.260000,640429.200000 L357.180000,640429.200000 L357.180000,640428.719990 \" style=\"stroke:black\"></path>\n<path d=\"M331.260000,640428.960000 L331.260000,640441.860000 L331.739980,640441.860000 L331.739980,640428.960000 \" style=\"stroke:black\"></path>\n<path d=\"M274.560000,640441.740000 L276.120000,640441.140000 L277.080000,640440.780000 L276.780000,640441.800000 L275.220000,640447.200000 L274.740000,640448.880000 L274.260000,640447.200000 L272.700000,640441.800000 L272.400000,640440.780000 L273.360000,640441.140000 L273.660000,640441.500000 L275.220000,640446.900000 L274.260000,640447.200000 L274.260000,640446.900000 L275.820000,640441.500000 L276.780000,640441.800000 L276.480000,640442.100000 L274.920000,640442.700000 \" style=\"stroke:black\"></path>\n<path d=\"M210.840000,640441.740000 L212.400000,640441.200000 L213.360000,640440.840000 L213.060000,640441.860000 L211.500000,640447.200000 L211.020000,640448.820000 L210.540000,640447.200000 L208.980000,640441.860000 L208.680000,640440.840000 L209.640000,640441.200000 L209.940000,640441.560000 L211.500000,640446.900000 L210.540000,640447.200000 L210.540000,640446.900000 L212.100000,640441.560000 L213.060000,640441.860000 L212.760000,640442.160000 L211.200000,640442.700000 \" style=\"stroke:black\"></path>\n<path d=\"M393.000000,640441.800000 L394.560000,640441.200000 L395.520000,640440.840000 L395.220000,640441.860000 L393.660000,640447.260000 L393.180000,640449.000000 L392.700000,640447.260000 L391.200000,640441.860000 L390.900000,640440.840000 L391.860000,640441.200000 L392.160000,640441.560000 L393.660000,640446.960000 L392.700000,640447.260000 L392.700000,640446.960000 L394.260000,640441.560000 L395.220000,640441.860000 L394.920000,640442.160000 L393.360000,640442.760000 \" style=\"stroke:black\"></path>\n<path d=\"M273.360000,640441.140000 L274.920000,640441.740000 L274.920000,640442.700000 L274.740000,640442.760000 L274.560000,640442.700000 L273.000000,640442.100000 \" style=\"stroke:black\"></path>\n<path d=\"M274.500000,640441.140000 L274.500000,640442.220000 L274.980010,640442.220000 L274.980010,640441.140000 \" style=\"stroke:black\"></path>\n<path d=\"M209.640000,640441.200000 L211.200000,640441.740000 L211.200000,640442.700000 L211.020000,640442.760000 L210.840000,640442.700000 L209.280000,640442.160000 \" style=\"stroke:black\"></path>\n<path d=\"M210.780000,640441.200000 L210.780000,640442.220000 L211.260000,640442.220000 L211.260000,640441.200000 \" style=\"stroke:black\"></path>\n<path d=\"M391.860000,640441.200000 L393.360000,640441.800000 L393.360000,640442.760000 L393.180000,640442.820000 L393.000000,640442.760000 L391.500000,640442.160000 \" style=\"stroke:black\"></path>\n<path d=\"M392.940000,640441.200000 L392.940000,640442.280000 L393.420010,640442.280000 L393.420010,640441.200000 \" style=\"stroke:black\"></path>\n<path d=\"M331.320000,640442.220000 L332.880000,640441.680000 L333.840000,640441.320000 L333.540000,640442.340000 L331.980000,640447.680000 L331.500000,640449.300000 L331.020000,640447.680000 L329.460000,640442.340000 L329.160000,640441.320000 L330.120000,640441.680000 L330.420000,640442.040000 L331.980000,640447.380000 L331.020000,640447.680000 L331.020000,640447.380000 L332.580000,640442.040000 L333.540000,640442.340000 L333.240000,640442.640000 L331.680000,640443.180000 \" style=\"stroke:black\"></path>\n<path d=\"M274.740000,640442.220000 L276.300000,640441.620000 L274.740000,640447.020000 L273.180000,640441.620000 \" style=\"stroke:black\"></path>\n<path d=\"M331.260000,640441.620000 L331.260000,640442.700000 L331.739980,640442.700000 L331.739980,640441.620000 \" style=\"stroke:black\"></path>\n<path d=\"M211.020000,640442.220000 L212.580000,640441.680000 L211.020000,640447.020000 L209.460000,640441.680000 \" style=\"stroke:black\"></path>\n<path d=\"M330.120000,640441.680000 L331.680000,640442.220000 L331.680000,640443.180000 L331.500000,640443.240000 L331.320000,640443.180000 L329.760000,640442.640000 \" style=\"stroke:black\"></path>\n<path d=\"M393.180000,640442.280000 L394.740000,640441.680000 L393.180000,640447.080000 L391.680000,640441.680000 \" style=\"stroke:black\"></path>\n<path d=\"M331.500000,640442.700000 L333.060000,640442.160000 L331.500000,640447.500000 L329.940000,640442.160000 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"219.0598\" y=\"640407.347684\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"406.08\" y=\"640407.347484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"282.775312\" y=\"640407.347684\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"343.793584\" y=\"640407.347684\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.5835\" x=\"197.0398\" y=\"640461.107384\">X3 &lt;&lt; COUNT</text></svg><svg height=\"27.0900224999\" viewbox=\"192.240005 640448.579995 251.580015 18.060015\" width=\"377.3700225\">\n<rect height=\"18.0600000001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"255.18\" y=\"640448.58\"></rect>\n<rect height=\"18.0600000001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"318.06\" y=\"640448.58\"></rect>\n<rect height=\"18.0600000001\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"380.94\" y=\"640448.58\"></rect>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.5832\" x=\"260.8201\" y=\"640461.107384\">X2 &lt;&lt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.5834\" x=\"324.0599\" y=\"640461.107384\">X1 &lt;&lt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"11.997132\" x=\"384.6601\" y=\"640461.107184\">X0 &lt;&lt; COUNT</text></svg><h3>Figure 4-13.  PSLLW, PSLLD, and PSLLQ Instruction Operation Using 64-bit Operand</h3><p>The (V)PSLLW instruction shifts each of the words in the destination operand to the left by the number of bits spec-ified in the count operand; the (V)PSLLD instruction shifts each of the doublewords in the destination operand; and the (V)PSLLQ instruction shifts the quadword (or quadwords) in the destination operand.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the left by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-13 gives an example of shifting words in a 64-bit operand."
             };

         case "VPSRAD":
         case "PSRAD":
         case "PSRAW":
         case "VPSRAW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSRAW:PSRAD.html",
                 "html": "<p>Shifts the bits in the individual data elements (words or doublewords) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are filled with the initial value of the sign bit of the data element. If the value specified by the count operand is greater than 15 (for words) or 31 (for doublewords), each destination data element is filled with the initial value of the sign bit of the element. (Figure 4-14 gives an example of shifting words in a 64-bit operand.)</p><svg height=\"157.1400225\" viewbox=\"111.840000 644963.999980 379.199990 104.760015\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"31.008684\" x=\"159.5399\" y=\"644982.827484\">Pre-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"170.04\" y=\"644991.107484\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"31.41744\" x=\"158.46\" y=\"645004.705488\">Shift Right</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"27.528192\" x=\"162.54\" y=\"645011.965488\">with Sign</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"29.386512\" x=\"160.6197\" y=\"645019.765588\">Extension</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"34.539834\" x=\"154.7399\" y=\"645035.687784\">Post-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.315378\" x=\"169.02\" y=\"645043.667484\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"319.98\" y=\"644975.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"382.86\" y=\"644975.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"193.62\" y=\"645028.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"256.5\" y=\"645028.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"319.38\" y=\"645028.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"194.16\" y=\"644975.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"257.04\" y=\"644975.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"382.26\" y=\"645028.14\"></rect>\n<path d=\"M193.920000,644974.920000 L193.920000,644993.160000 L194.400000,644993.160000 L194.400000,644974.920000 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,644974.920000 L256.800000,644993.160000 L257.280010,644993.160000 L257.280010,644974.920000 \" style=\"stroke:black\"></path>\n<path d=\"M319.740000,644974.920000 L319.740000,644993.160000 L320.220010,644993.160000 L320.220010,644974.920000 \" style=\"stroke:black\"></path>\n<path d=\"M382.620000,644974.920000 L382.620000,644993.160000 L383.100010,644993.160000 L383.100010,644974.920000 \" style=\"stroke:black\"></path>\n<path d=\"M194.160000,644974.920020 L194.160000,644975.400000 L257.280000,644975.400000 L257.280000,644974.920020 \" style=\"stroke:black\"></path>\n<path d=\"M257.040000,644974.920020 L257.040000,644975.400000 L320.220000,644975.400000 L320.220000,644974.920020 \" style=\"stroke:black\"></path>\n<path d=\"M319.980000,644974.920020 L319.980000,644975.400000 L383.100000,644975.400000 L383.100000,644974.920020 \" style=\"stroke:black\"></path>\n<path d=\"M382.860000,644974.920020 L382.860000,644975.400000 L445.980000,644975.400000 L445.980000,644974.920020 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,644975.160000 L256.800000,644993.400000 L257.280010,644993.400000 L257.280010,644975.160000 \" style=\"stroke:black\"></path>\n<path d=\"M319.740000,644975.160000 L319.740000,644993.400000 L320.220010,644993.400000 L320.220010,644975.160000 \" style=\"stroke:black\"></path>\n<path d=\"M382.620000,644975.160000 L382.620000,644993.400000 L383.100010,644993.400000 L383.100010,644975.160000 \" style=\"stroke:black\"></path>\n<path d=\"M445.500000,644975.160000 L445.500000,644993.400000 L445.980010,644993.400000 L445.980010,644975.160000 \" style=\"stroke:black\"></path>\n<path d=\"M193.920000,644992.920020 L193.920000,644993.400000 L257.040000,644993.400000 L257.040000,644992.920020 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,644992.920020 L256.800000,644993.400000 L319.980000,644993.400000 L319.980000,644992.920020 \" style=\"stroke:black\"></path>\n<path d=\"M319.740000,644992.920020 L319.740000,644993.400000 L382.860000,644993.400000 L382.860000,644992.920020 \" style=\"stroke:black\"></path>\n<path d=\"M382.620000,644992.920020 L382.620000,644993.400000 L445.740000,644993.400000 L445.740000,644992.920020 \" style=\"stroke:black\"></path>\n<path d=\"M275.820000,644993.280000 L275.820000,645008.220000 L276.299980,645008.220000 L276.299980,644993.280000 \" style=\"stroke:black\"></path>\n<path d=\"M212.100000,644993.340000 L212.100000,645008.280000 L212.580010,645008.280000 L212.580010,644993.340000 \" style=\"stroke:black\"></path>\n<path d=\"M394.320000,644993.340000 L394.320000,645008.280000 L394.800010,645008.280000 L394.800010,644993.340000 \" style=\"stroke:black\"></path>\n<path d=\"M332.580000,644993.820000 L332.580000,645008.760000 L333.060010,645008.760000 L333.060010,644993.820000 \" style=\"stroke:black\"></path>\n<path d=\"M276.060000,645007.739990 L276.060000,645008.220000 L302.040000,645008.220000 L302.040000,645007.739990 \" style=\"stroke:black\"></path>\n<path d=\"M212.340000,645007.799990 L212.340000,645008.280000 L238.320000,645008.280000 L238.320000,645007.799990 \" style=\"stroke:black\"></path>\n<path d=\"M394.560000,645007.799990 L394.560000,645008.280000 L420.480000,645008.280000 L420.480000,645007.799990 \" style=\"stroke:black\"></path>\n<path d=\"M301.560000,645007.980000 L301.560000,645020.880000 L302.040010,645020.880000 L302.040010,645007.980000 \" style=\"stroke:black\"></path>\n<path d=\"M237.840000,645008.040000 L237.840000,645020.940000 L238.320000,645020.940000 L238.320000,645008.040000 \" style=\"stroke:black\"></path>\n<path d=\"M420.000000,645008.040000 L420.000000,645021.000000 L420.480010,645021.000000 L420.480010,645008.040000 \" style=\"stroke:black\"></path>\n<path d=\"M332.820000,645008.280020 L332.820000,645008.760000 L358.800000,645008.760000 L358.800000,645008.280020 \" style=\"stroke:black\"></path>\n<path d=\"M358.320000,645008.520000 L358.320000,645021.420000 L358.800010,645021.420000 L358.800010,645008.520000 \" style=\"stroke:black\"></path>\n<path d=\"M237.900000,645021.300000 L239.460000,645020.700000 L240.420000,645020.340000 L240.120000,645021.360000 L238.560000,645026.760000 L238.080000,645028.440000 L237.600000,645026.760000 L236.040000,645021.360000 L235.740000,645020.340000 L236.700000,645020.700000 L237.000000,645021.060000 L238.560000,645026.460000 L237.600000,645026.760000 L237.600000,645026.460000 L239.160000,645021.060000 L240.120000,645021.360000 L239.820000,645021.660000 L238.260000,645022.260000 \" style=\"stroke:black\"></path>\n<path d=\"M301.620000,645021.240000 L303.180000,645020.700000 L304.140000,645020.340000 L303.840000,645021.360000 L302.280000,645026.700000 L301.800000,645028.320000 L301.320000,645026.700000 L299.760000,645021.360000 L299.460000,645020.340000 L300.420000,645020.700000 L300.720000,645021.060000 L302.280000,645026.400000 L301.320000,645026.700000 L301.320000,645026.400000 L302.880000,645021.060000 L303.840000,645021.360000 L303.540000,645021.660000 L301.980000,645022.200000 \" style=\"stroke:black\"></path>\n<path d=\"M420.060000,645021.300000 L421.620000,645020.760000 L422.580000,645020.400000 L422.280000,645021.420000 L420.720000,645026.820000 L420.240000,645028.560000 L419.760000,645026.820000 L418.260000,645021.420000 L417.960000,645020.400000 L418.920000,645020.760000 L419.220000,645021.120000 L420.720000,645026.520000 L419.760000,645026.820000 L419.760000,645026.520000 L421.320000,645021.120000 L422.280000,645021.420000 L421.980000,645021.720000 L420.420000,645022.260000 \" style=\"stroke:black\"></path>\n<path d=\"M301.560000,645020.640000 L301.560000,645021.720000 L302.040010,645021.720000 L302.040010,645020.640000 \" style=\"stroke:black\"></path>\n<path d=\"M236.700000,645020.700000 L238.260000,645021.300000 L238.260000,645022.260000 L238.080000,645022.320000 L237.900000,645022.260000 L236.340000,645021.660000 \" style=\"stroke:black\"></path>\n<path d=\"M237.840000,645020.700000 L237.840000,645021.780000 L238.320000,645021.780000 L238.320000,645020.700000 \" style=\"stroke:black\"></path>\n<path d=\"M300.420000,645020.700000 L301.980000,645021.240000 L301.980000,645022.200000 L301.800000,645022.260000 L301.620000,645022.200000 L300.060000,645021.660000 \" style=\"stroke:black\"></path>\n<path d=\"M358.380000,645021.720000 L359.880000,645021.180000 L360.840000,645020.760000 L359.040000,645027.240000 L358.560000,645028.920000 L358.080000,645027.240000 L356.520000,645021.840000 L356.220000,645020.820000 L357.180000,645021.180000 L357.480000,645021.540000 L359.040000,645026.940000 L358.080000,645027.240000 L358.080000,645026.940000 L359.580000,645021.540000 L360.540000,645021.840000 L360.240000,645022.140000 L358.740000,645022.680000 \" style=\"stroke:black\"></path>\n<path d=\"M418.920000,645020.760000 L420.420000,645021.300000 L420.420000,645022.260000 L420.240000,645022.320000 L420.060000,645022.260000 L418.560000,645021.720000 \" style=\"stroke:black\"></path>\n<path d=\"M420.000000,645020.760000 L420.000000,645021.780000 L420.480010,645021.780000 L420.480010,645020.760000 \" style=\"stroke:black\"></path>\n<path d=\"M238.080000,645021.780000 L239.640000,645021.180000 L238.080000,645026.580000 L236.520000,645021.180000 \" style=\"stroke:black\"></path>\n<path d=\"M301.800000,645021.720000 L303.360000,645021.180000 L301.800000,645026.520000 L300.240000,645021.180000 \" style=\"stroke:black\"></path>\n<path d=\"M357.180000,645021.180000 L358.740000,645021.720000 L358.740000,645022.680000 L358.560000,645022.740000 L358.380000,645022.680000 L356.820000,645022.140000 \" style=\"stroke:black\"></path>\n<path d=\"M358.320000,645021.180000 L358.320000,645022.200000 L358.800010,645022.200000 L358.800010,645021.180000 \" style=\"stroke:black\"></path>\n<path d=\"M420.240000,645021.780000 L421.800000,645021.240000 L420.240000,645026.640000 L418.740000,645021.240000 \" style=\"stroke:black\"></path>\n<path d=\"M358.560000,645022.200000 L360.060000,645021.660000 L358.560000,645027.060000 L357.000000,645021.660000 \" style=\"stroke:black\"></path>\n<path d=\"M193.620000,645027.899990 L193.620000,645028.380000 L256.740000,645028.380000 L256.740000,645027.899990 \" style=\"stroke:black\"></path>\n<path d=\"M256.500000,645027.899990 L256.500000,645028.380000 L319.620000,645028.380000 L319.620000,645027.899990 \" style=\"stroke:black\"></path>\n<path d=\"M319.380000,645027.899990 L319.380000,645028.380000 L382.500000,645028.380000 L382.500000,645027.899990 \" style=\"stroke:black\"></path>\n<path d=\"M382.260000,645027.899990 L382.260000,645028.380000 L445.440000,645028.380000 L445.440000,645027.899990 \" style=\"stroke:black\"></path>\n<path d=\"M193.380000,645027.900000 L193.380000,645046.140000 L193.860010,645046.140000 L193.860010,645027.900000 \" style=\"stroke:black\"></path>\n<path d=\"M256.260000,645027.900000 L256.260000,645046.140000 L256.739980,645046.140000 L256.739980,645027.900000 \" style=\"stroke:black\"></path>\n<path d=\"M319.140000,645027.900000 L319.140000,645046.140000 L319.619980,645046.140000 L319.619980,645027.900000 \" style=\"stroke:black\"></path>\n<path d=\"M382.020000,645027.900000 L382.020000,645046.140000 L382.500010,645046.140000 L382.500010,645027.900000 \" style=\"stroke:black\"></path>\n<path d=\"M256.260000,645028.140000 L256.260000,645046.380000 L256.739980,645046.380000 L256.739980,645028.140000 \" style=\"stroke:black\"></path>\n<path d=\"M319.140000,645028.140000 L319.140000,645046.380000 L319.619980,645046.380000 L319.619980,645028.140000 \" style=\"stroke:black\"></path>\n<path d=\"M382.020000,645028.140000 L382.020000,645046.380000 L382.500010,645046.380000 L382.500010,645028.140000 \" style=\"stroke:black\"></path>\n<path d=\"M444.960000,645028.140000 L444.960000,645046.380000 L445.440010,645046.380000 L445.440010,645028.140000 \" style=\"stroke:black\"></path>\n<path d=\"M193.380000,645045.899990 L193.380000,645046.380000 L256.500000,645046.380000 L256.500000,645045.899990 \" style=\"stroke:black\"></path>\n<path d=\"M256.260000,645045.899990 L256.260000,645046.380000 L319.380000,645046.380000 L319.380000,645045.899990 \" style=\"stroke:black\"></path>\n<path d=\"M319.140000,645045.899990 L319.140000,645046.380000 L382.260000,645046.380000 L382.260000,645045.899990 \" style=\"stroke:black\"></path>\n<path d=\"M382.020000,645045.899990 L382.020000,645046.380000 L445.200000,645046.380000 L445.200000,645045.899990 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"345.177424\" y=\"644986.907784\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"407.4\" y=\"644986.907484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.65532\" x=\"198.3598\" y=\"645040.667484\">X3 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.59906\" x=\"262.1402\" y=\"645040.667484\">X2 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.5836\" x=\"325.4397\" y=\"645040.667484\">X1 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"220.3798\" y=\"644986.907784\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"284.095312\" y=\"644986.907784\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"11.997132\" x=\"385.9801\" y=\"645040.667184\">X0 &gt;&gt; COUNT</text></svg><h3>Figure 4-14.  PSRAW and PSRAD Instruction Operation Using a 64-bit Operand</h3><p>Note that only the first 64-bits of a 128-bit count operand are checked to compute the count. If the second source operand is a memory address, 128 bits are loaded.</p><p>The (V)PSRAW instruction shifts each of the words in the destination operand to the right by the number of bits specified in the count operand, and the (V)PSRAD instruction shifts each of the doublewords in the destination operand.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words or doublewords) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are filled with the initial value of the sign bit of the data element. If the value specified by the count operand is greater than 15 (for words) or 31 (for doublewords), each destination data element is filled with the initial value of the sign bit of the element. (Figure 4-14 gives an example of shifting words in a 64-bit operand.)"
             };

         case "VPSRLDQ":
         case "PSRLDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSRLDQ.html",
                 "html": "<p>Shifts the destination operand (first operand) to the right by the number of bytes specified in the count operand (second operand). The empty high-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source and destination operands are the same. Bits (VLMAX-1:128) of the corre-sponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The source and destination operands are XMM registers. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register. The destination operand is a YMM register. The count operand applies to both the low and high 128-bit lanes.</p>",
                 "tooltip": "Shifts the destination operand (first operand) to the right by the number of bytes specified in the count operand (second operand). The empty high-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate."
             };

         case "PSRLQ":
         case "PSRLW":
         case "PSRLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSRLW:PSRLD:PSRLQ.html",
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-15 gives an example of shifting words in a 64-bit operand.</p><p>Note that only the first 64-bits of a 128-bit count operand are checked to compute the count.</p><svg height=\"132.2100075\" viewbox=\"112.380000 650696.999995 379.199990 88.140005\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"31.008684\" x=\"160.0799\" y=\"650715.827384\">Pre-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"170.58\" y=\"650724.107384\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"31.41744\" x=\"159.0\" y=\"650737.705488\">Shift Right</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"28.485888\" x=\"161.7\" y=\"650744.965388\">with Zero</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:6.960000pt\" textlength=\"29.386512\" x=\"161.1599\" y=\"650752.765488\">Extension</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"34.539834\" x=\"155.2201\" y=\"650768.687684\">Post-Shift</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.315378\" x=\"169.56\" y=\"650776.667384\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"194.7\" y=\"650708.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"257.58\" y=\"650708.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"383.4\" y=\"650708.16\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"257.04\" y=\"650761.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"319.92\" y=\"650761.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.88\" x=\"382.8\" y=\"650761.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"194.1\" y=\"650761.14\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"62.94\" x=\"320.46\" y=\"650708.16\"></rect>\n<path d=\"M194.460000,650707.920000 L194.460000,650726.160000 L194.939980,650726.160000 L194.939980,650707.920000 \" style=\"stroke:black\"></path>\n<path d=\"M257.340000,650707.920000 L257.340000,650726.160000 L257.820010,650726.160000 L257.820010,650707.920000 \" style=\"stroke:black\"></path>\n<path d=\"M320.220000,650707.920000 L320.220000,650726.160000 L320.700010,650726.160000 L320.700010,650707.920000 \" style=\"stroke:black\"></path>\n<path d=\"M383.160000,650707.920000 L383.160000,650726.160000 L383.640010,650726.160000 L383.640010,650707.920000 \" style=\"stroke:black\"></path>\n<path d=\"M194.700000,650707.920020 L194.700000,650708.400000 L257.820000,650708.400000 L257.820000,650707.920020 \" style=\"stroke:black\"></path>\n<path d=\"M257.580000,650707.920020 L257.580000,650708.400000 L320.700000,650708.400000 L320.700000,650707.920020 \" style=\"stroke:black\"></path>\n<path d=\"M320.460000,650707.920020 L320.460000,650708.400000 L383.640000,650708.400000 L383.640000,650707.920020 \" style=\"stroke:black\"></path>\n<path d=\"M383.400000,650707.920020 L383.400000,650708.400000 L446.520000,650708.400000 L446.520000,650707.920020 \" style=\"stroke:black\"></path>\n<path d=\"M257.340000,650708.160000 L257.340000,650726.400000 L257.820010,650726.400000 L257.820010,650708.160000 \" style=\"stroke:black\"></path>\n<path d=\"M320.220000,650708.160000 L320.220000,650726.400000 L320.700010,650726.400000 L320.700010,650708.160000 \" style=\"stroke:black\"></path>\n<path d=\"M383.160000,650708.160000 L383.160000,650726.400000 L383.640010,650726.400000 L383.640010,650708.160000 \" style=\"stroke:black\"></path>\n<path d=\"M446.040000,650708.160000 L446.040000,650726.400000 L446.519980,650726.400000 L446.519980,650708.160000 \" style=\"stroke:black\"></path>\n<path d=\"M194.460000,650725.919990 L194.460000,650726.400000 L257.580000,650726.400000 L257.580000,650725.919990 \" style=\"stroke:black\"></path>\n<path d=\"M257.340000,650725.919990 L257.340000,650726.400000 L320.460000,650726.400000 L320.460000,650725.919990 \" style=\"stroke:black\"></path>\n<path d=\"M320.220000,650725.919990 L320.220000,650726.400000 L383.400000,650726.400000 L383.400000,650725.919990 \" style=\"stroke:black\"></path>\n<path d=\"M383.160000,650725.919990 L383.160000,650726.400000 L446.280000,650726.400000 L446.280000,650725.919990 \" style=\"stroke:black\"></path>\n<path d=\"M276.360000,650726.280000 L276.360000,650741.220000 L276.839980,650741.220000 L276.839980,650726.280000 \" style=\"stroke:black\"></path>\n<path d=\"M212.640000,650726.340000 L212.640000,650741.280000 L213.120000,650741.280000 L213.120000,650726.340000 \" style=\"stroke:black\"></path>\n<path d=\"M394.800000,650726.400000 L394.800000,650741.340000 L395.279980,650741.340000 L395.279980,650726.400000 \" style=\"stroke:black\"></path>\n<path d=\"M333.120000,650726.820000 L333.120000,650741.760000 L333.600010,650741.760000 L333.600010,650726.820000 \" style=\"stroke:black\"></path>\n<path d=\"M276.600000,650740.739990 L276.600000,650741.220000 L302.580000,650741.220000 L302.580000,650740.739990 \" style=\"stroke:black\"></path>\n<path d=\"M212.880000,650740.800020 L212.880000,650741.280000 L238.860000,650741.280000 L238.860000,650740.800020 \" style=\"stroke:black\"></path>\n<path d=\"M395.040000,650740.860020 L395.040000,650741.340000 L421.020000,650741.340000 L421.020000,650740.860020 \" style=\"stroke:black\"></path>\n<path d=\"M302.100000,650740.980000 L302.100000,650753.940000 L302.579980,650753.940000 L302.579980,650740.980000 \" style=\"stroke:black\"></path>\n<path d=\"M238.380000,650741.040000 L238.380000,650753.940000 L238.860010,650753.940000 L238.860010,650741.040000 \" style=\"stroke:black\"></path>\n<path d=\"M420.540000,650741.100000 L420.540000,650754.000000 L421.019980,650754.000000 L421.019980,650741.100000 \" style=\"stroke:black\"></path>\n<path d=\"M333.360000,650741.279990 L333.360000,650741.760000 L359.280000,650741.760000 L359.280000,650741.279990 \" style=\"stroke:black\"></path>\n<path d=\"M358.800000,650741.520000 L358.800000,650754.420000 L359.279980,650754.420000 L359.279980,650741.520000 \" style=\"stroke:black\"></path>\n<path d=\"M238.440000,650754.300000 L239.940000,650753.700000 L240.900000,650753.280000 L239.100000,650759.760000 L238.620000,650761.440000 L238.140000,650759.760000 L236.580000,650754.360000 L236.280000,650753.340000 L237.240000,650753.700000 L237.540000,650754.060000 L239.100000,650759.460000 L238.140000,650759.760000 L238.140000,650759.460000 L239.640000,650754.060000 L240.600000,650754.360000 L240.300000,650754.660000 L238.800000,650755.260000 \" style=\"stroke:black\"></path>\n<path d=\"M302.160000,650754.240000 L303.660000,650753.700000 L304.620000,650753.280000 L304.320000,650754.360000 L302.820000,650759.700000 L302.340000,650761.320000 L301.860000,650759.700000 L300.300000,650754.360000 L300.000000,650753.340000 L300.960000,650753.700000 L301.260000,650754.060000 L302.820000,650759.400000 L301.860000,650759.700000 L301.860000,650759.400000 L303.360000,650754.060000 L304.320000,650754.360000 L304.020000,650754.660000 L302.520000,650755.200000 \" style=\"stroke:black\"></path>\n<path d=\"M420.600000,650754.300000 L422.160000,650753.760000 L423.120000,650753.400000 L422.820000,650754.420000 L421.260000,650759.820000 L420.780000,650761.500000 L420.300000,650759.820000 L418.740000,650754.420000 L418.440000,650753.400000 L419.400000,650753.760000 L419.700000,650754.120000 L421.260000,650759.520000 L420.300000,650759.820000 L420.300000,650759.520000 L421.860000,650754.120000 L422.820000,650754.420000 L422.520000,650754.720000 L420.960000,650755.260000 \" style=\"stroke:black\"></path>\n<path d=\"M237.240000,650753.700000 L238.800000,650754.300000 L238.800000,650755.260000 L238.620000,650755.320000 L238.440000,650755.260000 L236.880000,650754.660000 \" style=\"stroke:black\"></path>\n<path d=\"M238.380000,650753.700000 L238.380000,650754.780000 L238.860010,650754.780000 L238.860010,650753.700000 \" style=\"stroke:black\"></path>\n<path d=\"M300.960000,650753.700000 L302.520000,650754.240000 L302.520000,650755.200000 L302.340000,650755.260000 L302.160000,650755.200000 L300.600000,650754.660000 \" style=\"stroke:black\"></path>\n<path d=\"M302.100000,650753.700000 L302.100000,650754.720000 L302.579980,650754.720000 L302.579980,650753.700000 \" style=\"stroke:black\"></path>\n<path d=\"M419.400000,650753.760000 L420.960000,650754.300000 L420.960000,650755.260000 L420.780000,650755.320000 L420.600000,650755.260000 L419.040000,650754.720000 \" style=\"stroke:black\"></path>\n<path d=\"M420.540000,650753.760000 L420.540000,650754.780000 L421.019980,650754.780000 L421.019980,650753.760000 \" style=\"stroke:black\"></path>\n<path d=\"M358.860000,650754.780000 L360.420000,650754.180000 L361.380000,650753.820000 L361.080000,650754.840000 L359.520000,650760.240000 L359.040000,650761.980000 L358.560000,650760.240000 L357.060000,650754.840000 L356.760000,650753.820000 L357.720000,650754.180000 L358.020000,650754.540000 L359.520000,650759.940000 L358.560000,650760.240000 L358.560000,650759.940000 L360.120000,650754.540000 L361.080000,650754.840000 L360.780000,650755.140000 L359.220000,650755.740000 \" style=\"stroke:black\"></path>\n<path d=\"M238.620000,650754.780000 L240.120000,650754.180000 L238.620000,650759.580000 L237.060000,650754.180000 \" style=\"stroke:black\"></path>\n<path d=\"M302.340000,650754.720000 L303.840000,650754.180000 L302.340000,650759.520000 L300.780000,650754.180000 \" style=\"stroke:black\"></path>\n<path d=\"M357.720000,650754.180000 L359.220000,650754.780000 L359.220000,650755.740000 L359.040000,650755.800000 L358.860000,650755.740000 L357.360000,650755.140000 \" style=\"stroke:black\"></path>\n<path d=\"M358.800000,650754.180000 L358.800000,650755.260000 L359.279980,650755.260000 L359.279980,650754.180000 \" style=\"stroke:black\"></path>\n<path d=\"M420.780000,650754.780000 L422.340000,650754.240000 L420.780000,650759.640000 L419.220000,650754.240000 \" style=\"stroke:black\"></path>\n<path d=\"M359.040000,650755.260000 L360.600000,650754.660000 L359.040000,650760.060000 L357.540000,650754.660000 \" style=\"stroke:black\"></path>\n<path d=\"M193.860000,650760.900000 L193.860000,650779.140000 L194.340000,650779.140000 L194.340000,650760.900000 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,650760.900000 L256.800000,650779.140000 L257.280010,650779.140000 L257.280010,650760.900000 \" style=\"stroke:black\"></path>\n<path d=\"M319.680000,650760.900000 L319.680000,650779.140000 L320.160010,650779.140000 L320.160010,650760.900000 \" style=\"stroke:black\"></path>\n<path d=\"M382.560000,650760.900000 L382.560000,650779.140000 L383.040010,650779.140000 L383.040010,650760.900000 \" style=\"stroke:black\"></path>\n<path d=\"M194.100000,650760.900020 L194.100000,650761.380000 L257.280000,650761.380000 L257.280000,650760.900020 \" style=\"stroke:black\"></path>\n<path d=\"M257.040000,650760.900020 L257.040000,650761.380000 L320.160000,650761.380000 L320.160000,650760.900020 \" style=\"stroke:black\"></path>\n<path d=\"M319.920000,650760.900020 L319.920000,650761.380000 L383.040000,650761.380000 L383.040000,650760.900020 \" style=\"stroke:black\"></path>\n<path d=\"M382.800000,650760.900020 L382.800000,650761.380000 L445.920000,650761.380000 L445.920000,650760.900020 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,650761.140000 L256.800000,650779.380000 L257.280010,650779.380000 L257.280010,650761.140000 \" style=\"stroke:black\"></path>\n<path d=\"M319.680000,650761.140000 L319.680000,650779.380000 L320.160010,650779.380000 L320.160010,650761.140000 \" style=\"stroke:black\"></path>\n<path d=\"M382.560000,650761.140000 L382.560000,650779.380000 L383.040010,650779.380000 L383.040010,650761.140000 \" style=\"stroke:black\"></path>\n<path d=\"M445.440000,650761.140000 L445.440000,650779.380000 L445.920010,650779.380000 L445.920010,650761.140000 \" style=\"stroke:black\"></path>\n<path d=\"M193.860000,650778.900000 L193.860000,650779.380000 L257.040000,650779.380000 L257.040000,650778.900000 \" style=\"stroke:black\"></path>\n<path d=\"M256.800000,650778.900000 L256.800000,650779.380000 L319.920000,650779.380000 L319.920000,650778.900000 \" style=\"stroke:black\"></path>\n<path d=\"M319.680000,650778.900000 L319.680000,650779.380000 L382.800000,650779.380000 L382.800000,650778.900000 \" style=\"stroke:black\"></path>\n<path d=\"M382.560000,650778.900000 L382.560000,650779.380000 L445.680000,650779.380000 L445.680000,650778.900000 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"220.9198\" y=\"650719.907684\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"284.635312\" y=\"650719.907684\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"407.94\" y=\"650719.907484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.59916\" x=\"262.6801\" y=\"650773.667384\">X2 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.5834\" x=\"325.9199\" y=\"650773.667384\">X1 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"11.997132\" x=\"386.5201\" y=\"650773.667184\">X0 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"53.65532\" x=\"198.8998\" y=\"650773.667384\">X3 &gt;&gt; COUNT</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"345.653584\" y=\"650719.907684\">X1</text></svg><h3>Figure 4-15.  PSRLW, PSRLD, and PSRLQ Instruction Operation Using 64-bit Operand</h3><p>The (V)PSRLW instruction shifts each of the words in the destination operand to the right by the number of bits specified in the count operand; the (V)PSRLD instruction shifts each of the doublewords in the destination operand; and the PSRLQ instruction shifts the quadword (or quadwords) in the destination operand.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-15 gives an example of shifting words in a 64-bit operand."
             };

         case "VPSUBB":
         case "PSUBD":
         case "PSUBB":
         case "VPSUBD":
         case "PSUBW":
         case "VPSUBW":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSUBB:PSUBW:PSUBD.html",
                 "html": "<p>Performs a SIMD subtract of the packed integers of the source operand (second operand) from the packed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.</p><p>The (V)PSUBB instruction subtracts packed byte integers. When an individual result is too large or too small to be represented in a byte, the result is wrapped around and the low 8 bits are written to the destination element.</p><p>The (V)PSUBW instruction subtracts packed word integers. When an individual result is too large or too small to be represented in a word, the result is wrapped around and the low 16 bits are written to the destination element.</p><p>The (V)PSUBD instruction subtracts packed doubleword integers. When an individual result is too large or too small to be represented in a doubleword, the result is wrapped around and the low 32 bits are written to the destination element.</p><p>Note that the (V)PSUBB, (V)PSUBW, and (V)PSUBD instructions can operate on either unsigned or signed (two's complement notation) packed integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of values upon which it operates.</p>",
                 "tooltip": "Performs a SIMD subtract of the packed integers of the source operand (second operand) from the packed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs."
             };

         case "VPSUBQ":
         case "PSUBQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSUBQ.html",
                 "html": "<p>Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand. When packed quadword operands are used, a SIMD subtract is performed. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).</p><p>Note that the (V)PSUBQ instruction can operate on either unsigned or signed (two\u2019s complement notation) inte-gers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of the values upon which it operates.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be a quadword integer stored in an MMX technology register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The second source operand is an XMM register or a 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM desti-nation register remain unchanged.</p>",
                 "tooltip": "Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand. When packed quadword operands are used, a SIMD subtract is performed. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored)."
             };

         case "VPSUBSB":
         case "PSUBSW":
         case "VPSUBSW":
         case "PSUBSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSUBSB:PSUBSW.html",
                 "html": "<p>Performs a SIMD subtract of the packed signed integers of the source operand (second operand) from the packed signed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following para-graphs.</p><p>The (V)PSUBSB instruction subtracts packed signed byte integers. When an individual byte result is beyond the range of a signed byte integer (that is, greater than 7FH or less than 80H), the saturated value of 7FH or 80H, respectively, is written to the destination operand.</p><p>The (V)PSUBSW instruction subtracts packed signed word integers. When an individual word result is beyond the range of a signed word integer (that is, greater than 7FFFH or less than 8000H), the saturated value of 7FFFH or 8000H, respectively, is written to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: When operating on 64-bit operands, the destination operand must be an MMX technology register and the source operand can be either an MMX technology register or a 64-bit memory location.</p>",
                 "tooltip": "Performs a SIMD subtract of the packed signed integers of the source operand (second operand) from the packed signed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following para-graphs."
             };

         case "PSUBUSW":
         case "VPSUBUSW":
         case "VPSUBUSB":
         case "PSUBUSB":
             return {
                 "url": "http://www.felixcloutier.com/x86/PSUBUSB:PSUBUSW.html",
                 "html": "<p>Performs a SIMD subtract of the packed unsigned integers of the source operand (second operand) from the packed unsigned integers of the destination operand (first operand), and stores the packed unsigned integer results in the destination operand. See Figure 9-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.</p><p>These instructions can operate on either 64-bit or 128-bit operands.</p><p>The (V)PSUBUSB instruction subtracts packed unsigned byte integers. When an individual byte result is less than zero, the saturated value of 00H is written to the destination operand.</p><p>The (V)PSUBUSW instruction subtracts packed unsigned word integers. When an individual word result is less than zero, the saturated value of 0000H is written to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD subtract of the packed unsigned integers of the source operand (second operand) from the packed unsigned integers of the destination operand (first operand), and stores the packed unsigned integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs."
             };

         case "PTEST":
         case "VPTEST":
             return {
                 "url": "http://www.felixcloutier.com/x86/PTEST.html",
                 "html": "<p>PTEST and VPTEST set the ZF flag if all bits in the result are 0 of the bitwise AND of the first source operand (first operand) and the second source operand (second operand). VPTEST sets the CF flag if all bits in the result are 0 of the bitwise AND of the second source operand (second operand) and the logical NOT of the destination operand.</p><p>The first source register is specified by the ModR/M <em>reg</em> field.</p><p>128-bit versions: The first source register is an XMM register. The second source register can be an XMM register or a 128-bit memory location. The destination register is not modified.</p><p>VEX.256 encoded version: The first source register is a YMM register. The second source register can be a YMM register or a 256-bit memory location. The destination register is not modified.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "PTEST and VPTEST set the ZF flag if all bits in the result are 0 of the bitwise AND of the first source operand (first operand) and the second source operand (second operand). VPTEST sets the CF flag if all bits in the result are 0 of the bitwise AND of the second source operand (second operand) and the logical NOT of the destination operand."
             };

         case "VPUNPCKHQDQ":
         case "PUNPCKHQDQ":
         case "VPUNPCKHWD":
         case "PUNPCKHWD":
         case "PUNPCKHDQ":
         case "VPUNPCKHBW":
         case "PUNPCKHBW":
         case "VPUNPCKHDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PUNPCKHBW:PUNPCKHWD:PUNPCKHDQ:PUNPCKHQDQ.html",
                 "html": "<p>Unpacks and interleaves the high-order data elements (bytes, words, doublewords, or quadwords) of the destina-tion operand (first operand) and source operand (second operand) into the destination operand. Figure 4-16 shows the unpack operation for bytes in 64-bit operands. The low-order data elements are ignored.</p><svg height=\"145.89\" viewbox=\"111.840000 667000.980010 379.199990 97.260000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.864932\" x=\"126.3602\" y=\"667021.907684\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"454.26\" y=\"667022.267584\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"140.3404\" y=\"667084.787684\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"307.98\" y=\"667011.0\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"145.68\" y=\"667011.06\"></rect>\n<path d=\"M316.980000,667029.000000 L317.220000,667028.940000 L317.040000,667028.460000 L316.800000,667028.520000 \" style=\"stroke:black\"></path>\n<path d=\"M353.400000,667029.000000 L353.580000,667028.880000 L353.340000,667028.460000 L353.160000,667028.580000 \" style=\"stroke:black\"></path>\n<path d=\"M370.560000,667029.000000 L370.740000,667028.880000 L370.500000,667028.460000 L370.320000,667028.580000 \" style=\"stroke:black\"></path>\n<path d=\"M316.980000,667029.000000 L316.800000,667028.520000 L198.120000,667070.940000 L198.300000,667071.420000 \" style=\"stroke:black\"></path>\n<path d=\"M334.620000,667029.000000 L334.800000,667028.940000 L334.620000,667028.520000 L334.440000,667028.580000 \" style=\"stroke:black\"></path>\n<path d=\"M334.620000,667029.000000 L334.440000,667028.580000 L234.960000,667070.580000 L235.140000,667071.000000 \" style=\"stroke:black\"></path>\n<path d=\"M353.400000,667029.000000 L353.160000,667028.580000 L269.520000,667070.160000 L269.760000,667070.580000 \" style=\"stroke:black\"></path>\n<path d=\"M370.560000,667029.000000 L370.320000,667028.580000 L304.500000,667069.500000 L304.740000,667069.920000 \" style=\"stroke:black\"></path>\n<path d=\"M171.840000,667029.180000 L171.720000,667029.000000 L171.360000,667029.300000 L171.480000,667029.480000 \" style=\"stroke:black\"></path>\n<path d=\"M190.020000,667029.120000 L189.840000,667029.000000 L189.540000,667029.360000 L189.720000,667029.480000 \" style=\"stroke:black\"></path>\n<path d=\"M155.820000,667029.240000 L155.760000,667029.060000 L155.340000,667029.240000 L155.400000,667029.420000 \" style=\"stroke:black\"></path>\n<path d=\"M190.020000,667029.120000 L189.720000,667029.480000 L238.920000,667069.140000 L239.220000,667068.780000 \" style=\"stroke:black\"></path>\n<path d=\"M171.840000,667029.180000 L171.480000,667029.480000 L203.520000,667068.060000 L203.880000,667067.760000 \" style=\"stroke:black\"></path>\n<path d=\"M155.820000,667029.240000 L155.400000,667029.420000 L170.220000,667066.740000 L170.640000,667066.560000 \" style=\"stroke:black\"></path>\n<path d=\"M209.280000,667029.660000 L209.100000,667029.540000 L208.860000,667029.960000 L209.040000,667030.080000 \" style=\"stroke:black\"></path>\n<path d=\"M209.280000,667029.660000 L209.040000,667030.080000 L273.240000,667069.980000 L273.480000,667069.560000 \" style=\"stroke:black\"></path>\n<path d=\"M170.460000,667067.220000 L171.660000,667066.140000 L172.440000,667065.360000 L172.560000,667066.500000 L173.160000,667072.080000 L173.280000,667073.820000 L172.260000,667072.440000 L168.840000,667068.000000 L168.180000,667067.160000 L169.200000,667067.100000 L169.620000,667067.340000 L173.040000,667071.780000 L172.260000,667072.440000 L172.140000,667072.140000 L171.540000,667066.560000 L172.560000,667066.500000 L172.380000,667066.860000 L171.180000,667067.940000 \" style=\"stroke:black\"></path>\n<path d=\"M203.940000,667068.420000 L204.780000,667067.040000 L205.200000,667066.080000 L205.680000,667067.100000 L207.960000,667072.200000 L208.680000,667073.820000 L207.240000,667072.800000 L202.620000,667069.620000 L201.720000,667069.080000 L202.680000,667068.720000 L203.160000,667068.780000 L207.780000,667071.960000 L207.240000,667072.800000 L207.060000,667072.620000 L204.780000,667067.520000 L205.680000,667067.100000 L205.620000,667067.520000 L204.780000,667068.900000 \" style=\"stroke:black\"></path>\n<path d=\"M170.820000,667067.580000 L172.020000,667066.500000 L172.620000,667072.080000 L169.200000,667067.640000 \" style=\"stroke:black\"></path>\n<path d=\"M170.640000,667066.560000 L170.160000,667066.680000 L170.580000,667067.640000 L171.060000,667067.520000 \" style=\"stroke:black\"></path>\n<path d=\"M170.640000,667066.560000 L170.700000,667066.740000 L170.280000,667066.920000 L170.220000,667066.740000 \" style=\"stroke:black\"></path>\n<path d=\"M239.400000,667069.440000 L239.940000,667067.880000 L240.180000,667066.860000 L240.840000,667067.820000 L244.080000,667072.380000 L245.100000,667073.760000 L243.480000,667073.100000 L238.320000,667070.940000 L237.360000,667070.520000 L238.260000,667070.040000 L238.680000,667070.040000 L243.840000,667072.200000 L243.480000,667073.100000 L243.240000,667072.920000 L240.000000,667068.360000 L240.840000,667067.820000 L240.900000,667068.240000 L240.360000,667069.800000 \" style=\"stroke:black\"></path>\n<path d=\"M169.200000,667067.100000 L170.820000,667067.040000 L171.180000,667067.940000 L171.000000,667068.060000 L170.820000,667068.120000 L169.200000,667068.180000 \" style=\"stroke:black\"></path>\n<path d=\"M204.360000,667068.660000 L205.200000,667067.280000 L207.480000,667072.380000 L202.860000,667069.200000 \" style=\"stroke:black\"></path>\n<path d=\"M273.780000,667070.160000 L274.080000,667068.540000 L274.200000,667067.520000 L274.980000,667068.360000 L278.760000,667072.500000 L279.840000,667073.760000 L278.220000,667073.280000 L272.820000,667071.780000 L271.800000,667071.540000 L272.640000,667070.880000 L273.120000,667070.820000 L278.520000,667072.320000 L278.220000,667073.280000 L277.980000,667073.160000 L274.200000,667069.020000 L274.980000,667068.360000 L275.040000,667068.780000 L274.740000,667070.400000 \" style=\"stroke:black\"></path>\n<path d=\"M304.020000,667069.860000 L305.280000,667070.880000 L306.120000,667071.540000 L305.100000,667071.780000 L299.760000,667073.280000 L298.080000,667073.760000 L299.220000,667072.500000 L302.940000,667068.360000 L303.660000,667067.580000 L303.840000,667068.540000 L303.720000,667069.020000 L300.000000,667073.160000 L299.220000,667072.500000 L299.460000,667072.320000 L304.800000,667070.820000 L305.100000,667071.780000 L304.680000,667071.720000 L303.420000,667070.700000 \" style=\"stroke:black\"></path>\n<path d=\"M203.880000,667067.700000 L203.520000,667068.060000 L204.180000,667068.840000 L204.540000,667068.480000 \" style=\"stroke:black\"></path>\n<path d=\"M203.880000,667067.760000 L204.000000,667067.940000 L203.640000,667068.240000 L203.520000,667068.060000 \" style=\"stroke:black\"></path>\n<path d=\"M239.880000,667069.620000 L240.420000,667068.060000 L243.660000,667072.620000 L238.500000,667070.460000 \" style=\"stroke:black\"></path>\n<path d=\"M269.040000,667070.460000 L270.240000,667071.600000 L270.960000,667072.260000 L269.940000,667072.440000 L264.420000,667073.460000 L262.680000,667073.820000 L264.000000,667072.620000 L268.140000,667068.840000 L268.920000,667068.120000 L268.980000,667069.140000 L268.800000,667069.560000 L264.660000,667073.340000 L264.000000,667072.620000 L264.240000,667072.500000 L269.760000,667071.480000 L269.940000,667072.440000 L269.520000,667072.320000 L268.320000,667071.180000 \" style=\"stroke:black\"></path>\n<path d=\"M202.680000,667068.720000 L204.180000,667068.180000 L204.780000,667068.900000 L204.720000,667069.080000 L204.540000,667069.140000 L203.040000,667069.680000 \" style=\"stroke:black\"></path>\n<path d=\"M234.360000,667070.820000 L235.500000,667072.020000 L236.220000,667072.800000 L235.140000,667072.920000 L229.560000,667073.580000 L227.820000,667073.760000 L229.200000,667072.680000 L233.580000,667069.200000 L234.420000,667068.540000 L234.480000,667069.560000 L234.240000,667069.980000 L229.860000,667073.460000 L229.200000,667072.680000 L229.500000,667072.560000 L235.080000,667071.900000 L235.140000,667072.920000 L234.780000,667072.740000 L233.640000,667071.540000 \" style=\"stroke:black\"></path>\n<path d=\"M303.840000,667068.540000 L304.200000,667070.160000 L303.420000,667070.700000 L303.240000,667070.580000 L303.240000,667070.400000 L302.880000,667068.780000 \" style=\"stroke:black\"></path>\n<path d=\"M274.260000,667070.280000 L274.560000,667068.660000 L278.340000,667072.800000 L272.940000,667071.300000 \" style=\"stroke:black\"></path>\n<path d=\"M303.720000,667070.280000 L304.980000,667071.300000 L299.640000,667072.800000 L303.360000,667068.660000 \" style=\"stroke:black\"></path>\n<path d=\"M239.220000,667068.780000 L238.860000,667069.140000 L239.700000,667069.800000 L240.060000,667069.440000 \" style=\"stroke:black\"></path>\n<path d=\"M239.220000,667068.780000 L239.400000,667068.900000 L239.100000,667069.260000 L238.920000,667069.140000 \" style=\"stroke:black\"></path>\n<path d=\"M197.580000,667071.180000 L198.660000,667072.440000 L199.320000,667073.280000 L198.240000,667073.340000 L192.660000,667073.700000 L190.920000,667073.760000 L192.360000,667072.740000 L196.860000,667069.500000 L197.760000,667068.900000 L197.760000,667069.920000 L197.520000,667070.340000 L192.960000,667073.580000 L192.360000,667072.740000 L192.600000,667072.680000 L198.180000,667072.320000 L198.240000,667073.340000 L197.820000,667073.160000 L196.740000,667071.900000 \" style=\"stroke:black\"></path>\n<path d=\"M268.980000,667069.140000 L269.220000,667070.760000 L268.320000,667071.180000 L268.200000,667071.060000 L268.200000,667070.880000 L268.020000,667069.260000 \" style=\"stroke:black\"></path>\n<path d=\"M238.260000,667070.040000 L239.640000,667069.200000 L240.360000,667069.800000 L240.300000,667069.980000 L240.120000,667070.040000 L238.740000,667070.880000 \" style=\"stroke:black\"></path>\n<path d=\"M268.680000,667070.820000 L269.880000,667071.960000 L264.360000,667072.980000 L268.500000,667069.200000 \" style=\"stroke:black\"></path>\n<path d=\"M304.740000,667069.860000 L304.500000,667069.500000 L303.600000,667070.100000 L303.840000,667070.460000 \" style=\"stroke:black\"></path>\n<path d=\"M304.740000,667069.920000 L304.560000,667070.040000 L304.320000,667069.620000 L304.500000,667069.500000 \" style=\"stroke:black\"></path>\n<path d=\"M234.000000,667071.180000 L235.140000,667072.380000 L229.560000,667073.040000 L233.940000,667069.560000 \" style=\"stroke:black\"></path>\n<path d=\"M234.480000,667069.560000 L234.540000,667071.180000 L233.640000,667071.540000 L233.520000,667071.360000 L233.460000,667071.180000 L233.400000,667069.560000 \" style=\"stroke:black\"></path>\n<path d=\"M273.480000,667069.560000 L273.240000,667069.920000 L274.140000,667070.460000 L274.380000,667070.100000 \" style=\"stroke:black\"></path>\n<path d=\"M273.480000,667069.560000 L273.660000,667069.680000 L273.420000,667070.100000 L273.240000,667069.980000 \" style=\"stroke:black\"></path>\n<path d=\"M272.640000,667070.880000 L273.960000,667069.860000 L274.740000,667070.400000 L274.740000,667070.580000 L274.560000,667070.700000 L273.240000,667071.720000 \" style=\"stroke:black\"></path>\n<path d=\"M197.160000,667071.540000 L198.240000,667072.800000 L192.660000,667073.160000 L197.220000,667069.920000 \" style=\"stroke:black\"></path>\n<path d=\"M197.760000,667069.920000 L197.700000,667071.540000 L196.740000,667071.900000 L196.680000,667071.720000 L196.620000,667071.540000 L196.680000,667069.920000 \" style=\"stroke:black\"></path>\n<path d=\"M269.760000,667070.580000 L269.520000,667070.100000 L268.560000,667070.580000 L268.800000,667071.060000 \" style=\"stroke:black\"></path>\n<path d=\"M269.760000,667070.580000 L269.580000,667070.700000 L269.340000,667070.280000 L269.520000,667070.160000 \" style=\"stroke:black\"></path>\n<path d=\"M235.140000,667071.000000 L234.900000,667070.520000 L233.880000,667070.940000 L234.120000,667071.420000 \" style=\"stroke:black\"></path>\n<path d=\"M235.140000,667071.000000 L234.960000,667071.060000 L234.780000,667070.640000 L234.960000,667070.580000 \" style=\"stroke:black\"></path>\n<path d=\"M198.240000,667071.420000 L198.120000,667070.940000 L197.100000,667071.300000 L197.220000,667071.780000 \" style=\"stroke:black\"></path>\n<path d=\"M198.300000,667071.420000 L198.060000,667071.480000 L197.880000,667071.000000 L198.120000,667070.940000 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"312.7802\" y=\"667022.207684\">X7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"329.876552\" y=\"667022.207684\">X6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"349.135484\" y=\"667022.207684\">X5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"366.1201\" y=\"667022.207684\">X4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"384.0608\" y=\"667022.267584\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"401.157152\" y=\"667022.267584\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"420.416084\" y=\"667022.267584\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"437.4007\" y=\"667022.267584\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"149.8805\" y=\"667022.867684\">Y7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"166.976852\" y=\"667022.867684\">Y6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"186.235784\" y=\"667022.867684\">Y5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"203.2204\" y=\"667022.867684\">Y4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"221.7005\" y=\"667023.467784\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"238.796852\" y=\"667023.467784\">Y2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"258.055784\" y=\"667023.467784\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"275.0404\" y=\"667023.467784\">Y0</text></svg><svg height=\"27.1800075001\" viewbox=\"163.860000 667073.580005 144.000005 18.120005\" width=\"216.0000075\">\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"163.86\" y=\"667073.7\"></rect>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"168.1204\" y=\"667084.547484\">Y7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"185.8803\" y=\"667084.547484\">X7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"203.459032\" y=\"667084.547484\">Y6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"222.240372\" y=\"667084.547484\">X6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"239.340304\" y=\"667084.547484\">Y5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"257.579004\" y=\"667084.547484\">X5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"274.56\" y=\"667084.547484\">Y4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"293.88\" y=\"667084.547484\">X4</text></svg><h3>Figure 4-16.  PUNPCKHBW Instruction Operation Using 64-bit Operands</h3><p>31</p>",
                 "tooltip": "Unpacks and interleaves the high-order data elements (bytes, words, doublewords, or quadwords) of the destina-tion operand (first operand) and source operand (second operand) into the destination operand. Figure 4-16 shows the unpack operation for bytes in 64-bit operands. The low-order data elements are ignored."
             };

         case "PUNPCKLBW":
         case "VPUNPCKLQDQ":
         case "VPUNPCKLDQ":
         case "VPUNPCKLWD":
         case "PUNPCKLWD":
         case "PUNPCKLQDQ":
         case "VPUNPCKLBW":
         case "PUNPCKLDQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/PUNPCKLBW:PUNPCKLWD:PUNPCKLDQ:PUNPCKLQDQ.html",
                 "html": "<p>Unpacks and interleaves the low-order data elements (bytes, words, doublewords, and quadwords) of the destina-tion operand (first operand) and source operand (second operand) into the destination operand. (Figure 4-18 shows the unpack operation for bytes in 64-bit operands.). The high-order data elements are ignored.</p><svg height=\"149.04\" viewbox=\"112.380000 672538.020010 379.199990 99.360000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.920792\" x=\"128.5213\" y=\"672560.507684\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"456.3613\" y=\"672560.867584\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"255.3012\" y=\"672623.387684\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"278.88\" y=\"672612.3\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"310.08\" y=\"672549.6\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.0\" x=\"147.84\" y=\"672549.66\"></rect>\n<path d=\"M263.880000,672567.600000 L263.700000,672567.540000 L263.520000,672567.960000 L263.700000,672568.020000 \" style=\"stroke:black\"></path>\n<path d=\"M263.880000,672567.600000 L263.700000,672568.020000 L352.140000,672609.240000 L352.320000,672608.820000 \" style=\"stroke:black\"></path>\n<path d=\"M426.840000,672568.080000 L427.020000,672567.960000 L426.720000,672567.600000 L426.540000,672567.720000 \" style=\"stroke:black\"></path>\n<path d=\"M445.020000,672568.080000 L445.140000,672567.900000 L444.780000,672567.600000 L444.660000,672567.780000 \" style=\"stroke:black\"></path>\n<path d=\"M281.580000,672567.720000 L281.400000,672567.660000 L281.220000,672568.080000 L281.400000,672568.140000 \" style=\"stroke:black\"></path>\n<path d=\"M281.580000,672567.720000 L281.400000,672568.140000 L388.860000,672609.720000 L389.040000,672609.300000 \" style=\"stroke:black\"></path>\n<path d=\"M426.840000,672568.080000 L426.540000,672567.720000 L383.460000,672606.900000 L383.760000,672607.260000 \" style=\"stroke:black\"></path>\n<path d=\"M445.020000,672568.080000 L444.660000,672567.780000 L417.060000,672605.880000 L417.420000,672606.180000 \" style=\"stroke:black\"></path>\n<path d=\"M246.780000,672568.140000 L246.600000,672568.020000 L246.360000,672568.440000 L246.540000,672568.560000 \" style=\"stroke:black\"></path>\n<path d=\"M229.680000,672568.260000 L229.500000,672568.140000 L229.200000,672568.500000 L229.380000,672568.620000 \" style=\"stroke:black\"></path>\n<path d=\"M246.780000,672568.140000 L246.540000,672568.560000 L316.500000,672609.180000 L316.740000,672608.760000 \" style=\"stroke:black\"></path>\n<path d=\"M389.280000,672568.680000 L389.460000,672568.560000 L389.220000,672568.140000 L389.040000,672568.260000 \" style=\"stroke:black\"></path>\n<path d=\"M408.000000,672568.680000 L408.180000,672568.560000 L407.940000,672568.140000 L407.760000,672568.260000 \" style=\"stroke:black\"></path>\n<path d=\"M229.680000,672568.260000 L229.380000,672568.620000 L281.100000,672607.800000 L281.400000,672607.440000 \" style=\"stroke:black\"></path>\n<path d=\"M389.280000,672568.680000 L389.040000,672568.260000 L312.240000,672608.520000 L312.480000,672608.940000 \" style=\"stroke:black\"></path>\n<path d=\"M408.000000,672568.680000 L407.760000,672568.260000 L347.760000,672607.860000 L348.000000,672608.280000 \" style=\"stroke:black\"></path>\n<path d=\"M416.760000,672606.420000 L418.320000,672606.840000 L419.340000,672607.080000 L418.500000,672607.740000 L414.060000,672611.220000 L412.680000,672612.300000 L413.280000,672610.680000 L415.200000,672605.400000 L415.560000,672604.380000 L416.100000,672605.220000 L416.160000,672605.700000 L414.240000,672610.980000 L413.280000,672610.680000 L413.460000,672610.440000 L417.900000,672606.960000 L418.500000,672607.740000 L418.080000,672607.800000 L416.520000,672607.380000 \" style=\"stroke:black\"></path>\n<path d=\"M383.040000,672607.320000 L385.440000,672608.520000 L384.420000,672609.000000 L379.440000,672611.520000 L377.940000,672612.240000 L378.780000,672610.800000 L381.720000,672606.000000 L382.320000,672605.100000 L382.680000,672606.060000 L382.620000,672606.540000 L379.680000,672611.340000 L378.780000,672610.800000 L379.020000,672610.620000 L384.000000,672608.100000 L384.420000,672609.000000 L384.000000,672609.000000 L382.560000,672608.280000 \" style=\"stroke:black\"></path>\n<path d=\"M416.100000,672605.220000 L417.060000,672606.600000 L416.520000,672607.380000 L416.340000,672607.320000 L416.220000,672607.200000 L415.260000,672605.820000 \" style=\"stroke:black\"></path>\n<path d=\"M281.580000,672608.100000 L282.060000,672606.540000 L282.360000,672605.520000 L282.960000,672606.360000 L286.320000,672610.860000 L287.400000,672612.300000 L285.780000,672611.640000 L280.560000,672609.600000 L279.540000,672609.240000 L280.380000,672608.700000 L280.860000,672608.640000 L286.080000,672610.680000 L285.780000,672611.640000 L285.540000,672611.460000 L282.180000,672606.960000 L282.960000,672606.360000 L283.020000,672606.780000 L282.540000,672608.340000 \" style=\"stroke:black\"></path>\n<path d=\"M416.640000,672606.900000 L418.200000,672607.320000 L413.760000,672610.800000 L415.680000,672605.520000 \" style=\"stroke:black\"></path>\n<path d=\"M347.280000,672608.160000 L348.600000,672609.180000 L349.440000,672609.780000 L348.420000,672610.080000 L343.080000,672611.700000 L341.340000,672612.300000 L342.540000,672610.920000 L346.200000,672606.720000 L346.920000,672605.880000 L347.100000,672606.900000 L346.980000,672607.380000 L343.320000,672611.580000 L342.540000,672610.920000 L342.780000,672610.740000 L348.120000,672609.120000 L348.420000,672610.080000 L348.000000,672610.020000 L346.680000,672609.000000 \" style=\"stroke:black\"></path>\n<path d=\"M417.420000,672606.120000 L417.060000,672605.880000 L416.460000,672606.780000 L416.820000,672607.020000 \" style=\"stroke:black\"></path>\n<path d=\"M417.420000,672606.180000 L417.300000,672606.360000 L416.940000,672606.060000 L417.060000,672605.880000 \" style=\"stroke:black\"></path>\n<path d=\"M382.680000,672606.060000 L383.280000,672607.620000 L382.560000,672608.280000 L382.380000,672608.160000 L382.320000,672607.980000 L381.720000,672606.420000 \" style=\"stroke:black\"></path>\n<path d=\"M382.800000,672607.800000 L384.240000,672608.520000 L379.260000,672611.040000 L382.200000,672606.240000 \" style=\"stroke:black\"></path>\n<path d=\"M311.820000,672608.820000 L313.020000,672609.900000 L313.800000,672610.560000 L312.720000,672610.740000 L307.260000,672611.880000 L305.580000,672612.240000 L306.840000,672611.040000 L310.860000,672607.200000 L311.580000,672606.480000 L311.700000,672607.500000 L311.520000,672607.920000 L307.500000,672611.760000 L306.840000,672611.040000 L307.080000,672610.920000 L312.540000,672609.780000 L312.720000,672610.740000 L312.300000,672610.620000 L311.100000,672609.540000 \" style=\"stroke:black\"></path>\n<path d=\"M282.060000,672608.220000 L282.540000,672606.660000 L285.900000,672611.160000 L280.680000,672609.120000 \" style=\"stroke:black\"></path>\n<path d=\"M352.620000,672609.360000 L352.740000,672607.740000 L352.860000,672606.720000 L353.640000,672607.440000 L357.840000,672611.100000 L359.100000,672612.240000 L357.420000,672612.000000 L351.900000,672611.160000 L350.880000,672610.980000 L351.600000,672610.260000 L352.080000,672610.140000 L357.600000,672610.980000 L357.420000,672612.000000 L357.180000,672611.820000 L352.980000,672608.160000 L353.640000,672607.440000 L353.820000,672607.860000 L353.700000,672609.480000 \" style=\"stroke:black\"></path>\n<path d=\"M317.040000,672609.420000 L317.340000,672607.800000 L317.460000,672606.780000 L318.240000,672607.560000 L322.080000,672611.580000 L323.280000,672612.840000 L321.600000,672612.360000 L316.200000,672610.980000 L315.120000,672610.740000 L315.960000,672610.080000 L316.380000,672610.020000 L321.780000,672611.400000 L321.600000,672612.360000 L321.300000,672612.240000 L317.460000,672608.220000 L318.240000,672607.560000 L318.300000,672607.920000 L318.000000,672609.540000 \" style=\"stroke:black\"></path>\n<path d=\"M347.100000,672606.900000 L347.460000,672608.460000 L346.680000,672609.000000 L346.500000,672608.880000 L346.500000,672608.700000 L346.140000,672607.140000 \" style=\"stroke:black\"></path>\n<path d=\"M383.760000,672607.260000 L383.400000,672606.900000 L382.620000,672607.620000 L382.980000,672607.980000 \" style=\"stroke:black\"></path>\n<path d=\"M383.760000,672607.260000 L383.580000,672607.380000 L383.280000,672607.020000 L383.460000,672606.900000 \" style=\"stroke:black\"></path>\n<path d=\"M346.980000,672608.580000 L348.300000,672609.600000 L342.960000,672611.220000 L346.620000,672607.020000 \" style=\"stroke:black\"></path>\n<path d=\"M389.400000,672609.840000 L389.400000,672607.140000 L390.300000,672607.860000 L394.800000,672611.220000 L396.120000,672612.240000 L394.440000,672612.120000 L388.860000,672611.640000 L387.780000,672611.520000 L388.500000,672610.740000 L388.920000,672610.620000 L394.500000,672611.100000 L394.440000,672612.120000 L394.140000,672612.000000 L389.640000,672608.640000 L390.300000,672607.860000 L390.480000,672608.220000 L390.480000,672609.840000 \" style=\"stroke:black\"></path>\n<path d=\"M281.340000,672607.440000 L281.100000,672607.800000 L281.940000,672608.400000 L282.180000,672608.040000 \" style=\"stroke:black\"></path>\n<path d=\"M281.400000,672607.440000 L281.580000,672607.560000 L281.280000,672607.920000 L281.100000,672607.800000 \" style=\"stroke:black\"></path>\n<path d=\"M311.700000,672607.500000 L311.940000,672609.120000 L311.100000,672609.540000 L310.980000,672609.420000 L310.980000,672609.240000 L310.740000,672607.620000 \" style=\"stroke:black\"></path>\n<path d=\"M311.460000,672609.180000 L312.660000,672610.260000 L307.200000,672611.400000 L311.220000,672607.560000 \" style=\"stroke:black\"></path>\n<path d=\"M280.380000,672608.700000 L281.760000,672607.800000 L282.540000,672608.340000 L282.480000,672608.520000 L282.360000,672608.640000 L280.980000,672609.540000 \" style=\"stroke:black\"></path>\n<path d=\"M353.160000,672609.420000 L353.280000,672607.800000 L357.480000,672611.460000 L351.960000,672610.620000 \" style=\"stroke:black\"></path>\n<path d=\"M317.520000,672609.480000 L317.820000,672607.860000 L321.660000,672611.880000 L316.260000,672610.500000 \" style=\"stroke:black\"></path>\n<path d=\"M348.000000,672608.220000 L347.760000,672607.860000 L346.860000,672608.400000 L347.100000,672608.760000 \" style=\"stroke:black\"></path>\n<path d=\"M348.000000,672608.280000 L347.820000,672608.400000 L347.580000,672607.980000 L347.760000,672607.860000 \" style=\"stroke:black\"></path>\n<path d=\"M389.940000,672609.840000 L389.940000,672608.220000 L394.440000,672611.580000 L388.860000,672611.100000 \" style=\"stroke:black\"></path>\n<path d=\"M312.480000,672608.940000 L312.240000,672608.460000 L311.340000,672608.940000 L311.580000,672609.420000 \" style=\"stroke:black\"></path>\n<path d=\"M312.480000,672608.940000 L312.300000,672609.060000 L312.060000,672608.640000 L312.240000,672608.520000 \" style=\"stroke:black\"></path>\n<path d=\"M316.740000,672608.760000 L316.500000,672609.120000 L317.400000,672609.660000 L317.640000,672609.300000 \" style=\"stroke:black\"></path>\n<path d=\"M316.740000,672608.760000 L316.920000,672608.880000 L316.680000,672609.300000 L316.500000,672609.180000 \" style=\"stroke:black\"></path>\n<path d=\"M352.320000,672608.760000 L352.080000,672609.240000 L353.040000,672609.660000 L353.280000,672609.180000 \" style=\"stroke:black\"></path>\n<path d=\"M352.320000,672608.820000 L352.500000,672608.880000 L352.320000,672609.300000 L352.140000,672609.240000 \" style=\"stroke:black\"></path>\n<path d=\"M315.960000,672610.080000 L317.220000,672609.060000 L318.000000,672609.540000 L318.000000,672609.780000 L317.820000,672609.900000 L316.560000,672610.920000 \" style=\"stroke:black\"></path>\n<path d=\"M351.600000,672610.260000 L352.800000,672609.060000 L353.700000,672609.480000 L353.640000,672609.660000 L352.320000,672610.980000 \" style=\"stroke:black\"></path>\n<path d=\"M388.980000,672609.240000 L388.860000,672609.720000 L389.880000,672610.080000 L390.000000,672609.600000 \" style=\"stroke:black\"></path>\n<path d=\"M389.040000,672609.300000 L389.220000,672609.360000 L389.040000,672609.780000 L388.860000,672609.720000 \" style=\"stroke:black\"></path>\n<path d=\"M388.500000,672610.740000 L389.580000,672609.480000 L390.480000,672609.840000 L390.420000,672610.020000 L390.300000,672610.200000 L389.220000,672611.460000 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"283.1402\" y=\"672623.147484\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"300.8403\" y=\"672623.147484\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"318.478832\" y=\"672623.147484\">Y2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"337.256232\" y=\"672623.147484\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"354.296264\" y=\"672623.147484\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"372.539004\" y=\"672623.147484\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"389.52\" y=\"672623.147484\">Y0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"408.84\" y=\"672623.147484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"314.8815\" y=\"672560.807684\">X7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"331.977852\" y=\"672560.807684\">X6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"351.244764\" y=\"672560.807684\">X5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"368.2812\" y=\"672560.807684\">X4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"386.1604\" y=\"672560.867584\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"403.256752\" y=\"672560.867584\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"422.515684\" y=\"672560.867584\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"439.5602\" y=\"672560.867584\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"151.9817\" y=\"672561.467684\">Y7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"169.078052\" y=\"672561.467684\">Y6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"188.344964\" y=\"672561.467684\">Y5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"205.3815\" y=\"672561.467684\">Y4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"223.8009\" y=\"672562.067784\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"240.897252\" y=\"672562.067784\">Y2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.713256\" x=\"260.156184\" y=\"672562.067784\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"277.2007\" y=\"672562.067784\">Y0</text></svg><h3>Figure 4-18.  PUNPCKLBW Instruction Operation Using 64-bit Operands</h3><p>31</p><p>0</p>",
                 "tooltip": "Unpacks and interleaves the low-order data elements (bytes, words, doublewords, and quadwords) of the destina-tion operand (first operand) and source operand (second operand) into the destination operand. (Figure 4-18 shows the unpack operation for bytes in 64-bit operands.). The high-order data elements are ignored."
             };

         case "PUSH":
             return {
                 "url": "http://www.felixcloutier.com/x86/PUSH.html",
                 "html": "<p>Decrements the stack pointer and then stores the source operand on the top of the stack. Address and operand sizes are determined and used as follows:</p><p>The address size is used only when referencing a source operand in memory.</p><p>The operand size (16, 32, or 64 bits) determines the amount by which the stack pointer is decremented (2, 4 or 8).</p><p>If the source operand is an immediate of size less than the operand size, a sign-extended value is pushed on the  stack.  If  the  source  operand  is  a  segment  register  (16  bits)  and  the  operand  size  is  64-bits,  a  zero-extended value is pushed on the stack; if the operand size is 32-bits, either a zero-extended value is pushed on the stack or the segment selector is written on the stack using a 16-bit move. For the last case, all recent Core and Atom processors perform a 16-bit move, leaving the upper portion of the stack location unmodified.</p><p>The stack-address size determines the width of the stack pointer when writing to the stack in memory and when  decrementing</p>",
                 "tooltip": "Decrements the stack pointer and then stores the source operand on the top of the stack. Address and operand sizes are determined and used as follows"
             };

         case "PUSHA":
         case "PUSHAD":
             return {
                 "url": "http://www.felixcloutier.com/x86/PUSHAD.html",
                 "html": "<p>Pushes the contents of the general-purpose registers onto the stack. The registers are stored on the stack in the following order: EAX, ECX, EDX, EBX, ESP (original value), EBP, ESI, and EDI (if the current operand-size attribute is 32) and AX, CX, DX, BX, SP (original value), BP, SI, and DI (if the operand-size attribute is 16). These instruc-tions perform the reverse operation of the POPA/POPAD instructions. The value pushed for the ESP or SP register is its value before prior to pushing the first register (see the \u201cOperation\u201d section below).</p><p>The PUSHA (push all) and PUSHAD (push all double) mnemonics reference the same opcode. The PUSHA instruc-tion is intended for use when the operand-size attribute is 16 and the PUSHAD instruction for when the operand-size attribute is 32. Some assemblers may force the operand size to 16 when PUSHA is used and to 32 when PUSHAD is used. Others may treat these mnemonics as synonyms (PUSHA/PUSHAD) and use the current setting of the operand-size attribute to determine the size of values to be pushed from the stack, regardless of the mnemonic used.</p><p>In the real-address mode, if the ESP or SP register is 1, 3, or 5 when PUSHA/PUSHAD executes: an #SS exception is generated but not delivered (the stack error reported prevents #SS delivery). Next, the processor generates a #DF exception and enters a shutdown state as described in the #DF discussion in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>",
                 "tooltip": "Pushes the contents of the general-purpose registers onto the stack. The registers are stored on the stack in the following order: EAX, ECX, EDX, EBX, ESP (original value), EBP, ESI, and EDI (if the current operand-size attribute is 32) and AX, CX, DX, BX, SP (original value), BP, SI, and DI (if the operand-size attribute is 16). These instruc-tions perform the reverse operation of the POPA/POPAD instructions. The value pushed for the ESP or SP register is its value before prior to pushing the first register (see the \u201cOperation\u201d section below)."
             };

         case "PUSHFQ":
         case "PUSHFD":
         case "PUSHF":
             return {
                 "url": "http://www.felixcloutier.com/x86/PUSHFQ.html",
                 "html": "<p>Decrements the stack pointer by 4 (if the current operand-size attribute is 32) and pushes the entire contents of the EFLAGS register onto the stack, or decrements the stack pointer by 2 (if the operand-size attribute is 16) and pushes the lower 16 bits of the EFLAGS register (that is, the FLAGS register) onto the stack. These instructions reverse the operation of the POPF/POPFD instructions.</p><p>When copying the entire EFLAGS register to the stack, the VM and RF flags (bits 16 and 17) are not copied; instead, the values for these flags are cleared in the EFLAGS image stored on the stack. See Chapter 3 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information about the EFLAGS register.</p><p>The PUSHF (push flags) and PUSHFD (push flags double) mnemonics reference the same opcode. The PUSHF instruction is intended for use when the operand-size attribute is 16 and the PUSHFD instruction for when the operand-size attribute is 32. Some assemblers may force the operand size to 16 when PUSHF is used and to 32 when PUSHFD is used. Others may treat these mnemonics as synonyms (PUSHF/PUSHFD) and use the current setting of the operand-size attribute to determine the size of values to be pushed from the stack, regardless of the mnemonic used.</p><p>In 64-bit mode, the instruction\u2019s default operation is to decrement the stack pointer (RSP) by 8 and pushes RFLAGS on the stack. 16-bit operation is supported using the operand size override prefix 66H. 32-bit operand size cannot be encoded in this mode. When copying RFLAGS to the stack, the VM and RF flags (bits 16 and 17) are not copied; instead, values for these flags are cleared in the RFLAGS image stored on the stack.</p><p>When in virtual-8086 mode and the I/O privilege level (IOPL) is less than 3, the PUSHF/PUSHFD instruction causes a general protection exception (#GP).</p>",
                 "tooltip": "Decrements the stack pointer by 4 (if the current operand-size attribute is 32) and pushes the entire contents of the EFLAGS register onto the stack, or decrements the stack pointer by 2 (if the operand-size attribute is 16) and pushes the lower 16 bits of the EFLAGS register (that is, the FLAGS register) onto the stack. These instructions reverse the operation of the POPF/POPFD instructions."
             };

         case "VPXOR":
         case "PXOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/PXOR.html",
                 "html": "<p>Performs a bitwise logical exclusive-OR (XOR) operation on the source operand (second operand) and the destina-tion operand (first operand) and stores the result in the destination operand. Each bit of the result is 1 if the corre-sponding bits of the two operands are different; each bit is 0 if the corresponding bits of the operands are the same.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand is an XMM register or a 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM desti-nation register remain unchanged.</p><p>VEX.128 encoded version: The second source operand is an XMM register or a 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical exclusive-OR (XOR) operation on the source operand (second operand) and the destina-tion operand (first operand) and stores the result in the destination operand. Each bit of the result is 1 if the corre-sponding bits of the two operands are different; each bit is 0 if the corresponding bits of the operands are the same."
             };

         case "RCPPS":
         case "VRCPPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/RCPPS.html",
                 "html": "<p>Performs a SIMD computation of the approximate reciprocals of the four packed single-precision floating-point values in the source operand (second operand) stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD single-precision floating-point operation.</p><p>The relative error for this approximation is:</p><p>|Relative Error| \u2264 1.5 \u2217 2<sup>\u221212</sup></p><p>The RCPPS instruction is not affected by the rounding control bits in the MXCSR register. When a source value is a 0.0, an \u221e of the sign of the source value is returned. A denormal source value is treated as a 0.0 (of the same sign). Tiny results are always flushed to 0.0, with the sign of the operand. (Input values greater than or equal to |1.11111111110100000000000B\u22172<sup>125</sup>| are guaranteed to not produce tiny results; input values less than or equal to |1.00000000000110000000001B*2<sup>126</sup>| are guaranteed to produce tiny results, which are in turn flushed to 0.0; and input values in between this range may or may not produce tiny results, depending on the implementation.) When a source value is an SNaN or QNaN, the SNaN is converted to a QNaN or the source QNaN is returned.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD computation of the approximate reciprocals of the four packed single-precision floating-point values in the source operand (second operand) stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD single-precision floating-point operation."
             };

         case "RCPSS":
         case "VRCPSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/RCPSS.html",
                 "html": "<p>Computes of an approximate reciprocal of the low single-precision floating-point value in the source operand (second operand) and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar single-precision floating-point operation.</p><p>The relative error for this approximation is:</p><p>|Relative Error| \u2264 1.5 \u2217 2<sup>\u221212</sup></p><p>The RCPSS instruction is not affected by the rounding control bits in the MXCSR register. When a source value is a 0.0, an \u221e of the sign of the source value is returned. A denormal source value is treated as a 0.0 (of the same sign). Tiny results are always flushed to 0.0, with the sign of the operand. (Input values greater than or equal to |1.11111111110100000000000B\u22172<sup>125</sup>| are guaranteed to not produce tiny results; input values less than or equal to |1.00000000000110000000001B*2<sup>126</sup>| are guaranteed to produce tiny results, which are in turn flushed to 0.0; and input values in between this range may or may not produce tiny results, depending on the implementation.) When a source value is an SNaN or QNaN, the SNaN is converted to a QNaN or the source QNaN is returned.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Computes of an approximate reciprocal of the low single-precision floating-point value in the source operand (second operand) and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar single-precision floating-point operation."
             };

         case "RDFSBASE":
         case "RDGSBASE":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDFSBASE:RDGSBASE.html",
                 "html": "<p>Loads the general-purpose register indicated by the modR/M:r/m field with the FS or GS segment base address.</p><p>The destination operand may be either a 32-bit or a 64-bit general-purpose register. The REX.W prefix indicates the operand size is 64 bits. If no REX.W prefix is used, the operand size is 32 bits; the upper 32 bits of the source base address (for FS or GS) are ignored and upper 32 bits of the destination register are cleared.</p><p>This instruction is supported only in 64-bit mode.</p>",
                 "tooltip": "Loads the general-purpose register indicated by the modR/M:r/m field with the FS or GS segment base address."
             };

         case "RDMSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDMSR.html",
                 "html": "<p>Reads the contents of a 64-bit model specific register (MSR) specified in the ECX register into registers EDX:EAX. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The EDX register is loaded with the high-order 32 bits of the MSR and the EAX register is loaded with the low-order 32 bits. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.) If fewer than 64 bits are implemented in the MSR being read, the values returned to EDX:EAX in unimplemented bit locations are undefined.</p><p>This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) will be generated. Specifying a reserved or unimplemented MSR address in ECX will also cause a general protection exception.</p><p>The MSRs control functions for testability, execution tracing, performance-monitoring, and machine check errors. Chapter 35, \u201cModel-Specific Registers (MSRs),\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3C</em>, lists all the MSRs that can be read with this instruction and their addresses. Note that each processor family has its own set of MSRs.</p><p>The CPUID instruction should be used to determine whether MSRs are supported (CPUID.01H:EDX[5] = 1) before using this instruction.</p>",
                 "tooltip": "Reads the contents of a 64-bit model specific register (MSR) specified in the ECX register into registers EDX:EAX. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The EDX register is loaded with the high-order 32 bits of the MSR and the EAX register is loaded with the low-order 32 bits. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.) If fewer than 64 bits are implemented in the MSR being read, the values returned to EDX:EAX in unimplemented bit locations are undefined."
             };

         case "RDPMC":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDPMC.html",
                 "html": "<p>The EAX register is loaded with the low-order 32 bits. The EDX register is loaded with the supported high-order bits of the counter. The number of high-order bits loaded into EDX is implementation specific on processors that do no support architectural performance monitoring. The width of fixed-function and general-purpose performance coun-ters on processors supporting architectural performance monitoring are reported by CPUID 0AH leaf. See below for the treatment of the EDX register for \u201cfast\u201d reads.</p><p>The ECX register selects one of two type of performance counters, specifies the index relative to the base of each counter type, and selects \u201cfast\u201d read mode if supported. The two counter types are :</p><p>ECX[29:0] specifies the index. The width of general-purpose performance counters are 40-bits for processors that do not support architectural performance monitoring counters.The width of special-purpose performance counters are implementation specific. The width of fixed-function performance counters and general-purpose performance counters on processor supporting architectural performance monitoring are reported by CPUID 0AH leaf.</p><p>Table 4-13 lists valid indices of the general-purpose and special-purpose performance counters according to the derived DisplayFamily_DisplayModel values of CPUID encoding for each processor family (see CPUID instruction in Chapter 3, \u201cInstruction Set Reference, A-M\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2A</em>).</p><h3>Table 4-13.  Valid General and Special Purpose Performance Counter Index Range for RDPMC</h3>",
                 "tooltip": "The EAX register is loaded with the low-order 32 bits. The EDX register is loaded with the supported high-order bits of the counter. The number of high-order bits loaded into EDX is implementation specific on processors that do no support architectural performance monitoring. The width of fixed-function and general-purpose performance coun-ters on processors supporting architectural performance monitoring are reported by CPUID 0AH leaf. See below for the treatment of the EDX register for \u201cfast\u201d reads."
             };

         case "RDRAND":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDRAND.html",
                 "html": "<p>Loads a hardware generated random value and store it in the destination register. The size of the random value is determined by the destination register size and operating mode. The Carry Flag indicates whether a random value is available at the time the instruction is executed. CF=1 indicates that the data in the destination is valid. Other-wise CF=0 and the data in the destination operand will be returned as zeros for the specified width. All other flags are forced to 0 in either situation. Software must check the state of CF=1 for determining if a valid random value has been returned, otherwise it is expected to loop and retry execution of RDRAND (see <em>Intel\u00ae 64 and IA-32 Archi-tectures Software Developer\u2019s Manual, Volume 1</em>, Section 7.3.17, \u201cRandom Number Generator Instructions\u201d).</p><p>This instruction is available at all privilege levels.</p><p>In 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.B permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bit oper-ands. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Loads a hardware generated random value and store it in the destination register. The size of the random value is determined by the destination register size and operating mode. The Carry Flag indicates whether a random value is available at the time the instruction is executed. CF=1 indicates that the data in the destination is valid. Other-wise CF=0 and the data in the destination operand will be returned as zeros for the specified width. All other flags are forced to 0 in either situation. Software must check the state of CF=1 for determining if a valid random value has been returned, otherwise it is expected to loop and retry execution of RDRAND (see Intel\u00ae 64 and IA-32 Archi-tectures Software Developer\u2019s Manual, Volume 1, Section 7.3.17, \u201cRandom Number Generator Instructions\u201d)."
             };

         case "RDSEED":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDSEED.html",
                 "html": "<p>Loads a hardware generated random value and store it in the destination register. The random value is generated from an Enhanced NRBG (Non Deterministic Random Bit Generator) that is compliant to NIST SP800-90B and NIST SP800-90C in the XOR construction mode. The size of the random value is determined by the destination register size and operating mode. The Carry Flag indicates whether a random value is available at the time the instruction is executed. CF=1 indicates that the data in the destination is valid. Otherwise CF=0 and the data in the destination operand will be returned as zeros for the specified width. All other flags are forced to 0 in either situation. Software must check the state of CF=1 for determining if a valid random seed value has been returned, otherwise it is expected to loop and retry execution of RDSEED (see Section 1.2).</p><p>The RDSEED instruction is available at all privilege levels. The RDSEED instruction executes normally either inside or outside a transaction region.</p><p>In 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.B permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bit oper-ands. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Loads a hardware generated random value and store it in the destination register. The random value is generated from an Enhanced NRBG (Non Deterministic Random Bit Generator) that is compliant to NIST SP800-90B and NIST SP800-90C in the XOR construction mode. The size of the random value is determined by the destination register size and operating mode. The Carry Flag indicates whether a random value is available at the time the instruction is executed. CF=1 indicates that the data in the destination is valid. Otherwise CF=0 and the data in the destination operand will be returned as zeros for the specified width. All other flags are forced to 0 in either situation. Software must check the state of CF=1 for determining if a valid random seed value has been returned, otherwise it is expected to loop and retry execution of RDSEED (see Section 1.2)."
             };

         case "RDTSC":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDTSC.html",
                 "html": "<p>Loads the current value of the processor\u2019s time-stamp counter (a 64-bit MSR) into the EDX:EAX registers. The EDX register is loaded with the high-order 32 bits of the MSR and the EAX register is loaded with the low-order 32 bits. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.)</p><p>The processor monotonically increments the time-stamp counter MSR every clock cycle and resets it to 0 whenever the processor is reset. See \u201cTime Stamp Counter\u201d in Chapter 17 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3B</em>, for specific details of the time stamp counter behavior.</p><p>When in protected or virtual 8086 mode, the time stamp disable (TSD) flag in register CR4 restricts the use of the RDTSC instruction as follows. When the TSD flag is clear, the RDTSC instruction can be executed at any privilege level; when the flag is set, the instruction can only be executed at privilege level 0. (When in real-address mode, the RDTSC instruction is always enabled.)</p><p>The time-stamp counter can also be read with the RDMSR instruction, when executing at privilege level 0.</p><p>The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed. If software requires RDTSC to be executed only after all previous instructions have completed locally, it can either use RDTSCP (if the processor supports that instruction) or execute the sequence LFENCE;RDTSC.</p>",
                 "tooltip": "Loads the current value of the processor\u2019s time-stamp counter (a 64-bit MSR) into the EDX:EAX registers. The EDX register is loaded with the high-order 32 bits of the MSR and the EAX register is loaded with the low-order 32 bits. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.)"
             };

         case "RDTSCP":
             return {
                 "url": "http://www.felixcloutier.com/x86/RDTSCP.html",
                 "html": "<p>Loads the current value of the processor\u2019s time-stamp counter (a 64-bit MSR) into the EDX:EAX registers and also loads the IA32_TSC_AUX MSR (address C000_0103H) into the ECX register. The EDX register is loaded with the high-order 32 bits of the IA32_TSC MSR; the EAX register is loaded with the low-order 32 bits of the IA32_TSC MSR; and the ECX register is loaded with the low-order 32-bits of IA32_TSC_AUX MSR. On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX, RDX, and RCX are cleared.</p><p>The processor monotonically increments the time-stamp counter MSR every clock cycle and resets it to 0 whenever the processor is reset. See \u201cTime Stamp Counter\u201d in Chapter 17 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3B</em>, for specific details of the time stamp counter behavior.</p><p>When in protected or virtual 8086 mode, the time stamp disable (TSD) flag in register CR4 restricts the use of the RDTSCP instruction as follows. When the TSD flag is clear, the RDTSCP instruction can be executed at any privilege level; when the flag is set, the instruction can only be executed at privilege level 0. (When in real-address mode, the RDTSCP instruction is always enabled.)</p><p>The RDTSCP instruction waits until all previous instructions have been executed before reading the counter. However,  subsequent instructions may begin execution before the read operation is performed.</p><p>The presence of the RDTSCP instruction is indicated by CPUID leaf 80000001H, EDX bit 27. If the bit is set to 1 then RDTSCP is present on the processor.</p>",
                 "tooltip": "Loads the current value of the processor\u2019s time-stamp counter (a 64-bit MSR) into the EDX:EAX registers and also loads the IA32_TSC_AUX MSR (address C000_0103H) into the ECX register. The EDX register is loaded with the high-order 32 bits of the IA32_TSC MSR; the EAX register is loaded with the low-order 32 bits of the IA32_TSC MSR; and the ECX register is loaded with the low-order 32-bits of IA32_TSC_AUX MSR. On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX, RDX, and RCX are cleared."
             };

         case "REP":
         case "REPE":
             return {
                 "url": "http://www.felixcloutier.com/x86/REPE.html",
                 "html": "<p>Repeats a string instruction the number of times specified in the count register or until the indicated condition of the ZF flag is no longer met. The REP (repeat), REPE (repeat while equal), REPNE (repeat while not equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics are prefixes that can be added to one of the string instructions. The REP prefix can be added to the INS, OUTS, MOVS, LODS, and STOS instructions, and the REPE, REPNE, REPZ, and REPNZ prefixes can be added to the CMPS and SCAS instructions. (The REPZ and REPNZ prefixes are synonymous forms of the REPE and REPNE prefixes, respectively.) The F3H prefix is defined for the following instructions and undefined for the rest:</p><p>The REP prefixes apply only to one string instruction at a time. To repeat a block of instructions, use the LOOP instruction or another looping construct. All of these repeat prefixes cause the associated instruction to be repeated until the count in register is decremented to 0. See Table 4-14.</p><h3>Table 4-14.  Repeat Prefixes</h3><table>\n<tr>\n<th>Repeat Prefix</th>\n<th>Termination Condition 1*</th>\n<th>Termination Condition 2</th></tr>\n<tr>\n<td>\n<p>REP</p>\n<p>REPE/REPZ</p>\n<p>REPNE/REPNZ</p></td>\n<td>\n<p>RCX or (E)CX = 0</p>\n<p>RCX or (E)CX = 0</p>\n<p>RCX or (E)CX = 0</p></td>\n<td>\n<p>None</p>\n<p>ZF = 0</p>\n<p>ZF = 1</p></td></tr></table><p><strong>NOTES:</strong></p>",
                 "tooltip": "Repeats a string instruction the number of times specified in the count register or until the indicated condition of the ZF flag is no longer met. The REP (repeat), REPE (repeat while equal), REPNE (repeat while not equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics are prefixes that can be added to one of the string instructions. The REP prefix can be added to the INS, OUTS, MOVS, LODS, and STOS instructions, and the REPE, REPNE, REPZ, and REPNZ prefixes can be added to the CMPS and SCAS instructions. (The REPZ and REPNZ prefixes are synonymous forms of the REPE and REPNE prefixes, respectively.) The F3H prefix is defined for the following instructions and undefined for the rest"
             };

         case "RET":
             return {
                 "url": "http://www.felixcloutier.com/x86/RET.html",
                 "html": "<p>Transfers program control to a return address located on the top of the stack. The address is usually placed on the stack by a CALL instruction, and the return is made to the instruction that follows the CALL instruction.</p><p>The optional source operand specifies the number of stack bytes to be released after the return address is popped; the default is none. This operand can be used to release parameters from the stack that were passed to the called procedure and are no longer needed. It must be used when the CALL instruction used to switch to a new procedure uses a call gate with a non-zero word count to access the new procedure. Here, the source operand for the RET instruction must specify the same number of bytes as is specified in the word count field of the call gate.</p><p>The RET instruction can be used to execute three different types of returns:</p><p>The inter-privilege-level return type can only be executed in protected mode. See the section titled \u201cCalling Proce-dures Using Call and RET\u201d in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for detailed information on near, far, and inter-privilege-level returns.</p><p>When executing a near return, the processor pops the return instruction pointer (offset) from the top of the stack into the EIP register and begins program execution at the new instruction pointer. The CS register is unchanged.</p>",
                 "tooltip": "Transfers program control to a return address located on the top of the stack. The address is usually placed on the stack by a CALL instruction, and the return is made to the instruction that follows the CALL instruction."
             };

         case "RCL":
         case "ROL":
         case "RCR":
             return {
                 "url": "http://www.felixcloutier.com/x86/ROL.html",
                 "html": "<p>Shifts (rotates) the bits of the first operand (destination operand) the number of bit positions specified in the second operand (count operand) and stores the result in the destination operand. The destination operand can be a register or a memory location; the count operand is an unsigned integer that can be an immediate or a value in the CL register. In legacy and compatibility mode, the processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 least-significant bits.</p><p>The rotate left (ROL) and rotate through carry left (RCL) instructions shift all the bits toward more-significant bit positions, except for the most-significant bit, which is rotated to the least-significant bit location. The rotate right (ROR) and rotate through carry right (RCR) instructions shift all the bits toward less significant bit positions, except for the least-significant bit, which is rotated to the most-significant bit location.</p><p>The RCL and RCR instructions include the CF flag in the rotation. The RCL instruction shifts the CF flag into the least-significant bit and shifts the most-significant bit into the CF flag. The RCR instruction shifts the CF flag into the most-significant bit and shifts the least-significant bit into the CF flag. For the ROL and ROR instructions, the orig-inal value of the CF flag is not a part of the result, but the CF flag receives a copy of the bit that was shifted from one end to the other.</p><p>The OF flag is defined only for the 1-bit rotates; it is undefined in all other cases (except RCL and RCR instructions only: a zero-bit rotate does nothing, that is affects no flags). For left rotates, the OF flag is set to the exclusive OR of the CF bit (after the rotate) and the most-significant bit of the result. For right rotates, the OF flag is set to the exclusive OR of the two most-significant bits of the result.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Use of REX.W promotes the first operand to 64 bits and causes the count operand to become a 6-bit counter.</p>",
                 "tooltip": "Shifts (rotates) the bits of the first operand (destination operand) the number of bit positions specified in the second operand (count operand) and stores the result in the destination operand. The destination operand can be a register or a memory location; the count operand is an unsigned integer that can be an immediate or a value in the CL register. In legacy and compatibility mode, the processor restricts the count to a number between 0 and 31 by masking all the bits in the count operand except the 5 least-significant bits."
             };

         case "RORX":
             return {
                 "url": "http://www.felixcloutier.com/x86/RORX.html",
                 "html": "<p>Rotates the bits of second operand right by the count value specified in imm8 without affecting arithmetic flags. The RORX instruction does not read or write the arithmetic flags.</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>",
                 "tooltip": "Rotates the bits of second operand right by the count value specified in imm8 without affecting arithmetic flags. The RORX instruction does not read or write the arithmetic flags."
             };

         case "VROUNDPD":
         case "ROUNDPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ROUNDPD.html",
                 "html": "<p>Round the 2 double-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a single-precision floating-point value.</p><p>The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in Figure 4-20. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-15 lists the encoded values for rounding-mode field).</p><p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to \u20181 then denormals will be converted to zero before rounding.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destina-tion is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destina-tion operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Round the 2 double-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a single-precision floating-point value."
             };

         case "ROUNDPS":
         case "VROUNDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ROUNDPS.html",
                 "html": "<p>Round the 4 single-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a single-precision floating-point value.</p><p>The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in Figure 4-20. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-15 lists the encoded values for rounding-mode field).</p><p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to \u20181 then denormals will be converted to zero before rounding.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destina-tion is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destina-tion operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Round the 4 single-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a single-precision floating-point value."
             };

         case "ROUNDSD":
         case "VROUNDSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/ROUNDSD.html",
                 "html": "<p>Round the DP FP value in the lower qword of the source operand (second operand) using the rounding mode spec-ified in the immediate operand (third operand) and place the result in the destination operand (first operand). The rounding process rounds a double-precision floating-point input to an integer value and returns the integer result as a double precision floating-point value in the lowest position. The upper double precision floating-point value in the destination is retained.</p><p>The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in Figure 4-20. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-15 lists the encoded values for rounding-mode field).</p><p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to \u20181 then denormals will be converted to zero before rounding.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Round the DP FP value in the lower qword of the source operand (second operand) using the rounding mode spec-ified in the immediate operand (third operand) and place the result in the destination operand (first operand). The rounding process rounds a double-precision floating-point input to an integer value and returns the integer result as a double precision floating-point value in the lowest position. The upper double precision floating-point value in the destination is retained."
             };

         case "VROUNDSS":
         case "ROUNDSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/ROUNDSS.html",
                 "html": "<p>Round the single-precision floating-point value in the lowest dword of the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the result in the destination operand (first operand). The rounding process rounds a single-precision floating-point input to an integer value and returns the result as a single-precision floating-point value in the lowest position. The upper three single-precision floating-point values in the destination are retained.</p><p>The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in Figure 4-20. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-15 lists the encoded values for rounding-mode field).</p><p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to \u20181 then denormals will be converted to zero before rounding.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Round the single-precision floating-point value in the lowest dword of the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the result in the destination operand (first operand). The rounding process rounds a single-precision floating-point input to an integer value and returns the result as a single-precision floating-point value in the lowest position. The upper three single-precision floating-point values in the destination are retained."
             };

         case "RSM":
             return {
                 "url": "http://www.felixcloutier.com/x86/RSM.html",
                 "html": "<p>Returns program control from system management mode (SMM) to the application program or operating-system procedure that was interrupted when the processor received an SMM interrupt. The processor\u2019s state is restored from the dump created upon entering SMM. If the processor detects invalid state information during state restora-tion, it enters the shutdown state. The following invalid information can cause a shutdown:</p><p>The contents of the model-specific registers are not affected by a return from SMM.</p><p>The SMM state map used by RSM supports resuming processor context for non-64-bit modes and 64-bit mode.</p><p>See Chapter 34, \u201cSystem Management Mode,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3C</em>, for more information about SMM and the behavior of the RSM instruction.</p>",
                 "tooltip": "Returns program control from system management mode (SMM) to the application program or operating-system procedure that was interrupted when the processor received an SMM interrupt. The processor\u2019s state is restored from the dump created upon entering SMM. If the processor detects invalid state information during state restora-tion, it enters the shutdown state. The following invalid information can cause a shutdown"
             };

         case "RSQRTPS":
         case "VRSQRTPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/RSQRTPS.html",
                 "html": "<p>Performs a SIMD computation of the approximate reciprocals of the square roots of the four packed single-preci-sion floating-point values in the source operand (second operand) and stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD single-precision floating-point operation.</p><p>The relative error for this approximation is:</p><p>|Relative Error| \u2264 1.5 \u2217 2<sup>\u221212</sup></p><p>The RSQRTPS instruction is not affected by the rounding control bits in the MXCSR register. When a source value is a 0.0, an \u221e of the sign of the source value is returned. A denormal source value is treated as a 0.0 (of the same sign). When a source value is a negative value (other than \u22120.0), a floating-point indefinite is returned. When a source value is an SNaN or QNaN, the SNaN is converted to a QNaN or the source QNaN is returned.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD computation of the approximate reciprocals of the square roots of the four packed single-preci-sion floating-point values in the source operand (second operand) and stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD single-precision floating-point operation."
             };

         case "RSQRTSS":
         case "VRSQRTSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/RSQRTSS.html",
                 "html": "<p>Computes an approximate reciprocal of the square root of the low single-precision floating-point value in the source operand (second operand) stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar single-precision floating-point operation.</p><p>The relative error for this approximation is:</p><p>|Relative Error| \u2264 1.5 \u2217 2<sup>\u221212</sup></p><p>The RSQRTSS instruction is not affected by the rounding control bits in the MXCSR register. When a source value is a 0.0, an \u221e of the sign of the source value is returned. A denormal source value is treated as a 0.0 (of the same sign). When a source value is a negative value (other than \u22120.0), a floating-point indefinite is returned. When a source value is an SNaN or QNaN, the SNaN is converted to a QNaN or the source QNaN is returned.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Computes an approximate reciprocal of the square root of the low single-precision floating-point value in the source operand (second operand) stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar single-precision floating-point operation."
             };

         case "SAHF":
             return {
                 "url": "http://www.felixcloutier.com/x86/SAHF.html",
                 "html": "<p>Loads the SF, ZF, AF, PF, and CF flags of the EFLAGS register with values from the corresponding bits in the AH register (bits 7, 6, 4, 2, and 0, respectively). Bits 1, 3, and 5 of register AH are ignored; the corresponding reserved bits (1, 3, and 5) in the EFLAGS register remain as shown in the \u201cOperation\u201d section below.</p><p>This instruction executes as described above in compatibility mode and legacy mode. It is valid in 64-bit mode only if CPUID.80000001H:ECX.LAHF-SAHF[bit 0] = 1.</p>",
                 "tooltip": "Loads the SF, ZF, AF, PF, and CF flags of the EFLAGS register with values from the corresponding bits in the AH register (bits 7, 6, 4, 2, and 0, respectively). Bits 1, 3, and 5 of register AH are ignored; the corresponding reserved bits (1, 3, and 5) in the EFLAGS register remain as shown in the \u201cOperation\u201d section below."
             };

         case "SHLX":
         case "SARX":
         case "SHRX":
             return {
                 "url": "http://www.felixcloutier.com/x86/SARX:SHLX:SHRX.html",
                 "html": "<p>Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand).</p><p>The shift arithmetic right (SARX) and shift logical right (SHRX) instructions shift the bits of the destination operand to the right (toward less significant bit locations), SARX keeps and propagates the most significant bit (sign bit) while shifting.</p><p>The logical shift left (SHLX) shifts the bits of the destination operand to the left (toward more significant bit loca-tions).</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p><p>If the value specified in the first source operand exceeds OperandSize -1, the COUNT value is masked.</p>",
                 "tooltip": "Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand)."
             };

         case "SBB":
             return {
                 "url": "http://www.felixcloutier.com/x86/SBB.html",
                 "html": "<p>Adds the source operand (second operand) and the carry (CF) flag, and subtracts the result from the destination operand (first operand). The result of the subtraction is stored in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a borrow from a previous subtraction.</p><p>When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p><p>The SBB instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates the result for both data types and sets the OF and CF flags to indicate a borrow in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p><p>The SBB instruction is usually executed as part of a multibyte or multiword subtraction in which a SUB instruction is followed by a SBB instruction.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>",
                 "tooltip": "Adds the source operand (second operand) and the carry (CF) flag, and subtracts the result from the destination operand (first operand). The result of the subtraction is stored in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a borrow from a previous subtraction."
             };

         case "SCASW":
         case "SCAS":
         case "SCASD":
         case "SCASB":
         case "SCASQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/SCASQ.html",
                 "html": "<p>In non-64-bit modes and in default 64-bit mode: this instruction compares a byte, word, doubleword or quadword specified using a memory operand with the value in AL, AX, or EAX. It then sets status flags in EFLAGS recording the results. The memory operand address is read from ES:(E)DI register (depending on the address-size attribute of the instruction and the current operational mode). Note that ES cannot be overridden with a segment override prefix.</p><p>At the assembly-code level, two forms of this instruction are allowed. The explicit-operand form and the no-oper-ands form. The explicit-operand form (specified using the SCAS mnemonic) allows a memory operand to be speci-fied explicitly. The memory operand must be a symbol that indicates the size and location of the operand value. The register operand is then automatically selected to match the size of the memory operand (AL register for byte comparisons, AX for word comparisons, EAX for doubleword comparisons). The explicit-operand form is provided to allow documentation. Note that the documentation provided by this form can be misleading. That is, the memory operand symbol must specify the correct type (size) of the operand (byte, word, or doubleword) but it does not have to specify the correct location. The location is always specified by ES:(E)DI.</p><p>The no-operands form of the instruction uses a short form of SCAS. Again, ES:(E)DI is assumed to be the memory operand and AL, AX, or EAX is assumed to be the register operand. The size of operands is selected by the mnemonic: SCASB (byte comparison), SCASW (word comparison), or SCASD (doubleword comparison).</p><p>After the comparison, the (E)DI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the (E)DI register is decremented. The register is incremented or decremented by 1 for byte operations, by 2 for word oper-ations, and by 4 for doubleword operations.</p><p>SCAS, SCASB, SCASW, SCASD, and SCASQ can be preceded by the REP prefix for block comparisons of ECX bytes, words, doublewords, or quadwords. Often, however, these instructions will be used in a LOOP construct that takes</p>",
                 "tooltip": "In non-64-bit modes and in default 64-bit mode: this instruction compares a byte, word, doubleword or quadword specified using a memory operand with the value in AL, AX, or EAX. It then sets status flags in EFLAGS recording the results. The memory operand address is read from ES:(E)DI register (depending on the address-size attribute of the instruction and the current operational mode). Note that ES cannot be overridden with a segment override prefix."
             };

         case "SETE":
         case "SETNAE":
         case "SETAE":
         case "SETLE":
         case "SETA":
         case "SETB":
         case "SETC":
         case "SETL":
         case "SETNE":
         case "SETNC":
         case "SETNB":
         case "SETNA":
         case "SETNG":
         case "SETG":
         case "SETGE":
         case "SETBE":
         case "SETNGE":
         case "SETNL":
         case "SETNBE":
         case "SETNLE":
             return {
                 "url": "http://www.felixcloutier.com/x86/SETNLE.html",
                 "html": "<p>Sets the destination operand to 0 or 1 depending on the settings of the status flags (CF, SF, OF, ZF, and PF) in the EFLAGS register. The destination operand points to a byte register or a byte in memory. The condition code suffix (<em>cc</em>) indicates the condition being tested for.</p><p>The terms \u201cabove\u201d and \u201cbelow\u201d are associated with the CF flag and refer to the relationship between two unsigned integer values. The terms \u201cgreater\u201d and \u201cless\u201d are associated with the SF and OF flags and refer to the relationship between two signed integer values.</p><p>Many of the SET<em>cc</em> instruction opcodes have alternate mnemonics. For example, SETG (set byte if greater) and SETNLE (set if not less or equal) have the same opcode and test for the same condition: ZF equals 0 and SF equals OF. These alternate mnemonics are provided to make code more intelligible. Appendix B, \u201cEFLAGS Condition Codes,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, shows the alternate mnemonics for various test conditions.</p><p>Some languages represent a logical one as an integer with all bits set. This representation can be obtained by choosing the logically opposite condition for the SET<em>cc</em> instruction, then decrementing the result. For example, to test for overflow, use the SETNO instruction, then decrement the result.</p><p>In IA-64 mode, the operand size is fixed at 8 bits. Use of REX prefix enable uniform addressing to additional byte registers. Otherwise, this instruction\u2019s operation is the same as in legacy mode and compatibility mode.</p>",
                 "tooltip": "Sets the destination operand to 0 or 1 depending on the settings of the status flags (CF, SF, OF, ZF, and PF) in the EFLAGS register. The destination operand points to a byte register or a byte in memory. The condition code suffix (cc) indicates the condition being tested for."
             };

         case "SFENCE":
             return {
                 "url": "http://www.felixcloutier.com/x86/SFENCE.html",
                 "html": "<p>Performs a serializing operation on all store-to-memory instructions that were issued prior the SFENCE instruction. This serializing operation guarantees that every store instruction that precedes the SFENCE instruction in program order becomes globally visible before any store instruction that follows the SFENCE instruction. The SFENCE instruction is ordered with respect to store instructions, other SFENCE instructions, any LFENCE and MFENCE instructions, and any serializing instructions (such as the CPUID instruction). It is not ordered with respect to load instructions.</p><p>Weakly ordered memory types can be used to achieve higher processor performance through such techniques as out-of-order issue, write-combining, and write-collapsing. The degree to which a consumer of data recognizes or knows that the data is weakly ordered varies among applications and may be unknown to the producer of this data. The SFENCE instruction provides a performance-efficient way of ensuring store ordering between routines that produce weakly-ordered results and routines that consume this data.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>Specification of the instruction's opcode above indicates a ModR/M byte of F8. For this instruction, the processor ignores the r/m field of the ModR/M byte. Thus, SFENCE is encoded by any opcode of the form 0F AE Fx, where x is in the range 8-F.</p>",
                 "tooltip": "Performs a serializing operation on all store-to-memory instructions that were issued prior the SFENCE instruction. This serializing operation guarantees that every store instruction that precedes the SFENCE instruction in program order becomes globally visible before any store instruction that follows the SFENCE instruction. The SFENCE instruction is ordered with respect to store instructions, other SFENCE instructions, any LFENCE and MFENCE instructions, and any serializing instructions (such as the CPUID instruction). It is not ordered with respect to load instructions."
             };

         case "SGDT":
             return {
                 "url": "http://www.felixcloutier.com/x86/SGDT.html",
                 "html": "<p>Stores the content of the global descriptor table register (GDTR) in the destination operand. The destination operand specifies a memory location.</p><p>In legacy or compatibility mode, the destination operand is a 6-byte memory location. If the operand-size attribute is 16 bits, the limit is stored in the low 2 bytes and the 24-bit base address is stored in bytes 3-5, and byte 6 is zero-filled. If the operand-size attribute is 32 bits, the 16-bit limit field of the register is stored in the low 2 bytes of the memory location and the 32-bit base address is stored in the high 4 bytes.</p><p>In IA-32e mode, the operand size is fixed at 8+2 bytes. The instruction stores an 8-byte base and a 2-byte limit.</p><p>SGDT is useful only by operating-system software. However, it can be used in application programs without causing an exception to be generated. See \u201cLGDT/LIDT\u2014Load Global/Interrupt Descriptor Table Register\u201d in Chapter 3, <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2A</em>, for information on loading the GDTR and IDTR.</p>",
                 "tooltip": "Stores the content of the global descriptor table register (GDTR) in the destination operand. The destination operand specifies a memory location."
             };

         case "SAR":
         case "SHL":
         case "SAL":
             return {
                 "url": "http://www.felixcloutier.com/x86/SHL.html",
                 "html": "<p>Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand.</p><p>The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.</p><p>The shift arithmetic left (SAL) and shift logical left (SHL) instructions perform the same operation; they shift the bits in the destination operand to the left (toward more significant bit locations). For each shift count, the most significant bit of the destination operand is shifted into the CF flag, and the least significant bit is cleared (see Figure 7-7 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>).</p><p>The shift arithmetic right (SAR) and shift logical right (SHR) instructions shift the bits of the destination operand to the right (toward less significant bit locations). For each shift count, the least significant bit of the destination operand is shifted into the CF flag, and the most significant bit is either set or cleared depending on the instruction type. The SHR instruction clears the most significant bit (see Figure 7-8 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>); the SAR instruction sets or clears the most significant bit to correspond to the sign (most significant bit) of the original value in the destination operand. In effect, the SAR instruction fills the empty bit position\u2019s shifted value with the sign of the unshifted value (see Figure 7-9 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>).</p><p>The SAR and SHR instructions can be used to perform signed or unsigned division, respectively, of the destination operand by powers of 2. For example, using the SAR instruction to shift a signed integer 1 bit to the right divides the value by 2.</p>",
                 "tooltip": "Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand."
             };

         case "SHLD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SHLD.html",
                 "html": "<p>The SHLD instruction is used for multi-precision shifts of 64 bits or more.</p><p>The instruction shifts the first operand (destination operand) to the left the number of bits specified by the third operand (count operand). The second operand (source operand) provides bits to shift in from the right (starting with bit 0 of the destination operand).</p><p>The destination operand can be a register or a memory location; the source operand is a register. The count operand is an unsigned integer that can be stored in an immediate byte or in the CL register. If the count operand is CL, the shift count is the logical AND of CL and a count mask. In non-64-bit modes and default 64-bit mode; only bits 0 through 4 of the count are used. This masks the count to a value between 0 and 31. If a count is greater than the operand size, the result is undefined.</p><p>If the count is 1 or greater, the CF flag is filled with the last bit shifted out of the destination operand. For a 1-bit shift, the OF flag is set if a sign change occurred; otherwise, it is cleared. If the count operand is 0, flags are not affected.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits (upgrading the count mask to 6 bits). See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "The SHLD instruction is used for multi-precision shifts of 64 bits or more."
             };

         case "SHRD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SHRD.html",
                 "html": "<p>The SHRD instruction is useful for multi-precision shifts of 64 bits or more.</p><p>The instruction shifts the first operand (destination operand) to the right the number of bits specified by the third operand (count operand). The second operand (source operand) provides bits to shift in from the left (starting with the most significant bit of the destination operand).</p><p>The destination operand can be a register or a memory location; the source operand is a register. The count operand is an unsigned integer that can be stored in an immediate byte or the CL register. If the count operand is CL, the shift count is the logical AND of CL and a count mask. In non-64-bit modes and default 64-bit mode, the width of the count mask is 5 bits. Only bits 0 through 4 of the count register are used (masking the count to a value between 0 and 31). If the count is greater than the operand size, the result is undefined.</p><p>If the count is 1 or greater, the CF flag is filled with the last bit shifted out of the destination operand. For a 1-bit shift, the OF flag is set if a sign change occurred; otherwise, it is cleared. If the count operand is 0, flags are not affected.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits (upgrading the count mask to 6 bits). See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "The SHRD instruction is useful for multi-precision shifts of 64 bits or more."
             };

         case "VSHUFPD":
         case "SHUFPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SHUFPD.html",
                 "html": "<p>Moves either of the two packed double-precision floating-point values from destination operand (first operand) into the low quadword of the destination operand; moves either of the two packed double-precision floating-point values from the source operand into to the high quadword of the destination operand (see Figure 4-21). The select operand (third operand) determines which values are moved to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Moves either of the two packed double-precision floating-point values from destination operand (first operand) into the low quadword of the destination operand; moves either of the two packed double-precision floating-point values from the source operand into to the high quadword of the destination operand (see Figure 4-21). The select operand (third operand) determines which values are moved to the destination operand."
             };

         case "SHUFPS":
         case "VSHUFPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SHUFPS.html",
                 "html": "<p>Moves two of the four packed single-precision floating-point values from the destination operand (first operand) into the low quadword of the destination operand; moves two of the four packed single-precision floating-point values from the source operand (second operand) into to the high quadword of the destination operand (see Figure 4-22). The select operand (third operand) determines which values are moved to the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>determines which values are moved to the destination operand.</p>",
                 "tooltip": "Moves two of the four packed single-precision floating-point values from the destination operand (first operand) into the low quadword of the destination operand; moves two of the four packed single-precision floating-point values from the source operand (second operand) into to the high quadword of the destination operand (see Figure 4-22). The select operand (third operand) determines which values are moved to the destination operand."
             };

         case "SIDT":
             return {
                 "url": "http://www.felixcloutier.com/x86/SIDT.html",
                 "html": "<p>Stores the content the interrupt descriptor table register (IDTR) in the destination operand. The destination operand specifies a 6-byte memory location.</p><p>In non-64-bit modes, if the operand-size attribute is 32 bits, the 16-bit limit field of the register is stored in the low 2 bytes of the memory location and the 32-bit base address is stored in the high 4 bytes. If the operand-size attri-bute is 16 bits, the limit is stored in the low 2 bytes and the 24-bit base address is stored in the third, fourth, and fifth byte, with the sixth byte filled with 0s.</p><p>In 64-bit mode, the operand size fixed at 8+2 bytes. The instruction stores 8-byte base and 2-byte limit values.</p><p>SIDT is only useful in operating-system software; however, it can be used in application programs without causing an exception to be generated. See \u201cLGDT/LIDT\u2014Load Global/Interrupt Descriptor Table Register\u201d in Chapter 3, <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2A</em>, for information on loading the GDTR and IDTR.</p>",
                 "tooltip": "Stores the content the interrupt descriptor table register (IDTR) in the destination operand. The destination operand specifies a 6-byte memory location."
             };

         case "SLDT":
             return {
                 "url": "http://www.felixcloutier.com/x86/SLDT.html",
                 "html": "<p>Stores the segment selector from the local descriptor table register (LDTR) in the destination operand. The desti-nation operand can be a general-purpose register or a memory location. The segment selector stored with this instruction points to the segment descriptor (located in the GDT) for the current LDT. This instruction can only be executed in protected mode.</p><p>Outside IA-32e mode, when the destination operand is a 32-bit register, the 16-bit segment selector is copied into the low-order 16 bits of the register. The high-order 16 bits of the register are cleared for the Pentium 4, Intel Xeon, and P6 family processors. They are undefined for Pentium, Intel486, and Intel386 processors. When the destina-tion operand is a memory location, the segment selector is written to memory as a 16-bit quantity, regardless of the operand size.</p><p>In compatibility mode, when the destination operand is a 32-bit register, the 16-bit segment selector is copied into the low-order 16 bits of the register. The high-order 16 bits of the register are cleared. When the destination operand is a memory location, the segment selector is written to memory as a 16-bit quantity, regardless of the operand size.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). The behavior of SLDT with a 64-bit register is to zero-extend the 16-bit selector and store it in the register. If the desti-nation is memory and operand size is 64, SLDT will write the 16-bit selector to memory as a 16-bit quantity, regardless of the operand size</p>",
                 "tooltip": "Stores the segment selector from the local descriptor table register (LDTR) in the destination operand. The desti-nation operand can be a general-purpose register or a memory location. The segment selector stored with this instruction points to the segment descriptor (located in the GDT) for the current LDT. This instruction can only be executed in protected mode."
             };

         case "SMSW":
             return {
                 "url": "http://www.felixcloutier.com/x86/SMSW.html",
                 "html": "<p>Stores the machine status word (bits 0 through 15 of control register CR0) into the destination operand. The desti-nation operand can be a general-purpose register or a memory location.</p><p>In non-64-bit modes, when the destination operand is a 32-bit register, the low-order 16 bits of register CR0 are copied into the low-order 16 bits of the register and the high-order 16 bits are undefined. When the destination operand is a memory location, the low-order 16 bits of register CR0 are written to memory as a 16-bit quantity, regardless of the operand size.</p><p>In 64-bit mode, the behavior of the SMSW instruction is defined by the following examples:</p><p>SMSW is only useful in operating-system software. However, it is not a privileged instruction and can be used in application programs. The is provided for compatibility with the Intel 286 processor. Programs and procedures intended to run on the Pentium 4, Intel Xeon, P6 family, Pentium, Intel486, and Intel386 processors should use the MOV (control registers) instruction to load the machine status word.</p><p>See \u201cChanges to Instruction Behavior in VMX Non-Root Operation\u201d in Chapter 25 of the <em>Intel\u00ae 64 and IA-32 Archi-tectures Software Developer\u2019s Manual, Volume 3C</em>, for more information about the behavior of this instruction in VMX non-root operation.</p>",
                 "tooltip": "Stores the machine status word (bits 0 through 15 of control register CR0) into the destination operand. The desti-nation operand can be a general-purpose register or a memory location."
             };

         case "SQRTPD":
         case "VSQRTPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SQRTPD.html",
                 "html": "<p>Performs a SIMD computation of the square roots of the two packed double-precision floating-point values in the source operand (second operand) stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destina-tion is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destina-tion operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD computation of the square roots of the two packed double-precision floating-point values in the source operand (second operand) stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD double-precision floating-point operation."
             };

         case "VSQRTPS":
         case "SQRTPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SQRTPS.html",
                 "html": "<p>Performs a SIMD computation of the square roots of the four packed single-precision floating-point values in the source operand (second operand) stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD single-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destina-tion is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destina-tion operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD computation of the square roots of the four packed single-precision floating-point values in the source operand (second operand) stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD single-precision floating-point operation."
             };

         case "VSQRTSD":
         case "SQRTSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SQRTSD.html",
                 "html": "<p>Computes the square root of the low double-precision floating-point value in the source operand (second operand) and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the desti-nation operand remains unchanged. See Figure 11-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1</em>, for an illustration of a scalar double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Computes the square root of the low double-precision floating-point value in the source operand (second operand) and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the desti-nation operand remains unchanged. See Figure 11-4 in the Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 1, for an illustration of a scalar double-precision floating-point operation."
             };

         case "SQRTSS":
         case "VSQRTSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SQRTSS.html",
                 "html": "<p>Computes the square root of the low single-precision floating-point value in the source operand (second operand) and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order double-words of the destination operand remain unchanged. See Figure 10-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar single-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Computes the square root of the low single-precision floating-point value in the source operand (second operand) and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order double-words of the destination operand remain unchanged. See Figure 10-6 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar single-precision floating-point operation."
             };

         case "STAC":
             return {
                 "url": "http://www.felixcloutier.com/x86/STAC.html",
                 "html": "<p>Sets the AC flag bit in EFLAGS register. This may enable alignment checking of user-mode data accesses. This allows explicit supervisor-mode data accesses to user-mode pages even if the SMAP bit is set in the CR4 register.</p><p>This instruction's operation is the same in non-64-bit modes and 64-bit mode. Attempts to execute STAC when CPL &gt; 0 cause #UD.</p>",
                 "tooltip": "Sets the AC flag bit in EFLAGS register. This may enable alignment checking of user-mode data accesses. This allows explicit supervisor-mode data accesses to user-mode pages even if the SMAP bit is set in the CR4 register."
             };

         case "STC":
             return {
                 "url": "http://www.felixcloutier.com/x86/STC.html",
                 "html": "<p>Sets the CF flag in the EFLAGS register. Operation is the same in all modes.</p>",
                 "tooltip": "Sets the CF flag in the EFLAGS register. Operation is the same in all modes."
             };

         case "STD":
             return {
                 "url": "http://www.felixcloutier.com/x86/STD.html",
                 "html": "<p>Sets the DF flag in the EFLAGS register. When the DF flag is set to 1, string operations decrement the index regis-ters (ESI and/or EDI). Operation is the same in all modes.</p>",
                 "tooltip": "Sets the DF flag in the EFLAGS register. When the DF flag is set to 1, string operations decrement the index regis-ters (ESI and/or EDI). Operation is the same in all modes."
             };

         case "STI":
             return {
                 "url": "http://www.felixcloutier.com/x86/STI.html",
                 "html": "<p>If protected-mode virtual interrupts are not enabled, STI sets the interrupt flag (IF) in the EFLAGS register. After the IF flag is set, the processor begins responding to external, maskable interrupts after the next instruction is executed. The delayed effect of this instruction is provided to allow interrupts to be enabled just before returning from a procedure (or subroutine). For instance, if an STI instruction is followed by an RET instruction, the RET instruction is allowed to execute before external interrupts are recognized<sup>1</sup>. If the STI instruction is followed by a CLI instruction (which clears the IF flag), the effect of the STI instruction is negated.</p><p>The IF flag and the STI and CLI instructions do not prohibit the generation of exceptions and NMI interrupts. NMI interrupts (and SMIs) may be blocked for one macroinstruction following an STI.</p><p>When protected-mode virtual interrupts are enabled, CPL is 3, and IOPL is less than 3; STI sets the VIF flag in the EFLAGS register, leaving IF unaffected.</p><p>Table 4-16 indicates the action of the STI instruction depending on the processor\u2019s mode of operation and the CPL/IOPL settings of the running program or procedure.</p><p>Operation is the same in all modes.</p>",
                 "tooltip": "If protected-mode virtual interrupts are not enabled, STI sets the interrupt flag (IF) in the EFLAGS register. After the IF flag is set, the processor begins responding to external, maskable interrupts after the next instruction is executed. The delayed effect of this instruction is provided to allow interrupts to be enabled just before returning from a procedure (or subroutine). For instance, if an STI instruction is followed by an RET instruction, the RET instruction is allowed to execute before external interrupts are recognized1. If the STI instruction is followed by a CLI instruction (which clears the IF flag), the effect of the STI instruction is negated."
             };

         case "VSTMXCSR":
         case "STMXCSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/STMXCSR.html",
                 "html": "<p>Stores the contents of the MXCSR control and status register to the destination operand. The destination operand is a 32-bit memory location. The reserved bits in the MXCSR register are stored as 0s.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>VEX.L must be 0, otherwise instructions will #UD.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Stores the contents of the MXCSR control and status register to the destination operand. The destination operand is a 32-bit memory location. The reserved bits in the MXCSR register are stored as 0s."
             };

         case "STOSD":
         case "STOSW":
         case "STOSQ":
         case "STOSB":
         case "STOS":
             return {
                 "url": "http://www.felixcloutier.com/x86/STOSQ.html",
                 "html": "<p>In non-64-bit and default 64-bit mode; stores a byte, word, or doubleword from the AL, AX, or EAX register (respectively) into the destination operand. The destination operand is a memory location, the address of which is read from either the ES:EDI or ES:DI register (depending on the address-size attribute of the instruction and the mode of operation). The ES segment cannot be overridden with a segment override prefix.</p><p>At the assembly-code level, two forms of the instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the STOS mnemonic) allows the destination operand to be specified explicitly. Here, the destination operand should be a symbol that indicates the size and location of the destination value. The source operand is then automatically selected to match the size of the destination operand (the AL register for byte operands, AX for word operands, EAX for doubleword operands). The explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the destination operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the ES:(E)DI register. These must be loaded correctly before the store string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, doubleword, and quadword versions of the STOS instructions. Here also ES:(E)DI is assumed to be the destination operand and AL, AX, or EAX is assumed to be the source operand. The size of the destination and source operands is selected by the mnemonic: STOSB (byte read from register AL), STOSW (word from AX), STOSD (doubleword from EAX).</p><p>After the byte, word, or doubleword is transferred from the register to the memory location, the (E)DI register is incremented or decremented according to the setting of the DF flag in the EFLAGS register. If the DF flag is 0, the register is incremented; if the DF flag is 1, the register is decremented (the register is incremented or decremented by 1 for byte operations, by 2 for word operations, by 4 for doubleword operations).</p><h3>NOTE</h3>",
                 "tooltip": "In non-64-bit and default 64-bit mode; stores a byte, word, or doubleword from the AL, AX, or EAX register (respectively) into the destination operand. The destination operand is a memory location, the address of which is read from either the ES:EDI or ES:DI register (depending on the address-size attribute of the instruction and the mode of operation). The ES segment cannot be overridden with a segment override prefix."
             };

         case "STR":
             return {
                 "url": "http://www.felixcloutier.com/x86/STR.html",
                 "html": "<p>Stores the segment selector from the task register (TR) in the destination operand. The destination operand can be a general-purpose register or a memory location. The segment selector stored with this instruction points to the task state segment (TSS) for the currently running task.</p><p>When the destination operand is a 32-bit register, the 16-bit segment selector is copied into the lower 16 bits of the register and the upper 16 bits of the register are cleared. When the destination operand is a memory location, the segment selector is written to memory as a 16-bit quantity, regardless of operand size.</p><p>In 64-bit mode, operation is the same. The size of the memory operand is fixed at 16 bits. In register stores, the 2-byte TR is zero extended if stored to a 64-bit register.</p><p>The STR instruction is useful only in operating-system software. It can only be executed in protected mode.</p>",
                 "tooltip": "Stores the segment selector from the task register (TR) in the destination operand. The destination operand can be a general-purpose register or a memory location. The segment selector stored with this instruction points to the task state segment (TSS) for the currently running task."
             };

         case "SUB":
             return {
                 "url": "http://www.felixcloutier.com/x86/SUB.html",
                 "html": "<p>Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, register, or memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p><p>The SUB instruction performs integer subtraction. It evaluates the result for both signed and unsigned integer operands and sets the OF and CF flags to indicate an overflow in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>",
                 "tooltip": "Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, register, or memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format."
             };

         case "VSUBPD":
         case "SUBPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SUBPD.html",
                 "html": "<p>Performs a SIMD subtract of the two packed double-precision floating-point values in the source operand (second operand) from the two packed double-precision floating-point values in the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: T second source can be an XMM register or an 128-bit memory location. The destina-tion is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD subtract of the two packed double-precision floating-point values in the source operand (second operand) from the two packed double-precision floating-point values in the destination operand (first operand), and stores the packed double-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 11-3 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD double-precision floating-point operation."
             };

         case "SUBPS":
         case "VSUBPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SUBPS.html",
                 "html": "<p>Performs a SIMD subtract of the four packed single-precision floating-point values in the source operand (second operand) from the four packed single-precision floating-point values in the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD subtract of the four packed single-precision floating-point values in the source operand (second operand) from the four packed single-precision floating-point values in the destination operand (first operand), and stores the packed single-precision floating-point results in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. See Figure 10-5 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD double-precision floating-point operation."
             };

         case "SUBSD":
         case "VSUBSD":
             return {
                 "url": "http://www.felixcloutier.com/x86/SUBSD.html",
                 "html": "<p>Subtracts the low double-precision floating-point value in the source operand (second operand) from the low double-precision floating-point value in the destination operand (first operand), and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the destination operand remains unchanged. See Figure 11-4 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar double-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:64) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Subtracts the low double-precision floating-point value in the source operand (second operand) from the low double-precision floating-point value in the destination operand (first operand), and stores the double-precision floating-point result in the destination operand. The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The high quadword of the destination operand remains unchanged. See Figure 11-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar double-precision floating-point operation."
             };

         case "VSUBSS":
         case "SUBSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SUBSS.html",
                 "html": "<p>Subtracts the low single-precision floating-point value in the source operand (second operand) from the low single-precision floating-point value in the destination operand (first operand), and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a scalar single-precision floating-point operation.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (VLMAX-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (127:32) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (VLMAX-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Subtracts the low single-precision floating-point value in the source operand (second operand) from the low single-precision floating-point value in the destination operand (first operand), and stores the single-precision floating-point result in the destination operand. The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The three high-order doublewords of the destination operand remain unchanged. See Figure 10-6 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a scalar single-precision floating-point operation."
             };

         case "SWAPGS":
             return {
                 "url": "http://www.felixcloutier.com/x86/SWAPGS.html",
                 "html": "<p>SWAPGS exchanges the current GS base register value with the value contained in MSR address C0000102H (IA32_KERNEL_GS_BASE). The SWAPGS instruction is a privileged instruction intended for use by system soft-ware.</p><p>When using SYSCALL to implement system calls, there is no kernel stack at the OS entry point. Neither is there a straightforward method to obtain a pointer to kernel structures from which the kernel stack pointer could be read. Thus, the kernel cannot save general purpose registers or reference memory.</p><p>By design, SWAPGS does not require any general purpose registers or memory operands. No registers need to be saved before using the instruction. SWAPGS exchanges the CPL 0 data pointer from the IA32_KERNEL_GS_BASE MSR with the GS base register. The kernel can then use the GS prefix on normal memory references to access kernel data structures. Similarly, when the OS kernel is entered using an interrupt or exception (where the kernel stack is already set up), SWAPGS can be used to quickly get a pointer to the kernel data structures.</p><p>The IA32_KERNEL_GS_BASE MSR itself is only accessible using RDMSR/WRMSR instructions. Those instructions are only accessible at privilege level 0. The WRMSR instruction ensures that the IA32_KERNEL_GS_BASE MSR contains a canonical address.</p>",
                 "tooltip": "SWAPGS exchanges the current GS base register value with the value contained in MSR address C0000102H (IA32_KERNEL_GS_BASE). The SWAPGS instruction is a privileged instruction intended for use by system soft-ware."
             };

         case "SYSCALL":
             return {
                 "url": "http://www.felixcloutier.com/x86/SYSCALL.html",
                 "html": "<p>SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)</p><p>SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR.</p><p>SYSCALL loads the CS and SS selectors with values derived from bits 47:32 of the IA32_STAR MSR. However, the CS and SS descriptor caches are <strong>not</strong> loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the respon-sibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values corre-spond to the fixed values loaded into the descriptor caches; the SYSCALL instruction does not ensure this correspondence.</p><p>The SYSCALL instruction does not save the stack pointer (RSP). If the OS system-call handler will change the stack pointer, it is the responsibility of software to save the previous value of the stack pointer. This might be done prior to executing SYSCALL, with software restoring the stack pointer with the instruction following SYSCALL (which will be executed after SYSRET). Alternatively, the OS system-call handler may save the stack pointer and restore it before executing SYSRET.</p>",
                 "tooltip": "SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)"
             };

         case "SYSENTER":
             return {
                 "url": "http://www.felixcloutier.com/x86/SYSENTER.html",
                 "html": "<p>Executes a fast call to a level 0 system procedure or routine. SYSENTER is a companion instruction to SYSEXIT. The instruction is optimized to provide the maximum performance for system calls from user code running at privilege level 3 to operating system or executive procedures running at privilege level 0.</p><p>When executed in IA-32e mode, the SYSENTER instruction transitions the logical processor to 64-bit mode; other-wise, the logical processor remains in protected mode.</p><p>Prior to executing the SYSENTER instruction, software must specify the privilege level 0 code segment and code entry point, and the privilege level 0 stack segment and stack pointer by writing values to the following MSRs:</p><p>These MSRs can be read from and written to using RDMSR/WRMSR. The WRMSR instruction ensures that the IA32_SYSENTER_EIP and IA32_SYSENTER_ESP MSRs always contain canonical addresses.</p><p>While SYSENTER loads the CS and SS selectors with values derived from the IA32_SYSENTER_CS MSR, the CS and SS descriptor caches are <strong>not</strong> loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSENTER instruction does not ensure this correspondence.</p>",
                 "tooltip": "Executes a fast call to a level 0 system procedure or routine. SYSENTER is a companion instruction to SYSEXIT. The instruction is optimized to provide the maximum performance for system calls from user code running at privilege level 3 to operating system or executive procedures running at privilege level 0."
             };

         case "SYSEXIT":
             return {
                 "url": "http://www.felixcloutier.com/x86/SYSEXIT.html",
                 "html": "<p>Executes a fast return to privilege level 3 user code. SYSEXIT is a companion instruction to the SYSENTER instruc-tion. The instruction is optimized to provide the maximum performance for returns from system procedures executing at protections levels 0 to user procedures executing at protection level 3. It must be executed from code executing at privilege level 0.</p><p>With a 64-bit operand size, SYSEXIT remains in 64-bit mode; otherwise, it either enters compatibility mode (if the logical processor is in IA-32e mode) or remains in protected mode (if it is not).</p><p>Prior to executing SYSEXIT, software must specify the privilege level 3 code segment and code entry point, and the privilege level 3 stack segment and stack pointer by writing values into the following MSR and general-purpose registers:</p><p>The IA32_SYSENTER_CS MSR can be read from and written to using RDMSR and WRMSR.</p><p>While SYSEXIT loads the CS and SS selectors with values derived from the IA32_SYSENTER_CS MSR, the CS and SS descriptor caches are <strong>not</strong> loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSEXIT instruction does not ensure this correspondence.</p>",
                 "tooltip": "Executes a fast return to privilege level 3 user code. SYSEXIT is a companion instruction to the SYSENTER instruc-tion. The instruction is optimized to provide the maximum performance for returns from system procedures executing at protections levels 0 to user procedures executing at protection level 3. It must be executed from code executing at privilege level 0."
             };

         case "SYSRET":
             return {
                 "url": "http://www.felixcloutier.com/x86/SYSRET.html",
                 "html": "<p>SYSRET is a companion instruction to the SYSCALL instruction. It returns from an OS system-call handler to user code at privilege level 3. It does so by loading RIP from RCX and loading RFLAGS from R11.<sup>1</sup> With a 64-bit operand size, SYSRET remains in 64-bit mode; otherwise, it enters compatibility mode and only the low 32 bits of the regis-ters are loaded.</p><p>SYSRET loads the CS and SS selectors with values derived from bits 63:48 of the IA32_STAR MSR. However, the CS and SS descriptor caches are <strong>not</strong> loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the respon-sibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values corre-spond to the fixed values loaded into the descriptor caches; the SYSRET instruction does not ensure this correspondence.</p><p>The SYSRET instruction does not modify the stack pointer (ESP or RSP). For that reason, it is necessary for software to switch to the user stack. The OS may load the user stack pointer (if it was saved after SYSCALL) before executing SYSRET; alternatively, user code may load the stack pointer (if it was saved before SYSCALL) after receiving control from SYSRET.</p><p>If the OS loads the stack pointer before executing SYSRET, it must ensure that the handler of any interrupt or exception delivered between restoring the stack pointer and successful execution of SYSRET is not invoked with the user stack. It can do so using approaches such as the following:</p><p>\u2014 Confirming that the value of RCX is canonical before executing SYSRET.</p>",
                 "tooltip": "SYSRET is a companion instruction to the SYSCALL instruction. It returns from an OS system-call handler to user code at privilege level 3. It does so by loading RIP from RCX and loading RFLAGS from R11.1 With a 64-bit operand size, SYSRET remains in 64-bit mode; otherwise, it enters compatibility mode and only the low 32 bits of the regis-ters are loaded."
             };

         case "TEST":
             return {
                 "url": "http://www.felixcloutier.com/x86/TEST.html",
                 "html": "<p>Computes the bit-wise logical AND of first operand (source 1 operand) and the second operand (source 2 operand) and sets the SF, ZF, and PF status flags according to the result. The result is then discarded.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Computes the bit-wise logical AND of first operand (source 1 operand) and the second operand (source 2 operand) and sets the SF, ZF, and PF status flags according to the result. The result is then discarded."
             };

         case "TZCNT":
             return {
                 "url": "http://www.felixcloutier.com/x86/TZCNT.html",
                 "html": "<p>TZCNT counts the number of trailing least significant zero bits in source operand (second operand) and returns the result in destination operand (first operand). TZCNT is an extension of the BSF instruction. The key difference between TZCNT and BSF instruction is that TZCNT provides operand size as output when source operand is zero while in the case of BSF instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support TZCNT, the instruction byte encoding is executed as BSF.</p>",
                 "tooltip": "TZCNT counts the number of trailing least significant zero bits in source operand (second operand) and returns the result in destination operand (first operand). TZCNT is an extension of the BSF instruction. The key difference between TZCNT and BSF instruction is that TZCNT provides operand size as output when source operand is zero while in the case of BSF instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support TZCNT, the instruction byte encoding is executed as BSF."
             };

         case "VUCOMISD":
         case "UCOMISD":
             return {
                 "url": "http://www.felixcloutier.com/x86/UCOMISD.html",
                 "html": "<p>Performs an unordered compare of the double-precision floating-point values in the low quadwords of source operand 1 (first operand) and source operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>Source operand 1 is an XMM register; source operand 2 can be an XMM register or a 64 bit memory location.</p><p>The UCOMISD instruction differs from the COMISD instruction in that it signals a SIMD floating-point invalid oper-ation exception (#I) only when a source operand is an SNaN. The COMISD instruction signals an invalid operation exception if a source operand is either a QNaN or an SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an unordered compare of the double-precision floating-point values in the low quadwords of source operand 1 (first operand) and source operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "UCOMISS":
         case "VUCOMISS":
             return {
                 "url": "http://www.felixcloutier.com/x86/UCOMISS.html",
                 "html": "<p>Performs an unordered compare of the single-precision floating-point values in the low doublewords of the source operand 1 (first operand) and the source operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>Source operand 1 is an XMM register; source operand 2 can be an XMM register or a 32 bit memory location.</p><p>The UCOMISS instruction differs from the COMISS instruction in that it signals a SIMD floating-point invalid opera-tion exception (#I) only when a source operand is an SNaN. The COMISS instruction signals an invalid operation exception if a source operand is either a QNaN or an SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an unordered compare of the single-precision floating-point values in the low doublewords of the source operand 1 (first operand) and the source operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0."
             };

         case "UD2":
             return {
                 "url": "http://www.felixcloutier.com/x86/UD2.html",
                 "html": "<p>Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode exception. The opcode for this instruction is reserved for this purpose.</p><p>Other than raising the invalid opcode exception, this instruction has no effect on processor state or memory.</p><p>Even though it is the execution of the UD2 instruction that causes the invalid opcode exception, the instruction pointer saved by delivery of the exception references the UD2 instruction (and not the following instruction).</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode exception. The opcode for this instruction is reserved for this purpose."
             };

         case "UNPCKHPD":
         case "VUNPCKHPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/UNPCKHPD.html",
                 "html": "<p>Performs an interleaved unpack of the high double-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-23.</p><svg height=\"225.45\" viewbox=\"111.840000 798713.039995 379.199990 150.300000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"141.1794\" y=\"798741.527284\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.860144\" x=\"145.6195\" y=\"798785.867384\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"141.1794\" y=\"798847.847184\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"143.28\" x=\"167.88\" y=\"798837.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.12\" x=\"311.58\" y=\"798729.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.18\" x=\"167.7\" y=\"798729.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.18\" x=\"167.7\" y=\"798774.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.18\" x=\"310.92\" y=\"798837.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"145.02\" x=\"310.68\" y=\"798774.96\"></rect>\n<path d=\"M167.700000,798729.719990 L167.700000,798730.200000 L312.120000,798730.200000 L312.120000,798729.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.580000,798729.719990 L311.580000,798730.200000 L455.940000,798730.200000 L455.940000,798729.719990 \" style=\"stroke:black\"></path>\n<path d=\"M167.460000,798729.720000 L167.460000,798747.960000 L167.940010,798747.960000 L167.940010,798729.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.340000,798729.720000 L311.340000,798747.960000 L311.820010,798747.960000 L311.820010,798729.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.640000,798729.960000 L311.640000,798748.200000 L312.119980,798748.200000 L312.119980,798729.960000 \" style=\"stroke:black\"></path>\n<path d=\"M455.460000,798729.960000 L455.460000,798748.200000 L455.940010,798748.200000 L455.940010,798729.960000 \" style=\"stroke:black\"></path>\n<path d=\"M167.460000,798747.719990 L167.460000,798748.200000 L311.880000,798748.200000 L311.880000,798747.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.340000,798747.719990 L311.340000,798748.200000 L455.700000,798748.200000 L455.700000,798747.719990 \" style=\"stroke:black\"></path>\n<path d=\"M167.700000,798774.719990 L167.700000,798775.200000 L312.120000,798775.200000 L312.120000,798774.719990 \" style=\"stroke:black\"></path>\n<path d=\"M310.680000,798774.719990 L310.680000,798775.200000 L455.940000,798775.200000 L455.940000,798774.719990 \" style=\"stroke:black\"></path>\n<path d=\"M167.460000,798774.720000 L167.460000,798792.960000 L167.940010,798792.960000 L167.940010,798774.720000 \" style=\"stroke:black\"></path>\n<path d=\"M310.440000,798774.720000 L310.440000,798792.960000 L310.920010,798792.960000 L310.920010,798774.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.640000,798774.960000 L311.640000,798793.200000 L312.119980,798793.200000 L312.119980,798774.960000 \" style=\"stroke:black\"></path>\n<path d=\"M455.460000,798774.960000 L455.460000,798793.200000 L455.940010,798793.200000 L455.940010,798774.960000 \" style=\"stroke:black\"></path>\n<path d=\"M239.220000,798792.480010 L239.220000,798792.720000 L239.700000,798792.720000 L239.700000,798792.480010 \" style=\"stroke:black\"></path>\n<path d=\"M167.460000,798792.719990 L167.460000,798793.200000 L311.880000,798793.200000 L311.880000,798792.719990 \" style=\"stroke:black\"></path>\n<path d=\"M310.440000,798792.719990 L310.440000,798793.200000 L455.700000,798793.200000 L455.700000,798792.719990 \" style=\"stroke:black\"></path>\n<path d=\"M239.220000,798792.720000 L239.220000,798830.640000 L239.700000,798830.640000 L239.700000,798792.720000 \" style=\"stroke:black\"></path>\n<path d=\"M239.220000,798830.400000 L239.220000,798831.540000 L239.700000,798831.540000 L239.700000,798830.400000 \" style=\"stroke:black\"></path>\n<path d=\"M167.880000,798837.719990 L167.880000,798838.200000 L311.400000,798838.200000 L311.400000,798837.719990 \" style=\"stroke:black\"></path>\n<path d=\"M310.920000,798837.719990 L310.920000,798838.200000 L455.340000,798838.200000 L455.340000,798837.719990 \" style=\"stroke:black\"></path>\n<path d=\"M167.640000,798837.720000 L167.640000,798855.960000 L168.120000,798855.960000 L168.120000,798837.720000 \" style=\"stroke:black\"></path>\n<path d=\"M310.680000,798837.720000 L310.680000,798855.960000 L311.160010,798855.960000 L311.160010,798837.720000 \" style=\"stroke:black\"></path>\n<path d=\"M310.920000,798837.960000 L310.920000,798856.200000 L311.399980,798856.200000 L311.399980,798837.960000 \" style=\"stroke:black\"></path>\n<path d=\"M454.860000,798837.960000 L454.860000,798856.200000 L455.339980,798856.200000 L455.339980,798837.960000 \" style=\"stroke:black\"></path>\n<path d=\"M167.640000,798855.719990 L167.640000,798856.200000 L311.160000,798856.200000 L311.160000,798855.719990 \" style=\"stroke:black\"></path>\n<path d=\"M310.680000,798855.719990 L310.680000,798856.200000 L455.100000,798856.200000 L455.100000,798855.719990 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"234.48\" y=\"798850.007384\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"380.4603\" y=\"798743.207584\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"233.76\" y=\"798742.307484\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"233.76\" y=\"798788.207484\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"377.2199\" y=\"798850.247584\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"379.557792\" y=\"798788.207484\">Y0</text></svg><h3>Figure 4-23.  UNPCKHPD Instruction High Unpack and Interleave Operation</h3><p>When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an interleaved unpack of the high double-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-23."
             };

         case "VUNPCKHPS":
         case "UNPCKHPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/UNPCKHPS.html",
                 "html": "<p>Performs an interleaved unpack of the high-order single-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-24. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register.</p><svg height=\"220.32\" viewbox=\"111.840000 800308.019995 379.199990 146.880000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"143.58\" y=\"800331.407484\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.860144\" x=\"148.0201\" y=\"800377.427384\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"143.58\" y=\"800439.227684\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"169.74\" y=\"800319.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"313.8\" y=\"800319.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"385.8\" y=\"800319.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"169.74\" y=\"800364.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"313.8\" y=\"800364.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"385.8\" y=\"800364.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"169.14\" y=\"800427.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"241.14\" y=\"800427.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"313.14\" y=\"800427.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"385.14\" y=\"800427.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.06\" x=\"241.74\" y=\"800319.9\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.06\" x=\"241.74\" y=\"800364.9\"></rect>\n<path d=\"M169.740000,800319.659990 L169.740000,800320.140000 L241.980000,800320.140000 L241.980000,800319.659990 \" style=\"stroke:black\"></path>\n<path d=\"M241.740000,800319.659990 L241.740000,800320.140000 L314.040000,800320.140000 L314.040000,800319.659990 \" style=\"stroke:black\"></path>\n<path d=\"M313.800000,800319.659990 L313.800000,800320.140000 L386.040000,800320.140000 L386.040000,800319.659990 \" style=\"stroke:black\"></path>\n<path d=\"M385.800000,800319.659990 L385.800000,800320.140000 L458.040000,800320.140000 L458.040000,800319.659990 \" style=\"stroke:black\"></path>\n<path d=\"M169.500000,800319.660000 L169.500000,800337.900000 L169.980000,800337.900000 L169.980000,800319.660000 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800319.660000 L241.500000,800337.900000 L241.980000,800337.900000 L241.980000,800319.660000 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800319.660000 L313.560000,800337.900000 L314.040010,800337.900000 L314.040010,800319.660000 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800319.660000 L385.560000,800337.900000 L386.040010,800337.900000 L386.040010,800319.660000 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800319.900000 L241.500000,800338.140000 L241.980000,800338.140000 L241.980000,800319.900000 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800319.900000 L313.560000,800338.140000 L314.040010,800338.140000 L314.040010,800319.900000 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800319.900000 L385.560000,800338.140000 L386.040010,800338.140000 L386.040010,800319.900000 \" style=\"stroke:black\"></path>\n<path d=\"M457.560000,800319.900000 L457.560000,800338.140000 L458.039980,800338.140000 L458.039980,800319.900000 \" style=\"stroke:black\"></path>\n<path d=\"M169.500000,800337.659990 L169.500000,800338.140000 L241.740000,800338.140000 L241.740000,800337.659990 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800337.659990 L241.500000,800338.140000 L313.800000,800338.140000 L313.800000,800337.659990 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800337.659990 L313.560000,800338.140000 L385.800000,800338.140000 L385.800000,800337.659990 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800337.659990 L385.560000,800338.140000 L457.800000,800338.140000 L457.800000,800337.659990 \" style=\"stroke:black\"></path>\n<path d=\"M169.740000,800364.659990 L169.740000,800365.140000 L241.980000,800365.140000 L241.980000,800364.659990 \" style=\"stroke:black\"></path>\n<path d=\"M241.740000,800364.659990 L241.740000,800365.140000 L314.040000,800365.140000 L314.040000,800364.659990 \" style=\"stroke:black\"></path>\n<path d=\"M313.800000,800364.659990 L313.800000,800365.140000 L386.040000,800365.140000 L386.040000,800364.659990 \" style=\"stroke:black\"></path>\n<path d=\"M385.800000,800364.659990 L385.800000,800365.140000 L458.040000,800365.140000 L458.040000,800364.659990 \" style=\"stroke:black\"></path>\n<path d=\"M169.500000,800364.660000 L169.500000,800382.900000 L169.980000,800382.900000 L169.980000,800364.660000 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800364.660000 L241.500000,800382.900000 L241.980000,800382.900000 L241.980000,800364.660000 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800364.660000 L313.560000,800382.900000 L314.040010,800382.900000 L314.040010,800364.660000 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800364.660000 L385.560000,800382.900000 L386.040010,800382.900000 L386.040010,800364.660000 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800364.900000 L241.500000,800383.140000 L241.980000,800383.140000 L241.980000,800364.900000 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800364.900000 L313.560000,800383.140000 L314.040010,800383.140000 L314.040010,800364.900000 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800364.900000 L385.560000,800383.140000 L386.040010,800383.140000 L386.040010,800364.900000 \" style=\"stroke:black\"></path>\n<path d=\"M457.560000,800364.900000 L457.560000,800383.140000 L458.039980,800383.140000 L458.039980,800364.900000 \" style=\"stroke:black\"></path>\n<path d=\"M169.500000,800382.659990 L169.500000,800383.140000 L241.740000,800383.140000 L241.740000,800382.659990 \" style=\"stroke:black\"></path>\n<path d=\"M241.500000,800382.659990 L241.500000,800383.140000 L313.800000,800383.140000 L313.800000,800382.659990 \" style=\"stroke:black\"></path>\n<path d=\"M313.560000,800382.659990 L313.560000,800383.140000 L385.800000,800383.140000 L385.800000,800382.659990 \" style=\"stroke:black\"></path>\n<path d=\"M385.560000,800382.659990 L385.560000,800383.140000 L457.800000,800383.140000 L457.800000,800382.659990 \" style=\"stroke:black\"></path>\n<path d=\"M207.480000,800382.780010 L207.480000,800383.020000 L207.960010,800383.020000 L207.960010,800382.780010 \" style=\"stroke:black\"></path>\n<path d=\"M207.480000,800383.020000 L207.480000,800419.080000 L207.960010,800419.080000 L207.960010,800383.020000 \" style=\"stroke:black\"></path>\n<path d=\"M207.480000,800418.840000 L207.480000,800420.040000 L207.960010,800420.040000 L207.960010,800418.840000 \" style=\"stroke:black\"></path>\n<path d=\"M169.140000,800427.659990 L169.140000,800428.140000 L241.380000,800428.140000 L241.380000,800427.659990 \" style=\"stroke:black\"></path>\n<path d=\"M241.140000,800427.659990 L241.140000,800428.140000 L313.380000,800428.140000 L313.380000,800427.659990 \" style=\"stroke:black\"></path>\n<path d=\"M313.140000,800427.659990 L313.140000,800428.140000 L385.380000,800428.140000 L385.380000,800427.659990 \" style=\"stroke:black\"></path>\n<path d=\"M385.140000,800427.659990 L385.140000,800428.140000 L457.380000,800428.140000 L457.380000,800427.659990 \" style=\"stroke:black\"></path>\n<path d=\"M168.900000,800427.660000 L168.900000,800445.900000 L169.380000,800445.900000 L169.380000,800427.660000 \" style=\"stroke:black\"></path>\n<path d=\"M240.900000,800427.660000 L240.900000,800445.900000 L241.379980,800445.900000 L241.379980,800427.660000 \" style=\"stroke:black\"></path>\n<path d=\"M312.900000,800427.660000 L312.900000,800445.900000 L313.380010,800445.900000 L313.380010,800427.660000 \" style=\"stroke:black\"></path>\n<path d=\"M384.900000,800427.660000 L384.900000,800445.900000 L385.380010,800445.900000 L385.380010,800427.660000 \" style=\"stroke:black\"></path>\n<path d=\"M240.900000,800427.900000 L240.900000,800446.140000 L241.379980,800446.140000 L241.379980,800427.900000 \" style=\"stroke:black\"></path>\n<path d=\"M312.900000,800427.900000 L312.900000,800446.140000 L313.380010,800446.140000 L313.380010,800427.900000 \" style=\"stroke:black\"></path>\n<path d=\"M384.900000,800427.900000 L384.900000,800446.140000 L385.380010,800446.140000 L385.380010,800427.900000 \" style=\"stroke:black\"></path>\n<path d=\"M456.900000,800427.900000 L456.900000,800446.140000 L457.380010,800446.140000 L457.380010,800427.900000 \" style=\"stroke:black\"></path>\n<path d=\"M168.900000,800445.659990 L168.900000,800446.140000 L241.140000,800446.140000 L241.140000,800445.659990 \" style=\"stroke:black\"></path>\n<path d=\"M240.900000,800445.659990 L240.900000,800446.140000 L313.140000,800446.140000 L313.140000,800445.659990 \" style=\"stroke:black\"></path>\n<path d=\"M312.900000,800445.659990 L312.900000,800446.140000 L385.140000,800446.140000 L385.140000,800445.659990 \" style=\"stroke:black\"></path>\n<path d=\"M384.900000,800445.659990 L384.900000,800446.140000 L457.140000,800446.140000 L457.140000,800445.659990 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"204.36\" y=\"800331.407484\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"345.069744\" y=\"800331.407484\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"417.076476\" y=\"800331.407484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.717246\" x=\"204.36\" y=\"800376.407484\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"339.364044\" y=\"800376.407484\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"411.370776\" y=\"800376.407484\">Y0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"203.52\" y=\"800439.647484\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"275.758152\" y=\"800439.647484\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"347.756904\" y=\"800439.647484\">Y2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"418.438956\" y=\"800439.647484\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"273.063012\" y=\"800331.407484\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.781086\" x=\"274.020612\" y=\"800376.407484\">Y2</text></svg><h3>Figure 4-24.  UNPCKHPS Instruction High Unpack and Interleave Operation</h3><p>When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an interleaved unpack of the high-order single-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-24. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register."
             };

         case "VUNPCKLPD":
         case "UNPCKLPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/UNPCKLPD.html",
                 "html": "<p>Performs an interleaved unpack of the low double-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-25. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register.</p><svg height=\"222.4800075\" viewbox=\"111.840000 801892.019995 379.199990 148.320005\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"141.6594\" y=\"801918.527284\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.860144\" x=\"146.1594\" y=\"801962.867384\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"141.6594\" y=\"802024.847184\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"143.22\" x=\"168.42\" y=\"802014.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.12\" x=\"168.24\" y=\"801906.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.12\" x=\"168.24\" y=\"801951.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.18\" x=\"312.06\" y=\"801906.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"144.18\" x=\"311.4\" y=\"802014.96\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"145.02\" x=\"311.22\" y=\"801951.96\"></rect>\n<path d=\"M168.240000,801906.719990 L168.240000,801907.200000 L312.600000,801907.200000 L312.600000,801906.719990 \" style=\"stroke:black\"></path>\n<path d=\"M312.060000,801906.719990 L312.060000,801907.200000 L456.480000,801907.200000 L456.480000,801906.719990 \" style=\"stroke:black\"></path>\n<path d=\"M168.000000,801906.720000 L168.000000,801924.960000 L168.480000,801924.960000 L168.480000,801906.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.820000,801906.720000 L311.820000,801924.960000 L312.299980,801924.960000 L312.299980,801906.720000 \" style=\"stroke:black\"></path>\n<path d=\"M312.120000,801906.960000 L312.120000,801925.200000 L312.600010,801925.200000 L312.600010,801906.960000 \" style=\"stroke:black\"></path>\n<path d=\"M456.000000,801906.960000 L456.000000,801925.200000 L456.480010,801925.200000 L456.480010,801906.960000 \" style=\"stroke:black\"></path>\n<path d=\"M168.000000,801924.719990 L168.000000,801925.200000 L312.360000,801925.200000 L312.360000,801924.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.820000,801924.719990 L311.820000,801925.200000 L456.240000,801925.200000 L456.240000,801924.719990 \" style=\"stroke:black\"></path>\n<path d=\"M168.240000,801951.719990 L168.240000,801952.200000 L312.600000,801952.200000 L312.600000,801951.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.220000,801951.719990 L311.220000,801952.200000 L456.480000,801952.200000 L456.480000,801951.719990 \" style=\"stroke:black\"></path>\n<path d=\"M168.000000,801951.720000 L168.000000,801969.960000 L168.480000,801969.960000 L168.480000,801951.720000 \" style=\"stroke:black\"></path>\n<path d=\"M310.980000,801951.720000 L310.980000,801969.960000 L311.459980,801969.960000 L311.459980,801951.720000 \" style=\"stroke:black\"></path>\n<path d=\"M312.120000,801951.960000 L312.120000,801970.200000 L312.600010,801970.200000 L312.600010,801951.960000 \" style=\"stroke:black\"></path>\n<path d=\"M456.000000,801951.960000 L456.000000,801970.200000 L456.480010,801970.200000 L456.480010,801951.960000 \" style=\"stroke:black\"></path>\n<path d=\"M168.000000,801969.719990 L168.000000,801970.200000 L312.360000,801970.200000 L312.360000,801969.719990 \" style=\"stroke:black\"></path>\n<path d=\"M310.980000,801969.719990 L310.980000,801970.200000 L456.240000,801970.200000 L456.240000,801969.719990 \" style=\"stroke:black\"></path>\n<path d=\"M168.420000,802014.719990 L168.420000,802015.200000 L311.880000,802015.200000 L311.880000,802014.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.400000,802014.719990 L311.400000,802015.200000 L455.820000,802015.200000 L455.820000,802014.719990 \" style=\"stroke:black\"></path>\n<path d=\"M168.180000,802014.720000 L168.180000,802032.960000 L168.660000,802032.960000 L168.660000,802014.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.160000,802014.720000 L311.160000,802032.960000 L311.640010,802032.960000 L311.640010,802014.720000 \" style=\"stroke:black\"></path>\n<path d=\"M311.400000,802014.960000 L311.400000,802033.200000 L311.880010,802033.200000 L311.880010,802014.960000 \" style=\"stroke:black\"></path>\n<path d=\"M455.340000,802014.960000 L455.340000,802033.200000 L455.820010,802033.200000 L455.820010,802014.960000 \" style=\"stroke:black\"></path>\n<path d=\"M168.180000,802032.719990 L168.180000,802033.200000 L311.640000,802033.200000 L311.640000,802032.719990 \" style=\"stroke:black\"></path>\n<path d=\"M311.160000,802032.719990 L311.160000,802033.200000 L455.580000,802033.200000 L455.580000,802032.719990 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"234.96\" y=\"802027.007384\">Y0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"234.24\" y=\"801919.307484\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"234.24\" y=\"801965.207484\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"380.9403\" y=\"801920.207584\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"377.6999\" y=\"802027.247584\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.713256\" x=\"380.037792\" y=\"801965.207484\">Y0</text></svg><h3>Figure 4-25.  UNPCKLPD Instruction Low Unpack and Interleave Operation</h3><p>When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an interleaved unpack of the low double-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-25. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register."
             };

         case "UNPCKLPS":
         case "VUNPCKLPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/UNPCKLPS.html",
                 "html": "<p>Performs an interleaved unpack of the low-order single-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-26. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register.</p><svg height=\"234.5400075\" viewbox=\"111.840000 803476.019995 379.199990 156.360005\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"144.06\" y=\"803508.887484\">DEST</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"16.860144\" x=\"148.5001\" y=\"803554.907384\">SRC</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"21.320964\" x=\"144.06\" y=\"803616.707684\">DEST</text>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"242.28\" y=\"803497.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"242.28\" y=\"803542.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"170.28\" y=\"803497.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"314.28\" y=\"803497.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"386.28\" y=\"803497.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"170.28\" y=\"803542.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"314.28\" y=\"803542.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"386.28\" y=\"803542.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"169.62\" y=\"803605.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"241.62\" y=\"803605.44\"></rect>\n<rect height=\"18.0\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"72.0\" x=\"313.62\" y=\"803605.44\"></rect>\n<path d=\"M170.040000,803542.200000 L170.040000,803560.440000 L170.520000,803560.440000 L170.520000,803542.200000 \" style=\"stroke:black\"></path>\n<path d=\"M242.040000,803542.200000 L242.040000,803560.440000 L242.520010,803560.440000 L242.520010,803542.200000 \" style=\"stroke:black\"></path>\n<path d=\"M314.040000,803542.200000 L314.040000,803560.440000 L314.519980,803560.440000 L314.519980,803542.200000 \" style=\"stroke:black\"></path>\n<path d=\"M386.040000,803542.200000 L386.040000,803560.440000 L386.519980,803560.440000 L386.519980,803542.200000 \" style=\"stroke:black\"></path>\n<path d=\"M170.280000,803542.200020 L170.280000,803542.680000 L242.520000,803542.680000 L242.520000,803542.200020 \" style=\"stroke:black\"></path>\n<path d=\"M242.280000,803542.200020 L242.280000,803542.680000 L314.520000,803542.680000 L314.520000,803542.200020 \" style=\"stroke:black\"></path>\n<path d=\"M314.280000,803542.200020 L314.280000,803542.680000 L386.520000,803542.680000 L386.520000,803542.200020 \" style=\"stroke:black\"></path>\n<path d=\"M386.280000,803542.200020 L386.280000,803542.680000 L458.520000,803542.680000 L458.520000,803542.200020 \" style=\"stroke:black\"></path>\n<path d=\"M242.040000,803542.440000 L242.040000,803560.680000 L242.520010,803560.680000 L242.520010,803542.440000 \" style=\"stroke:black\"></path>\n<path d=\"M314.040000,803542.440000 L314.040000,803560.680000 L314.519980,803560.680000 L314.519980,803542.440000 \" style=\"stroke:black\"></path>\n<path d=\"M386.040000,803542.440000 L386.040000,803560.680000 L386.519980,803560.680000 L386.519980,803542.440000 \" style=\"stroke:black\"></path>\n<path d=\"M458.040000,803542.440000 L458.040000,803560.680000 L458.519980,803560.680000 L458.519980,803542.440000 \" style=\"stroke:black\"></path>\n<path d=\"M170.040000,803560.199990 L170.040000,803560.680000 L242.280000,803560.680000 L242.280000,803560.199990 \" style=\"stroke:black\"></path>\n<path d=\"M242.040000,803560.199990 L242.040000,803560.680000 L314.280000,803560.680000 L314.280000,803560.199990 \" style=\"stroke:black\"></path>\n<path d=\"M314.040000,803560.199990 L314.040000,803560.680000 L386.280000,803560.680000 L386.280000,803560.199990 \" style=\"stroke:black\"></path>\n<path d=\"M386.040000,803560.199990 L386.040000,803560.680000 L458.280000,803560.680000 L458.280000,803560.199990 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.713256\" x=\"273.543012\" y=\"803508.887484\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"274.500612\" y=\"803553.887484\">Y2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"204.84\" y=\"803508.887484\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.713256\" x=\"345.541764\" y=\"803508.887484\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"417.540516\" y=\"803508.887484\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"204.84\" y=\"803553.887484\">Y3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"339.836064\" y=\"803553.887484\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"411.834816\" y=\"803553.887484\">Y0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"204.0\" y=\"803616.77786\">Y1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"276.238152\" y=\"803616.77786\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.980000pt\" textlength=\"9.777096\" x=\"348.236904\" y=\"803616.77786\">Y0</text></svg><h3>Figure 4-26.  UNPCKLPS Instruction Low Unpack and Interleave Operation</h3><p>When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs an interleaved unpack of the low-order single-precision floating-point values from the source operand (second operand) and the destination operand (first operand). See Figure 4-26. The source operand can be an XMM register or a 128-bit memory location; the destination operand is an XMM register."
             };

         case "VBROADCASTF128":
         case "VBROADCASTSD":
         case "VBROADCASTSS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VBROADCAST.html",
                 "html": "<p>Load floating point values from the source operand (second operand) and broadcast to all elements of the destina-tion operand (first operand).</p><p>VBROADCASTSD and VBROADCASTF128 are only supported as 256-bit wide versions. VBROADCASTSS is supported in both 128-bit and 256-bit wide versions.</p><p>Memory and register source operand syntax support of 256-bit instructions depend on the processor\u2019s enumeration of the following conditions with respect to CPUID.1:ECX.AVX[bit 28] and CPUID.(EAX=07H, ECX=0H):EBX.AVX2[bit 5]:</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD. An attempt to execute VBROADCASTSD or VBROADCASTF128 encoded with VEX.L= 0 will cause an #UD exception. Attempts to execute any VBROADCAST* instruction with VEX.W = 1 will cause #UD.</p><svg height=\"164.07\" viewbox=\"112.380000 805536.000010 379.199990 109.380000\" width=\"568.799985\">\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"14.6085\" x=\"383.28\" y=\"805562.7735\">m32</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"19.9395\" x=\"141.96\" y=\"805632.0135\">DEST</text>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.08\" x=\"347.28\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.08\" x=\"198.78\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"421.5\" y=\"805552.98\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"421.5\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"384.36\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"273.0\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"310.14\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"235.86\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"161.64\" y=\"805623.9\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"421.500005\" y=\"805552.98001\"></rect>\n<path d=\"M438.600000,805567.440000 L438.180000,805565.520000 L195.240000,805619.220000 L195.660000,805621.140000 \" style=\"stroke:black\"></path>\n<path d=\"M438.600000,805567.440000 L438.120000,805565.520000 L235.500000,805618.560000 L235.980000,805620.480000 \" style=\"stroke:black\"></path>\n<path d=\"M438.660000,805567.440000 L438.060000,805565.520000 L268.980000,805617.840000 L269.580000,805619.760000 \" style=\"stroke:black\"></path>\n<path d=\"M438.720000,805567.440000 L438.000000,805565.580000 L309.000000,805616.640000 L309.720000,805618.500000 \" style=\"stroke:black\"></path>\n<path d=\"M438.840000,805567.380000 L437.940000,805565.640000 L342.000000,805615.140000 L342.900000,805616.880000 \" style=\"stroke:black\"></path>\n<path d=\"M438.960000,805567.260000 L437.760000,805565.700000 L377.220000,805612.500000 L378.420000,805614.060000 \" style=\"stroke:black\"></path>\n<path d=\"M439.200000,805567.020000 L437.520000,805565.940000 L409.800000,805608.840000 L411.480000,805609.920000 \" style=\"stroke:black\"></path>\n<path d=\"M439.980000,805566.480000 L438.000000,805566.480000 L437.580000,805606.560000 L439.560000,805606.560000 \" style=\"stroke:black\"></path>\n<path d=\"M438.540000,805607.580000 L438.780000,805607.580000 L438.960000,805607.520000 L439.080000,805607.400000 L439.260000,805607.280000 L439.380000,805607.160000 L439.500000,805606.980000 L439.560000,805606.800000 L439.560000,805606.380000 L439.500000,805606.200000 L439.380000,805606.020000 L439.140000,805605.780000 L438.960000,805605.660000 L438.780000,805605.600000 L438.360000,805605.600000 L438.000000,805605.720000 L437.880000,805605.840000 L437.640000,805606.200000 L437.580000,805606.380000 L437.580000,805606.800000 L437.640000,805606.980000 L437.700000,805607.160000 L437.820000,805607.280000 L438.180000,805607.520000 L438.360000,805607.580000 L438.540000,805607.580000 \" style=\"stroke:black\"></path>\n<path d=\"M438.540000,805606.620000 L442.860000,805606.620000 L443.520000,805606.680000 L443.340000,805607.280000 L438.900000,805622.220000 L438.420000,805623.900000 L437.940000,805622.220000 L433.800000,805607.220000 L433.620000,805606.560000 L434.280000,805606.560000 L434.760000,805606.920000 L438.900000,805621.920000 L437.940000,805622.220000 L437.940000,805621.920000 L442.380000,805606.980000 L443.340000,805607.280000 L442.860000,805607.640000 L438.540000,805607.640000 \" style=\"stroke:black\"></path>\n<path d=\"M434.280000,805606.560000 L438.540000,805606.620000 L438.540000,805607.640000 L434.280000,805607.580000 \" style=\"stroke:black\"></path>\n<path d=\"M410.640000,805609.380000 L414.240000,805611.660000 L414.780000,805612.080000 L414.300000,805612.500000 L402.540000,805622.760000 L401.220000,805623.900000 L401.760000,805622.220000 L406.320000,805607.340000 L406.500000,805606.680000 L407.040000,805607.040000 L407.280000,805607.640000 L402.720000,805622.520000 L401.760000,805622.220000 L401.880000,805621.980000 L413.640000,805611.720000 L414.300000,805612.500000 L413.700000,805612.560000 L410.100000,805610.280000 \" style=\"stroke:black\"></path>\n<path d=\"M407.040000,805607.040000 L410.640000,805609.380000 L410.100000,805610.280000 L406.500000,805607.940000 \" style=\"stroke:black\"></path>\n<path d=\"M438.540000,805607.100000 L442.860000,805607.100000 L438.420000,805622.040000 L434.280000,805607.040000 \" style=\"stroke:black\"></path>\n<path d=\"M410.400000,805609.800000 L414.000000,805612.080000 L402.240000,805622.340000 L406.800000,805607.460000 \" style=\"stroke:black\"></path>\n<path d=\"M410.100000,805610.220000 L410.460000,805610.340000 L410.880000,805610.340000 L411.060000,805610.280000 L411.240000,805610.160000 L411.480000,805609.920000 L411.600000,805609.740000 L411.660000,805609.500000 L411.660000,805609.140000 L411.600000,805608.960000 L411.480000,805608.780000 L411.360000,805608.660000 L411.000000,805608.420000 L410.118000,805608.045000 L409.276000,805609.190000 L409.860000,805609.920000 L409.980000,805610.040000 L410.100000,805610.220000 \" style=\"stroke:black\"></path>\n<path d=\"M377.820000,805613.280000 L380.400000,805616.700000 L380.760000,805617.240000 L380.220000,805617.480000 L365.760000,805623.240000 L364.080000,805623.900000 L365.160000,805622.460000 L374.400000,805609.980000 L374.820000,805609.380000 L375.180000,805609.920000 L375.180000,805610.520000 L365.940000,805623.060000 L365.160000,805622.460000 L365.400000,805622.280000 L379.860000,805616.520000 L380.220000,805617.480000 L379.620000,805617.300000 L377.040000,805613.940000 \" style=\"stroke:black\"></path>\n<path d=\"M375.180000,805609.920000 L377.820000,805613.280000 L377.040000,805613.940000 L374.400000,805610.580000 \" style=\"stroke:black\"></path>\n<path d=\"M377.460000,805613.580000 L380.040000,805617.000000 L365.580000,805622.760000 L374.820000,805610.220000 \" style=\"stroke:black\"></path>\n<path d=\"M342.360000,805615.920000 L344.340000,805619.760000 L344.640000,805620.360000 L343.980000,805620.480000 L328.680000,805623.540000 L326.940000,805623.900000 L328.260000,805622.700000 L339.600000,805612.020000 L340.080000,805611.600000 L340.380000,805612.140000 L340.260000,805612.740000 L328.920000,805623.420000 L328.260000,805622.700000 L328.500000,805622.580000 L343.800000,805619.520000 L343.980000,805620.480000 L343.440000,805620.240000 L341.460000,805616.400000 \" style=\"stroke:black\"></path>\n<path d=\"M340.380000,805612.140000 L342.360000,805615.920000 L341.460000,805616.400000 L339.480000,805612.620000 \" style=\"stroke:black\"></path>\n<path d=\"M377.040000,805613.940000 L377.160000,805614.060000 L377.340000,805614.180000 L378.122000,805614.574000 L379.211000,805613.842000 L378.720000,805612.860000 L378.600000,805612.680000 L378.480000,805612.560000 L378.300000,805612.440000 L377.940000,805612.320000 L377.580000,805612.320000 L377.400000,805612.380000 L377.220000,805612.500000 L376.926000,805612.840000 L376.815000,805612.896000 L376.860000,805613.340000 L376.860000,805613.580000 L376.920000,805613.760000 L377.040000,805613.940000 \" style=\"stroke:black\"></path>\n<path d=\"M341.940000,805616.160000 L343.920000,805620.000000 L328.620000,805623.060000 L339.960000,805612.380000 \" style=\"stroke:black\"></path>\n<path d=\"M309.300000,805617.540000 L310.860000,805621.500000 L311.100000,805622.100000 L310.440000,805622.220000 L294.960000,805623.780000 L293.220000,805623.900000 L294.600000,805622.880000 L306.960000,805613.340000 L307.500000,805612.920000 L307.740000,805613.520000 L307.620000,805614.120000 L295.260000,805623.660000 L294.600000,805622.880000 L294.900000,805622.760000 L310.380000,805621.200000 L310.440000,805622.220000 L309.960000,805621.860000 L308.400000,805617.900000 \" style=\"stroke:black\"></path>\n<path d=\"M307.740000,805613.520000 L309.300000,805617.540000 L308.400000,805617.900000 L306.840000,805613.880000 \" style=\"stroke:black\"></path>\n<path d=\"M308.880000,805617.720000 L310.440000,805621.680000 L294.960000,805623.240000 L307.320000,805613.700000 \" style=\"stroke:black\"></path>\n<path d=\"M269.280000,805618.800000 L270.540000,805622.880000 L270.720000,805623.480000 L270.000000,805623.540000 L254.520000,805623.900000 L252.720000,805623.900000 L254.220000,805622.940000 L267.240000,805614.420000 L267.780000,805614.060000 L268.020000,805614.720000 L267.780000,805615.260000 L254.760000,805623.780000 L254.220000,805622.940000 L254.520000,805622.880000 L270.060000,805622.520000 L270.000000,805623.540000 L269.520000,805623.180000 L268.260000,805619.100000 \" style=\"stroke:black\"></path>\n<path d=\"M235.740000,805619.520000 L236.820000,805623.660000 L237.000000,805624.320000 L236.280000,805624.260000 L220.680000,805623.960000 L218.940000,805623.900000 L220.440000,805623.000000 L233.880000,805615.080000 L234.480000,805614.720000 L234.660000,805615.380000 L234.420000,805615.920000 L220.980000,805623.840000 L220.440000,805623.000000 L220.740000,805622.940000 L236.340000,805623.300000 L236.280000,805624.260000 L235.800000,805623.900000 L234.720000,805619.760000 \" style=\"stroke:black\"></path>\n<path d=\"M268.020000,805614.720000 L269.280000,805618.800000 L268.260000,805619.100000 L267.000000,805615.020000 \" style=\"stroke:black\"></path>\n<path d=\"M268.800000,805618.920000 L270.060000,805623.000000 L254.520000,805623.360000 L267.540000,805614.840000 \" style=\"stroke:black\"></path>\n<path d=\"M341.520000,805616.400000 L341.640000,805616.580000 L341.760000,805616.700000 L342.574000,805617.346000 L343.528000,805616.647000 L343.380000,805615.680000 L343.260000,805615.500000 L343.200000,805615.320000 L343.020000,805615.200000 L342.900000,805615.080000 L342.540000,805614.960000 L342.300000,805614.960000 L341.940000,805615.080000 L341.555000,805615.311000 L341.525000,805615.465000 L341.400000,805615.860000 L341.400000,805616.040000 L341.520000,805616.400000 \" style=\"stroke:black\"></path>\n<path d=\"M195.420000,805620.120000 L196.320000,805624.320000 L196.380000,805624.980000 L195.780000,805624.920000 L180.240000,805623.960000 L178.500000,805623.900000 L180.000000,805623.060000 L193.680000,805615.620000 L194.280000,805615.320000 L194.460000,805615.920000 L194.220000,805616.460000 L180.540000,805623.900000 L180.000000,805623.060000 L180.300000,805623.000000 L195.840000,805623.960000 L195.780000,805624.920000 L195.300000,805624.560000 L194.400000,805620.360000 \" style=\"stroke:black\"></path>\n<path d=\"M234.660000,805615.380000 L235.740000,805619.520000 L234.720000,805619.760000 L233.640000,805615.620000 \" style=\"stroke:black\"></path>\n<path d=\"M235.260000,805619.640000 L236.340000,805623.780000 L220.740000,805623.420000 L234.180000,805615.500000 \" style=\"stroke:black\"></path>\n<path d=\"M194.460000,805615.920000 L195.420000,805620.120000 L194.400000,805620.360000 L193.440000,805616.160000 \" style=\"stroke:black\"></path>\n<path d=\"M194.940000,805620.240000 L195.840000,805624.440000 L180.300000,805623.480000 L193.980000,805616.040000 \" style=\"stroke:black\"></path>\n<path d=\"M308.400000,805617.900000 L308.520000,805618.080000 L308.640000,805618.200000 L309.280000,805618.928000 L310.559000,805618.317000 L310.320000,805617.360000 L310.200000,805617.000000 L310.080000,805616.820000 L309.720000,805616.580000 L309.540000,805616.520000 L309.180000,805616.520000 L309.000000,805616.580000 L308.493000,805616.944000 L308.576000,805616.741000 L308.400000,805617.300000 L308.340000,805617.480000 L308.340000,805617.720000 L308.400000,805617.900000 \" style=\"stroke:black\"></path>\n<path d=\"M268.320000,805619.040000 L268.500000,805619.400000 L269.147000,805620.128000 L270.328000,805619.676000 L270.300000,805618.680000 L270.240000,805618.500000 L270.180000,805618.260000 L269.562000,805617.416000 L268.290000,805617.795000 L268.320000,805618.680000 L268.320000,805619.040000 \" style=\"stroke:black\"></path>\n<path d=\"M234.780000,805619.760000 L234.840000,805619.940000 L234.960000,805620.120000 L235.592000,805620.823000 L236.796000,805620.466000 L236.760000,805619.460000 L236.700000,805619.280000 L236.640000,805619.040000 L236.520000,805618.920000 L236.400000,805618.740000 L236.280000,805618.620000 L235.920000,805618.500000 L235.680000,805618.500000 L235.320000,805618.620000 L235.140000,805618.740000 L234.900000,805618.980000 L234.780000,805619.160000 L234.780000,805619.340000 L234.720000,805619.580000 L234.780000,805619.760000 \" style=\"stroke:black\"></path>\n<path d=\"M194.400000,805620.360000 L194.460000,805620.540000 L194.580000,805620.720000 L195.087000,805621.532000 L196.477000,805621.110000 L196.380000,805620.120000 L196.380000,805619.940000 L196.320000,805619.760000 L195.902000,805618.841000 L194.744000,805618.973000 L194.400000,805619.940000 L194.400000,805620.360000 \" style=\"stroke:black\"></path>\n<path d=\"M161.400000,805623.660000 L161.400000,805637.400000 L161.880000,805637.400000 L161.880000,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M198.540000,805623.660000 L198.540000,805637.400000 L199.020010,805637.400000 L199.020010,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M235.620000,805623.660000 L235.620000,805637.400000 L236.100000,805637.400000 L236.100000,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M272.760000,805623.660000 L272.760000,805637.400000 L273.239980,805637.400000 L273.239980,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M309.900000,805623.660000 L309.900000,805637.400000 L310.380010,805637.400000 L310.380010,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M347.040000,805623.660000 L347.040000,805637.400000 L347.519980,805637.400000 L347.519980,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M384.120000,805623.660000 L384.120000,805637.400000 L384.600010,805637.400000 L384.600010,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M421.260000,805623.660000 L421.260000,805637.400000 L421.740010,805637.400000 L421.740010,805623.660000 \" style=\"stroke:black\"></path>\n<path d=\"M161.640000,805623.660020 L161.640000,805624.140000 L199.020000,805624.140000 L199.020000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M198.780000,805623.660020 L198.780000,805624.140000 L236.100000,805624.140000 L236.100000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M235.860000,805623.660020 L235.860000,805624.140000 L273.240000,805624.140000 L273.240000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M273.000000,805623.660020 L273.000000,805624.140000 L310.380000,805624.140000 L310.380000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M310.140000,805623.660020 L310.140000,805624.140000 L347.520000,805624.140000 L347.520000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M347.280000,805623.660020 L347.280000,805624.140000 L384.600000,805624.140000 L384.600000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M384.360000,805623.660020 L384.360000,805624.140000 L421.740000,805624.140000 L421.740000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M421.500000,805623.660020 L421.500000,805624.140000 L458.880000,805624.140000 L458.880000,805623.660020 \" style=\"stroke:black\"></path>\n<path d=\"M198.540000,805623.900000 L198.540000,805637.640000 L199.020010,805637.640000 L199.020010,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M235.620000,805623.900000 L235.620000,805637.640000 L236.100000,805637.640000 L236.100000,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M272.760000,805623.900000 L272.760000,805637.640000 L273.239980,805637.640000 L273.239980,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M309.900000,805623.900000 L309.900000,805637.640000 L310.380010,805637.640000 L310.380010,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M347.040000,805623.900000 L347.040000,805637.640000 L347.519980,805637.640000 L347.519980,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M384.120000,805623.900000 L384.120000,805637.640000 L384.600010,805637.640000 L384.600010,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M421.260000,805623.900000 L421.260000,805637.640000 L421.740010,805637.640000 L421.740010,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M458.400000,805623.900000 L458.400000,805637.640000 L458.880010,805637.640000 L458.880010,805623.900000 \" style=\"stroke:black\"></path>\n<path d=\"M161.400000,805637.159960 L161.400000,805637.640000 L198.780000,805637.640000 L198.780000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M198.540000,805637.159960 L198.540000,805637.640000 L235.860000,805637.640000 L235.860000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M235.620000,805637.159960 L235.620000,805637.640000 L273.000000,805637.640000 L273.000000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M272.760000,805637.159960 L272.760000,805637.640000 L310.140000,805637.640000 L310.140000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M309.900000,805637.159960 L309.900000,805637.640000 L347.280000,805637.640000 L347.280000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M347.040000,805637.159960 L347.040000,805637.640000 L384.360000,805637.640000 L384.360000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M384.120000,805637.159960 L384.120000,805637.640000 L421.500000,805637.640000 L421.500000,805637.159960 \" style=\"stroke:black\"></path>\n<path d=\"M421.260000,805637.159960 L421.260000,805637.640000 L458.640000,805637.640000 L458.640000,805637.159960 \" style=\"stroke:black\"></path>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.07002675\" x=\"360.78\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.07002675\" x=\"212.28\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"432.6\" y=\"805561.1535\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.12972555\" x=\"435.0\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.1305547\" x=\"397.92\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.1305547\" x=\"286.5\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.07002675\" x=\"323.7\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.1305547\" x=\"249.42\" y=\"805633.6335\">X0</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:8.291500pt\" textlength=\"10.07002675\" x=\"175.2\" y=\"805633.6335\">X0</text></svg>",
                 "tooltip": "Load floating point values from the source operand (second operand) and broadcast to all elements of the destina-tion operand (first operand)."
             };

         case "VCVTPH2PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VCVTPH2PS.html",
                 "html": "<p>Converts four/eight packed half precision (16-bits) floating-point values in the low-order 64/128 bits of an XMM/YMM register or 64/128-bit memory location to four/eight packed single-precision floating-point values and writes the converted values into the destination XMM/YMM register.</p><p>If case of a denormal operand, the correct normal result is returned. MXCSR.DAZ is ignored and is treated as if it 0. No denormal exception is reported on MXCSR.</p><p>128-bit version: The source operand is a XMM register or 64-bit memory location. The destination operand is a XMM register. The upper bits (VLMAX-1:128) of the corresponding destination YMM register are zeroed.</p><p>256-bit version: The source operand is a XMM register or 128-bit memory location. The destination operand is a YMM register.</p><p> The diagram below illustrates how data is converted from four packed half precision (in 64 bits) to four single precision (in 128 bits) FP values. Note: VEX.vvvv is reserved (must be 1111b).</p>",
                 "tooltip": "Converts four/eight packed half precision (16-bits) floating-point values in the low-order 64/128 bits of an XMM/YMM register or 64/128-bit memory location to four/eight packed single-precision floating-point values and writes the converted values into the destination XMM/YMM register."
             };

         case "VCVTPS2PH":
             return {
                 "url": "http://www.felixcloutier.com/x86/VCVTPS2PH.html",
                 "html": "<p>Convert four or eight packed single-precision floating values in first source operand to four or eight packed half-precision (16-bit) floating-point values. The rounding mode is specified using the immediate field (imm8).</p><p>Underflow results (i.e. tiny results) are converted to denormals. MXCSR.FTZ is ignored. If a source element is denormal relative to input format with MXCSR.DAZ not set, DM masked and at least one of PM or UM unmasked; a SIMD exception will be raised with DE, UE and PE set.</p><p>128-bit version: The source operand is a XMM register. The destination operand is a XMM register or 64-bit memory location. The upper-bits vector register zeroing behavior of VEX prefix encoding still applies if the destination operand is a xmm register. So the upper bits (255:64) of corresponding YMM register are zeroed.</p><p>256-bit version: The source operand is a YMM register. The destination operand is a XMM register or 128-bit memory location. The upper-bits vector register zeroing behavior of VEX prefix encoding still applies if the destina-tion operand is a xmm register. So the upper bits (255:128) of the corresponding YMM register are zeroed.</p><p>Note: VEX.vvvv is reserved (must be 1111b).</p>",
                 "tooltip": "Convert four or eight packed single-precision floating values in first source operand to four or eight packed half-precision (16-bit) floating-point values. The rounding mode is specified using the immediate field (imm8)."
             };

         case "VERR":
         case "VERW":
             return {
                 "url": "http://www.felixcloutier.com/x86/VERW.html",
                 "html": "<p>Verifies whether the code or data segment specified with the source operand is readable (VERR) or writable (VERW) from the current privilege level (CPL). The source operand is a 16-bit register or a memory location that contains the segment selector for the segment to be verified. If the segment is accessible and readable (VERR) or writable (VERW), the ZF flag is set; otherwise, the ZF flag is cleared. Code segments are never verified as writable. This check cannot be performed on system segments.</p><p>To set the ZF flag, the following conditions must be met:</p><p>The validation performed is the same as is performed when a segment selector is loaded into the DS, ES, FS, or GS register, and the indicated access (read or write) is performed. The segment selector's value cannot result in a protection exception, enabling the software to anticipate possible segment access problems.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode. The operand size is fixed at 16 bits.</p>",
                 "tooltip": "Verifies whether the code or data segment specified with the source operand is readable (VERR) or writable (VERW) from the current privilege level (CPL). The source operand is a 16-bit register or a memory location that contains the segment selector for the segment to be verified. If the segment is accessible and readable (VERR) or writable (VERW), the ZF flag is set; otherwise, the ZF flag is cleared. Code segments are never verified as writable. This check cannot be performed on system segments."
             };

         case "VEXTRACTF128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VEXTRACTF128.html",
                 "html": "<p>Extracts 128-bits of packed floating-point values from the source operand (second operand) at an 128-bit offset from imm8[0] into the destination operand (first operand). The destination may be either an XMM register or an 128-bit memory location.</p><p>VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p><p>The high 7 bits of the immediate are ignored.</p><p>If VEXTRACTF128 is encoded with VEX.L= 0, an attempt to execute the instruction encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Extracts 128-bits of packed floating-point values from the source operand (second operand) at an 128-bit offset from imm8[0] into the destination operand (first operand). The destination may be either an XMM register or an 128-bit memory location."
             };

         case "VEXTRACTI128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VEXTRACTI128.html",
                 "html": "<p>Extracts 128-bits of packed integer values from the source operand (second operand) at a 128-bit offset from imm8[0] into the destination operand (first operand). The destination may be either an XMM register or a 128-bit memory location.</p><p>VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p><p>The high 7 bits of the immediate are ignored.</p><p>An attempt to execute VEXTRACTI128 encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Extracts 128-bits of packed integer values from the source operand (second operand) at a 128-bit offset from imm8[0] into the destination operand (first operand). The destination may be either an XMM register or a 128-bit memory location."
             };

         case "VFMADD213PD":
         case "VFMADD231PD":
         case "VFMADD132PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132PD:VFMADD213PD:VFMADD231PD.html",
                 "html": "<p>Performs a set of SIMD multiply-add computation on packed double-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand, adds the infi-nite precision intermediate result to the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand, adds the infi-nite precision intermediate result to the two or four packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD231PD: Multiplies the two or four packed double-precision floating-point values from the second source to the two or four packed double-precision floating-point values in the third source operand, adds the infinite preci-sion intermediate result to the two or four packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a set of SIMD multiply-add computation on packed double-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMADD132PS":
         case "VFMADD231PS":
         case "VFMADD213PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132PS:VFMADD213PS:VFMADD231PS.html",
                 "html": "<p>Performs a set of SIMD multiply-add computation on packed single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand, adds the infi-nite precision intermediate result to the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand, adds the infi-nite precision intermediate result to the four or eight packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting the four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD231PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the third source operand, adds the infi-nite precision intermediate result to the four or eight packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a set of SIMD multiply-add computation on packed single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMADD213SD":
         case "VFMADD231SD":
         case "VFMADD132SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132SD:VFMADD213SD:VFMADD231SD.html",
                 "html": "<p>Performs a SIMD multiply-add computation on the low packed double-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand, adds the infinite precision inter-mediate result to the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand, adds the infinite precision inter-mediate result to the low packed double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low packed double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 64-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-add computation on the low packed double-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMADD132SS":
         case "VFMADD213SS":
         case "VFMADD231SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132SS:VFMADD213SS:VFMADD231SS.html",
                 "html": "<p>Performs a SIMD multiply-add computation on packed single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand, adds the infinite precision interme-diate result to the low packed single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand, adds the infinite precision interme-diate result to the low packed single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD231SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the third source operand, adds the infinite precision inter-mediate result to the low packed single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 32-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-add computation on packed single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMADDSUB213PD":
         case "VFMADDSUB231PD":
         case "VFMADDSUB132PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADDSUB132PD:VFMADDSUB213PD:VFMADDSUB231PD.html",
                 "html": "<p>VFMADDSUB132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB231PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMADDSUB132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFMADDSUB231PS":
         case "VFMADDSUB132PS":
         case "VFMADDSUB213PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMADDSUB132PS:VFMADDSUB213PS:VFMADDSUB231PS.html",
                 "html": "<p>VFMADDSUB132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the third source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB231PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMADDSUB132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFMSUB231PD":
         case "VFMSUB213PD":
         case "VFMSUB132PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132PD:VFMSUB213PD:VFMSUB231PD.html",
                 "html": "<p>Performs a set of SIMD multiply-subtract computation on packed double-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMSUB132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From the infi-nite precision intermediate result, subtracts the two or four packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB231PD: Multiplies the two or four packed double-precision floating-point values from the second source to the two or four packed double-precision floating-point values in the third source operand. From the infinite preci-sion intermediate result, subtracts the two or four packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a set of SIMD multiply-subtract computation on packed double-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMSUB132PS":
         case "VFMSUB231PS":
         case "VFMSUB213PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132PS:VFMSUB213PS:VFMSUB231PS.html",
                 "html": "<p>Performs a set of SIMD multiply-subtract computation on packed single-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMSUB132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand. From the infi-nite precision intermediate result, subtracts the four or eight packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB231PS: Multiplies the four or eight packed single-precision floating-point values from the second source to the four or eight packed single-precision floating-point values in the third source operand. From the infinite preci-sion intermediate result, subtracts the four or eight packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a set of SIMD multiply-subtract computation on packed single-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMSUB213SD":
         case "VFMSUB231SD":
         case "VFMSUB132SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132SD:VFMSUB213SD:VFMSUB231SD.html",
                 "html": "<p>Performs a SIMD multiply-subtract computation on the low packed double-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMSUB132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand. From the infinite precision inter-mediate result, subtracts the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand. From the infinite precision inter-mediate result, subtracts the low packed double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 64-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-subtract computation on the low packed double-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMSUB132SS":
         case "VFMSUB213SS":
         case "VFMSUB231SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132SS:VFMSUB213SS:VFMSUB231SS.html",
                 "html": "<p>Performs a SIMD multiply-subtract computation on the low packed single-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMSUB132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand. From the infinite precision interme-diate result, subtracts the low packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand. From the infinite precision interme-diate result, subtracts the low packed single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB231SS: Multiplies the low packed single-precision floating-point value from the second source to the low packed single-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 32-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-subtract computation on the low packed single-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location."
             };

         case "VFMSUBADD213PD":
         case "VFMSUBADD231PD":
         case "VFMSUBADD132PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUBADD132PD:VFMSUBADD213PD:VFMSUBADD231PD.html",
                 "html": "<p>VFMSUBADD132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From the infinite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD231PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMSUBADD132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFMSUBADD132PS":
         case "VFMSUBADD213PS":
         case "VFMSUBADD231PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFMSUBADD132PS:VFMSUBADD213PS:VFMSUBADD231PS.html",
                 "html": "<p>VFMSUBADD132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the third source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD231PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMSUBADD132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From the infi-nite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFNMADD132PD":
         case "VFNMADD231PD":
         case "VFNMADD213PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132PD:VFNMADD213PD:VFNMADD231PD.html",
                 "html": "<p>VFNMADD132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand, adds the negated infinite precision intermediate result to the two or four packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD231PD: Multiplies the two or four packed double-precision floating-point values from the second source to the two or four packed double-precision floating-point values in the third source operand, adds the negated infi-nite precision intermediate result to the two or four packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a</p><p>XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "VFNMADD132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFNMADD213PS":
         case "VFNMADD132PS":
         case "VFNMADD231PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132PS:VFNMADD213PS:VFNMADD231PS.html",
                 "html": "<p>VFNMADD132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand, adds the negated infinite precision intermediate result to the four or eight packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting the four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD231PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four or eight packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "VFNMADD132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFNMADD231SD":
         case "VFNMADD213SD":
         case "VFNMADD132SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132SD:VFNMADD213SD:VFNMADD231SD.html",
                 "html": "<p>VFNMADD132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand, adds the negated infinite preci-sion intermediate result to the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand, adds the negated infinite preci-sion intermediate result to the low packed double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 64-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p><p>Compiler tools may optionally support a complementary mnemonic for each instruction mnemonic listed in the opcode/instruction column of the summary table. The behavior of the complementary mnemonic in situations involving NANs are governed by the definition of the instruction mnemonic defined in the opcode/instruction column. See also Section 14.5.1, \u201cFMA Instruction Operand Order and Arithmetic Behavior\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p>",
                 "tooltip": "VFNMADD132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand, adds the negated infinite preci-sion intermediate result to the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand)."
             };

         case "VFNMADD132SS":
         case "VFNMADD231SS":
         case "VFNMADD213SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132SS:VFNMADD213SS:VFNMADD231SS.html",
                 "html": "<p>VFNMADD132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand, adds the negated infinite preci-sion intermediate result to the low packed single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD231SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite preci-sion intermediate result to the low packed single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 32-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p><p>Compiler tools may optionally support a complementary mnemonic for each instruction mnemonic listed in the opcode/instruction column of the summary table. The behavior of the complementary mnemonic in situations involving NANs are governed by the definition of the instruction mnemonic defined in the opcode/instruction column. See also Section 14.5.1, \u201cFMA Instruction Operand Order and Arithmetic Behavior\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p>",
                 "tooltip": "VFNMADD132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand)."
             };

         case "VFNMSUB213PD":
         case "VFNMSUB132PD":
         case "VFNMSUB231PD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMSUB132PD:VFNMSUB213PD:VFNMSUB231PD.html",
                 "html": "<p>VFNMSUB132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB213PD: Multiplies the two or four packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From negated infinite precision intermediate results, subtracts the two or four packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB231PD: Multiplies the two or four packed double-precision floating-point values from the second source to the two or four packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two or four packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a</p><p>XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "VFNMSUB132PD: Multiplies the two or four packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two or four packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand)."
             };

         case "VFNMSUB231PS":
         case "VFNMSUB213PS":
         case "VFNMSUB132PS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMSUB132PS:VFNMSUB213PS:VFNMSUB231PS.html",
                 "html": "<p>VFNMSUB132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-preci-sion floating-point values to the destination operand (first source operand).</p><p>VFNMSUB213PS: Multiplies the four or eight packed single-precision floating-point values from the second source operand to the four or eight packed single-precision floating-point values in the first source operand. From negated infinite precision intermediate results, subtracts the four or eight packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMSUB231PS: Multiplies the four or eight packed single-precision floating-point values from the second source to the four or eight packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four or eight packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four or eight packed single-precision floating-point values to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a</p><p>XMM register or a 128-bit memory location and encoded in rm_field. The upper 128 bits of the YMM destination register are zeroed.</p>",
                 "tooltip": "VFNMSUB132PS: Multiplies the four or eight packed single-precision floating-point values from the first source operand to the four or eight packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four or eight packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four or eight packed single-preci-sion floating-point values to the destination operand (first source operand)."
             };

         case "VFNMSUB231SD":
         case "VFNMSUB213SD":
         case "VFNMSUB132SD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMSUB132SD:VFNMSUB213SD:VFNMSUB231SD.html",
                 "html": "<p>VFNMSUB132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand. From negated infinite precision intermediate result, subtracts the low double-precision floating-point value in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMSUB213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand. From negated infinite precision intermediate result, subtracts the low double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMSUB231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand. From negated infinite precision interme-diate result, subtracts the low double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 64-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p><p>Compiler tools may optionally support a complementary mnemonic for each instruction mnemonic listed in the opcode/instruction column of the summary table. The behavior of the complementary mnemonic in situations involving NANs are governed by the definition of the instruction mnemonic defined in the opcode/instruction column. See also Section 14.5.1, \u201cFMA Instruction Operand Order and Arithmetic Behavior\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p>",
                 "tooltip": "VFNMSUB132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand. From negated infinite precision intermediate result, subtracts the low double-precision floating-point value in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand)."
             };

         case "VFNMSUB132SS":
         case "VFNMSUB231SS":
         case "VFNMSUB213SS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VFNMSUB132SS:VFNMSUB213SS:VFNMSUB231SS.html",
                 "html": "<p>VFNMSUB132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand. From negated infinite precision intermediate result, the low single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMSUB213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand. From negated infinite precision intermediate result, the low single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMSUB231SS: Multiplies the low packed single-precision floating-point value from the second source to the low packed single-precision floating-point value in the third source operand. From negated infinite precision interme-diate result, the low single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 encoded version: The destination operand (also first source operand) is a XMM register and encoded in reg_field. The second source operand is a XMM register and encoded in VEX.vvvv. The third source operand is a XMM register or a 32-bit memory location and encoded in rm_field. The upper bits ([VLMAX-1:128]) of the YMM destination register are zeroed.</p><p>Compiler tools may optionally support a complementary mnemonic for each instruction mnemonic listed in the opcode/instruction column of the summary table. The behavior of the complementary mnemonic in situations involving NANs are governed by the definition of the instruction mnemonic defined in the opcode/instruction column. See also Section 14.5.1, \u201cFMA Instruction Operand Order and Arithmetic Behavior\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p>",
                 "tooltip": "VFNMSUB132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand. From negated infinite precision intermediate result, the low single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand)."
             };

         case "VGATHERDPD":
         case "VGATHERQPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VGATHERDPD:VGATHERQPD.html",
                 "html": "<p>The instruction conditionally loads up to 2 or 4 double-precision floating-point values from memory addresses specified by the memory operand (the second operand) and using qword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.</p><p>The mask operand (the third operand) specifies the conditional load operation from each memory address and the corresponding update of each data element of the destination operand (the first operand). Conditionality is speci-fied by the most significant bit of each data element of the mask register. If an element\u2019s mask bit is not set, the corresponding element of the destination register is left unchanged. The width of data element in the destination register and mask register are identical. The entire mask register will be set to zero by this instruction unless the instruction causes an exception.</p><p>Using dword indices in the lower half of the mask register, the instruction conditionally loads up to 2 or 4 double-precision floating-point values from the VSIB addressing memory operand, and updates the destination register.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set).  When this happens, the destination register and the mask operand are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero.  If any traps or interrupts are pending from already gath-ered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data size and index size are different, part of the destination register and part of the mask register do not correspond to any elements being gathered.  This instruction sets those parts to zero.  It may do this to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "The instruction conditionally loads up to 2 or 4 double-precision floating-point values from memory addresses specified by the memory operand (the second operand) and using qword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor."
             };

         case "VGATHERQPS":
         case "VGATHERDPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VGATHERDPS:VGATHERQPS.html",
                 "html": "<p>The instruction conditionally loads up to 4 or 8 single-precision floating-point values from memory addresses spec-ified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.</p><p>The mask operand (the third operand) specifies the conditional load operation from each memory address and the corresponding update of each data element of the destination operand (the first operand). Conditionality is speci-fied by the most significant bit of each data element of the mask register. If an element\u2019s mask bit is not set, the corresponding element of the destination register is left unchanged. The width of data element in the destination register and mask register are identical. The entire mask register will be set to zero by this instruction unless the instruction causes an exception.</p><p>Using qword indices, the instruction conditionally loads up to 2 or 4 single-precision floating-point values from the VSIB addressing memory operand, and updates the lower half of the destination register. The upper 128 or 256 bits of the destination register are zero\u2019ed with qword indices.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set).  When this happens, the destination register and the mask operand are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero.  If any traps or interrupts are pending from already gath-ered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data size and index size are different, part of the destination register and part of the mask register do not correspond to any elements being gathered.  This instruction sets those parts to zero.  It may do this to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "The instruction conditionally loads up to 4 or 8 single-precision floating-point values from memory addresses spec-ified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor."
             };

         case "VINSERTF128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VINSERTF128.html",
                 "html": "<p>Performs an insertion of 128-bits of packed floating-point values from the second source operand (third operand) into an the destination operand (first operand) at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand (second operand). The second source operand can be either an XMM register or a 128-bit memory location.</p><p>The high 7 bits of the immediate are ignored.</p>",
                 "tooltip": "Performs an insertion of 128-bits of packed floating-point values from the second source operand (third operand) into an the destination operand (first operand) at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand (second operand). The second source operand can be either an XMM register or a 128-bit memory location."
             };

         case "VINSERTI128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VINSERTI128.html",
                 "html": "<p>Performs an insertion of 128-bits of packed integer data from the second source operand (third operand) into an the destination operand (first operand) at a 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand (second operand). The second source operand can be either an XMM register or a 128-bit memory location.</p><p>The high 7 bits of the immediate are ignored.</p><p>VEX.L must be 1; an attempt to execute this instruction with VEX.L=0 will cause #UD.</p>",
                 "tooltip": "Performs an insertion of 128-bits of packed integer data from the second source operand (third operand) into an the destination operand (first operand) at a 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand (second operand). The second source operand can be either an XMM register or a 128-bit memory location."
             };

         case "VMASKMOVPS":
         case "VMASKMOVPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VMASKMOV.html",
                 "html": "<p>Conditionally moves packed data elements from the second source operand into the corresponding data element of the destination operand, depending on the mask bits associated with each data element. The mask bits are specified in the first source operand.</p><p>The mask bit for each data element is the most significant bit of that element in the first source operand. If a mask is 1, the corresponding data element is copied from the second source operand to the destination operand. If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form.</p><p>The second source operand is a memory address for the load form of these instruction. The destination operand is a memory address for the store form of these instructions. The other operands are both XMM registers (for VEX.128 version) or YMM registers (for VEX.256 version).</p><p>Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no faults will be detected if the mask bits are all zero.</p><p>Unlike previous MASKMOV instructions (MASKMOVQ and MASKMOVDQU), a nontemporal hint is not applied to these instructions.</p>",
                 "tooltip": "Conditionally moves packed data elements from the second source operand into the corresponding data element of the destination operand, depending on the mask bits associated with each data element. The mask bits are specified in the first source operand."
             };

         case "VPBLENDD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPBLENDD.html",
                 "html": "<p>Dword elements from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word is unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Dword elements from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word is unchanged."
             };

         case "VBROADCASTI128":
         case "VPBROADCASTD":
         case "VPBROADCASTW":
         case "VPBROADCASTQ":
         case "VPBROADCASTB":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPBROADCAST.html",
                 "html": "<p>Load integer data from the source operand (second operand) and broadcast to all elements of the destination operand (first operand).</p><p>The destination operand is a YMM register. The source operand is 8-bit, 16-bit 32-bit, 64-bit memory location or the low 8-bit, 16-bit 32-bit, 64-bit data in an XMM register. VPBROADCASTB/D/W/Q also support XMM register as the source operand.</p><p>VBROADCASTI128: The destination operand is a YMM register. The source operand is 128-bit memory location. Register source encodings for VBROADCASTI128 are reserved and will #UD.</p><p>VPBROADCASTB/W/D/Q is supported in both 128-bit and 256-bit wide versions.</p><p>VBROADCASTI128 is only supported as a 256-bit wide version.</p>",
                 "tooltip": "Load integer data from the source operand (second operand) and broadcast to all elements of the destination operand (first operand)."
             };

         case "VPERM2F128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERM2F128.html",
                 "html": "<p>Permute 128 bit floating-point-containing fields from the first source operand (second operand) and second source operand (third operand) using bits in the 8-bit immediate and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register.</p><p>Y1</p><p>Y0</p><p>SRC2</p><p>X1</p>",
                 "tooltip": "Permute 128 bit floating-point-containing fields from the first source operand (second operand) and second source operand (third operand) using bits in the 8-bit immediate and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register."
             };

         case "VPERM2I128":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERM2I128.html",
                 "html": "<p>Permute 128 bit integer data from the first source operand (second operand) and second source operand (third operand) using bits in the 8-bit immediate and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register.</p><p>Y1</p><p>Y0</p><p>SRC2</p><p>X1</p>",
                 "tooltip": "Permute 128 bit integer data from the first source operand (second operand) and second source operand (third operand) using bits in the 8-bit immediate and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register."
             };

         case "VPERMD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMD.html",
                 "html": "<p>Use the index values in each dword element of the first source operand (the second operand) to select a dword element in the second source operand (the third operand), the resultant dword value from the second source operand is copied to the destination operand (the first operand) in the corresponding position of the index element. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand.</p><p>An attempt to execute VPERMD encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Use the index values in each dword element of the first source operand (the second operand) to select a dword element in the second source operand (the third operand), the resultant dword value from the second source operand is copied to the destination operand (the first operand) in the corresponding position of the index element. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand."
             };

         case "VPERMILPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMILPD.html",
                 "html": "<p>Permute double-precision floating-point values in the first source operand (second operand) using 8-bit control fields in the low bytes of the second source operand (third operand) and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register.</p><p>X3</p><p>X2</p><p>X1</p><p>X0</p>",
                 "tooltip": "Permute double-precision floating-point values in the first source operand (second operand) using 8-bit control fields in the low bytes of the second source operand (third operand) and store results in the destination operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register."
             };

         case "VPERMILPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMILPS.html",
                 "html": "<p>(variable control version)</p><p>Permute single-precision floating-point values in the first source operand (second operand) using 8-bit control fields in the low bytes of corresponding elements the shuffle control (third operand) and store results in the desti-nation operand (first operand). The first source operand is a YMM register, the second source operand is a YMM register or a 256-bit memory location, and the destination operand is a YMM register.</p><svg height=\"20.2500225002\" viewbox=\"136.080000 885148.199995 297.060005 13.500015\" width=\"445.5900075\">\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"136.08\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"173.22\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"210.36\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.08\" x=\"247.5\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"284.58\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"321.72\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"358.86\" y=\"885148.2\"></rect>\n<rect height=\"13.5\" style=\"fill:rgba(0,0,0,0);stroke:rgb(0,0,0);stroke-width:1pt;\" width=\"37.14\" x=\"396.0\" y=\"885148.2\"></rect>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"147.18\" y=\"885156.3134\">X7</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.21\" x=\"184.32\" y=\"885156.3134\">X6</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"221.46\" y=\"885156.3134\">X5</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"258.54\" y=\"885156.3134\">X4</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"295.68\" y=\"885156.3134\">X3</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.21\" x=\"332.82\" y=\"885156.3134\">X2</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.21\" x=\"369.96\" y=\"885156.3134\">X1</text>\n<text lengthadjust=\"spacingAndGlyphs\" style=\"font-size:7.500000pt\" textlength=\"9.15\" x=\"407.1\" y=\"885156.3134\">X0</text></svg><p>SRC1</p><p>DEST</p>",
                 "tooltip": "(variable control version)"
             };

         case "VPERMPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMPD.html",
                 "html": "<p>Use two-bit index values in the immediate byte to select a double-precision floating-point element in the source operand; the resultant data from the source operand is copied to the corresponding element of the destination operand in the order of the index field. Note that this instruction permits a qword in the source operand to be copied to multiple location in the destination operand.</p><p>An attempt to execute VPERMPD encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Use two-bit index values in the immediate byte to select a double-precision floating-point element in the source operand; the resultant data from the source operand is copied to the corresponding element of the destination operand in the order of the index field. Note that this instruction permits a qword in the source operand to be copied to multiple location in the destination operand."
             };

         case "VPERMPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMPS.html",
                 "html": "<p>Use the index values in each dword element of the first source operand (the second operand) to select a single-precision floating-point element in the second source operand (the third operand), the resultant data from the second source operand is copied to the destination operand (the first operand) in the corresponding position of the index element. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand.</p><p>An attempt to execute VPERMPS encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Use the index values in each dword element of the first source operand (the second operand) to select a single-precision floating-point element in the second source operand (the third operand), the resultant data from the second source operand is copied to the destination operand (the first operand) in the corresponding position of the index element. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand."
             };

         case "VPERMQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPERMQ.html",
                 "html": "<p>Use two-bit index values in the immediate byte to select a qword element in the source operand, the resultant qword value from the source operand is copied to the corresponding element of the destination operand in the order of the index field. Note that this instruction permits a qword in the source operand to be copied to multiple locations in the destination operand.</p><p>An attempt to execute VPERMQ encoded with VEX.L= 0 will cause an #UD exception.</p>",
                 "tooltip": "Use two-bit index values in the immediate byte to select a qword element in the source operand, the resultant qword value from the source operand is copied to the corresponding element of the destination operand in the order of the index field. Note that this instruction permits a qword in the source operand to be copied to multiple locations in the destination operand."
             };

         case "VPGATHERQD":
         case "VPGATHERDD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPGATHERDD:VPGATHERQD.html",
                 "html": "<p>The instruction conditionally loads up to 4 or 8 dword values from memory addresses specified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.</p><p>The mask operand (the third operand) specifies the conditional load operation from each memory address and the corresponding update of each data element of the destination operand (the first operand). Conditionality is speci-fied by the most significant bit of each data element of the mask register. If an element\u2019s mask bit is not set, the corresponding element of the destination register is left unchanged. The width of data element in the destination register and mask register are identical. The entire mask register will be set to zero by this instruction unless the instruction causes an exception.</p><p>Using qword indices, the instruction conditionally loads up to 2 or 4 dword values from the VSIB addressing memory operand, and updates the lower half of the destination register. The upper 128 or 256 bits of the destina-tion register are zero\u2019ed with qword indices.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set).  When this happens, the destination register and the mask operand are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero.  If any traps or interrupts are pending from already gath-ered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data size and index size are different, part of the destination register and part of the mask register do not correspond to any elements being gathered.  This instruction sets those parts to zero.  It may do this to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "The instruction conditionally loads up to 4 or 8 dword values from memory addresses specified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor."
             };

         case "VPGATHERDQ":
         case "VPGATHERQQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPGATHERDQ:VPGATHERQQ.html",
                 "html": "<p>The instruction conditionally loads up to 2 or 4 qword values from memory addresses specified by the memory operand (the second operand) and using qword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.</p><p>The mask operand (the third operand) specifies the conditional load operation from each memory address and the corresponding update of each data element of the destination operand (the first operand). Conditionality is speci-fied by the most significant bit of each data element of the mask register. If an element\u2019s mask bit is not set, the corresponding element of the destination register is left unchanged. The width of data element in the destination register and mask register are identical. The entire mask register will be set to zero by this instruction unless the instruction causes an exception.</p><p>Using dword indices in the lower half of the mask register, the instruction conditionally loads up to 2 or 4 qword values from the VSIB addressing memory operand, and updates the destination register.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set).  When this happens, the destination register and the mask operand are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero.  If any traps or interrupts are pending from already gath-ered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data size and index size are different, part of the destination register and part of the mask register do not correspond to any elements being gathered.  This instruction sets those parts to zero.  It may do this to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "The instruction conditionally loads up to 2 or 4 qword values from memory addresses specified by the memory operand (the second operand) and using qword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor."
             };

         case "VPMASKMOVD":
         case "VPMASKMOVQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPMASKMOV.html",
                 "html": "<p>Conditionally moves packed data elements from the second source operand into the corresponding data element of the destination operand, depending on the mask bits associated with each data element. The mask bits are speci-fied in the first source operand.</p><p>The mask bit for each data element is the most significant bit of that element in the first source operand. If a mask is 1, the corresponding data element is copied from the second source operand to the destination operand. If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form.</p><p>The second source operand is a memory address for the load form of these instructions. The destination operand is a memory address for the store form of these instructions. The other operands are either XMM registers (for VEX.128 version) or YMM registers (for VEX.256 version).</p><p>Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no faults will be detected if the mask bits are all zero.</p><p>Unlike previous MASKMOV instructions (MASKMOVQ and MASKMOVDQU), a nontemporal hint is not applied to these instructions.</p>",
                 "tooltip": "Conditionally moves packed data elements from the second source operand into the corresponding data element of the destination operand, depending on the mask bits associated with each data element. The mask bits are speci-fied in the first source operand."
             };

         case "VPSLLVD":
         case "VPSLLVQ":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPSLLVD:VPSLLVQ.html",
                 "html": "<p>Shifts the bits in the individual data elements (doublewords, or quadword) in the first source operand to the left by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0).</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 31 (for double-words), or 63 (for a quadword), then the destination data element are written with 0.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory location.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (doublewords, or quadword) in the first source operand to the left by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0)."
             };

         case "VPSRAVD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPSRAVD.html",
                 "html": "<p>Shifts the bits in the individual doubleword data elements in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in each data element are shifted right, the empty high-order bits are filled with the sign bit of the source element.</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 31, then the destination data element are filled with the corresponding sign bit of the source element.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory location.</p>",
                 "tooltip": "Shifts the bits in the individual doubleword data elements in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in each data element are shifted right, the empty high-order bits are filled with the sign bit of the source element."
             };

         case "VPSRLVQ":
         case "VPSRLVD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VPSRLVD:VPSRLVQ.html",
                 "html": "<p>Shifts the bits in the individual data elements (doublewords, or quadword) in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0).</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 31 (for double-words), or 63 (for a quadword), then the destination data element are written with 0.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory location.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (doublewords, or quadword) in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0)."
             };

         case "VTESTPS":
         case "VTESTPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/VTESTPD:VTESTPS.html",
                 "html": "<p>VTESTPS performs a bitwise comparison of all the sign bits of the packed single-precision elements in the first source operation and corresponding sign bits in the second source operand. If the AND of the source sign bits with the dest sign bits produces all zeros, the ZF is set else the ZF is clear. If the AND of the source sign bits with the inverted dest sign bits produces all zeros the CF is set else the CF is clear. An attempt to execute VTESTPS with VEX.W=1 will cause #UD.</p><p>VTESTPD performs a bitwise comparison of all the sign bits of the double-precision elements in the first source operation and corresponding sign bits in the second source operand. If the AND of the source sign bits with the dest sign bits produces all zeros, the ZF is set else the ZF is clear. If the AND the source sign bits with the inverted dest sign bits produces all zeros the CF is set else the CF is clear. An attempt to execute VTESTPS with VEX.W=1 will cause #UD.</p><p>The first source register is specified by the ModR/M <em>reg</em> field.</p><p>128-bit version: The first source register is an XMM register. The second source register can be an XMM register or a 128-bit memory location. The destination register is not modified.</p><p>VEX.256 encoded version: The first source register is a YMM register. The second source register can be a YMM register or a 256-bit memory location. The destination register is not modified.</p>",
                 "tooltip": "VTESTPS performs a bitwise comparison of all the sign bits of the packed single-precision elements in the first source operation and corresponding sign bits in the second source operand. If the AND of the source sign bits with the dest sign bits produces all zeros, the ZF is set else the ZF is clear. If the AND of the source sign bits with the inverted dest sign bits produces all zeros the CF is set else the CF is clear. An attempt to execute VTESTPS with VEX.W=1 will cause #UD."
             };

         case "VZEROALL":
             return {
                 "url": "http://www.felixcloutier.com/x86/VZEROALL.html",
                 "html": "<p>The instruction zeros contents of all XMM or YMM registers.</p><p>Note: VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD. In Compatibility and legacy 32-bit mode only the lower 8 registers are modified.</p>",
                 "tooltip": "The instruction zeros contents of all XMM or YMM registers."
             };

         case "VZEROUPPER":
             return {
                 "url": "http://www.felixcloutier.com/x86/VZEROUPPER.html",
                 "html": "<p>The instruction zeros the bits in position 128 and higher of all YMM registers. The lower 128-bits of the registers (the corresponding XMM registers) are unmodified.</p><p>This instruction is recommended when transitioning between AVX and legacy SSE code - it will eliminate perfor-mance penalties caused by false dependencies.</p><p>Note: VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD. In Compatibility and legacy 32-bit mode only the lower 8 registers are modified.</p>",
                 "tooltip": "The instruction zeros the bits in position 128 and higher of all YMM registers. The lower 128-bits of the registers (the corresponding XMM registers) are unmodified."
             };

         case "WBINVD":
             return {
                 "url": "http://www.felixcloutier.com/x86/WBINVD.html",
                 "html": "<p>Writes back all modified cache lines in the processor\u2019s internal cache to main memory and invalidates (flushes) the internal caches. The instruction then issues a special-function bus cycle that directs external caches to also write back modified data and another bus cycle to indicate that the external caches should be invalidated.</p><p>After executing this instruction, the processor does not wait for the external caches to complete their write-back and flushing operations before proceeding with instruction execution. It is the responsibility of hardware to respond to the cache write-back and flush signals. The amount of time or cycles for WBINVD to complete will vary due to size and other factors of different cache hierarchies. As a consequence, the use of the WBINVD instruction can have an impact on logical processor interrupt/event response time. Additional information of WBINVD behavior in a cache hierarchy with hierarchical sharing topology can be found in Chapter 2 of the <em>Intel\u00ae 64 and IA-32 Architec-tures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>The WBINVD instruction is a privileged instruction. When the processor is running in protected mode, the CPL of a program or procedure must be 0 to execute this instruction. This instruction is also a serializing instruction (see \u201cSerializing Instructions\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>).</p><p>In situations where cache coherency with main memory is not a concern, software can use the INVD instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Writes back all modified cache lines in the processor\u2019s internal cache to main memory and invalidates (flushes) the internal caches. The instruction then issues a special-function bus cycle that directs external caches to also write back modified data and another bus cycle to indicate that the external caches should be invalidated."
             };

         case "WRFSBASE":
         case "WRGSBASE":
             return {
                 "url": "http://www.felixcloutier.com/x86/WRFSBASE:WRGSBASE.html",
                 "html": "<p>Loads the FS or GS segment base address with the general-purpose register indicated by the modR/M:r/m field.</p><p>The source operand may be either a 32-bit or a 64-bit general-purpose register. The REX.W prefix indicates the operand size is 64 bits. If no REX.W prefix is used, the operand size is 32 bits; the upper 32 bits of the source register are ignored and upper 32 bits of the base address (for FS or GS) are cleared.</p><p>This instruction is supported only in 64-bit mode.</p>",
                 "tooltip": "Loads the FS or GS segment base address with the general-purpose register indicated by the modR/M:r/m field."
             };

         case "WRMSR":
             return {
                 "url": "http://www.felixcloutier.com/x86/WRMSR.html",
                 "html": "<p>Writes the contents of registers EDX:EAX into the 64-bit model specific register (MSR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The contents of the EDX register are copied to high-order 32 bits of the selected MSR and the contents of the EAX register are copied to low-order 32 bits of the MSR. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are ignored.) Undefined or reserved bits in an MSR should be set to values previously read.</p><p>This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) is generated. Specifying a reserved or unimplemented MSR address in ECX will also cause a general protection exception. The processor will also generate a general protection exception if software attempts to write to bits in a reserved MSR.</p><p>When the WRMSR instruction is used to write to an MTRR, the TLBs are invalidated. This includes global entries (see \u201cTranslation Lookaside Buffers (TLBs)\u201d in Chapter 3 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Devel-oper\u2019s Manual, Volume 3A</em>).</p><p>MSRs control functions for testability, execution tracing, performance-monitoring and machine check errors. Chapter 35, \u201cModel-Specific Registers (MSRs)\u201d, in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3C</em>, lists all MSRs that can be written with this instruction and their addresses. Note that each processor family has its own set of MSRs.</p><p>The WRMSR instruction is a serializing instruction (see \u201cSerializing Instructions\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>). Note that WRMSR to the IA32_TSC_DEADLINE MSR (MSR index 6E0H) and the X2APIC MSRs (MSR indices 802H to 83FH) are not serializing.</p>",
                 "tooltip": "Writes the contents of registers EDX:EAX into the 64-bit model specific register (MSR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The contents of the EDX register are copied to high-order 32 bits of the selected MSR and the contents of the EAX register are copied to low-order 32 bits of the MSR. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are ignored.) Undefined or reserved bits in an MSR should be set to values previously read."
             };

         case "XABORT":
             return {
                 "url": "http://www.felixcloutier.com/x86/XABORT.html",
                 "html": "<p>XABORT forces an RTM abort. Following an RTM abort, the logical processor resumes execution at the fallback address computed through the outermost XBEGIN instruction. The EAX register is updated to reflect an XABORT instruction caused the abort, and the imm8 argument will be provided in bits 31:24 of EAX.</p>",
                 "tooltip": "XABORT forces an RTM abort. Following an RTM abort, the logical processor resumes execution at the fallback address computed through the outermost XBEGIN instruction. The EAX register is updated to reflect an XABORT instruction caused the abort, and the imm8 argument will be provided in bits 31:24 of EAX."
             };

         case "XACQUIRE":
         case "XRELEASE":
             return {
                 "url": "http://www.felixcloutier.com/x86/XACQUIRE:XRELEASE.html",
                 "html": "<p>The XACQUIRE prefix is a hint to start lock elision on the memory address specified by the instruction and the XRELEASE prefix is a hint to end lock elision on the memory address specified by the instruction.</p><p>The XACQUIRE prefix hint can only be used with the following instructions (these instructions are also referred to as XACQUIRE-enabled when used with the XACQUIRE prefix):</p><p>The XRELEASE prefix hint can only be used with the following instructions (also referred to as XRELEASE-enabled when used with the XRELEASE prefix):</p><p>The lock variables must satisfy the guidelines described in <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, Section 15.3.3, for elision to be successful, otherwise an HLE abort may be signaled.</p><p>If an encoded byte sequence that meets XACQUIRE/XRELEASE requirements includes both prefixes, then the HLE semantic is determined by the prefix byte that is placed closest to the instruction opcode. For example, an F3F2C6 will not be treated as a XRELEASE-enabled instruction since the F2H (XACQUIRE) is closest to the instruction opcode C6. Similarly, an F2F3F0 prefixed instruction will be treated as a XRELEASE-enabled instruction since F3H (XRELEASE) is closest to the instruction opcode.</p>",
                 "tooltip": "The XACQUIRE prefix is a hint to start lock elision on the memory address specified by the instruction and the XRELEASE prefix is a hint to end lock elision on the memory address specified by the instruction."
             };

         case "XADD":
             return {
                 "url": "http://www.felixcloutier.com/x86/XADD.html",
                 "html": "<p>Exchanges the first operand (destination operand) with the second operand (source operand), then loads the sum of the two values into the destination operand. The destination operand can be a register or a memory location; the source operand is a register.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>",
                 "tooltip": "Exchanges the first operand (destination operand) with the second operand (source operand), then loads the sum of the two values into the destination operand. The destination operand can be a register or a memory location; the source operand is a register."
             };

         case "XBEGIN":
             return {
                 "url": "http://www.felixcloutier.com/x86/XBEGIN.html",
                 "html": "<p>The XBEGIN instruction specifies the start of an RTM code region. If the logical processor was not already in trans-actional execution, then the XBEGIN instruction causes the logical processor to transition into transactional execu-tion. The XBEGIN instruction that transitions the logical processor into transactional execution is referred to as the outermost XBEGIN instruction. The instruction also specifies a relative offset to compute the address of the fallback code path following a transactional abort.</p><p>On an RTM abort, the logical processor discards all architectural register and memory updates performed during the RTM execution and restores architectural state to that corresponding to the outermost XBEGIN instruction. The fallback address following an abort is computed from the outermost XBEGIN instruction.</p>",
                 "tooltip": "The XBEGIN instruction specifies the start of an RTM code region. If the logical processor was not already in trans-actional execution, then the XBEGIN instruction causes the logical processor to transition into transactional execu-tion. The XBEGIN instruction that transitions the logical processor into transactional execution is referred to as the outermost XBEGIN instruction. The instruction also specifies a relative offset to compute the address of the fallback code path following a transactional abort."
             };

         case "XCHG":
             return {
                 "url": "http://www.felixcloutier.com/x86/XCHG.html",
                 "html": "<p>Exchanges the contents of the destination (first) and source (second) operands. The operands can be two general-purpose registers or a register and a memory location. If a memory operand is referenced, the processor\u2019s locking protocol is automatically implemented for the duration of the exchange operation, regardless of the presence or absence of the LOCK prefix or of the value of the IOPL. (See the LOCK prefix description in this chapter for more information on the locking protocol.)</p><p>This instruction is useful for implementing semaphores or similar data structures for process synchronization. (See \u201cBus Locking\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>, for more information on bus locking.)</p><p>The XCHG instruction can also be used instead of the BSWAP instruction for 16-bit operands.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Exchanges the contents of the destination (first) and source (second) operands. The operands can be two general-purpose registers or a register and a memory location. If a memory operand is referenced, the processor\u2019s locking protocol is automatically implemented for the duration of the exchange operation, regardless of the presence or absence of the LOCK prefix or of the value of the IOPL. (See the LOCK prefix description in this chapter for more information on the locking protocol.)"
             };

         case "XEND":
             return {
                 "url": "http://www.felixcloutier.com/x86/XEND.html",
                 "html": "<p>The instruction marks the end of an RTM code region. If this corresponds to the outermost scope (that is, including this XEND instruction, the number of XBEGIN instructions is the same as number of XEND instructions), the logical processor will attempt to commit the logical processor state atomically. If the commit fails, the logical processor will rollback all architectural register and memory updates performed during the RTM execution. The logical processor will resume execution at the fallback address computed from the outermost XBEGIN instruction. The EAX register is updated to reflect RTM abort information.</p><p>XEND executed outside a transactional region will cause a #GP (General Protection Fault).</p>",
                 "tooltip": "The instruction marks the end of an RTM code region. If this corresponds to the outermost scope (that is, including this XEND instruction, the number of XBEGIN instructions is the same as number of XEND instructions), the logical processor will attempt to commit the logical processor state atomically. If the commit fails, the logical processor will rollback all architectural register and memory updates performed during the RTM execution. The logical processor will resume execution at the fallback address computed from the outermost XBEGIN instruction. The EAX register is updated to reflect RTM abort information."
             };

         case "XGETBV":
             return {
                 "url": "http://www.felixcloutier.com/x86/XGETBV.html",
                 "html": "<p>Reads the contents of the extended control register (XCR) specified in the ECX register into registers EDX:EAX. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The EDX register is loaded with the high-order 32 bits of the XCR and the EAX register is loaded with the low-order 32 bits. (On proces-sors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.) If fewer than 64 bits are implemented in the XCR being read, the values returned to EDX:EAX in unimplemented bit loca-tions are undefined.</p><p>XCR0 is supported on any processor that supports the XGETBV instruction. If CPUID.(EAX=0DH,ECX=1):EAX.XG1[bit 2] = 1, executing XGETBV with ECX = 1 returns in EDX:EAX the logical-AND of XCR0 and the current value of the XINUSE state-component bitmap. This allows software to discover the state of the init optimization used by XSAVEOPT and XSAVES. See Chapter 13, \u201cManaging State Using the XSAVE Feature Set\u201a\u201d in <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>Use of any other value for ECX results in a general-protection (#GP) exception.</p>",
                 "tooltip": "Reads the contents of the extended control register (XCR) specified in the ECX register into registers EDX:EAX. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The EDX register is loaded with the high-order 32 bits of the XCR and the EAX register is loaded with the low-order 32 bits. (On proces-sors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are cleared.) If fewer than 64 bits are implemented in the XCR being read, the values returned to EDX:EAX in unimplemented bit loca-tions are undefined."
             };

         case "XLATB":
         case "XLAT":
             return {
                 "url": "http://www.felixcloutier.com/x86/XLATB.html",
                 "html": "<p>Locates a byte entry in a table in memory, using the contents of the AL register as a table index, then copies the contents of the table entry back into the AL register. The index in the AL register is treated as an unsigned integer. The XLAT and XLATB instructions get the base address of the table in memory from either the DS:EBX or the DS:BX registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). (The DS segment may be overridden with a segment override prefix.)</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operand\u201d form and the \u201cno-operand\u201d form. The explicit-operand form (specified with the XLAT mnemonic) allows the base address of the table to be specified explicitly with a symbol. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the symbol does not have to specify the correct base address. The base address is always specified by the DS:(E)BX registers, which must be loaded correctly before the XLAT instruction is executed.</p><p>The no-operands form (XLATB) provides a \u201cshort form\u201d of the XLAT instructions. Here also the processor assumes that the DS:(E)BX registers contain the base address of the table.</p><p>In 64-bit mode, operation is similar to that in legacy or compatibility mode. AL is used to specify the table index (the operand size is fixed at 8 bits). RBX, however, is used to specify the table\u2019s base address. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Locates a byte entry in a table in memory, using the contents of the AL register as a table index, then copies the contents of the table entry back into the AL register. The index in the AL register is treated as an unsigned integer. The XLAT and XLATB instructions get the base address of the table in memory from either the DS:EBX or the DS:BX registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). (The DS segment may be overridden with a segment override prefix.)"
             };

         case "XOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/XOR.html",
                 "html": "<p>Performs a bitwise exclusive OR (XOR) operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory oper-ands cannot be used in one instruction.) Each bit of the result is 1 if the corresponding bits of the operands are different; each bit is 0 if the corresponding bits are the same.</p><p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Performs a bitwise exclusive OR (XOR) operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory oper-ands cannot be used in one instruction.) Each bit of the result is 1 if the corresponding bits of the operands are different; each bit is 0 if the corresponding bits are the same."
             };

         case "VXORPD":
         case "XORPD":
             return {
                 "url": "http://www.felixcloutier.com/x86/XORPD.html",
                 "html": "<p>Performs a bitwise logical exclusive-OR of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical exclusive-OR of the two packed double-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register."
             };

         case "XORPS":
         case "VXORPS":
             return {
                 "url": "http://www.felixcloutier.com/x86/XORPS.html",
                 "html": "<p>Performs a bitwise logical exclusive-OR of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a bitwise logical exclusive-OR of the four packed single-precision floating-point values from the source operand (second operand) and the destination operand (first operand), and stores the result in the destination operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register."
             };

         case "XRSTOR64":
         case "XRSTOR":
             return {
                 "url": "http://www.felixcloutier.com/x86/XRSTOR64.html",
                 "html": "<p>Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.7, \u201cOperation of XRSTOR,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> provides a detailed description of the operation of the XRSTOR instruction. The following items provide a high-level outline:</p><p>for which RFBM[<em>i</em>] = 0.</p><p><em>i</em></p>",
                 "tooltip": "Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0."
             };

         case "XRSTORS64":
         case "XRSTORS":
             return {
                 "url": "http://www.felixcloutier.com/x86/XRSTORS64.html",
                 "html": "<p>Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XRSTORS may be executed only if CPL = 0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.11, \u201cOperation of XRSTORS,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> provides a detailed description of the operation of the XRSTOR instruction. The following items provide a high-level outline:</p><p>for which RFBM[<em>i</em>] = 0.</p><p><em>i</em></p>",
                 "tooltip": "Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XRSTORS may be executed only if CPL = 0."
             };

         case "XSAVE":
         case "XSAVE64":
             return {
                 "url": "http://www.felixcloutier.com/x86/XSAVE64.html",
                 "html": "<p>Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.6, \u201cOperation of XSAVE,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1 </em>provides a detailed description of the operation of the XSAVE instruction. The following items provide a high-level outline:</p><p>Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored.</p>",
                 "tooltip": "Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0."
             };

         case "XSAVEC":
         case "XSAVEC64":
             return {
                 "url": "http://www.felixcloutier.com/x86/XSAVEC64.html",
                 "html": "<p>Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.9, \u201cOperation of XSAVEC,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> provides a detailed description of the operation of the XSAVEC instruction. The following items provide a high-level outline:</p><p>Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored.</p>",
                 "tooltip": "Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0."
             };

         case "XSAVEOPT64":
         case "XSAVEOPT":
             return {
                 "url": "http://www.felixcloutier.com/x86/XSAVEOPT.html",
                 "html": "<p>Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.8, \u201cOperation of XSAVEOPT,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> provides a detailed description of the operation of the XSAVEOPT instruction. The following items provide a high-level outline:</p><p>Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) will result in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored.</p>",
                 "tooltip": "Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0."
             };

         case "XSAVES":
         case "XSAVES64":
             return {
                 "url": "http://www.felixcloutier.com/x86/XSAVES64.html",
                 "html": "<p>Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), the logical-AND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XSAVES may be executed only if CPL = 0.</p><p>The format of the XSAVE area is detailed in Section 13.4, \u201cXSAVE Area,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Soft-ware Developer\u2019s Manual, Volume 1</em>.</p><p>Section 13.10, \u201cOperation of XSAVES,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> provides a detailed description of the operation of the XSAVES instruction. The following items provide a high-level outline:</p><p>Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored.</p><p>1.</p>",
                 "tooltip": "Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), the logical-AND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XSAVES may be executed only if CPL = 0."
             };

         case "XSETBV":
             return {
                 "url": "http://www.felixcloutier.com/x86/XSETBV.html",
                 "html": "<p>Writes the contents of registers EDX:EAX into the 64-bit extended control register (XCR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The contents of the EDX register are copied to high-order 32 bits of the selected XCR and the contents of the EAX register are copied to low-order 32 bits of the XCR. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are ignored.) Undefined or reserved bits in an XCR should be set to values previously read.</p><p>This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) is generated. Specifying a reserved or unimplemented XCR in ECX will also cause a general protection exception. The processor will also generate a general protection exception if software attempts to write to reserved bits in an XCR.</p><p>Currently, only XCR0 is supported. Thus, all other values of ECX are reserved and will cause a #GP(0). Note that bit 0 of XCR0 (corresponding to x87 state) must be set to 1; the instruction will cause a #GP(0) if an attempt is made to clear this bit. In addition, the instruction causes a #GP(0) if an attempt is made to set XCR0[2] (AVX state) while clearing XCR0[1] (SSE state); it is necessary to set both bits to use AVX instructions; Section 13.3, \u201cEnabling the XSAVE Feature Set and XSAVE-Supported Features,\u201d of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p>",
                 "tooltip": "Writes the contents of registers EDX:EAX into the 64-bit extended control register (XCR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The contents of the EDX register are copied to high-order 32 bits of the selected XCR and the contents of the EAX register are copied to low-order 32 bits of the XCR. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are ignored.) Undefined or reserved bits in an XCR should be set to values previously read."
             };

         case "XTEST":
             return {
                 "url": "http://www.felixcloutier.com/x86/XTEST.html",
                 "html": "<p>The XTEST instruction queries the transactional execution status. If the instruction executes inside a transaction-ally executing RTM region or a transactionally executing HLE region, then the ZF flag is cleared, else it is set.</p>",
                 "tooltip": "The XTEST instruction queries the transactional execution status. If the instruction executes inside a transaction-ally executing RTM region or a transactionally executing HLE region, then the ZF flag is cleared, else it is set."
             };


     }
 }

 module.exports = {
     getAsmOpcode: getAsmOpcode
 };