모두의 코드
HSUBPS (Intel x86/64 assembly instruction)

작성일 : 2020-09-01 이 글은 745 번 읽혔습니다.

HSUBPS

Packed Single-FP Horizontal Subtract

참고 사항

아래 표를 해석하는 방법은 x86-64 명령어 레퍼런스 읽는 법 글을 참조하시기 바랍니다.

Opcode/
Instruction

Op/
En

64/32-bit
Mode

CPUID
Feature
Flag

Description

F2 0F 7D /r
HSUBPS xmm1 xmm2/m128

RM

V/V

SSE3

Horizontal subtract packed single-precision floating-point values from xmm2/m128 to xmm1.

VEX.NDS.128.F2.0F.WIG 7D /r
VHSUBPS xmm1 xmm2 xmm3/m128

RVM

V/V

AVX

Horizontal subtract packed single-precision floating-point values from xmm2 and xmm3/mem.

VEX.NDS.256.F2.0F.WIG 7D /r
VHSUBPS ymm1 ymm2 ymm3/m256

RVM

V/V

AVX

Horizontal subtract packed single-precision floating-point values from ymm2 and ymm3/mem.

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

RM

ModRM:reg (r, w)

ModRM:r/m (r)

NA

NA

RVM

ModRM:reg (w)

VEX.vvvv (r)

ModRM:r/m (r)

NA

Description

Subtracts the single-precision floating-point value in the second dword of the destination operand from the first dword of the destination operand and stores the result in the first dword of the destination operand.

Subtracts the single-precision floating-point value in the fourth dword of the destination operand from the third dword of the destination operand and stores the result in the second dword of the destination operand.

Subtracts the single-precision floating-point value in the second dword of the source operand from the first dword of the source operand and stores the result in the third dword of the destination operand.

Subtracts the single-precision floating-point value in the fourth dword of the source operand from the third dword of the source operand and stores the result in the fourth dword of the destination operand.

In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).

See Figure 3-22 for HSUBPS; see Figure 3-23 for VHSUBPS.

] m 1 1 7 S 9 [ m : 8 6 , 2 - 3 - 2 4 ] 1 2 ] 3 3 1 8 [ x m 9 2 ] 5 / 6 1 x 6 P ] 2 ] m ] ] ] : 2 : / 2 2 m [ 6 2 6 9 [ 9 3 : m 3 8 x 2 : 6 5 / 2 9 3 7 : : 6 [ [ 3 ] ] m / E O x S 8 : 5 : H 4 m x 1 2 2 9 m 6 2 ] x 9 3 2 : : - m : / M 1 9 6 U B 6 x m m 1 x 2 2 8 R L T : m m m x m / m 8 x m 0 S [ 3 1 : 0 ] m : m 1 [ 2 ] ] [ 2 1 0 m m 1 [ [ 9 5 ] : [ 6 4 ] m x m 1 9 4 U 6 3 : 6 3 1 3 [ 1 0 ] x 1 [ 3 1 0 ] 3 m 1 3 [ 1 : : x m m 1 2 [ 6 [ 5 m 2 m 1 [ : 6 4 ] - m m m [ 1 7 1 1 2 6 7 [ 9 5 : 7 9
Figure 3-22. HSUBPS--Packed Single-FP Horizontal Subtract

Y X - 4 5 Y - 1 X X 7 Y 6 Y 5 Y 4 Y 3 Y 2 Y Y Y 7 X X X 4 X X 2 2 C 0 X 1 C E D 3 X X 7 X Y - 6 Y 1 5 6 1 - 1 - 5 X T - - X S R S - Y X 6 0 7 Y 2 0 2 Y R X 3 0 3 S 4 Y
Figure 3-23. VHSUBPS operation

128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-nation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified.

VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed.

VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.

Operation

HSUBPS (128-bit Legacy SSE version)

DEST[31:0] <-  SRC1[31:0] - SRC1[63:32]
DEST[63:32] <-  SRC1[95:64] - SRC1[127:96]
DEST[95:64] <-  SRC2[31:0] - SRC2[63:32]
DEST[127:96] <-  SRC2[95:64] - SRC2[127:96] 
DEST[VLMAX-1:128] (Unmodified)

VHSUBPS (VEX.128 encoded version)

DEST[31:0] <-  SRC1[31:0] - SRC1[63:32]
DEST[63:32] <-  SRC1[95:64] - SRC1[127:96]
DEST[95:64] <-  SRC2[31:0] - SRC2[63:32]
DEST[127:96] <-  SRC2[95:64] - SRC2[127:96] 
DEST[VLMAX-1:128] <-  0

VHSUBPS (VEX.256 encoded version)

DEST[31:0] <-  SRC1[31:0] - SRC1[63:32]
DEST[63:32] <-  SRC1[95:64] - SRC1[127:96]
DEST[95:64] <-  SRC2[31:0] - SRC2[63:32]
DEST[127:96] <-  SRC2[95:64] - SRC2[127:96] 
DEST[159:128] <-  SRC1[159:128] - SRC1[191:160]
DEST[191:160] <-  SRC1[223:192] - SRC1[255:224]
DEST[223:192] <-  SRC2[159:128] - SRC2[191:160]
DEST[255:224] <-  SRC2[223:192] - SRC2[255:224]

Intel C/C++ Compiler Intrinsic Equivalent

HSUBPS : __m128 _mm_hsub_ps(__m128 a, __m128 b);
VHSUBPS : __m256 _mm256_hsub_ps(__m256 a, __m256 b);

Exceptions

When the source operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.

Numeric Exceptions

Overflow, Underflow, Invalid, Precision, Denormal

Other Exceptions

See Exceptions Type 2.

첫 댓글을 달아주세요!
프로필 사진 없음
강좌에 관련 없이 궁금한 내용은 여기를 사용해주세요

    댓글을 불러오는 중입니다..