#include <AssemblerX86.h>
Inheritance diagram for AssemblerX86:


End users will generally not need to use the AssemblerX86 class directly. Instead, they will call Assembler::create() to create an assembler that's appropriate for a particular binary file header or interpretation and then use that assembler to assemble instructions.
The assembler itself is quite small compared to the disassembler (about one third the size) and doesn't actually know about any instructions; it only knows how to encode various prefixes and operand addressing modes. For each instruction to be assembled, the assembler consults a dictionary of assembly definitions. The instruction is looked up in this dictionary and the chosen definition then drives the assembly. If the instruction being assembled matches multiple definitions then each valid definition is tried and the "best" one (see Assembler::set_encoding_type()) is returned.
The dictionary is generated directly from the Intel "Instruction Set Reference" PDF documentation as augmented by a small text file in this directory. The IntelAssemblyBuilder perl script generates AssemblerX86Init.h and AssemblerX86Init.C, which contain the X86InstructionKind enumeration, a function to initialize the dictionary (AssemblerX86::initAssemblyRules()), and a function for converting an X86InstructionKind constant to a string (AssemblerX86::to_str()).
Public Member Functions | |
| AssemblerX86 () | |
| virtual | ~AssemblerX86 () |
| virtual SgUnsignedCharList | assembleOne (SgAsmInstruction *) |
| Assemble an instruction (SgAsmInstruction) into byte code. | |
| void | set_honor_operand_types (bool b) |
| Causes the assembler to honor (if true) or disregard (if false) the data types of operands when assembling. | |
| bool | get_honor_operand_types () const |
| Returns true if the assembler is honoring operand data types, or false if the assembler is using the smallest possible encoding. | |
| virtual SgUnsignedCharList | assembleProgram (const std::string &source) |
| Assemble an x86 program from assembly source code using the nasm assembler. | |
Private Types | |
| typedef std::vector< const InsnDefn * > | DictionaryPage |
| Instruction assembly definitions for a single kind of instruction. | |
| typedef std::map< X86InstructionKind, DictionaryPage > | InsnDictionary |
| Instruction assembly definitions for all kinds of instructions. | |
| od_none | |
| Operand is not present as part of the instruction. | |
| od_AL | |
| AL register. | |
| od_AX | |
| AX register. | |
| od_EAX | |
| EAX register. | |
| od_RAX | |
| RAX register. | |
| od_DX | |
| DX register. | |
| od_CS | |
| CS register. | |
| od_DS | |
| DS register. | |
| od_ES | |
| ES register. | |
| od_FS | |
| FS register. | |
| od_GS | |
| GS register. | |
| od_SS | |
| SS register. | |
| od_rel8 | |
| A relative address in the range from 128 bytes before the end of the instruction to 127 bytes after the end of the instruction. | |
| od_rel16 | |
| A relative address in the same code segment as the instruction assembled, with an operand size attribute of 16 bits. | |
| od_rel32 | |
| A relative address in the same code segment as the instruction assembled, with an operand size attribute of 32 bits. | |
| od_rel64 | |
| A relative address in the same code segment as the instruction assembled, with an operand size attribute of 64 bits. | |
| od_ptr16_16 | |
| A far pointer, typically to a code segment different from that of the instruction. | |
| od_ptr16_32 | |
| A far pointer, typically to a code segment different from that of the instruction. | |
| od_ptr16_64 | |
| A far pointer, typically to a code segment different from that of the instruction. | |
| od_r8 | |
| One of the byte general-purpose registers: AL, CL, DL, BL, AH, CH, DH, BH, BPL, SPL, DIL and SIL; or one of the byte registers (R8L-R15L) available when using REX.R and 64-bit mode. | |
| od_r16 | |
| One of the word general-purpose registers: AX, CX, DX, BX, SP, BP, SI, DI; or one of the word registers (R8-R15) available when using REX.R and 64-bit mode. | |
| od_r32 | |
| One of the doubleword general-purpose registers: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI; or one of the doubleword registers (R8D-R15D) available when using REX.R and 64-bit mode. | |
| od_r64 | |
| One of the quadword general-purpose registers: RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15. | |
| od_imm8 | |
| An immediate byte value, a signed number between -128 and +127, inclusive. | |
| od_imm16 | |
| An immediate word value used for instructions whose operand-size attribute is 16 bits. | |
| od_imm32 | |
| An immediate doubleword value used for instructions whose operand-size attribute is 32 bits. | |
| od_imm64 | |
| An immediate quadword value used for instructions whose operand-size attribute is 64 bits. | |
| od_r_m8 | |
| A byte operand that is either the contents of a byte general-purpose register (AL, CL, DL, BL, AH, CH, DH, BH, BPL, SPL, DIL and SIL) or a byte from memory. | |
| od_r_m16 | |
| A word general-purpose register or memory operand used for instructions whose operand-size attribute is 16-bits. | |
| od_r_m32 | |
| A doubleword general-purpose register or memory operand used for instructions whose operand-size attribute is 32-bits. | |
| od_r_m64 | |
| A quadword general-purpose register or memory operand used for instructions whose operand-size attribute is 64 bits when using REX.W. | |
| od_m | |
| A 16-, 32-, or 64-bit operand in memory. | |
| od_m8 | |
| A byte operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers. | |
| od_m16 | |
| A word operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers. | |
| od_m32 | |
| A doubleword operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers. | |
| od_m64 | |
| A memory quadword operand in memory. | |
| od_m128 | |
| A memory double quadword operand in memory. | |
| od_m16_16 | |
| A memory operand containing a far pointer composed of two numbers. | |
| od_m16_32 | |
| A memory operand containing a far pointer composed of two numbers. | |
| od_m16_64 | |
| A memory operand containing a far pointer composed of two numbers. | |
| od_m16a16 | |
| A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m16&16" in Intel manuals). | |
| od_m16a32 | |
| A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m16&32" in Intel manuals). | |
| od_m32a32 | |
| A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m32&32" in Intel manuals). | |
| od_m16a64 | |
| A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m64&64" in Intel manuals). | |
| od_moffs8 | |
| A simple memory variable (memory offset) of type byte used by some variants of the MOV instruction. | |
| od_moffs16 | |
| A simple memory variable (memory offset) of type word used by some variants of the MOV instruction. | |
| od_moffs32 | |
| A simple memory variable (memory offset) of type doubleword used by some variants of the MOV instruction. | |
| od_moffs64 | |
| A simple memory variable (memory offset) of type quadword used by some variants of the MOV instruction. | |
| od_sreg | |
| A segment register. | |
| od_m32fp | |
| A single-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions. | |
| od_m64fp | |
| A double-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions. | |
| od_m80fp | |
| A double extended-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions. | |
| od_st0 | |
| The 0th (top) element of the FPU register stack. | |
| od_st1 | |
| The 1st (second-from-top) element of the FPU register stack. | |
| od_st2 | |
| The 2nd element of the FPU register stack. | |
| od_st3 | |
| The 3rd element of the FPU register stack. | |
| od_st4 | |
| The 4th element of the FPU register stack. | |
| od_st5 | |
| The 5th element of the FPU register stack. | |
| od_st6 | |
| The 6th element of the FPU register stack. | |
| od_st7 | |
| The 7th (bottom) element of the FPU register stack. | |
| od_sti | |
| Any element of the FPU register stack. | |
| od_mm | |
| An MMX register. | |
| od_mm_m32 | |
| The low-order 32 bits of an MMX register or a 32-bit memory operand. | |
| od_mm_m64 | |
| An MMX register or a 64-bit memory operand. | |
| od_xmm | |
| An XMM register. | |
| od_xmm_m16 | |
| See PMOVSXBQ. | |
| od_xmm_m32 | |
| An XMM register or a 32-bit memory operand. | |
| od_xmm_m64 | |
| An XMM register or a 64-bit memory operand. | |
| od_xmm_m128 | |
| An XMM register or a 128-bit memory operand. | |
| od_XMM0 | |
| See BLENDVPD. | |
| od_0 | |
| See ENTER. | |
| od_1 | |
| See ENTER. | |
| od_m80 | |
| See FBLD. | |
| od_dec | |
| See FBLD. | |
| od_m80bcd | |
| See FBSTP. | |
| od_m2byte | |
| See FLDCW. | |
| od_m14_28byte | |
| See FLDENV. | |
| od_m94_108byte | |
| See FRSTOR. | |
| od_m512byte | |
| See FXRSTORE. | |
| od_r16_m16 | |
| See LAR. | |
| od_r32_m8 | |
| See PINSRB. | |
| od_r32_m16 | |
| See LAR. | |
| od_r64_m16 | |
| See SLDT. | |
| od_CR0 | |
| See MOV. | |
| od_CR7 | |
| See MOV. | |
| od_CR8 | |
| See MOV. | |
| od_CR0CR7 | |
| See MOV. | |
| od_DR0DR7 | |
| See MOV. | |
| od_reg | |
| See MOVMSKPD. | |
| od_CL | |
| See SAR. | |
| mrp_unknown | |
| mrp_disp | |
| mrp_index | |
| mrp_index_disp | |
| mrp_base | |
| mrp_base_disp | |
| mrp_base_index | |
| mrp_base_index_disp | |
| enum | OperandDefn { od_none, od_AL, od_AX, od_EAX, od_RAX, od_DX, od_CS, od_DS, od_ES, od_FS, od_GS, od_SS, od_rel8, od_rel16, od_rel32, od_rel64, od_ptr16_16, od_ptr16_32, od_ptr16_64, od_r8, od_r16, od_r32, od_r64, od_imm8, od_imm16, od_imm32, od_imm64, od_r_m8, od_r_m16, od_r_m32, od_r_m64, od_m, od_m8, od_m16, od_m32, od_m64, od_m128, od_m16_16, od_m16_32, od_m16_64, od_m16a16, od_m16a32, od_m32a32, od_m16a64, od_moffs8, od_moffs16, od_moffs32, od_moffs64, od_sreg, od_m32fp, od_m64fp, od_m80fp, od_st0, od_st1, od_st2, od_st3, od_st4, od_st5, od_st6, od_st7, od_sti, od_mm, od_mm_m32, od_mm_m64, od_xmm, od_xmm_m16, od_xmm_m32, od_xmm_m64, od_xmm_m128, od_XMM0, od_0, od_1, od_m80, od_dec, od_m80bcd, od_m2byte, od_m14_28byte, od_m94_108byte, od_m512byte, od_r16_m16, od_r32_m8, od_r32_m16, od_r64_m16, od_CR0, od_CR7, od_CR8, od_CR0CR7, od_DR0DR7, od_reg, od_CL } |
| Operand types, from Intel "Instruction Set Reference, A-M" section 3.1.1.2, Vol. More... | |
| enum | MemoryReferencePattern { mrp_unknown, mrp_disp, mrp_index, mrp_index_disp, mrp_base, mrp_base_disp, mrp_base_index, mrp_base_index_disp } |
Private Member Functions | |
| SgUnsignedCharList | fixup_prefix_bytes (SgAsmx86Instruction *insn, SgUnsignedCharList source) |
Rewrites the prefix bytes stored in the original source to be in the same order (and same repeat counts) as stored in the target, or p_raw_bytes data member of the instruction. | |
| SgUnsignedCharList | assemble (SgAsmx86Instruction *insn, const InsnDefn *defn) |
| Low-level method to assemble a single instruction using the specified definition from the assembly dictionary. | |
| void | matches (const InsnDefn *defn, SgAsmx86Instruction *insn, int64_t *disp, int64_t *imm) const |
| Attempts to match an instruction with a definition. | |
| bool | matches (OperandDefn, SgAsmExpression *, SgAsmInstruction *, int64_t *disp, int64_t *imm) const |
| Attempts to match an instruction operand with a definition operand. | |
| uint8_t | build_modrm (const InsnDefn *, SgAsmx86Instruction *, size_t argno, uint8_t *sib, int64_t *displacement, uint8_t *rex) const |
| Builds the ModR/M byte, SIB byte. | |
| void | build_modreg (const InsnDefn *, SgAsmx86Instruction *, size_t argno, uint8_t *modrm, uint8_t *rex) const |
| Adjusts the "reg" field of the ModR/M byte and adjusts the REX prefix byte if necessary. | |
| uint8_t | segment_override (SgAsmx86Instruction *) |
| Calculates the segment override from the instruction operands rather than obtaining it from the p_segmentOverride data member. | |
Static Private Member Functions | |
| static size_t | od_e_val (unsigned opcode_mods) |
| Returns value of En modification. | |
| static uint8_t | od_rex_byte (unsigned opcode_mods) |
| static uint8_t | build_modrm (unsigned mod, unsigned reg, unsigned rm) |
| Returns a ModR/M byte constructed from the three standard fields: mode, register, and register/memory. | |
| static unsigned | modrm_mod (uint8_t modrm) |
| Returns the mode field of a ModR/M byte. | |
| static unsigned | modrm_reg (uint8_t modrm) |
| Returns the register field of a ModR/M byte. | |
| static unsigned | modrm_rm (uint8_t modrm) |
| Returns the register/memory field of a ModR/M byte. | |
| static uint8_t | build_sib (unsigned ss, unsigned index, unsigned base) |
| Returns a SIB byte constructed from the three standard fields: scale, index, and base. | |
| static unsigned | sib_ss (uint8_t sib) |
| Returns the scale field of a SIB byte. | |
| static unsigned | sib_index (uint8_t sib) |
| Returns the index field of a SIB byte. | |
| static unsigned | sib_base (uint8_t sib) |
| Returns the base field of a SIB byte. | |
| static void | initAssemblyRules () |
| Build the dictionary used by the x86 assemblers. | |
| static void | initAssemblyRules_part1 () |
| static void | initAssemblyRules_part2 () |
| static void | initAssemblyRules_part3 () |
| static void | initAssemblyRules_part4 () |
| static void | initAssemblyRules_part5 () |
| static void | initAssemblyRules_part6 () |
| static void | initAssemblyRules_part7 () |
| static void | initAssemblyRules_part8 () |
| static void | initAssemblyRules_part9 () |
| static void | define (const InsnDefn *d) |
| Adds a definition to the assembly dictionary. | |
| static std::string | to_str (X86InstructionKind) |
| Returns the string version of the instruction kind sans "x86_" prefix. | |
| static bool | matches_rel (SgAsmInstruction *, int64_t val, size_t nbytes) |
| Determines whether a call/jump target can be represented as a IP-relative displacement of the specified size. | |
| static MemoryReferencePattern | parse_memref (SgAsmInstruction *insn, SgAsmMemoryReferenceExpression *expr, SgAsmx86RegisterReferenceExpression **base_reg, SgAsmx86RegisterReferenceExpression **index_reg, SgAsmValueExpression **scale, SgAsmValueExpression **displacement) |
| Parses memory refernce expressons and returns the address BASE_REG + (INDEX_REG * SCALE) + DISPLACEMENT, where BASE_REG and INDEX_REG are optional register reference expressions and SCALE and DISPLACEMENT are optional value expressions. | |
Private Attributes | |
| bool | honor_operand_types |
| If true, operand types rather than values determine assembled form. | |
Static Private Attributes | |
| static const unsigned | od_e_mask = 0x00000070 |
| Indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand. | |
| static const unsigned | od_e_pres = 0x00000080 |
| static const unsigned | od_e0 = 0x00000000 | od_e_pres |
| static const unsigned | od_e1 = 0x00000010 | od_e_pres |
| static const unsigned | od_e2 = 0x00000020 | od_e_pres |
| static const unsigned | od_e3 = 0x00000030 | od_e_pres |
| static const unsigned | od_e4 = 0x00000040 | od_e_pres |
| static const unsigned | od_e5 = 0x00000050 | od_e_pres |
| static const unsigned | od_e6 = 0x00000060 | od_e_pres |
| static const unsigned | od_e7 = 0x00000070 | od_e_pres |
| static const unsigned | od_rex_pres = 0x00000001 |
| Indicates the use of a REX prefix that affects operand size or instruction semantics. | |
| static const unsigned | od_rex_mask = 0x00000f00 |
| static const unsigned | od_rex = 0x00000000 | od_rex_pres |
| static const unsigned | od_rexb = 0x00000100 | od_rex_pres |
| static const unsigned | od_rexx = 0x00000200 | od_rex_pres |
| static const unsigned | od_rexxb = 0x00000300 | od_rex_pres |
| static const unsigned | od_rexr = 0x00000400 | od_rex_pres |
| static const unsigned | od_rexrb = 0x00000500 | od_rex_pres |
| static const unsigned | od_rexrx = 0x00000600 | od_rex_pres |
| static const unsigned | od_rexrxb = 0x00000700 | od_rex_pres |
| static const unsigned | od_rexw = 0x00000800 | od_rex_pres |
| static const unsigned | od_rexwb = 0x00000900 | od_rex_pres |
| static const unsigned | od_rexwx = 0x00000a00 | od_rex_pres |
| static const unsigned | od_rexwxb = 0x00000b00 | od_rex_pres |
| static const unsigned | od_rexwr = 0x00000c00 | od_rex_pres |
| static const unsigned | od_rexwrb = 0x00000d00 | od_rex_pres |
| static const unsigned | od_rexwrx = 0x00000e00 | od_rex_pres |
| static const unsigned | od_rexwrxb = 0x00000f00 | od_rex_pres |
| static const unsigned | od_modrm = 0x00000002 |
| Indicates that the ModR/M byte of the instruction contains a register operand and an r/m operand. | |
| static const unsigned | od_c_mask = 0x00007000 |
| A 1-byte (CB), 2-byte (CW), 4-byte (CD), 6-byte (CP), 8-byte (CO), or 10-byte (CT) value follows the opcode. | |
| static const unsigned | od_cb = 0x00001000 |
| static const unsigned | od_cw = 0x00002000 |
| static const unsigned | od_cd = 0x00003000 |
| static const unsigned | od_cp = 0x00004000 |
| static const unsigned | od_co = 0x00005000 |
| static const unsigned | od_ct = 0x00006000 |
| static const unsigned | od_i_mask = 0x00070000 |
| A 1-byte (IB), 2-byte (IW), 4-byte (ID), or 8-byte (IO) little-endian immediate operand to the instruction follows the opcode, ModR/M bytes or scale-indexing bytes. | |
| static const unsigned | od_ib = 0x00010000 |
| static const unsigned | od_iw = 0x00020000 |
| static const unsigned | od_id = 0x00030000 |
| static const unsigned | od_io = 0x00040000 |
| static const unsigned | od_r_mask = 0x00700000 |
| A register code, from 0 through 7, added to a byte of the opcode. | |
| static const unsigned | od_rb = 0x00100000 |
| static const unsigned | od_rw = 0x00200000 |
| static const unsigned | od_rd = 0x00300000 |
| static const unsigned | od_ro = 0x00400000 |
| static const unsigned | od_i = 0x00000004 |
| A number used in floating-point instructions when one of the operands is ST(i) from the FPU register stack. | |
| static const unsigned | COMPAT_LEGACY = 0x01 |
| These bits define the compatibility of an instruction to 32- and 64-bit modes. Definition is compatible with non 64-bit architectures. | |
| static const unsigned | COMPAT_64 = 0x02 |
| Definition is compatible with 64-bit architectures. | |
| static InsnDictionary | defns |
| Instruction assembly definitions organized by X86InstructionKind. | |
Classes | |
| class | InsnDefn |
| Defines static characteristics of an instruction used by the assembler and disassembler. More... | |
typedef std::vector<const InsnDefn*> AssemblerX86::DictionaryPage [private] |
Instruction assembly definitions for a single kind of instruction.
typedef std::map<X86InstructionKind, DictionaryPage> AssemblerX86::InsnDictionary [private] |
Instruction assembly definitions for all kinds of instructions.
enum AssemblerX86::OperandDefn [private] |
Operand types, from Intel "Instruction Set Reference, A-M" section 3.1.1.2, Vol.
2A 3-3
enum AssemblerX86::MemoryReferencePattern [private] |
| AssemblerX86::AssemblerX86 | ( | ) | [inline] |
| virtual AssemblerX86::~AssemblerX86 | ( | ) | [inline, virtual] |
| virtual SgUnsignedCharList AssemblerX86::assembleOne | ( | SgAsmInstruction * | ) | [virtual] |
Assemble an instruction (SgAsmInstruction) into byte code.
The new bytes are added to the end of the vector.
Implements Assembler.
| void AssemblerX86::set_honor_operand_types | ( | bool | b | ) | [inline] |
Causes the assembler to honor (if true) or disregard (if false) the data types of operands when assembling.
For instance, when honoring operand data types, if an operand is of type SgAsmWordValueExpression then the assembler will attempt to encode it as four bytes even if its value could be encoded as a single byte. This is turned on automatically if the Assembler::set_encoding_type() is set to Assembler::ET_MATCHES, but can also be turned on independently.
| bool AssemblerX86::get_honor_operand_types | ( | ) | const [inline] |
Returns true if the assembler is honoring operand data types, or false if the assembler is using the smallest possible encoding.
| virtual SgUnsignedCharList AssemblerX86::assembleProgram | ( | const std::string & | source | ) | [virtual] |
| static size_t AssemblerX86::od_e_val | ( | unsigned | opcode_mods | ) | [inline, static, private] |
Returns value of En modification.
| static uint8_t AssemblerX86::od_rex_byte | ( | unsigned | opcode_mods | ) | [inline, static, private] |
| static uint8_t AssemblerX86::build_modrm | ( | unsigned | mod, | |
| unsigned | reg, | |||
| unsigned | rm | |||
| ) | [inline, static, private] |
Returns a ModR/M byte constructed from the three standard fields: mode, register, and register/memory.
| static unsigned AssemblerX86::modrm_mod | ( | uint8_t | modrm | ) | [inline, static, private] |
Returns the mode field of a ModR/M byte.
| static unsigned AssemblerX86::modrm_reg | ( | uint8_t | modrm | ) | [inline, static, private] |
Returns the register field of a ModR/M byte.
| static unsigned AssemblerX86::modrm_rm | ( | uint8_t | modrm | ) | [inline, static, private] |
Returns the register/memory field of a ModR/M byte.
| static uint8_t AssemblerX86::build_sib | ( | unsigned | ss, | |
| unsigned | index, | |||
| unsigned | base | |||
| ) | [inline, static, private] |
Returns a SIB byte constructed from the three standard fields: scale, index, and base.
| static unsigned AssemblerX86::sib_ss | ( | uint8_t | sib | ) | [inline, static, private] |
Returns the scale field of a SIB byte.
| static unsigned AssemblerX86::sib_index | ( | uint8_t | sib | ) | [inline, static, private] |
Returns the index field of a SIB byte.
| static unsigned AssemblerX86::sib_base | ( | uint8_t | sib | ) | [inline, static, private] |
Returns the base field of a SIB byte.
| static void AssemblerX86::initAssemblyRules | ( | ) | [static, private] |
Build the dictionary used by the x86 assemblers.
All x86 assemblers share a common dictionary.
| static void AssemblerX86::initAssemblyRules_part1 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part2 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part3 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part4 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part5 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part6 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part7 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part8 | ( | ) | [static, private] |
| static void AssemblerX86::initAssemblyRules_part9 | ( | ) | [static, private] |
| static void AssemblerX86::define | ( | const InsnDefn * | d | ) | [inline, static, private] |
Adds a definition to the assembly dictionary.
All x86 assemblers share a common dictionary.
| static std::string AssemblerX86::to_str | ( | X86InstructionKind | ) | [static, private] |
Returns the string version of the instruction kind sans "x86_" prefix.
This is not necessarily the same as the mnemonic since occassionally multiple kinds will map to a single mnemonic (e.g., RET maps to both x86_ret and x86_retf).
| SgUnsignedCharList AssemblerX86::fixup_prefix_bytes | ( | SgAsmx86Instruction * | insn, | |
| SgUnsignedCharList | source | |||
| ) | [private] |
Rewrites the prefix bytes stored in the original source to be in the same order (and same repeat counts) as stored in the target, or p_raw_bytes data member of the instruction.
The source should contain only prefix bytes from groups 1 through 4 as listed in section 2.1.1 of the Intel Instruction Set Reference. It should not contain the REX byte. Any source prefix that does not appear in the original instruction will be placed at the end of the result; any prefix that appears in the original instruction but not the source will be dropped.
| SgUnsignedCharList AssemblerX86::assemble | ( | SgAsmx86Instruction * | insn, | |
| const InsnDefn * | defn | |||
| ) | [private] |
Low-level method to assemble a single instruction using the specified definition from the assembly dictionary.
An Assembler::Exception is thrown if the instruction is not compatible with the definition.
| void AssemblerX86::matches | ( | const InsnDefn * | defn, | |
| SgAsmx86Instruction * | insn, | |||
| int64_t * | disp, | |||
| int64_t * | imm | |||
| ) | const [private] |
Attempts to match an instruction with a definition.
An exception is thrown if the instruction and definition do not match. If the disp or imm arguments are non-null pointers then the operands of the instruction are also checked, and any operand which is an IP-relative displacement or immediate have their values returned through those arguments.
| bool AssemblerX86::matches | ( | OperandDefn | , | |
| SgAsmExpression * | , | |||
| SgAsmInstruction * | , | |||
| int64_t * | disp, | |||
| int64_t * | imm | |||
| ) | const [private] |
Attempts to match an instruction operand with a definition operand.
Returns true if they match, false otherwise. The disp and imm pointers are used to return values if the operand is an IP-relative displacement or immediate value.
| static bool AssemblerX86::matches_rel | ( | SgAsmInstruction * | , | |
| int64_t | val, | |||
| size_t | nbytes | |||
| ) | [static, private] |
Determines whether a call/jump target can be represented as a IP-relative displacement of the specified size.
| static MemoryReferencePattern AssemblerX86::parse_memref | ( | SgAsmInstruction * | insn, | |
| SgAsmMemoryReferenceExpression * | expr, | |||
| SgAsmx86RegisterReferenceExpression ** | base_reg, | |||
| SgAsmx86RegisterReferenceExpression ** | index_reg, | |||
| SgAsmValueExpression ** | scale, | |||
| SgAsmValueExpression ** | displacement | |||
| ) | [static, private] |
Parses memory refernce expressons and returns the address BASE_REG + (INDEX_REG * SCALE) + DISPLACEMENT, where BASE_REG and INDEX_REG are optional register reference expressions and SCALE and DISPLACEMENT are optional value expressions.
| uint8_t AssemblerX86::build_modrm | ( | const InsnDefn * | , | |
| SgAsmx86Instruction * | , | |||
| size_t | argno, | |||
| uint8_t * | sib, | |||
| int64_t * | displacement, | |||
| uint8_t * | rex | |||
| ) | const [private] |
Builds the ModR/M byte, SIB byte.
Also adjusts the REX prefix byte and returns any displacement value.
| void AssemblerX86::build_modreg | ( | const InsnDefn * | , | |
| SgAsmx86Instruction * | , | |||
| size_t | argno, | |||
| uint8_t * | modrm, | |||
| uint8_t * | rex | |||
| ) | const [private] |
Adjusts the "reg" field of the ModR/M byte and adjusts the REX prefix byte if necessary.
| uint8_t AssemblerX86::segment_override | ( | SgAsmx86Instruction * | ) | [private] |
Calculates the segment override from the instruction operands rather than obtaining it from the p_segmentOverride data member.
Returns zero if no segment override is necessary.
const unsigned AssemblerX86::od_e_mask = 0x00000070 [static, private] |
Indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand.
The reg field contains n, providing an extension to the instruction's opcode. This form is written as "/0", "/1", etc. in the Intel documentation.
const unsigned AssemblerX86::od_e_pres = 0x00000080 [static, private] |
const unsigned AssemblerX86::od_e0 = 0x00000000 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e1 = 0x00000010 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e2 = 0x00000020 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e3 = 0x00000030 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e4 = 0x00000040 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e5 = 0x00000050 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e6 = 0x00000060 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_e7 = 0x00000070 | od_e_pres [static, private] |
const unsigned AssemblerX86::od_rex_pres = 0x00000001 [static, private] |
Indicates the use of a REX prefix that affects operand size or instruction semantics.
The ordering of the REX prefix and other optional/mandatory instruction prefixes are discussed in Chapter 2 of the Intel "Instruction Set Reference, A-M".
const unsigned AssemblerX86::od_rex_mask = 0x00000f00 [static, private] |
const unsigned AssemblerX86::od_rex = 0x00000000 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexb = 0x00000100 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexx = 0x00000200 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexxb = 0x00000300 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexr = 0x00000400 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexrb = 0x00000500 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexrx = 0x00000600 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexrxb = 0x00000700 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexw = 0x00000800 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwb = 0x00000900 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwx = 0x00000a00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwxb = 0x00000b00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwr = 0x00000c00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwrb = 0x00000d00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwrx = 0x00000e00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_rexwrxb = 0x00000f00 | od_rex_pres [static, private] |
const unsigned AssemblerX86::od_modrm = 0x00000002 [static, private] |
Indicates that the ModR/M byte of the instruction contains a register operand and an r/m operand.
This form is written as "/r" in the Intel documentation.
const unsigned AssemblerX86::od_c_mask = 0x00007000 [static, private] |
A 1-byte (CB), 2-byte (CW), 4-byte (CD), 6-byte (CP), 8-byte (CO), or 10-byte (CT) value follows the opcode.
This value is used to specify a code offset and possibly a new value for the code segment register.
const unsigned AssemblerX86::od_cb = 0x00001000 [static, private] |
const unsigned AssemblerX86::od_cw = 0x00002000 [static, private] |
const unsigned AssemblerX86::od_cd = 0x00003000 [static, private] |
const unsigned AssemblerX86::od_cp = 0x00004000 [static, private] |
const unsigned AssemblerX86::od_co = 0x00005000 [static, private] |
const unsigned AssemblerX86::od_ct = 0x00006000 [static, private] |
const unsigned AssemblerX86::od_i_mask = 0x00070000 [static, private] |
A 1-byte (IB), 2-byte (IW), 4-byte (ID), or 8-byte (IO) little-endian immediate operand to the instruction follows the opcode, ModR/M bytes or scale-indexing bytes.
The opcode determines if the operand is a signed value.
const unsigned AssemblerX86::od_ib = 0x00010000 [static, private] |
const unsigned AssemblerX86::od_iw = 0x00020000 [static, private] |
const unsigned AssemblerX86::od_id = 0x00030000 [static, private] |
const unsigned AssemblerX86::od_io = 0x00040000 [static, private] |
const unsigned AssemblerX86::od_r_mask = 0x00700000 [static, private] |
A register code, from 0 through 7, added to a byte of the opcode.
This form is written as "+rb" in the Intel documentation.
const unsigned AssemblerX86::od_rb = 0x00100000 [static, private] |
const unsigned AssemblerX86::od_rw = 0x00200000 [static, private] |
const unsigned AssemblerX86::od_rd = 0x00300000 [static, private] |
const unsigned AssemblerX86::od_ro = 0x00400000 [static, private] |
const unsigned AssemblerX86::od_i = 0x00000004 [static, private] |
A number used in floating-point instructions when one of the operands is ST(i) from the FPU register stack.
The number i (which can range from 0 to 7) is added to the opcode byte form a single opcode byte. This form is written as "+i" in the Intel documentation.
const unsigned AssemblerX86::COMPAT_LEGACY = 0x01 [static, private] |
These bits define the compatibility of an instruction to 32- and 64-bit modes. Definition is compatible with non 64-bit architectures.
const unsigned AssemblerX86::COMPAT_64 = 0x02 [static, private] |
Definition is compatible with 64-bit architectures.
InsnDictionary AssemblerX86::defns [static, private] |
Instruction assembly definitions organized by X86InstructionKind.
bool AssemblerX86::honor_operand_types [private] |
If true, operand types rather than values determine assembled form.
1.4.7