ROSE  0.11.145.0
Classes | Public Types | Public Member Functions | Static Public Member Functions | Static Public Attributes | Protected Attributes | List of all members
Rose::BinaryAnalysis::CodeInserter Class Reference

Description

Insert new code in place of existing instructions.

Definition at line 21 of file CodeInserter.h.

#include <Rose/BinaryAnalysis/CodeInserter.h>

Collaboration diagram for Rose::BinaryAnalysis::CodeInserter:
Collaboration graph
[legend]

Classes

struct  InstructionInfo
 Information about an instruction within the basic block being modified. More...
 
struct  Relocation
 Relocation record. More...
 

Public Types

enum  AggregationDirection {
  AGGREGATE_PREDECESSORS = 0x00000001,
  AGGREGATE_SUCCESSORS = 0x00000002
}
 What other instructions can be moved to make room. More...
 
enum  NopPadding {
  PAD_NOP_BACK,
  PAD_NOP_FRONT,
  PAD_RANDOM_BACK
}
 How to pad with no-ops. More...
 
enum  RelocType {
  RELOC_INDEX_ABS_LE32,
  RELOC_INDEX_ABS_LE32HI,
  RELOC_INDEX_ABS_BE32,
  RELOC_ADDR_REL_LE32,
  RELOC_ADDR_REL_BE32,
  RELOC_INSN_ABS_LE32,
  RELOC_INSN_REL_LE32,
  RELOC_INSN_REL_BE32
}
 Type of relocation to perform. More...
 
typedef Sawyer::Container::Map< int, InstructionInfoInstructionInfoMap
 Information about instructions within the basic block being modified. More...
 

Public Member Functions

 CodeInserter (const Partitioner2::PartitionerConstPtr &)
 
const AddressIntervalSetallocatedChunks () const
 Returns the parts of the virtual address space that were allocated for new instructions. More...
 
virtual bool replaceBlockInsns (const Partitioner2::BasicBlockPtr &, size_t startIdx, size_t nInsns, std::vector< uint8_t > replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Replace instructions in basic block. More...
 
bool replaceInsnsAtFront (const Partitioner2::BasicBlockPtr &, size_t nInsns, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Replace instructions at front of basic block. More...
 
virtual bool replaceInsnsAtBack (const Partitioner2::BasicBlockPtr &, size_t nInsns, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Replace instructions at back of basic block. More...
 
virtual bool prependInsns (const Partitioner2::BasicBlockPtr &, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Prepend code to a basic block. More...
 
virtual bool appendInsns (const Partitioner2::BasicBlockPtr &, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Append code to a basic block. More...
 
virtual bool replaceInsns (const std::vector< SgAsmInstruction * > &toReplace, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations=std::vector< Relocation >())
 Replace exactly the specified instructions with some other encoding. More...
 
virtual void fillWithNops (const AddressIntervalSet &where)
 Fill the specified memory with no-op instructions. More...
 
virtual void fillWithRandom (const AddressIntervalSet &where)
 Fill the specified memory with random data. More...
 
virtual std::vector< uint8_t > encodeJump (rose_addr_t srcVa, rose_addr_t tgtVa)
 Encode an unconditional branch. More...
 
virtual std::vector< uint8_t > applyRelocations (rose_addr_t startVa, std::vector< uint8_t > replacement, const std::vector< Relocation > &relocations, size_t relocStart, const InstructionInfoMap &insnInfoMap)
 Apply relocations to create a new encoding. More...
 
virtual AddressInterval allocateMemory (size_t nBytes, rose_addr_t jmpTargetVa, Commit::Boolean commit=Commit::YES)
 Allocate virtual memory in the partitioner memory map. More...
 
void commitAllocation (const AddressInterval &where, Commit::Boolean commit=Commit::YES)
 Commit previous allocation. More...
 
AddressIntervalSet instructionLocations (const std::vector< SgAsmInstruction * > &)
 Given a list of functions, return all addresses that the instructions occupy. More...
 
virtual bool replaceByOverwrite (const AddressIntervalSet &toReplaceVas, const AddressInterval &entryInterval, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations, size_t relocStart, const InstructionInfoMap &insnInfoMap)
 Insert new code by overwriting existing instructions. More...
 
virtual bool replaceByTransfer (const AddressIntervalSet &toReplaceVas, const AddressInterval &entryInterval, const std::vector< SgAsmInstruction * > &toReplace, const std::vector< uint8_t > &replacement, const std::vector< Relocation > &relocations, size_t relocStart, const InstructionInfoMap &insnInfoMap)
 Insert new code in allocated area. More...
 
InstructionInfoMap computeInstructionInfoMap (const Partitioner2::BasicBlockPtr &, size_t startIdx, size_t nDeleted)
 Obtain info about instructions for the basic block being modified. More...
 
const AddressIntervalchunkAllocationRegion () const
 Property: Where chunks are allocated. More...
 
void chunkAllocationRegion (const AddressInterval &i)
 Property: Where chunks are allocated. More...
 
size_t minChunkAllocationSize () const
 Property: Minimum size of allocated chunks. More...
 
void minChunkAllocationSize (size_t n)
 Property: Minimum size of allocated chunks. More...
 
size_t chunkAllocationAlignment () const
 Property: Alignment for large allocated chunks.
 
void chunkAllocationAlignment (size_t n)
 Property: Alignment for large allocated chunks.
 
const std::string & chunkAllocationName () const
 Property: Name for newly allocated regions of memory.
 
void chunkAllocationName (const std::string &s)
 Property: Name for newly allocated regions of memory.
 
unsigned aggregationDirection () const
 Property: Whether additional instructions can be moved. More...
 
void aggregationDirection (unsigned d)
 Property: Whether additional instructions can be moved. More...
 
NopPadding nopPadding () const
 Property: Where to add no-ops when padding. More...
 
void nopPadding (NopPadding p)
 Property: Where to add no-ops when padding. More...
 

Static Public Member Functions

static void initDiagnostics ()
 Initialize diagnostic streams. More...
 

Static Public Attributes

static Diagnostics::Facility mlog
 Facility for emitting diagnostics. More...
 

Protected Attributes

Partitioner2::PartitionerConstPtr partitioner_
 
AddressInterval chunkAllocationRegion_
 
size_t minChunkAllocationSize_
 
size_t chunkAllocationAlignment_
 
std::string chunkAllocationName_
 
AddressIntervalSet allocatedChunks_
 
AddressIntervalSet freeSpace_
 
unsigned aggregationDirection_
 
NopPadding nopPadding_
 

Member Typedef Documentation

Information about instructions within the basic block being modified.

The instructions are numbered relative to their position with the insertion point and deleted instructions. Negative keys refer to instructions that appear before the insertion point, and non-negative keys refer to instructions starting one past the last deleted instruction or, if no instructions are deleted, the instruction originally at the insertion point.

Definition at line 141 of file CodeInserter.h.

Member Enumeration Documentation

What other instructions can be moved to make room.

These are bit flags.

Enumerator
AGGREGATE_PREDECESSORS 

Move preceding instructions in CFG.

AGGREGATE_SUCCESSORS 

Move succeeding instructions in CFG.

Definition at line 24 of file CodeInserter.h.

How to pad with no-ops.

Enumerator
PAD_NOP_BACK 

Add no-ops to the end of replacements.

PAD_NOP_FRONT 

Add no-ops to the front of replacements.

PAD_RANDOM_BACK 

Add random data to the end of replacements.

Definition at line 30 of file CodeInserter.h.

Type of relocation to perform.

Each enum constant identifies a function whose description is given below. In those descriptions, the following variables are used:

  • input is a vector of bytes that was specified by the user as the new code to be inserted.
  • reloc_value is the value data member of the relocation record. It has various interpretations depending on the relocation function.
  • @ addend is the value currerntly stored at the destination bytes of the input interpretted as a signed value in the byte order and width specified by the function name.

The last word of the function name specifies the format used to write the computed value back to the output:

  • LE32 writes the low-order 32 bits of the computed value as a 32-bit integer in little-endian order.
  • LE32HI writes the high-order 32 bits of the computed value as a 32-bit integer in little-endian order.
Enumerator
RELOC_INDEX_ABS_LE32 

Interprets the reloc_value as an index of some byte in the input, and computes that byte's virtual address.

RELOC_INDEX_ABS_LE32HI 

Interprets the reloc_value as an index of some byte in the input, and computes that byte's virtual address.

RELOC_INDEX_ABS_BE32 

Interprets the reloc_value as an index of some byte in the input, and computes that byte's virtual address.

RELOC_ADDR_REL_LE32 

Interprets the reloc_value as a virtual address and computes the offset from the output virtual address to that specified virtual address, adjusted with the addend.

RELOC_ADDR_REL_BE32 

Interprets the reloc_value as a virtual address and computes the offset from the output virtual address to that specified virtual address, adjusted with the addend.

RELOC_INSN_ABS_LE32 

Interprets the reloc_value as an instruction relative index for some instruction of the original basic block.

Negative indexes are measured backward from the insertion point, and non-negative indexes are measured forward from one past the last deleted instruction (or insertion point if no deletions). This relocation function calculates the address of the specified instruction. This accounts for cases when the referenced instruction has been moved.

RELOC_INSN_REL_LE32 

Interprets the reloc_value as an instruction relative index for some instruction of the original basic block.

Negative indexes are measured backward from the insertion point, and non-negative indexes are measured forward from one past the last deleted instruction (or insertion point if no deletions). This relocation function calculates the offset from the output virtual address to that instructions virtual address, adjusted with the addend. This accounts for cases when the referenced instruction has been moved.

RELOC_INSN_REL_BE32 

Interprets the reloc_value as an instruction relative index for some instruction of the original basic block.

Negative indexes are measured backward from the insertion point, and non-negative indexes are measured forward from one past the last deleted instruction (or insertion point if no deletions). This relocation function calculates the offset from the output virtual address to that instructions virtual address, adjusted with the addend. This accounts for cases when the referenced instruction has been moved.

Definition at line 54 of file CodeInserter.h.

Member Function Documentation

static void Rose::BinaryAnalysis::CodeInserter::initDiagnostics ( )
static

Initialize diagnostic streams.

This is called automatically by Rose::Diagnostics::initialize.

const AddressInterval& Rose::BinaryAnalysis::CodeInserter::chunkAllocationRegion ( ) const
inline

Property: Where chunks are allocated.

This region defines the part of the memory map where new chunks are allocated in order to hold replacement code that doesn't fit into the same space as the instructions its replacing. The default is the part of the address space immediately after the last mapped address in the partitioner passed to the constructor.

Definition at line 174 of file CodeInserter.h.

void Rose::BinaryAnalysis::CodeInserter::chunkAllocationRegion ( const AddressInterval i)
inline

Property: Where chunks are allocated.

This region defines the part of the memory map where new chunks are allocated in order to hold replacement code that doesn't fit into the same space as the instructions its replacing. The default is the part of the address space immediately after the last mapped address in the partitioner passed to the constructor.

Definition at line 175 of file CodeInserter.h.

const AddressIntervalSet& Rose::BinaryAnalysis::CodeInserter::allocatedChunks ( ) const
inline

Returns the parts of the virtual address space that were allocated for new instructions.

The returned value will be a subset of the chunkAllocationRegion. The return value indicates where large chunks of memory were allocated, but not what bytes within that memory were actually used for new instructions.

Definition at line 181 of file CodeInserter.h.

size_t Rose::BinaryAnalysis::CodeInserter::minChunkAllocationSize ( ) const
inline

Property: Minimum size of allocated chunks.

When allocating space for replacement code, never allocate less than this many bytes at a time. Note that multiple replacement codes can be written to a single chunk since we maintain a free list within chunks.

Definition at line 189 of file CodeInserter.h.

void Rose::BinaryAnalysis::CodeInserter::minChunkAllocationSize ( size_t  n)
inline

Property: Minimum size of allocated chunks.

When allocating space for replacement code, never allocate less than this many bytes at a time. Note that multiple replacement codes can be written to a single chunk since we maintain a free list within chunks.

Definition at line 190 of file CodeInserter.h.

unsigned Rose::BinaryAnalysis::CodeInserter::aggregationDirection ( ) const
inline

Property: Whether additional instructions can be moved.

This property controls which additional instructions can be moved by the replaceBlockInsns method in order to make room for the replacement. It is a bit vector of AggregationDirection bits and defaults to both successors and predecessors. When both are present, successors are added first (all the way to the end of the block) and then predecessors are also added (all the way to the beginning of the block).

Definition at line 215 of file CodeInserter.h.

void Rose::BinaryAnalysis::CodeInserter::aggregationDirection ( unsigned  d)
inline

Property: Whether additional instructions can be moved.

This property controls which additional instructions can be moved by the replaceBlockInsns method in order to make room for the replacement. It is a bit vector of AggregationDirection bits and defaults to both successors and predecessors. When both are present, successors are added first (all the way to the end of the block) and then predecessors are also added (all the way to the beginning of the block).

Definition at line 216 of file CodeInserter.h.

NopPadding Rose::BinaryAnalysis::CodeInserter::nopPadding ( ) const
inline

Property: Where to add no-ops when padding.

When to-be-replaced instructions are overwritten with a replacement and the replacement is smaller than the replaced instructions, then the replacement is padded with no-op instructions according to this property.

Definition at line 225 of file CodeInserter.h.

void Rose::BinaryAnalysis::CodeInserter::nopPadding ( NopPadding  p)
inline

Property: Where to add no-ops when padding.

When to-be-replaced instructions are overwritten with a replacement and the replacement is smaller than the replaced instructions, then the replacement is padded with no-op instructions according to this property.

Definition at line 226 of file CodeInserter.h.

virtual bool Rose::BinaryAnalysis::CodeInserter::replaceBlockInsns ( const Partitioner2::BasicBlockPtr ,
size_t  startIdx,
size_t  nInsns,
std::vector< uint8_t >  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)
virtual

Replace instructions in basic block.

This replaces nInsns instructions in the basic block starting at instruction number startIdx. The nInsns may be zero, in which case the replacement is inserted before the startIdx instruction. The new code is inserted either by overwriting the to-be-replaced instructions with the replacement padded at the end by no-ops if necessary (so called "overwrite" mode), or the replacement is written to some other part of the address space and unconditional branches are inserted to branch to the replacement and then back again (so called "branch-aside" mode).

If the neither the replacement (in overwrite mode) nor the unconditional branch (in branch-aside mode) fit in the area vacated by the to-be-replaced instructions, then the to-be-replaced instructions are extended by moving a neighboring instruction into the replacement. The aggregationDirection property controls which instructions can be joined. This often works for branch-aside mode, but can even sometimes work for overwrite mode if the basic block instructions are not executed in address order. The overwrite situation can work when a subsequent or earlier instruction fills in a hole in the to-be-replaced address set.

This method assumes that the replacement is entered at the first byte and exits to one past the last byte. Since some instruction encodings depend on the location of the replacement in virtual memory, the relocations can be used to patch the replacement as it's written to memory.

Inserting code in this manner is not without risk. For instance, enlarging the to-be-replaced set might mean that additional instructions are moved to a different address without changing their encoding. Examples are moving instructions that reference global variables relative to the instruction's address, branches that span the branch-aside gap, etc.

Returns true if successful, false otherwise.

bool Rose::BinaryAnalysis::CodeInserter::replaceInsnsAtFront ( const Partitioner2::BasicBlockPtr ,
size_t  nInsns,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)

Replace instructions at front of basic block.

This is just a convenience for replaceBlockInsns that replaces nInsns instructions at the beginning of the specified basic block. If nInsns is zero, then the replacement is inserted at the front of the basic block without removing any instructions.

virtual bool Rose::BinaryAnalysis::CodeInserter::replaceInsnsAtBack ( const Partitioner2::BasicBlockPtr ,
size_t  nInsns,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)
virtual

Replace instructions at back of basic block.

This is just a convenience for replaceBlockInsns that replaces nInsns instructions at the end of the specified basic block. If nInsns is zero, then the replacement is appended to the end of the basic block without removing any instructions.

virtual bool Rose::BinaryAnalysis::CodeInserter::prependInsns ( const Partitioner2::BasicBlockPtr ,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)
virtual

Prepend code to a basic block.

This is a convenience for replaceInsnsAtFront. It inserts the replacement at the front of the basic block by writing the replacement followed by the first instruction(s) of the block to some other area of memory, overwriting the first part of the basic block with an unconditional branch to the replacement, and following the replacement with an unconditional branch back to the rest of the basic block.

virtual bool Rose::BinaryAnalysis::CodeInserter::appendInsns ( const Partitioner2::BasicBlockPtr ,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)
virtual

Append code to a basic block.

This is a convenience for replaceInsnsAtBack. It appends the replacement to the end of the basic block by moving the last instruction(s) of the block to some other memory followed by the replacement. The original final instructions are overwritten with an unconditional branch to that other memory, which is followed by a branch back to the rest of the basic block.

virtual bool Rose::BinaryAnalysis::CodeInserter::replaceInsns ( const std::vector< SgAsmInstruction * > &  toReplace,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations = std::vector< Relocation >() 
)
virtual

Replace exactly the specified instructions with some other encoding.

The replacement instructions either overwrite the toReplace instructions or the replacement is written to a newly allocated area and unconditional branches connect it to the main control flow. The assumption is that control flow enters at the beginning of toReplace and the replacement will exit to the first address after the last instruction in toReplace. Likewise, control enters at the beginning of replacement and exits to the first address after the end of the replacement.

If relocations are specified, then parts of the replacement are rewritten based on its final address. Relocation records that refer to instructions rather than bytes are not permitted since this function doesn't have access to the basic block in which the replacement is occuring.

Returns true if the replacement could be inserted, false otherwise. The only time this returns false is when the addresses of the original instructions starting with the first instruction do not occupy a contiguous region of memory large enough to hold either the replacement or a jump to the relocated replacement. This algorithm correctly handles the general case when the toReplace instructions are not in address order or are not contiguous.

virtual void Rose::BinaryAnalysis::CodeInserter::fillWithNops ( const AddressIntervalSet where)
virtual

Fill the specified memory with no-op instructions.

virtual void Rose::BinaryAnalysis::CodeInserter::fillWithRandom ( const AddressIntervalSet where)
virtual

Fill the specified memory with random data.

virtual std::vector<uint8_t> Rose::BinaryAnalysis::CodeInserter::encodeJump ( rose_addr_t  srcVa,
rose_addr_t  tgtVa 
)
virtual

Encode an unconditional branch.

This encodes an unconditional branch instruction at srcVa that causes control to flow to tgtVa. The caller should not assume that a particular size encoding will be returned. E.g., on x86, jumps to targets that are further away require more bytes to encode than jumps to nearby targets.

virtual std::vector<uint8_t> Rose::BinaryAnalysis::CodeInserter::applyRelocations ( rose_addr_t  startVa,
std::vector< uint8_t >  replacement,
const std::vector< Relocation > &  relocations,
size_t  relocStart,
const InstructionInfoMap insnInfoMap 
)
virtual

Apply relocations to create a new encoding.

The relocations are applied to the replacement bytes which are assumed to be mapped in virtual memory starting at startVa. The relocStart is a byte offset for all the relocations; i.e., the actual offset in the replacement where the relocation is applied is the relocation's offset plus the relocStart value.

virtual AddressInterval Rose::BinaryAnalysis::CodeInserter::allocateMemory ( size_t  nBytes,
rose_addr_t  jmpTargetVa,
Commit::Boolean  commit = Commit::YES 
)
virtual

Allocate virtual memory in the partitioner memory map.

The second argument is the target address of an unconditional jump that will be added to the end of the allocated memory but which is not included in the nBytes argument (it is however included in the return value).

If the third argument is yes, then memory is actually allocated and removed from the free list. If no, then all steps are completed except removing it from the free list. The commitAllocation function can be called later to remove it from the free list. If you don't remove it from the free list, a subsequent allocation request might return the same addresses.

void Rose::BinaryAnalysis::CodeInserter::commitAllocation ( const AddressInterval where,
Commit::Boolean  commit = Commit::YES 
)

Commit previous allocation.

This commits the allocation returned by allocateMemory by removing it from the free list (if the commit argument is true). We do this as a separate step from the allocation so that we don't have to deallocate in all the error handling locations. Failing to commit an allocation will be easier to spot than failing to release an unused block because the former case causes nonsense disassembly whereas the latter looks like either unreachable code or static data.

AddressIntervalSet Rose::BinaryAnalysis::CodeInserter::instructionLocations ( const std::vector< SgAsmInstruction * > &  )

Given a list of functions, return all addresses that the instructions occupy.

virtual bool Rose::BinaryAnalysis::CodeInserter::replaceByOverwrite ( const AddressIntervalSet toReplaceVas,
const AddressInterval entryInterval,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations,
size_t  relocStart,
const InstructionInfoMap insnInfoMap 
)
virtual

Insert new code by overwriting existing instructions.

The toReplaceVas are the addresses occupied by the to-be-replaced instructions. Since the to-be-replaced instructions are not necessarily in address order or contiguous, the entryInterval describes the largest contiguous subset of toReplaceVas starting at the entry address. Since the replacement is assumed to be entered at its first byte, the replacement will be written into the entryInterval (if it fits). The replacement is padded if necessary according to the nopPadding property. All other addresses in toReplaceVas are filled with no-op instructions.

virtual bool Rose::BinaryAnalysis::CodeInserter::replaceByTransfer ( const AddressIntervalSet toReplaceVas,
const AddressInterval entryInterval,
const std::vector< SgAsmInstruction * > &  toReplace,
const std::vector< uint8_t > &  replacement,
const std::vector< Relocation > &  relocations,
size_t  relocStart,
const InstructionInfoMap insnInfoMap 
)
virtual

Insert new code in allocated area.

This inserts the replacement code in a newly allocated area (by calling allocateMemory). The toReplaceVas are the addresses of all the instruction bytes that are to be replaced. Note that this is all addresses of the instructions, not just the first addresses. The entryInterval is a contiguous subset of toReplaceVas and represents the entry point of toReplaceVas and as many subsequent contiguous addresses as possible. This function writes an unconditional branch in the entryInterval (padding it with no-ops according to nopPadding) that jumps to the replacement code. It appends an unconditional branch to the end of the replacement that jumps to the first address after the end of the toReplaceVas. All other bytes of toReplaceVas are overwritten with no-ops.

InstructionInfoMap Rose::BinaryAnalysis::CodeInserter::computeInstructionInfoMap ( const Partitioner2::BasicBlockPtr ,
size_t  startIdx,
size_t  nDeleted 
)

Obtain info about instructions for the basic block being modified.

Given a basic block, an insertion point, and the number of instructions that will be deleted starting at that insertion point, return information about the remaining instructions. See documentation for InstructionInfoMap for details about how the instructions are indexed in this map.

Member Data Documentation

Diagnostics::Facility Rose::BinaryAnalysis::CodeInserter::mlog
static

Facility for emitting diagnostics.

Definition at line 156 of file CodeInserter.h.


The documentation for this class was generated from the following file: