Bytecode disassembler

Pavel Holejsovsky pavel.holejsovsky at upek.com
Fri Aug 13 01:03:48 PDT 2004


Hi Slaters,

I wrote attached bytecode disassembling stream, mainly for educational 
purposes.  I'd be glad for any comments about this code - both for 
implementation and style of interface (Disassembler is ReadStream clone, 
working over CompiledMethod, providing strings with single disassembled 
instructions).

Sample usage:
[] disassembler upToEnd.
[|:a b| b: a - 1. (b < 0) ifTrue: a neg ifFalse: a ] disassembler do:
   [| :instr | Console ; instr ; '\n' ].

or 'hackish' REPL-friendly interface:
disass: [ 1 + 1 ].

I have some questions:
- I'm not sure if I understood doc/bytecode.txt properly.  Especially 
extended value decoding might be completely broken (see 
valueFromOpcode:). Also, jump-type offset decoding 
(d@(DisassemblerStream traits) value: _ typed: _@#jumpType) assumes 
little-endian, but is real encoding image-endianness dependent?

- How can I get block of existing slate methods (to see them 
disassembled)? I mean, if I want to see disassembly of method #next for 
argument {Disassembler traits}, I need some protocol which answers block 
(closure) according to method name and array of parameters.  Is it there 
something like this?

thanks
pavouk

-------------- next part --------------
"DisassemblerStream stream prototype."
prototypes addPrototype: #DisassemblerStream derivedFrom: {ReadStream}.

"Method being disassembled."
DisassemblerStream addSlot: #method.

"Stream of opcodes from #method."
DisassemblerStream addSlot: #bytecodeStream.

"Mapping of opcode to opcode name and value type tuple."
DisassemblerStream traits addSlot: #opcodeMap valued: Dictionary newEmpty.

"Populate opcodeValues and opcodeMap slots."
{
  16r00 -> (#sendMessage , #messageType).
  16r01 -> (#loadVariable , #variableType).
  16r02 -> (#storeVariable , #variableType).
  16r03 -> (#loadFreeVariable , #freeVariableType).
  16r04 -> (#storeFreeVariable , #freeVariableType).
  16r05 -> (#loadLiteral , #literalType).
  16r06 -> (#loadSelector , #selectorType).
  16r07 -> (#pop , #numberType).
  16r08 -> (#pushArray , #numberType).
  16r09 -> (#newBlock , #numberType).
  16r0a -> (#branchKeyed , #numberType).
  16r0b -> (#sendMessageOptionals , #messageType).
  16r0c -> (#return , #numberType).
  16r0d -> (#pushInteger , #numberType).
  16r0f -> (#jumpTo , #jumpType).
  16r1f -> (#branchIfTrue , #jumpType).
  16r2f -> (#branchIfFalse , #jumpType).
  16r3f -> (#pushEnvironment , #extendedType).
  16r4f -> (#resend , #extendedType).
  16r5f -> (#pushNil , #extendedType).
  16r6f -> (#popEqual , #extendedType).
  16r7f -> (#pushTrue , #extendedType).
  16r8f -> (#pushFalse , #extendedType)
} 
  do: 
    [| :opcodeDef | DisassemblerStream opcodeMap add: opcodeDef ].

DisassemblerStream addImmutableSlot: #extendedCode valued: 16r0f.

"End-user interface, answers disassembling stream of given method."
cm@(CompiledMethod traits) disassembler
[ DisassemblerStream newOn: cm ].

"Proper hooking into ReadStream interface."
d@(DisassemblerStream traits) on: m
[ 
  d method: m. 
  d bytecodeStream: m code reader. 
  d 
].

d@(DisassemblerStream traits) isAtEnd
[ d bytecodeStream isAtEnd ].

"Value printers for specific value types.  If not found something specific, don't print 
anything."
d@(DisassemblerStream traits) value: _ typed: _ [ '' ].

d@(DisassemblerStream traits) value: v typed: _@#numberType
[ v as: String ].

d@(DisassemblerStream traits) value: v typed: _@#messageType
[ 'args:' ; (v as: String) ].

d@(DisassemblerStream traits) value: v typed: _@#variableType
[
  (v < d method inputVariables) 
    ifTrue:
      [ ':i' ; (v as: String) ]
    ifFalse:
      [ 'l' ; (v - d method inputVariables as: String) ]
].

d@(DisassemblerStream traits) value: v typed: _@#freeVariableType
[ 'lexicalOffset:' ; (v as: String) ; ' fvIndex: ' ; (method bytecodeStream next as: String) ].

d@(DisassemblerStream traits) value: v typed: _@#literalType
[ '\'' ; ((d method literals at: v) as: String) ; '\'' ].

d@(DisassemblerStream traits) value: v typed: _@#selectorType
[ '#' ; (d method selectors at: v) ].

d@(DisassemblerStream traits) value: _ typed: _@#jumpType
[| displacement |
  displacement: (d bytecodeStream next bitOr: (d bytecodeStream next bitShift: 8)).
  (displacement >= 16r8000) 
    ifTrue: 
      [ displacement: displacement - 256 ].
  'offset:' ; (d bytecodeStream position + displacement as: String)
].

"Decodes value from given opcode and optionally any following codes in bytecode stream."
d@(DisassemblerStream traits) valueFromOpcode: code
[| value |
  value: (code bitShift: -4).
  (value = 16r0f)
    ifTrue:
      [| nextCode | 
        nextCode: d bytecodeStream next.
	value: value + (nextCode bitAnd: 16r7f).
	(nextCode bitAnd: 16r80) = 16r80
      ] whileTrue.
  value
].

"Answers string with disassembled instruction."
d@(DisassemblerStream traits) next
[| offset code value mapping |
  offset: d bytecodeStream position.
  code: d bytecodeStream next.
  value: (d valueFromOpcode: code).

  "Clean the 'value' field from opcode, but only for non-extended ones."
  ((code bitAnd: d extendedCode) = d extendedCode) 
    ifFalse: 
      [ code: (code bitAnd: d extendedCode) ].

  "Return concatenated string with instruction name and its value."
  mapping: (d opcodeMap at: code).
  (offset as: String) ; ' ' ; 
    (mapping at: 0) name ; ' ' ; 
    (d value: value typed: (mapping at: 1))
].

"Method for hackish but fast invocation from REPL."
_@(lobby) disass: m@(CompiledMethod)
[ m disassembler do: [| :instr | Console ; instr ; '\n' ]. ].


More information about the Slate mailing list