Next-Generation Intermediate Representations for Binary Code Analysis

Enregistré dans:
Détails bibliographiques
Publié dans:Programming and Computer Software vol. 45, no. 7 (Dec 2019), p. 424
Auteur principal: Solovev, M. A.
Autres auteurs: Bakulin, M. G., Gorbachev, M. S., Manushin, D. V., Padaryan, V. A., Panasenko, S. S.
Publié:
Springer Nature B.V.
Sujets:
Accès en ligne:Citation/Abstract
Full Text
Full Text - PDF
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!

MARC

LEADER 00000nab a2200000uu 4500
001 2918495808
003 UK-CbPIL
022 |a 0361-7688 
022 |a 1608-3261 
024 7 |a 10.1134/S0361768819070107  |2 doi 
035 |a 2918495808 
045 2 |b d20191201  |b d20191231 
100 1 |a Solovev, M. A.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124); Moscow State University, Moscow, Russia (GRID:grid.14476.30) (ISNI:0000 0001 2342 9668) 
245 1 |a Next-Generation Intermediate Representations for Binary Code Analysis 
260 |b Springer Nature B.V.  |c Dec 2019 
513 |a Journal Article 
520 3 |a Many binary code analysis tools rely on intermediate representation (IR) derived from a binary code, instead of working directly with machine instructions. In this paper, we first consider binary code analysis problems that benefit from IR and compile a list of requirements that the IR suitable for solving these problems should meet. Generally speaking, a universal binary analysis platform requires two principal components. The first component is a retargetable instruction decoder that utilizes external specifications to describe target instruction sets. External specifications facilitate maintenance and allow one to quickly implement support for new instruction sets. We analyze some of the most popular instruction set architectures (ISAs), including those used in microcontrollers, and from that compile a list of requirements for the retargetable decoder. We then overview existing multi-ISA decoders and propose our vision of a more generic approach, based on a multi-layer directed acyclic graph that describes the decoding process in universal terms. The second component of the analysis platform is the actual architecture-neutral IR. In this paper, we describe such IRs and propose Pivot 2, an IR that is low-level enough to be easily constructed from decoded machine instructions, also being easy to analyze. The main features of Pivot 2 are explicit side effects, SSA variables, simpler alternative to phi-functions, and extensible elementary operation set at the core. This IR also supports machines that have multiple memory address spaces. Finally, we propose a way to tie the decoder and the IR together to fit them to most of the binary code analysis tasks through abstract interpretation on top of the IR. The proposed scheme takes into account various aspects of target architectures that are overlooked in many other works, including pipeline specifics (handling of delay slots, hardware loop support, etc.), exception and interrupt management, and generic address space model, in which accesses may have arbitrary side effects due to memory-mapped devices or other non-trivial behavior of the memory system. 
653 |a Side effects 
653 |a Programming languages 
653 |a Decoders 
653 |a Decoding 
653 |a Multilayers 
653 |a Binary codes 
653 |a Debugging 
653 |a Specifications 
653 |a Software utilities 
653 |a Memory devices 
653 |a Automation 
653 |a Code reuse 
653 |a Representations 
653 |a Semantics 
700 1 |a Bakulin, M. G.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124) 
700 1 |a Gorbachev, M. S.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124) 
700 1 |a Manushin, D. V.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124); Moscow State University, Moscow, Russia (GRID:grid.14476.30) (ISNI:0000 0001 2342 9668) 
700 1 |a Padaryan, V. A.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124); Moscow State University, Moscow, Russia (GRID:grid.14476.30) (ISNI:0000 0001 2342 9668) 
700 1 |a Panasenko, S. S.  |u Ivannikov Institute for System Programming, Russian Academy of Sciences, Moscow, Russia (GRID:grid.4886.2) (ISNI:0000 0001 2192 9124) 
773 0 |t Programming and Computer Software  |g vol. 45, no. 7 (Dec 2019), p. 424 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2918495808/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/2918495808/fulltext/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/2918495808/fulltextPDF/embedded/ZKJTFFSVAI7CB62C?source=fedsrch