Transformer-Based File Fragment Type Classification for File Carving in Digital Forensics

Spremljeno u:
Bibliografski detalji
Izdano u:European Conference on Cyber Warfare and Security (Jun 2025), p. 169-177
Glavni autor: Guzhov, Andrey
Daljnji autori: Wirth, Christoph Tobias
Izdano:
Academic Conferences International Limited
Teme:
Online pristup:Citation/Abstract
Full Text
Full Text - PDF
Oznake: Dodaj oznaku
Bez oznaka, Budi prvi tko označuje ovaj zapis!

MARC

LEADER 00000nab a2200000uu 4500
001 3244089536
003 UK-CbPIL
035 |a 3244089536 
045 2 |b d20250601  |b d20250630 
084 |a 142231  |2 nlm 
100 1 |a Guzhov, Andrey 
245 1 |a Transformer-Based File Fragment Type Classification for File Carving in Digital Forensics 
260 |b Academic Conferences International Limited  |c Jun 2025 
513 |a Conference Proceedings 
520 3 |a The recovery and reconstruction of fragmented data is a critical challenge in digital forensics, particularly when dealing with incomplete, corrupted, or partially deleted files in large-scale cybercrime investigations. Accurate classification of file fragment types is essential for reconstructing critical evidence, especially in environments characterized by high levels of data fragmentation, such as cyberattacks, data breaches, and the operation of illicit ("darknet") data centers. Traditional file carving methods often struggle to efficiently handle these fragmented files, limiting their reliability in complex investigations involving large volumes of data. This paper introduces a novel approach to classifying file fragment types using a Transformer-based model, designed to significantly enhance the speed and accuracy of forensic investigations. Unlike traditional methods, which rely on handcrafted rules or shallow machine learning techniques, our model leverages the powerful Swin Transformer V2 architecture, a state-of-the-art deep learning model tailored for sequence-to-sequence tasks. The model was trained to recognize complex, hierarchical patterns within raw byte sequences, enabling it to classify file fragments with high precision and reliability. We demonstrate that our model outperforms traditional methods on 512-byte file blocks, achieving superior classification accuracy on the File Fragment Type dataset (FFT-75), and also shows strong competitive performance with larger 4 KiB file blocks. Our approach represents a significant advancement in digital forensics, automating the classification of fragmented data and improving the reliability and efficiency of evidence recovery. Future work will focus on optimizing the model for different file block sizes and evaluating its application to real-world fragmented data scenarios. By automating the identification of file fragment formats, our approach not only improves classification accuracy but also reduces the time required for investigators to recover critical evidence from fragmented data sources. This work provides a promising tool for digital forensics practitioners, advancing recovery capabilities in the face of evolving cyber threats. 
653 |a Accuracy 
653 |a Machine learning 
653 |a Metadata 
653 |a Datasets 
653 |a Deep learning 
653 |a Classification 
653 |a Forensic sciences 
653 |a Artificial intelligence 
653 |a Reliability 
653 |a Neural networks 
653 |a Task complexity 
653 |a Recovery 
653 |a Automation 
653 |a Cybercrime 
653 |a Computer forensics 
653 |a Forensic computing 
653 |a Efficiency 
653 |a Fragments 
653 |a Evidence 
653 |a Segmentation 
653 |a Forensic science 
653 |a Models 
653 |a Sequences 
653 |a Data 
700 1 |a Wirth, Christoph Tobias 
773 0 |t European Conference on Cyber Warfare and Security  |g (Jun 2025), p. 169-177 
786 0 |d ProQuest  |t Political Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3244089536/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3244089536/fulltext/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3244089536/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch