blockgpt landing

Real-time Intrusion Detection

Abnormal DeFi Transactions

Can we detect DeFi attacks without hardcoded heuristics? Our learned LLM model says so, see our Technical Report.

We show that intrusion detection without heuristics can be done in real-time.

  • BlockGPT We are the first to apply unsupervised/self-supervised learning for anomaly detection in smart contract transaction execution traces. We develop a large language model for Ethereum transaction anomaly detection, employing custom data encoding, domain-specific tokenization, and a tree encoding method tailored for EVM trace tree representation, capturing calls, function names, parameters, and storage modifications.
  • BlockGPT Evaluation We evaluate BlockGPT on a dataset of 124 attacks, consisting of a total of 68M transactions, spanning a period of 1523 days, starting from block 5470817 (19th April, 2018) and ending at block 15000000 (21st June, 2022).
  • Real-time. Evaluation results indicate that BlockGPT effectively identifies abnormal transactions and can detect different types of malicious activities, as shown through a flash loan attack case study. With a throughput of 2284±289 transactions per second, our tool is a viable real-time IDS for blockchains.

Conclusion?

Learning a model of normal transactions, and thereby detecting abnormal transactions is a vast field of study. We've shown how it is possible to not rely on manually crafted heuristics to detect abnormal transactions. BlockGPT successfully identified 49 out of 124 attacks as among the top-3 most abnormal transactions, demonstrating its efficacy in real-time threat detection with an average batch throughput of 2,284 transactions per second. This capacity renders BlockGPT an effective real-time intrusion detection system for blockchain networks such as Ethereum, capable of triggering smart contract pause mechanisms to thwart attacks. This research marks a significant contribution to blockchain transaction analysis by pioneering the use of unsupervised/self-supervised learning for the anomaly detection of transactions, further supported by a custom-built large language model designed specifically for this purpose.