When malware changed its mind: an empirical study of variable program behaviours in the real world

Erin Avllazagaj (University of Maryland, College Park), Ziyun Zhu (Facebook), Leyla Bilge (NortonLifeLock Research Group), Davide Balzarotti (EURECOM) & Tudor Dumitras (University of Maryland, College Park)

Behavioural program analysis is widely used for understanding malware behaviour, for creating rule-based detectors, and for clustering samples into malware families. However, this approach is ineffective when the behaviour of individual samples changes across different executions, owing to environment sensitivity, evasive techniques or time variability. While the inability to observe the complete behaviour of a program is a well-known limitation of dynamic analysis, the prevalence of this behaviour variability in the wild, and the behaviour components that are most affected by it, are still unknown. As the behavioural traces are typically collected by executing the samples in a controlled environment, the models created and tested using such traces do not account for the broad range of behaviours observed in the wild, and may result in a false sense of security.

In this paper we conduct the first quantitative analysis of behavioural variability in Windows malware, PUP and benign samples, using a novel dataset of 7.6M execution traces, recorded in 5.4M real hosts from 113 countries. We analyse program behaviours at multiple granularities, and we show how they change across hosts and across time. We then analyse the invariant parts of the malware behaviours, and we show how this affects the effectiveness of malware detection using a common class of behavioural rules. Our findings have actionable implications for malware clustering and detection, and they emphasize that program behaviour in the wild depends on a subtle interplay of factors that may only be observed at scale, by monitoring malware on real hosts.

Got a question about this presentation? To get in touch with the speakers, contact Erin Avllazagaj on Twitter at @albocoder.

Erin Avllazagaj

University of Maryland, College Park

Erin is a third-year Ph.D. student in the Department of Electrical and Computer Engineering at the University of Maryland in College Park, advised by Prof. Tudor Dumitraș. He received his B.S. in computer science from Bilkent University (Ankara, Turkey) in 2018. Erin's broad research interests cover data-driven malware analysis and automatic exploit generation. Specifically, in his recent Ph.D. work he has analysed executions of malware traces in the real world to derive guidelines for creating effective behaviour-based detection systems. Erin is currently interested in automatic exploit generation for heap-based exploits. His participation in various CTF competitions and his internship work have been major influences in this new research direction.

Ziyun Zhu

Facebook

Ziyun Zhu is a Research scientist at Facebook in Greater New York Area.

Leyla Bilge

NortonLifeLock Research

Leyla Bilge is technical director and leads the branch of the research team that resides in Europe.

Davide Balzarotti

EURECOM

Davide Balzarotti is a Professor (Professeur des université) at the EURECOM Graduate School and Research Center, located in Sophia Antipolis on the French riviera. His research interests include most aspects of system security and in particular the areas of binary and malware analysis, reverse engineering, computer forensics, and web security. Davide is a recipient of an ERC Consolidator Grant which focuses on the analysis of compromised systems. Davide is a member of the Order of the Overflow – the team which organizes the DefCon Capture the Flag competition. Before that, he was one of the founding members of the Shellphish hacking group, with whom he participated in ten DEFCON CTF finals in Vegas (winning in 2005). When he was a post-doc in the security group at UCSB, he also helped to organize several early editions of the iCTF competition. When not in front of his computer, Davide likes to climb rocks, surf waves, hike trails, and take pictures along the way.

Tudor Dumitras

University of Maryland, College Park

Tudor works on data-driven security. His research objective is to provide an evidence-based foundation for security, by building defences grounded in a rigorous understanding of real-world adversaries. Tudor conducts empirical studies of adversary behaviour, builds machine learning systems for detecting malware and attacks, and studies the security of machine learning in adversarial environments. He also has a good knowledge of the security industry, having worked for 2.5 years at Symantec Research Labs. There,he built WINE, one of the first platforms for sharing field data collected by the security industry with academic researchers. In his most cited paper he measured how long zero-day attacks go undiscovered in the wild; this measurement was made possible, for the first time, by the WINE platform. Tudor's research has been featured in the Research Highlights of the Communications of the ACM and has been widely cited in the media, for example in The Economist, the MIT Technology Review, Forbes, and The Register. He also enjoys giving TED-style talks, to explain his work to broad audiences. Tudor has a Ph.D. from Carnegie Mellon University and undergraduate degrees from the Ecole Polytechnique and the “Politehnica” University, Bucharest.