SlideShare a Scribd company logo
DIRA: Automatic Detection, Identification, and Repair of Control-Hijacking Attacks Alexey Smirnov and Tzi-cker Chiueh SUNY at Stony Brook {alexey, chiueh}@cs.sunysb.edu DEFCON 13
Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
Introduction Buffer overflow attacks are the most common type of attacks. A comprehensive protection strategy should consists of the following components: Attack detection  – to prevent the attack from causing damage; Attack identification  – to feed the IDS with the attack signature; Attack repair  – to allow the compromised application to continue its normal execution. We propose a compile-time solution that provides all three components.
What is a Buffer Overflow Attack Control-hijacking attacks work by overwriting a  control pointer  such as the return address,  function pointer, etc. Buffer overflows are possible when the length of the target buffer is less than the length of the data that can be written into it.  Standard  libc  functions such as  strcpy()  or  sprintf()  are responsible for most buffer overflows.
Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
Attack Detection Stackguard  – place a canary word before the return address (RA) in the function prolog and check it in the function epilog. The assumption is that the attacker will have to overwrite the canary word in order to overwrite the RA. RAD  – save the original RA in a safe place in the function prolog and compare it to the value stored in the stack in the function epilog.
Approaches to Attack Identification Automatic  ways to identify attacks (that is, to generate their signatures) are very important for worm epidemics confinement. Previous systems either provided a single attacking packet or required a large pool of malicious network data. Toth and Kruegel  – look at network packets payloads and perform abstract code execution. TaintCheck  – uses the value of compromised control pointer as the attack signature. Autograph  – extracts most common subsequences from suspicious flows and reports them as signatures. Polygraph  and  Nemean  – use machine learning algorithms to derive common patterns from a large set of malicious flows .
Approaches to Attack Repair Program rollback and replay is used in software debugging. Two approaches: (1) keep execution history ( Spyder ) or (2) do periodic state check-pointing. Check-pointing is easy under Linux because of copy-on-write  fork()  system call ( RECAP  and  Flashback ). Can be more difficult under other OS. Check-pointing relies on the OS rather than on the applications. Shadow Honeypot  runs two versions of the application (protected and non-protected) and dynamically switches between the two once an attack has been detected.
Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
DIRA Approach DIRA is an extension to GCC 3.4.1. It uses  memory updates logging  to solve the three problems at the same time. The idea is to maintain a  run-time log  of all changes to the memory state of the program. Assignments such as a=b; and libc function calls such as  memcpy () change the memory state of the program.  For each memory update DIRA stores its source address, destination address, length, and the pre-image.
DIRA Approach How to detect, identify, and repair an attack using memory updates log? To  detect – compare the current RA with that saved in the log; To  identify  – trace back the data that replaced the control pointer to the point where it was read from the network; To  repair  – restore the memory state using the pre-images stored in the log. At  compile time , DIRA instruments the source code to perform logging and to check correctness of control pointers. At  run-time , the logging code generates the memory updates log.
Memory Updates Logging Memory updates log is a circular buffer; each entry has four fields:  read_addr ,  write_addr ,  len ,  data . DIRA logs effect of each operation of the form X=Y where X and Y are directly referenced variables, array references (a[i]), or de-referenced variables (*(a+1)). read_addr  is set to &Y, write_addr  is set to &X, len  is set to sizeof(Y), data  is set to the pre-image of X in DIR mode and is empty in other modes.
Memory Updates Logging If the right-hand side is a complex expression then a log record is created for each variable of it. To handle updates performed by  libc functions  DIRA proxies several of them: string manipulation functions, format string functions, file and network I/O functions; The log is also used to store  tags , special records indicating change of program’s run-time state: FUNCTION_ENTRY  tag is inserted when a function is called; FUNCTION_EXIT  tag is inserted before a function returns. Tags are used for signature generation and repair.
Memory Updates Logging Example At compile time: Source code: x=y+z; Instrumented code: (log(&x, &y, sizeof(y), &x), (log(&x, &z, sizeof(z), &x), x=y+z)); At run time: log() adds two records to the memory updates log: read_addr : &y;  write_addr : &x;  len : sizeof(y);  data : x; read_addr : &z;  write_addr : &x;  len : sizeof(z);  data : x;
Memory Updates Logging Example At compile time: Source code:  strcpy (a,b); Instrumented code: dira_strcpy(a,b); At run time: Proxy function dira_strcpy() adds a log record: read_addr =&b,  write_addr =&a,  len = strlen (b)+1,  data =a
Attack Detection (D-mode)‏ DIRA uses RAD-like approach: the code to save the RA in a protected buffer is added to the function prolog. The actual RA stored in the stack is compared with this value in function epilog. Using a special buffer to store RAs is an optimization of using a common memory update log to store RAs. DIRA can protect other control-sensitive data structures such as GOT, signal handler tables in a similar fashion (not implemented yet).
Attack Identification The desired properties of an attack signature: Context-aware (to reduce false positives); Semantics-aware (to reduce false positives); Provides a degree of flexibility within each packet (to reduce false negatives); DIRA’s signatures consist of  multiple packets , each packet is a  regular expression . The  length constraint  limits the length of the attacking part of the last packet. Memory updates log is used to build attack signatures.
Attack Identification Two types of dependencies: data and control dependencies. A  data dependency  is created when one variable is assigned to another. A  control dependency  is created between variable  X  and variable  Y  if value of variable  Y  depends on the value of variable  X  used in a conditional expression. Example: if  (x>0)‏ y=1; else y=2; Why we need control dependencies? Example: FTP server attack involving authentication.
Vulnerable FTP Server Example A vulnerable FTP server pseudo-code: char  buf [ 16 ]; Is_auth = is_user =0; // user not authenticated initially while  (1) { recv_packet( p ); if  (! strncmp ( p , “QUIT”,4))  break ; if  (! strncmp ( p , “USER”, 4)) {  is_user =1;  continue ; } if  (! strncmp ( p , “PASS”, 4) &&  is_user ) {  is_auth =1;  continue ; } if  (! is_auth )  continue ; // authentication required if  (! strncmp ( p , “GET”, 3)) { strcpy ( buf ,  p +4); // copy filename  send_file( buf ); } }
FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address
FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records: <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user>
FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records: <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth>
FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records (third packet): <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
FTP Server Attack The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying Attack Using Data Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Definition of Control Dependencies Whenever variable X can prevent control flow from reaching variable Y, a control dependency is created between X and Y. stmt1 and stmt2 are always dependent. Control dependencies are also created for  for  and  while . Tags  START_SCOPE  and  END_SCOPE  are used to store control dependencies in the memory updates log.
Representing Packets as Regular Expressions For each byte of the attacking packet DIRA determines whether it was  looked at  by the program or  not looked at . For example,  strcmp()  applied to the packet bytes converts them into looked-at bytes. If the bytes are blindly copied with  strcpy()  then they are non-looked-at. Initially all bytes are not-looked-at. DIRA traverses the log forward from where the packets were received and records all packet bytes that were looked at. When it outputs the bytes, a looked-at byte is output as is, a non-looked-at is output as ‘?’.
Building Regular Expressions < DIRA_RECV , &p, 11, “USER alexey”> < DIRA_STRNCMP , &p,  4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Building Regular Expressions < DIRA_RECV , &p, 11, “ USER  alexey”> < DIRA_STRNCMP , &p,  4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Building Regular Expressions < DIRA_RECV , &p, 11, “ USER  alexey”> < DIRA_STRNCMP , &p,  4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “ PASS  my_pass”> < DIRA_STRNCMP , &p,  4 , NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “ GET  …”> <DIRA_COND, &is_auth, 0, NULL> < DIRA_STRNCMP , &p,  3 , NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
Length Constraint Generation The length constraint limits the attacking part of the packet by specifying the terminating character and its maximum offset in any benign packet.
DIRA’s Signature File Format N – number of packets L_i – length of i-th packet Regular expression of the packet. Possible characters are shown on the right: The length constraint is specified for the last attacking packet.
Complete Signature for FTP Attack 3 # number of packets 11 # 1 st  packet length USER??????? 12 # 2 nd  packet length PASS???????? 62 # 3 rd  packet length GET???...??? 4 17 \0 # length constraint
Attack Recovery (DIR-mode)‏ Main goal:  bring the program to the state in which it was before the attack packet(s) was received. How to restore the pre-attack state? From which point to continue execution? Program restart points can only be at the beginning of a function because only  global updates  are logged in DIR mode (for performance reasons). The proper function is the  least common dynamic ancestor  of the function in which the attack was detected and the function in which the data was read in.
Choosing the Restart Point depth  is a  loop invariant : it is the relative depth of the current function with respect to the greatest dynamic ancestor seen so far.
Choosing the Restart Point When all updates are tracked it is possible to resume execution from the middle of a function.  No system support is required for restarting  – longjmp()  and  setjmp()  are used. A  setjmp()  call is inserted before the function that can be a potential restart point is called (to push the arguments again). DIRA inserts the  first local update tag  when it encounters such an update after a function call.
Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification  Attack Repair Performance Evaluation Conclusion
DIRA Evaluation Programs tested: ghttpd 1.4  – have exploit; drcatd 0.5.0  – have exploit; named 8.1  – have exploit; qpopper 4.0.4 ; proftpd 1.2.9 ; Two goals: measure  run-time overhead  and quality of automatically generated signatures Configuration: server machine (P-4M 1.7GHz, 512 MB RAM), two clients (Athlon 1.7GHz, 512 MB RAM). Used exploit programs from securiteam.com and insecure.org.
Run-time Overhead The following two graphs show run-time overhead for programs compiled in DIR-mode:
Signature Generation Signatures were produced for all programs that we had exploits for.  ghttpd  signature specifies length constraint using terminating character;  named  signature specifies maximum value of the length field. The  drcatd  signature has three packets in it: login, password, and the attacking packet
Is Recovery Really Useful? Recovery incurs significant overhead. Is it really better than just terminating the application? Yes, because: Terminating a single-threaded program disconnects all clients. Same tradeoff exists in the case of source-code checking tools: using them requires developer’s time investment and we can always use Stackguard instead to protect the programs.
Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
Conclusion DIRA solves the problems of attack detection, identification, and repair in a unified way. It produces accurate multi-packet signatures from a single attack instance. Dynamic slicing of the memory updates log  is the underlying technique. Same technique can be used for automatic patch generation – our future work.
Questions?   http://www.ecsl.cs.sunysb.edu/dira

More Related Content

PPT
Chapter Seven(1)
PPTX
Buffer overflow attacks
PPTX
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
PPTX
More on Lex
PDF
Exploits
PPTX
Lecture 14 run time environment
PPT
Buffer Overflow Countermeasures, DEP, Security Assessment
PPTX
Access to non local names
Chapter Seven(1)
Buffer overflow attacks
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
More on Lex
Exploits
Lecture 14 run time environment
Buffer Overflow Countermeasures, DEP, Security Assessment
Access to non local names

What's hot (20)

PDF
Embedded device hacking Session i
PDF
Offensive cyber security: Smashing the stack with Python
PPTX
Anatomy of a Buffer Overflow Attack
PPT
Dc 12 Chiueh
PDF
C library for input output operations.cstdio.(stdio.h)
PPTX
Reverse-engineering: Using GDB on Linux
PPTX
Compiler design
PDF
Buffer overflow attacks
PDF
09 implementing+subprograms
PDF
Dynamic Binary Instrumentation
PDF
SnortUsersWebcast-Rules_pt2
PDF
Specialized Compiler for Hash Cracking
PPT
Advanced c programming in Linux
PDF
PPTX
Buffer overflow attacks
PPT
Linux basics
PDF
Unix processes
DOCX
Unit 5 dwqb ans
PPTX
Chap 2 structure of c programming dti2143
Embedded device hacking Session i
Offensive cyber security: Smashing the stack with Python
Anatomy of a Buffer Overflow Attack
Dc 12 Chiueh
C library for input output operations.cstdio.(stdio.h)
Reverse-engineering: Using GDB on Linux
Compiler design
Buffer overflow attacks
09 implementing+subprograms
Dynamic Binary Instrumentation
SnortUsersWebcast-Rules_pt2
Specialized Compiler for Hash Cracking
Advanced c programming in Linux
Buffer overflow attacks
Linux basics
Unix processes
Unit 5 dwqb ans
Chap 2 structure of c programming dti2143
Ad

Viewers also liked (7)

PPT
FIR filter on GPU
PPT
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
PPT
RDB - Repairable Database Systems
PPT
DUSK - Develop at Userland Install into Kernel
PDF
RDB - Repairable Database Systems
PPT
GEM - GNU C Compiler Extensions Framework
PPT
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
FIR filter on GPU
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
RDB - Repairable Database Systems
DUSK - Develop at Userland Install into Kernel
RDB - Repairable Database Systems
GEM - GNU C Compiler Extensions Framework
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Ad

Similar to DIRA: Automatic Detection, Identification, and Repair of Controll-Hijacking attacks (20)

PPT
PDF
2010.hari_kannan.phd_thesis.slides.pdf
PDF
Buffer overflow tutorial
PDF
Intermediate code optimization Unit-4.pdf
PDF
Introduction to Dynamic Analysis of Android Application
KEY
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
DOCX
Srgoc dotnet
PPTX
Hunting for APT in network logs workshop presentation
PDF
OORPT Dynamic Analysis
PDF
Beyond Breakpoints: A Tour of Dynamic Analysis
PPTX
C programming language tutorial
PPTX
Buffer overflow attacks
PPTX
iii-ii cd nCompiler design UNIT-V-1.pptx
PPTX
C_Progragramming_language_Tutorial_ppt_f.pptx
PPT
Secure Programming
PPT
Application Security
PPTX
Unit 1
PDF
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
PDF
Unit 5 quesn b ans5
PPT
Virtual platform
2010.hari_kannan.phd_thesis.slides.pdf
Buffer overflow tutorial
Intermediate code optimization Unit-4.pdf
Introduction to Dynamic Analysis of Android Application
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Srgoc dotnet
Hunting for APT in network logs workshop presentation
OORPT Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
C programming language tutorial
Buffer overflow attacks
iii-ii cd nCompiler design UNIT-V-1.pptx
C_Progragramming_language_Tutorial_ppt_f.pptx
Secure Programming
Application Security
Unit 1
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Unit 5 quesn b ans5
Virtual platform

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
Mushroom cultivation and it's methods.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PDF
August Patch Tuesday
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
Mushroom cultivation and it's methods.pdf
OMC Textile Division Presentation 2021.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Zenith AI: Advanced Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Chapter 5: Probability Theory and Statistics
August Patch Tuesday
SOPHOS-XG Firewall Administrator PPT.pptx
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
A novel scalable deep ensemble learning framework for big data classification...
cloud_computing_Infrastucture_as_cloud_p
Encapsulation_ Review paper, used for researhc scholars
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
DP Operators-handbook-extract for the Mautical Institute
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A comparative analysis of optical character recognition models for extracting...
Accuracy of neural networks in brain wave diagnosis of schizophrenia

DIRA: Automatic Detection, Identification, and Repair of Controll-Hijacking attacks

  • 1. DIRA: Automatic Detection, Identification, and Repair of Control-Hijacking Attacks Alexey Smirnov and Tzi-cker Chiueh SUNY at Stony Brook {alexey, chiueh}@cs.sunysb.edu DEFCON 13
  • 2. Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
  • 3. Introduction Buffer overflow attacks are the most common type of attacks. A comprehensive protection strategy should consists of the following components: Attack detection – to prevent the attack from causing damage; Attack identification – to feed the IDS with the attack signature; Attack repair – to allow the compromised application to continue its normal execution. We propose a compile-time solution that provides all three components.
  • 4. What is a Buffer Overflow Attack Control-hijacking attacks work by overwriting a control pointer such as the return address, function pointer, etc. Buffer overflows are possible when the length of the target buffer is less than the length of the data that can be written into it. Standard libc functions such as strcpy() or sprintf() are responsible for most buffer overflows.
  • 5. Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
  • 6. Attack Detection Stackguard – place a canary word before the return address (RA) in the function prolog and check it in the function epilog. The assumption is that the attacker will have to overwrite the canary word in order to overwrite the RA. RAD – save the original RA in a safe place in the function prolog and compare it to the value stored in the stack in the function epilog.
  • 7. Approaches to Attack Identification Automatic ways to identify attacks (that is, to generate their signatures) are very important for worm epidemics confinement. Previous systems either provided a single attacking packet or required a large pool of malicious network data. Toth and Kruegel – look at network packets payloads and perform abstract code execution. TaintCheck – uses the value of compromised control pointer as the attack signature. Autograph – extracts most common subsequences from suspicious flows and reports them as signatures. Polygraph and Nemean – use machine learning algorithms to derive common patterns from a large set of malicious flows .
  • 8. Approaches to Attack Repair Program rollback and replay is used in software debugging. Two approaches: (1) keep execution history ( Spyder ) or (2) do periodic state check-pointing. Check-pointing is easy under Linux because of copy-on-write fork() system call ( RECAP and Flashback ). Can be more difficult under other OS. Check-pointing relies on the OS rather than on the applications. Shadow Honeypot runs two versions of the application (protected and non-protected) and dynamically switches between the two once an attack has been detected.
  • 9. Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
  • 10. DIRA Approach DIRA is an extension to GCC 3.4.1. It uses memory updates logging to solve the three problems at the same time. The idea is to maintain a run-time log of all changes to the memory state of the program. Assignments such as a=b; and libc function calls such as memcpy () change the memory state of the program. For each memory update DIRA stores its source address, destination address, length, and the pre-image.
  • 11. DIRA Approach How to detect, identify, and repair an attack using memory updates log? To detect – compare the current RA with that saved in the log; To identify – trace back the data that replaced the control pointer to the point where it was read from the network; To repair – restore the memory state using the pre-images stored in the log. At compile time , DIRA instruments the source code to perform logging and to check correctness of control pointers. At run-time , the logging code generates the memory updates log.
  • 12. Memory Updates Logging Memory updates log is a circular buffer; each entry has four fields: read_addr , write_addr , len , data . DIRA logs effect of each operation of the form X=Y where X and Y are directly referenced variables, array references (a[i]), or de-referenced variables (*(a+1)). read_addr is set to &Y, write_addr is set to &X, len is set to sizeof(Y), data is set to the pre-image of X in DIR mode and is empty in other modes.
  • 13. Memory Updates Logging If the right-hand side is a complex expression then a log record is created for each variable of it. To handle updates performed by libc functions DIRA proxies several of them: string manipulation functions, format string functions, file and network I/O functions; The log is also used to store tags , special records indicating change of program’s run-time state: FUNCTION_ENTRY tag is inserted when a function is called; FUNCTION_EXIT tag is inserted before a function returns. Tags are used for signature generation and repair.
  • 14. Memory Updates Logging Example At compile time: Source code: x=y+z; Instrumented code: (log(&x, &y, sizeof(y), &x), (log(&x, &z, sizeof(z), &x), x=y+z)); At run time: log() adds two records to the memory updates log: read_addr : &y; write_addr : &x; len : sizeof(y); data : x; read_addr : &z; write_addr : &x; len : sizeof(z); data : x;
  • 15. Memory Updates Logging Example At compile time: Source code: strcpy (a,b); Instrumented code: dira_strcpy(a,b); At run time: Proxy function dira_strcpy() adds a log record: read_addr =&b, write_addr =&a, len = strlen (b)+1, data =a
  • 16. Attack Detection (D-mode)‏ DIRA uses RAD-like approach: the code to save the RA in a protected buffer is added to the function prolog. The actual RA stored in the stack is compared with this value in function epilog. Using a special buffer to store RAs is an optimization of using a common memory update log to store RAs. DIRA can protect other control-sensitive data structures such as GOT, signal handler tables in a similar fashion (not implemented yet).
  • 17. Attack Identification The desired properties of an attack signature: Context-aware (to reduce false positives); Semantics-aware (to reduce false positives); Provides a degree of flexibility within each packet (to reduce false negatives); DIRA’s signatures consist of multiple packets , each packet is a regular expression . The length constraint limits the length of the attacking part of the last packet. Memory updates log is used to build attack signatures.
  • 18. Attack Identification Two types of dependencies: data and control dependencies. A data dependency is created when one variable is assigned to another. A control dependency is created between variable X and variable Y if value of variable Y depends on the value of variable X used in a conditional expression. Example: if (x>0)‏ y=1; else y=2; Why we need control dependencies? Example: FTP server attack involving authentication.
  • 19. Vulnerable FTP Server Example A vulnerable FTP server pseudo-code: char buf [ 16 ]; Is_auth = is_user =0; // user not authenticated initially while (1) { recv_packet( p ); if (! strncmp ( p , “QUIT”,4)) break ; if (! strncmp ( p , “USER”, 4)) { is_user =1; continue ; } if (! strncmp ( p , “PASS”, 4) && is_user ) { is_auth =1; continue ; } if (! is_auth ) continue ; // authentication required if (! strncmp ( p , “GET”, 3)) { strcpy ( buf , p +4); // copy filename send_file( buf ); } }
  • 20. FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address
  • 21. FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records: <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user>
  • 22. FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records: <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth>
  • 23. FTP Server Attack FTP server GET attack (3 packets): USER alexey PASS my_pass GET very_long_file_name_that_will_overwrite_the_return_address Log records (third packet): <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 24. FTP Server Attack The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 25. Identifying Attack Using Data Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 26. Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 27. Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 28. Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 29. Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 30. Identifying More Packets Using Control Dependencies The return address (RA) is located after buf: RA=buf+17. <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 31. Definition of Control Dependencies Whenever variable X can prevent control flow from reaching variable Y, a control dependency is created between X and Y. stmt1 and stmt2 are always dependent. Control dependencies are also created for for and while . Tags START_SCOPE and END_SCOPE are used to store control dependencies in the memory updates log.
  • 32. Representing Packets as Regular Expressions For each byte of the attacking packet DIRA determines whether it was looked at by the program or not looked at . For example, strcmp() applied to the packet bytes converts them into looked-at bytes. If the bytes are blindly copied with strcpy() then they are non-looked-at. Initially all bytes are not-looked-at. DIRA traverses the log forward from where the packets were received and records all packet bytes that were looked at. When it outputs the bytes, a looked-at byte is output as is, a non-looked-at is output as ‘?’.
  • 33. Building Regular Expressions < DIRA_RECV , &p, 11, “USER alexey”> < DIRA_STRNCMP , &p, 4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 34. Building Regular Expressions < DIRA_RECV , &p, 11, “ USER alexey”> < DIRA_STRNCMP , &p, 4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 35. Building Regular Expressions < DIRA_RECV , &p, 11, “ USER alexey”> < DIRA_STRNCMP , &p, 4 , NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> < DIRA_RECV , &p, 12, “ PASS my_pass”> < DIRA_STRNCMP , &p, 4 , NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> < DIRA_RECV , &p, 62, “ GET …”> <DIRA_COND, &is_auth, 0, NULL> < DIRA_STRNCMP , &p, 3 , NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)>
  • 36. Length Constraint Generation The length constraint limits the attacking part of the packet by specifying the terminating character and its maximum offset in any benign packet.
  • 37. DIRA’s Signature File Format N – number of packets L_i – length of i-th packet Regular expression of the packet. Possible characters are shown on the right: The length constraint is specified for the last attacking packet.
  • 38. Complete Signature for FTP Attack 3 # number of packets 11 # 1 st packet length USER??????? 12 # 2 nd packet length PASS???????? 62 # 3 rd packet length GET???...??? 4 17 \0 # length constraint
  • 39. Attack Recovery (DIR-mode)‏ Main goal: bring the program to the state in which it was before the attack packet(s) was received. How to restore the pre-attack state? From which point to continue execution? Program restart points can only be at the beginning of a function because only global updates are logged in DIR mode (for performance reasons). The proper function is the least common dynamic ancestor of the function in which the attack was detected and the function in which the data was read in.
  • 40. Choosing the Restart Point depth is a loop invariant : it is the relative depth of the current function with respect to the greatest dynamic ancestor seen so far.
  • 41. Choosing the Restart Point When all updates are tracked it is possible to resume execution from the middle of a function. No system support is required for restarting – longjmp() and setjmp() are used. A setjmp() call is inserted before the function that can be a potential restart point is called (to push the arguments again). DIRA inserts the first local update tag when it encounters such an update after a function call.
  • 42. Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
  • 43. DIRA Evaluation Programs tested: ghttpd 1.4 – have exploit; drcatd 0.5.0 – have exploit; named 8.1 – have exploit; qpopper 4.0.4 ; proftpd 1.2.9 ; Two goals: measure run-time overhead and quality of automatically generated signatures Configuration: server machine (P-4M 1.7GHz, 512 MB RAM), two clients (Athlon 1.7GHz, 512 MB RAM). Used exploit programs from securiteam.com and insecure.org.
  • 44. Run-time Overhead The following two graphs show run-time overhead for programs compiled in DIR-mode:
  • 45. Signature Generation Signatures were produced for all programs that we had exploits for. ghttpd signature specifies length constraint using terminating character; named signature specifies maximum value of the length field. The drcatd signature has three packets in it: login, password, and the attacking packet
  • 46. Is Recovery Really Useful? Recovery incurs significant overhead. Is it really better than just terminating the application? Yes, because: Terminating a single-threaded program disconnects all clients. Same tradeoff exists in the case of source-code checking tools: using them requires developer’s time investment and we can always use Stackguard instead to protect the programs.
  • 47. Outline of the Talk Introduction Related Work DIRA Architecture Attack Detection Attack Identification Attack Repair Performance Evaluation Conclusion
  • 48. Conclusion DIRA solves the problems of attack detection, identification, and repair in a unified way. It produces accurate multi-packet signatures from a single attack instance. Dynamic slicing of the memory updates log is the underlying technique. Same technique can be used for automatic patch generation – our future work.
  • 49. Questions? http://www.ecsl.cs.sunysb.edu/dira