AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Apr 26, 2015 #1 Here's a quick implementation I wrote using nasm for finding the length of a local null terminated string, without calling the C function strlen(). The only C function used in this example is printf() to display the results to the console. Code: [NO-PARSE];---------------------------------------------------- ; File: ; code.asm ; ; Description: ; NASM implementation to find the length of a ; local null-terminated string. ; ; nasm -f win32 code.asm -o code.o ; gcc -o main code.o ;---------------------------------------------------- extern _printf ; C function global _main section .data mystr: db 'testing', 0 format: db 'length (%s) => %u', 10, 0 section .text _main: push dword mystr call strlen add esp, byte 4 push eax push dword mystr push dword format call _printf add esp, byte 12 ret strlen: mov ebx, [esp + 4] mov ecx, 0 .next: cmp byte [ebx], 0 jz .exit_loop inc ebx inc ecx jmp .next .exit_loop: mov eax, ecx ret[/NO-PARSE] Written for 32-bit Windows.
Here's a quick implementation I wrote using nasm for finding the length of a local null terminated string, without calling the C function strlen(). The only C function used in this example is printf() to display the results to the console. Code: [NO-PARSE];---------------------------------------------------- ; File: ; code.asm ; ; Description: ; NASM implementation to find the length of a ; local null-terminated string. ; ; nasm -f win32 code.asm -o code.o ; gcc -o main code.o ;---------------------------------------------------- extern _printf ; C function global _main section .data mystr: db 'testing', 0 format: db 'length (%s) => %u', 10, 0 section .text _main: push dword mystr call strlen add esp, byte 4 push eax push dword mystr push dword format call _printf add esp, byte 12 ret strlen: mov ebx, [esp + 4] mov ecx, 0 .next: cmp byte [ebx], 0 jz .exit_loop inc ebx inc ecx jmp .next .exit_loop: mov eax, ecx ret[/NO-PARSE] Written for 32-bit Windows.
P Patrick Sysnative Staff Joined Jun 7, 2012 Posts 4,618 Apr 26, 2015 #2 Had no idea you knew x86-32 assembly, neat.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Apr 26, 2015 #3 I know a bit about x64 as well. Theres a few key differences especially in regards to function calls and registers.
I know a bit about x64 as well. Theres a few key differences especially in regards to function calls and registers.
P Patrick Sysnative Staff Joined Jun 7, 2012 Posts 4,618 Apr 26, 2015 #4 Cool. Yeah, there's not too much that requires much thought process despite the full registers/64 bit operations being r prefixed as opposed to e. Similar with integer registers also, I guess. Also rip pushad/popad. Welcome our new nonvolatile overlords.
Cool. Yeah, there's not too much that requires much thought process despite the full registers/64 bit operations being r prefixed as opposed to e. Similar with integer registers also, I guess. Also rip pushad/popad. Welcome our new nonvolatile overlords.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Apr 26, 2015 #5 Actually there's a bit more to it than just that when you dig deep enough, but it's not a super hard transition once you figure a few things out.
Actually there's a bit more to it than just that when you dig deep enough, but it's not a super hard transition once you figure a few things out.
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 23, 2015 #7 You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used.
You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Aug 23, 2015 #8 tekz said: You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used. Click to expand... Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :) Code: mov ecx, mystr lea edx, [ecx] repeat: mov al,[ecx] test al,0FFh jz next inc ecx jmp repeat next: sub ecx,edx push ecx push mystr push format call _printf add esp, 12 ret I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) edit: I suppose the other thing you could do is decrement ecx after the sub instruction, and keep everything else consistent in your code. Last edited: Aug 23, 2015
tekz said: You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used. Click to expand... Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :) Code: mov ecx, mystr lea edx, [ecx] repeat: mov al,[ecx] test al,0FFh jz next inc ecx jmp repeat next: sub ecx,edx push ecx push mystr push format call _printf add esp, 12 ret I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) edit: I suppose the other thing you could do is decrement ecx after the sub instruction, and keep everything else consistent in your code.
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 23, 2015 #9 AceInfinity said: tekz said: You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used. Click to expand... Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :) Code: mov ecx, mystr lea edx, [ecx] repeat: mov al,[ecx] test al,0FFh jz next inc ecx jmp repeat next: sub ecx,edx push ecx push mystr push format call _printf add esp, 12 ret I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) Click to expand... Ahh, I see, also old habbit of using test over cmp when I can, writing it on Windows though.
AceInfinity said: tekz said: You can also write, Code: mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret 32 bytes used. Click to expand... Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :) Code: mov ecx, mystr lea edx, [ecx] repeat: mov al,[ecx] test al,0FFh jz next inc ecx jmp repeat next: sub ecx,edx push ecx push mystr push format call _printf add esp, 12 ret I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) Click to expand... Ahh, I see, also old habbit of using test over cmp when I can, writing it on Windows though.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Aug 23, 2015 #10 Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... Yeah, I see why inc works before the comparison now -- you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. :) I read all of the code except for what you store in edx... Very nice :) Code: extern _printf global _main section .data mystr: db '', 0 format: db '%u', 10, 0 section .text _main: mov ecx,mystr lea edx,[ecx+1] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push format call _printf pop ecx pop ecx ret Last edited: Aug 23, 2015
Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... Yeah, I see why inc works before the comparison now -- you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. :) I read all of the code except for what you store in edx... Very nice :) Code: extern _printf global _main section .data mystr: db '', 0 format: db '%u', 10, 0 section .text _main: mov ecx,mystr lea edx,[ecx+1] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push format call _printf pop ecx pop ecx ret
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 23, 2015 #11 AceInfinity said: Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :) Click to expand... Yeah edited that out of my post asap since i miss read what you said :S, it will return 0. Also I usually compile my code using inline asm inside VS c++.
AceInfinity said: Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :) Click to expand... Yeah edited that out of my post asap since i miss read what you said :S, it will return 0. Also I usually compile my code using inline asm inside VS c++.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Aug 23, 2015 #12 tekz said: AceInfinity said: Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :) Click to expand... Yeah edited that out of my post asap since i miss read what you said, it will return 0, I usually compile my code using inline asm inside VS c++. Click to expand... I read your response incorrect. edited my post.. Sorry! Code: extern _printf global _main section .data mystr: db 'test', 0 format: db '%u', 10, 0 section .text _main: mov ecx,mystr lea edx,[ecx+1] repeat: mov al,[ecx] inc ecx test al,0FFh jne repeat sub ecx,edx push ecx push format call _printf add esp, 8 ret Clever example, thanks for that by the way! :) Also... Have you looked into how Microsoft implements it for C/C++? That is interesting as well. I usually compile my code using inline asm inside VS c++ Click to expand... THAT would be why haha. I'm using NASM to compile. Windows convention means prefixed functions with underscores. With your variant, my executable size can be reduced by 65 bytes in total, removing the *function* and writing it out in main. Last edited: Aug 23, 2015
tekz said: AceInfinity said: Code: lea edx,[ecx+1] Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S Example isn't flawed, ecx is meant to be increased before the comparison Click to expand... It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :) Click to expand... Yeah edited that out of my post asap since i miss read what you said, it will return 0, I usually compile my code using inline asm inside VS c++. Click to expand... I read your response incorrect. edited my post.. Sorry! Code: extern _printf global _main section .data mystr: db 'test', 0 format: db '%u', 10, 0 section .text _main: mov ecx,mystr lea edx,[ecx+1] repeat: mov al,[ecx] inc ecx test al,0FFh jne repeat sub ecx,edx push ecx push format call _printf add esp, 8 ret Clever example, thanks for that by the way! :) Also... Have you looked into how Microsoft implements it for C/C++? That is interesting as well. I usually compile my code using inline asm inside VS c++ Click to expand... THAT would be why haha. I'm using NASM to compile. Windows convention means prefixed functions with underscores. With your variant, my executable size can be reduced by 65 bytes in total, removing the *function* and writing it out in main.
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 24, 2015 #13 Yeah the way microsoft implents it is very neat, I try to write mine with similar methods based of what I remembered from it. Anyway only managed to write it 53 bytes total. Code: xor ebx,ebx cmp byte ptr [cArray],00 jne label1 xor eax,eax inc eax mov ebx,eax jmp label2 label1: mov eax,ebx label2: lea ecx,[eax+cArray] lea edx,[ecx+01] label3: mov al,[ecx] inc ecx test al,al jne label3 sub ecx,edx push ecx push offset cFormat call dword ptr [printf] pop ecx pop ecx ret
Yeah the way microsoft implents it is very neat, I try to write mine with similar methods based of what I remembered from it. Anyway only managed to write it 53 bytes total. Code: xor ebx,ebx cmp byte ptr [cArray],00 jne label1 xor eax,eax inc eax mov ebx,eax jmp label2 label1: mov eax,ebx label2: lea ecx,[eax+cArray] lea edx,[ecx+01] label3: mov al,[ecx] inc ecx test al,al jne label3 sub ecx,edx push ecx push offset cFormat call dword ptr [printf] pop ecx pop ecx ret
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Aug 25, 2015 #14 Actually Microsoft seems to use an algorithm which might look similar to this: Code: int strlen(char const *s) { char const *p = s; int m = 0x7EFEFEFF, n = ~m, i; for (; (int)p & (sizeof(int) - 1); p++) { if (!*p) { return p - s; } } for (;;) { i = *(int const *)p; if (!(((i + m) ^ ~i) & n)) { p += sizeof(int); } else { for (i = sizeof(int); i; p++, i--) { if (!*p) { return p - s; } } } } } This is similarly what I see when I breakpoint at the function call and step into it. https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/
Actually Microsoft seems to use an algorithm which might look similar to this: Code: int strlen(char const *s) { char const *p = s; int m = 0x7EFEFEFF, n = ~m, i; for (; (int)p & (sizeof(int) - 1); p++) { if (!*p) { return p - s; } } for (;;) { i = *(int const *)p; if (!(((i + m) ^ ~i) & n)) { p += sizeof(int); } else { for (i = sizeof(int); i; p++, i--) { if (!*p) { return p - s; } } } } } This is similarly what I see when I breakpoint at the function call and step into it. https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 25, 2015 #15 AceInfinity said: Actually Microsoft seems to use an algorithm which might look similar to this: Code: int strlen(char const *s) { char const *p = s; int m = 0x7EFEFEFF, n = ~m, i; for (; (int)p & (sizeof(int) - 1); p++) { if (!*p) { return p - s; } } for (;;) { i = *(int const *)p; if (!(((i + m) ^ ~i) & n)) { p += sizeof(int); } else { for (i = sizeof(int); i; p++, i--) { if (!*p) { return p - s; } } } } } This is similarly what I see when I breakpoint at the function call and step into it. https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/ Click to expand... I mean the way microsoft VS compiler optimize a SIMPLE strlen-type function, not the actual strlen function.
AceInfinity said: Actually Microsoft seems to use an algorithm which might look similar to this: Code: int strlen(char const *s) { char const *p = s; int m = 0x7EFEFEFF, n = ~m, i; for (; (int)p & (sizeof(int) - 1); p++) { if (!*p) { return p - s; } } for (;;) { i = *(int const *)p; if (!(((i + m) ^ ~i) & n)) { p += sizeof(int); } else { for (i = sizeof(int); i; p++, i--) { if (!*p) { return p - s; } } } } } This is similarly what I see when I breakpoint at the function call and step into it. https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/ Click to expand... I mean the way microsoft VS compiler optimize a SIMPLE strlen-type function, not the actual strlen function.
T tekz Active member Joined Aug 22, 2015 Posts 29 Aug 26, 2015 #16 This is the case for simple strlen-type calculations. Code: char cArray[] = "hello"; char cFormat[] = "%d"; HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen"); _declspec(naked) void func1() { _asm { //What one think happens push offset cArray call hStrlen pop ebx push eax push offset cFormat call printf pop eax pop eax } } _declspec(naked) void func2() { printf(cFormat, strlen(cArray)); /*What really happens mov ebx,offset cArray lea eax,[ebx+01] repeat: mov al,[ebx] inc ebx test al,al jne repeat sub ebx,eax push ebx push offset cFormat call printf pop ebx pop ebx */ } Compiler: Visual Studio 2013 Optimization: /O1
This is the case for simple strlen-type calculations. Code: char cArray[] = "hello"; char cFormat[] = "%d"; HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen"); _declspec(naked) void func1() { _asm { //What one think happens push offset cArray call hStrlen pop ebx push eax push offset cFormat call printf pop eax pop eax } } _declspec(naked) void func2() { printf(cFormat, strlen(cArray)); /*What really happens mov ebx,offset cArray lea eax,[ebx+01] repeat: mov al,[ebx] inc ebx test al,al jne repeat sub ebx,eax push ebx push offset cFormat call printf pop ebx pop ebx */ } Compiler: Visual Studio 2013 Optimization: /O1
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Aug 27, 2015 #17 Ahh, I thought you were talking about the actual implementation.
AceInfinity Emeritus, Contributor Joined Feb 21, 2012 Posts 1,728 Location Canada Oct 22, 2015 #18 tekz said: This is the case for simple strlen-type calculations. Code: char cArray[] = "hello"; char cFormat[] = "%d"; HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen"); _declspec(naked) void func1() { _asm { //What one think happens push offset cArray call hStrlen pop ebx push eax push offset cFormat call printf pop eax pop eax } } _declspec(naked) void func2() { printf(cFormat, strlen(cArray)); /*What really happens mov ebx,offset cArray lea eax,[ebx+01] repeat: mov al,[ebx] inc ebx test al,al jne repeat sub ebx,eax push ebx push offset cFormat call printf pop ebx pop ebx */ } Compiler: Visual Studio 2013 Optimization: /O1 Click to expand... I never actually looked at the variants before in your post, but the reason for the difference is because that function call is getting inlined in this case.
tekz said: This is the case for simple strlen-type calculations. Code: char cArray[] = "hello"; char cFormat[] = "%d"; HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen"); _declspec(naked) void func1() { _asm { //What one think happens push offset cArray call hStrlen pop ebx push eax push offset cFormat call printf pop eax pop eax } } _declspec(naked) void func2() { printf(cFormat, strlen(cArray)); /*What really happens mov ebx,offset cArray lea eax,[ebx+01] repeat: mov al,[ebx] inc ebx test al,al jne repeat sub ebx,eax push ebx push offset cFormat call printf pop ebx pop ebx */ } Compiler: Visual Studio 2013 Optimization: /O1 Click to expand... I never actually looked at the variants before in your post, but the reason for the difference is because that function call is getting inlined in this case.