NASM (Assembly) - Finding Null-terminated string length (x86 Win)

AceInfinity · Apr 26, 2015

Here's a quick implementation I wrote using nasm for finding the length of a local null terminated string, without calling the C function strlen(). The only C function used in this example is printf() to display the results to the console.

Code:

[NO-PARSE];----------------------------------------------------
; File:
;   code.asm
; 
; Description:
;   NASM implementation to find the length of a
;   local null-terminated string.
;
; nasm -f win32 code.asm -o code.o
; gcc -o main code.o 
;----------------------------------------------------

extern _printf ; C function

global _main

section .data
  mystr: db 'testing', 0
  format: db 'length (%s) => %u', 10, 0

section .text
_main:
  push dword mystr
  call strlen
  add esp, byte 4
  push eax
  push dword mystr
  push dword format
  call _printf 
  add esp, byte 12
  ret

strlen:
  mov ebx, [esp + 4]
  mov ecx, 0
  .next:
  cmp byte [ebx], 0
  jz .exit_loop
  inc ebx
  inc ecx
  jmp .next
  .exit_loop:
  mov eax, ecx
  ret[/NO-PARSE]

Written for 32-bit Windows.

Patrick · Apr 26, 2015

Had no idea you knew x86-32 assembly, neat.

AceInfinity · Apr 26, 2015

I know a bit about x64 as well. Theres a few key differences especially in regards to function calls and registers.

Patrick · Apr 26, 2015

Cool.

Yeah, there's not too much that requires much thought process despite the full registers/64 bit operations being r prefixed as opposed to e. Similar with integer registers also, I guess.

Also rip pushad/popad. Welcome our new nonvolatile overlords.

AceInfinity · Apr 26, 2015

Actually there's a bit more to it than just that when you dig deep enough, but it's not a super hard transition once you figure a few things out.

Patrick · Apr 26, 2015

Yep, pretty much.

tekz · Aug 23, 2015

You can also write,

Code:

mov ecx,cString
lea edx,[ecx+01]

repeat:
mov al,[ecx]
inc ecx
test al,al
jne repeat
sub ecx,edx
push ecx
push cFormat
call printf
pop ecx
pop ecx
ret

32 bytes used.

AceInfinity · Aug 23, 2015

tekz said:
You can also write,

Code:

mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret

32 bytes used.

Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :)

Code:

[plain]  mov ecx, mystr
  lea edx, [ecx]

  repeat:
  mov al,[ecx]
  test al,0FFh
  jz next
  inc ecx
  jmp repeat
  next:
  sub ecx,edx
  push ecx
  push mystr
  push format
  call _printf
  add esp, 12
  ret[/plain]

I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) edit: I suppose the other thing you could do is decrement ecx after the sub instruction, and keep everything else consistent in your code.

tekz · Aug 23, 2015

AceInfinity said:
tekz said:

You can also write,

Code:

mov ecx,cString lea edx,[ecx+01] repeat: mov al,[ecx] inc ecx test al,al jne repeat sub ecx,edx push ecx push cFormat call printf pop ecx pop ecx ret

32 bytes used.

Click to expand...

Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :)

Code:

[plain] mov ecx, mystr lea edx, [ecx] repeat: mov al,[ecx] test al,0FFh jz next inc ecx jmp repeat next: sub ecx,edx push ecx push mystr push format call _printf add esp, 12 ret[/plain]

I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :)

Ahh, I see, also old habbit of using test over cmp when I can, writing it on Windows though.

AceInfinity · Aug 23, 2015

Code:

lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

Yeah, I see why inc works before the comparison now -- you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. :) I read all of the code except for what you store in edx...

Very nice :)

Code:

extern _printf

global _main

section .data
  mystr: db '', 0
  format: db '%u', 10, 0

section .text
_main:
  mov ecx,mystr
  lea edx,[ecx+1]

  repeat:
  mov al,[ecx]
  inc ecx
  test al,al
  jne repeat
  sub ecx,edx
  push ecx
  push format
  call _printf
  pop ecx
  pop ecx
  ret

tekz · Aug 23, 2015

AceInfinity said:
Code:

lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

Click to expand...

It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :)

Yeah edited that out of my post asap since i miss read what you said :S, it will return 0. Also I usually compile my code using inline asm inside VS c++.

AceInfinity · Aug 23, 2015

tekz said:
AceInfinity said:

Code:

lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

Click to expand...

It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :)

Click to expand...

Yeah edited that out of my post asap since i miss read what you said, it will return 0, I usually compile my code using inline asm inside VS c++.

I read your response incorrect. edited my post.. Sorry!

Code:

extern _printf

global _main

section .data
  mystr: db 'test', 0
  format: db '%u', 10, 0

section .text
_main:
  mov ecx,mystr
  lea edx,[ecx+1]

  repeat:
  mov al,[ecx]
  inc ecx
  test al,0FFh
  jne repeat
  sub ecx,edx
  push ecx
  push format
  call _printf
  add esp, 8
  ret

Clever example, thanks for that by the way! :) Also... Have you looked into how Microsoft implements it for C/C++? That is interesting as well.

I usually compile my code using inline asm inside VS c++

THAT would be why haha. I'm using NASM to compile. Windows convention means prefixed functions with underscores.

With your variant, my executable size can be reduced by 65 bytes in total, removing the *function* and writing it out in main.

tekz · Aug 24, 2015

Yeah the way microsoft implents it is very neat, I try to write mine with similar methods based of what I remembered from it.

Anyway only managed to write it 53 bytes total.

Code:

xor ebx,ebx        
cmp byte ptr [cArray],00
jne label1
xor eax,eax
inc eax
mov ebx,eax
jmp label2

label1:
mov eax,ebx


label2:
lea ecx,[eax+cArray]
lea edx,[ecx+01]


label3:
mov al,[ecx]
inc ecx
test al,al
jne label3
sub ecx,edx
push ecx
push offset cFormat
call dword ptr [printf]
pop ecx
pop ecx
ret

AceInfinity · Aug 25, 2015

Actually Microsoft seems to use an algorithm which might look similar to this:

Code:

int strlen(char const *s)
{
  char const *p = s;
  int m = 0x7EFEFEFF, n = ~m, i;
  for (; (int)p & (sizeof(int) - 1); p++)
  {
    if (!*p)
    {
      return p - s;
    }
  }
  for (;;)
  {
    i = *(int const *)p;
    if (!(((i + m) ^ ~i) & n))
    {
      p += sizeof(int);
    }
    else
    {
      for (i = sizeof(int); i; p++, i--)
      {
      if (!*p)
      {
      return p - s;
      }
      }
    }
  }
}

This is similarly what I see when I breakpoint at the function call and step into it.

https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/

tekz · Aug 25, 2015

AceInfinity said:
Actually Microsoft seems to use an algorithm which might look similar to this:

Code:

int strlen(char const *s) { char const *p = s; int m = 0x7EFEFEFF, n = ~m, i; for (; (int)p & (sizeof(int) - 1); p++) { if (!*p) { return p - s; } } for (;;) { i = *(int const *)p; if (!(((i + m) ^ ~i) & n)) { p += sizeof(int); } else { for (i = sizeof(int); i; p++, i--) { if (!*p) { return p - s; } } } } }

This is similarly what I see when I breakpoint at the function call and step into it.

https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/

I mean the way microsoft VS compiler optimize a SIMPLE strlen-type function, not the actual strlen function.

tekz · Aug 26, 2015

This is the case for simple strlen-type calculations.

Code:

char cArray[] = "hello";
char cFormat[] = "%d";


HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen");


_declspec(naked) void func1()
{
    _asm
    {   //What one think happens
        push offset cArray
        call hStrlen
        pop ebx
        push eax
        push offset cFormat
        call printf
        pop eax
        pop eax
    }
}


_declspec(naked) void func2()
{
    printf(cFormat, strlen(cArray));
    /*What really happens
    mov ebx,offset cArray
    lea eax,[ebx+01]


    repeat:
    mov al,[ebx]
    inc ebx
    test al,al
    jne repeat
    sub ebx,eax
    push ebx
    push offset cFormat
    call printf
    pop ebx
    pop ebx
    */
}

Compiler: Visual Studio 2013
Optimization: /O1

AceInfinity · Aug 27, 2015

Ahh, I thought you were talking about the actual implementation.

AceInfinity · Oct 22, 2015

tekz said:

This is the case for simple strlen-type calculations.

Code:

char cArray[] = "hello";
char cFormat[] = "%d";


HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen");


_declspec(naked) void func1()
{
    _asm
    {   //What one think happens
        push offset cArray
        call hStrlen
        pop ebx
        push eax
        push offset cFormat
        call printf
        pop eax
        pop eax
    }
}


_declspec(naked) void func2()
{
    printf(cFormat, strlen(cArray));
    /*What really happens
    mov ebx,offset cArray
    lea eax,[ebx+01]


    repeat:
    mov al,[ebx]
    inc ebx
    test al,al
    jne repeat
    sub ebx,eax
    push ebx
    push offset cFormat
    call printf
    pop ebx
    pop ebx
    */
}

Compiler: Visual Studio 2013
Optimization: /O1

I never actually looked at the variants before in your post, but the reason for the difference is because that function call is getting inlined in this case.

NASM (Assembly) - Finding Null-terminated string length (x86 Win)

AceInfinity

Emeritus, Contributor

Patrick

Sysnative Staff

AceInfinity

Emeritus, Contributor

Patrick

Sysnative Staff

AceInfinity

Emeritus, Contributor

Patrick

Sysnative Staff

tekz

Active member

AceInfinity

Emeritus, Contributor

tekz

Active member

AceInfinity

Emeritus, Contributor

tekz

Active member

AceInfinity

Emeritus, Contributor

tekz

Active member

AceInfinity

Emeritus, Contributor

tekz

Active member

tekz

Active member

AceInfinity

Emeritus, Contributor

AceInfinity

Emeritus, Contributor