NASM (Assembly) - Finding Null-terminated string length (x86 Win)

AceInfinity

Emeritus, Contributor
Joined
Feb 21, 2012
Posts
1,728
Location
Canada
Here's a quick implementation I wrote using nasm for finding the length of a local null terminated string, without calling the C function strlen(). The only C function used in this example is printf() to display the results to the console.

Code:
[NO-PARSE];----------------------------------------------------
; File:
;   code.asm
; 
; Description:
;   NASM implementation to find the length of a
;   local null-terminated string.
;
; nasm -f win32 code.asm -o code.o
; gcc -o main code.o 
;----------------------------------------------------

extern _printf ; C function

global _main

section .data
  mystr: db 'testing', 0
  format: db 'length (%s) => %u', 10, 0

section .text
_main:
  push dword mystr
  call strlen
  add esp, byte 4
  push eax
  push dword mystr
  push dword format
  call _printf 
  add esp, byte 12
  ret

strlen:
  mov ebx, [esp + 4]
  mov ecx, 0
  .next:
  cmp byte [ebx], 0
  jz .exit_loop
  inc ebx
  inc ecx
  jmp .next
  .exit_loop:
  mov eax, ecx
  ret[/NO-PARSE]

Written for 32-bit Windows.
 
I know a bit about x64 as well. Theres a few key differences especially in regards to function calls and registers.
 
Cool.

Yeah, there's not too much that requires much thought process despite the full registers/64 bit operations being r prefixed as opposed to e. Similar with integer registers also, I guess.

Also rip pushad/popad. Welcome our new nonvolatile overlords.
 
Actually there's a bit more to it than just that when you dig deep enough, but it's not a super hard transition once you figure a few things out.
 
You can also write,
Code:
mov ecx,cString
lea edx,[ecx+01]

repeat:
mov al,[ecx]
inc ecx
test al,al
jne repeat
sub ecx,edx
push ecx
push cFormat
call printf
pop ecx
pop ecx
ret
32 bytes used.
 
You can also write,
Code:
mov ecx,cString
lea edx,[ecx+01]

repeat:
mov al,[ecx]
inc ecx
test al,al
jne repeat
sub ecx,edx
push ecx
push cFormat
call printf
pop ecx
pop ecx
ret
32 bytes used.

Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :)

Code:
[plain]  mov ecx, mystr
  lea edx, [ecx]

  repeat:
  mov al,[ecx]
  test al,0FFh
  jz next
  inc ecx
  jmp repeat
  next:
  sub ecx,edx
  push ecx
  push mystr
  push format
  call _printf
  add esp, 12
  ret[/plain]

I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :) edit: I suppose the other thing you could do is decrement ecx after the sub instruction, and keep everything else consistent in your code.
 
Last edited:
You can also write,
Code:
mov ecx,cString
lea edx,[ecx+01]

repeat:
mov al,[ecx]
inc ecx
test al,al
jne repeat
sub ecx,edx
push ecx
push cFormat
call printf
pop ecx
pop ecx
ret
32 bytes used.

Sure, you can subtract pointers, but your implementation is still flawed because the inc instruction happens before the comparison(/test instruction in your case). A string such as "\0" will still return 1, and all other string length's will include the null terminator in your code. :)

Code:
[plain]  mov ecx, mystr
  lea edx, [ecx]

  repeat:
  mov al,[ecx]
  test al,0FFh
  jz next
  inc ecx
  jmp repeat
  next:
  sub ecx,edx
  push ecx
  push mystr
  push format
  call _printf
  add esp, 12
  ret[/plain]

I assume that you also didn't write that example on Windows? I also see that you use 'test' vs 'cmp' :)

Ahh, I see, also old habbit of using test over cmp when I can, writing it on Windows though.
 
Code:
lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

Yeah, I see why inc works before the comparison now -- you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. :) I read all of the code except for what you store in edx...

Very nice :)


Code:
extern _printf

global _main

section .data
  mystr: db '', 0
  format: db '%u', 10, 0

section .text
_main:
  mov ecx,mystr
  lea edx,[ecx+1]

  repeat:
  mov al,[ecx]
  inc ecx
  test al,al
  jne repeat
  sub ecx,edx
  push ecx
  push format
  call _printf
  pop ecx
  pop ecx
  ret
 
Last edited:
Code:
lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :)
Yeah edited that out of my post asap since i miss read what you said :S, it will return 0. Also I usually compile my code using inline asm inside VS c++.
 
Code:
lea edx,[ecx+1]

Ah, I wasn't watching this. Also, how come you're calling printf and not _printf if that is Windows? :S

Example isn't flawed, ecx is meant to be increased before the comparison

It is though, so my speculation is correct. The reason why inc works before the comparison is because you increment the address of ecx which stores the pointer to the beginning of string PLUS 1, so even though ecx is incremented before the comparison, you don't subtract from the beginning of the string but rather subtract by the beginning of the string offset by 1, which cancel's the additional inc out. Read through your code to see what I mean. :)
Yeah edited that out of my post asap since i miss read what you said, it will return 0, I usually compile my code using inline asm inside VS c++.

I read your response incorrect. edited my post.. Sorry!
Code:
extern _printf

global _main

section .data
  mystr: db 'test', 0
  format: db '%u', 10, 0

section .text
_main:
  mov ecx,mystr
  lea edx,[ecx+1]

  repeat:
  mov al,[ecx]
  inc ecx
  test al,0FFh
  jne repeat
  sub ecx,edx
  push ecx
  push format
  call _printf
  add esp, 8
  ret

Clever example, thanks for that by the way! :) Also... Have you looked into how Microsoft implements it for C/C++? That is interesting as well.

I usually compile my code using inline asm inside VS c++

THAT would be why haha. I'm using NASM to compile. Windows convention means prefixed functions with underscores.

With your variant, my executable size can be reduced by 65 bytes in total, removing the *function* and writing it out in main.
 
Last edited:
Yeah the way microsoft implents it is very neat, I try to write mine with similar methods based of what I remembered from it.

Anyway only managed to write it 53 bytes total.
Code:
xor ebx,ebx        
cmp byte ptr [cArray],00
jne label1
xor eax,eax
inc eax
mov ebx,eax
jmp label2

label1:
mov eax,ebx


label2:
lea ecx,[eax+cArray]
lea edx,[ecx+01]


label3:
mov al,[ecx]
inc ecx
test al,al
jne label3
sub ecx,edx
push ecx
push offset cFormat
call dword ptr [printf]
pop ecx
pop ecx
ret
 
Actually Microsoft seems to use an algorithm which might look similar to this:
Code:
int strlen(char const *s)
{
  char const *p = s;
  int m = 0x7EFEFEFF, n = ~m, i;
  for (; (int)p & (sizeof(int) - 1); p++)
  {
    if (!*p)
    {
      return p - s;
    }
  }
  for (;;)
  {
    i = *(int const *)p;
    if (!(((i + m) ^ ~i) & n))
    {
      p += sizeof(int);
    }
    else
    {
      for (i = sizeof(int); i; p++, i--)
      {
      if (!*p)
      {
      return p - s;
      }
      }
    }
  }
}

This is similarly what I see when I breakpoint at the function call and step into it.

https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/
 
Actually Microsoft seems to use an algorithm which might look similar to this:
Code:
int strlen(char const *s)
{
  char const *p = s;
  int m = 0x7EFEFEFF, n = ~m, i;
  for (; (int)p & (sizeof(int) - 1); p++)
  {
    if (!*p)
    {
      return p - s;
    }
  }
  for (;;)
  {
    i = *(int const *)p;
    if (!(((i + m) ^ ~i) & n))
    {
      p += sizeof(int);
    }
    else
    {
      for (i = sizeof(int); i; p++, i--)
      {
      if (!*p)
      {
      return p - s;
      }
      }
    }
  }
}

This is similarly what I see when I breakpoint at the function call and step into it.

https://hiddencodes.wordpress.com/2...-studio-and-0x7efefeff-0x81010100-0x81010101/


I mean the way microsoft VS compiler optimize a SIMPLE strlen-type function, not the actual strlen function.
 
This is the case for simple strlen-type calculations.

Code:
char cArray[] = "hello";
char cFormat[] = "%d";


HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen");


_declspec(naked) void func1()
{
    _asm
    {   //What one think happens
        push offset cArray
        call hStrlen
        pop ebx
        push eax
        push offset cFormat
        call printf
        pop eax
        pop eax
    }
}


_declspec(naked) void func2()
{
    printf(cFormat, strlen(cArray));
    /*What really happens
    mov ebx,offset cArray
    lea eax,[ebx+01]


    repeat:
    mov al,[ebx]
    inc ebx
    test al,al
    jne repeat
    sub ebx,eax
    push ebx
    push offset cFormat
    call printf
    pop ebx
    pop ebx
    */
}

Compiler: Visual Studio 2013
Optimization: /O1
 
This is the case for simple strlen-type calculations.

Code:
char cArray[] = "hello";
char cFormat[] = "%d";


HANDLE hStrlen = GetProcAddress(LoadLibrary(L"ntdll.dll"), "strlen");


_declspec(naked) void func1()
{
    _asm
    {   //What one think happens
        push offset cArray
        call hStrlen
        pop ebx
        push eax
        push offset cFormat
        call printf
        pop eax
        pop eax
    }
}


_declspec(naked) void func2()
{
    printf(cFormat, strlen(cArray));
    /*What really happens
    mov ebx,offset cArray
    lea eax,[ebx+01]


    repeat:
    mov al,[ebx]
    inc ebx
    test al,al
    jne repeat
    sub ebx,eax
    push ebx
    push offset cFormat
    call printf
    pop ebx
    pop ebx
    */
}

Compiler: Visual Studio 2013
Optimization: /O1

I never actually looked at the variants before in your post, but the reason for the difference is because that function call is getting inlined in this case.
 

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top