Resources | developer.brewmp.com Resources | developer.brewmp.com

Developer

resources

Use of string functions

Care must be taken when using many of the C/C++ string functions, which can significantly degrade performance if not used carefully.

The strxxx() functions may feel like primitive operations, but string functions can take a fair amount of time. The following are things to watch out for when using these functions.

Use only string functions that accept size or length as one of the parameters

Buffer-overrun is a common error that occurs with string operations. These type of errors occur during runtime execution and usually cause the device to crash, making debugging tricky. Best practice is to always choose the version of the function that accepts the buffer size or length as one of its parameters. This will help reduce the chances of buffer-overrun in your code. For example, use SNPRINTF() and STRNCPY() rather than SPRINTF() and STRCPY().

Do not use strlen() to check for an empty string

In this example, strlen() is used to check whether a string is empty:

boolean naive_IsEmptyString(char *psz)
{ 
   return (0 == strlen(psz));
}

If the string is not empty, not only is the overhead of a call and return incurred, but the entire string is scanned.

The following example is faster (even in the case of an empty string):

boolean better_IsEmptyString(char *psz)
{ 
   return ('\0' == *psz);
}
The table below shows the 6290 timings of naive_IsEmptyString() and better_IsEmptyString() on a non-empty string and an empty string. Note that the better_IsEmptyString() function is faster for both.
String naive example better example ratio
"fs:/mod/perffilesystem/openfile/f_0.dat" 1.28 us 0.16 us 8.0
"" (empty string) 0.22 us 0.16 us 1.4

Getting the length of a string

Much string-related code seems to start by getting the length of all strings involved. Before doing this, think about whether you really need that length; a simple change of algorithm might allow you to operate on the strings by just walking through them until you hit a terminating '\0'. For example, a naive implementation of strstr might be written as the following:

char *naive_strstr(char *haystack, char *needle)
{
   int haystackLen = STRLEN(haystack);
   int needleLen = STRLEN(needle);
   char *haystackEnd = haystack + haystackLen - needleLen;
   for ( ; haystack <= haystackEnd ; ++haystack)
   {
      if (0 == STRNCMP(haystack, needle, needleLen))
      {
         return haystack;
      }
   }
   return NULL;
}

The following example is faster, because it walks the haystack string (which could be quite long) only once and the needle string one less time:

char *better_strstr(char *haystack, char *needle)
{
   for ( ; '\0' != *haystack ; ++haystack)
   {
      char *ph;
      char *pn;
      for (ph = haystack, pn = needle ; ; ++ph, ++pn)
      {
         if ('\0' == *pn) { return haystack; }
         if ('\0' == *ph) { return NULL; }
         if (*ph != *pn)  { break; }
      }
   }

   return NULL;
}

The following table shows 6290 timings for naive_strstr() and better_strstr() with a haystack value of "fs:/mod/perffilesystem/openfile/f_0.dat":
string naive example better example ratio
strstr(haystack, "qrstu") 6.04 us 3.67 us 1.6
strstr(haystack, "fs:/") 1.86 us 0.51 us 3.6

Compute the length of a string only once

If the length of a string is required, compute and store it once rather than calling strlen() on the same string multiple times. On a 6290, the following code takes 1.28 us to execute, so each avoided call speeds up the code up by that much.

strlen("fs:/mod/perffilesystem/openfile/f_0.dat");

Instead of the following code:

char *strdup(char *string)
{
   char *copy = (char *)malloc(strlen(string) + 1); 
   strlcpy(copy, string, strlen(string) + 1); 
   return copy;
}

Use the following:

char *strdup(char *string)
{
   int len = strlen(string); 
   char *copy = (char *)malloc(len + 1);
   strlcpy(copy, string, len + 1); return copy;
}

Do not call strlen() on a literal string

Do not call strlen() on literal strings (strings defined with #define or inline). The length of a literal does not change and is known at compile time. Instead, use (int)(sizeof(string)-1) or an appropriately defined macro.

Instead of the following code:

if (0 !=
strncmp(pszFilePath, "fs:/", strlen("fs:/"))) ...

Use the following:

#define CSTRLEN(s)
(int)((sizeof(s)-1)if (0 != strncmp(pszFilepath, "fs:/",
CSTRLEN("fs:/"))) ...

Understanding the semantics of strlcpy() and strlcat()

Use strlcpy() and strlcat() rather than strcpy() and strcat(). It is helpful to understand the semantics of strlcpy() and strlcat().

int strlcpy(char *pszDst, const char *cpszSrc, int nDestSize)

In addition to guarding against buffer overflow and ensuring that the destination string is null-terminated, strlcpy() returns the length of cpszSrc; this can be checked against nDestSize to determine whether the source string was truncated. If the string was not truncated (if the return value is less than nDestSize), the return value can be used as an index to point to the end of the string (the terminating null) in pszDst.

int strlcat(char *pszDst, const char *cpszSrc, int nDestSize)

Note that, unlike strncat() which takes the number of characters to be copied (usually the remaining space in the buffer) as the third parameter, nDestSize parameter to strlcat() is the total size of the pszDst buffer. The return value of strlcat() is strlen(pszDst) + strlen(cpszSrc), the length of the concatenated string assuming an infinitely large destination buffer. This can be checked against the destination buffer size to determine whether there was truncation and, assuming no truncation, find the null-terminator in the destination buffer.

Strlcat() has to walk the destination buffer until it finds the end of the string already in the buffer. A series of calls to strlcat() on the same buffer will unnecessarily scan the destination buffer multiple times. More efficient ways of doing this are shown in the examples below.

Example:

  • Buggy and inefficient: The following is similar to real-world code in Brew MP. It not only scans many strings multiple times (making it inefficient), it doesn't take into account the difference in semantics of the size parameter between strncat()and strlcat(), passing the remaining buffer size rather than the total buffer size. This code might create a truncated string when there is still plenty of space in the buffer.
    strlcpy(dest, m_psz1, MAX_STRING_LEN);
    strlcat(dest, ":", MAX_STRING_LEN - strlen(m_psz1));
    strlcat(dest, m_psz2, MAX_STRING_LEN - (strlen(m_psz1) + 1));
    strlcat(dest, "=", MAX_STRING_LEN - (strlen(m_psz1) + 1 + strlen(m_psz2));
    
  • Not buggy and more efficient: Passing the proper size value (the destination buffer size) to strlcat() removes the bug, and the code becomes more efficient without the multiple strlen()calls:
    strlcpy(dest, m_psz1, MAX_STRING_LEN);
    strlcat(dest, ":", MAX_STRING_LEN);
    strlcat(dest, m_psz2, MAX_STRING_LEN)
    strlcat(dest, "=", MAX_STRING_LEN);
    
  • Even more efficient: avoid scanning the destination buffer multiple times to find the end of the string before concatenating by using the return value of strlcpy(). This points to the null-terminator and determine the size of the remaining buffer and use strlcpy() instead:
    int length = strlcpy(dest, m_psz1, MAX_STRING_LEN);
    if (length >= MAX_STRING_LEN) goto bail;   // no more room
    
    length += strlcpy(dest + length, ":", MAX_STRING_LEN - length);
    if (length >= MAX_STRING_LEN) goto bail;   // no more room
    
    length += strlcpy(dest + length, m_psz2, MAX_STRING_LEN - length);
    if (length >= MAX_STRING_LEN) goto bail;   // no more room
      
    strlcpy(dest + length, "=", MAX_STRING_LEN - length);
    

The following table shows 6290 timings of the three examples shown above with m_psz1 = "123456789" and m_psz2 = "abcdefghijklmn"
buggy and inefficient not buggy and more efficient even more efficient
6.24 us 4.87 us 3.15 us

Example 2

The following code example:

strlcpy(dest + strlen(dest), source, MAX_STRING_LEN - strlen(dest));

is the same as the following example:

strlcat(dest, source, MAX_STRING_LEN); 

Do not make multiple strchr() calls, walk the string instead

When checking if one of a set of characters is in a string, rather than calling strchr() or strstr() on the string for each character of interest, consider walking through the string once, checking each character against the characters in the set.

The following is a poor example of checking whether a file is to be opened in writeable mode:

boolean naive_iswriteable(char *pszCap)
{
   return std_strstr(pszCap, "rw") ||
          std_strstr(pszCap, "c") ||
          std_strstr(pszCap, "t") ||
          std_strstr(pszCap, "a");
}

The following code is a better example. The check for "rw" really only needs to be a check for "w", since that's what designates writeable.

boolean better_iswriteable(char *pszCap)
{
   for (char *psz = pszCap ; '\0' != *psz ; ++psz)
   {
      char c = *psz;
      if ('w' == c || 'c' == c || 't' == c || 'a' == c)
      {
         return TRUE;
      }
   }
   return FALSE;
}
The following table shows the 6290 timings of naive_iswriteable() and better_iswriteable():
pszCap naive better ratio
"Dr" 1.17 us 0.28 us 4.2
"a" 0.80 us 0.20 us 4.0

Try a state-machine for string matching

When checking a string, consider a state-machine rather than a bunch of calls to strstr(), strbegins(), or strends(). Using a state machine rather than checking a file path using string functions to see if it is resolved in IFileMgr_OpenFile() results in a 10% speed-up of the open call (because of a 14x speed-up of the check).

This example:

boolean IsCleanPath(const char *cpszName)
{
   if (STRSTR(cpszName, "/../") || STRSTR(cpszName, "/./") ||
       STRENDS(cpszName,"/..")  || STRENDS(cpszName,"/.") ||
       STRBEGINS(cpszName, "../") || STRBEGINS(cpszName, "./") ||       
       !STRCMP(cpszName, "..") || !STRCMP(cpszName, ".")) {
      return FALSE;
   } else {
      return TRUE;
   }
}

can be replaced by the following:

// Define a state machine for the string 'p' with initial state 's'
#define  STATE_MACHINE(p, s)     \
   const char  *psz = p;         \
   goto  STATE_##s;              \

// Define a state within the machine.  This reads the next character
// and either transitions or terminates based on the character.
#define  STATE(s)             \
STATE_##s:                    \
   switch (*psz++)

// Transition to state 's' on character 'c'
#define  TRANSITION(c, s)     \
   case c: goto STATE_##s

// Transition to state 's' on any character not already handled
#define  OTHER_TRANSITION(s)  \
   default: goto STATE_##s

// Terminate with result 'r' on character 'c'
#define  TERMINAL(c, r)       \
   case c: return r

// Terminate with result 'r' on any character not already handled
#define  OTHER_TERMINAL(r)    \
   default: return r


boolean IsCleanPath (const char *pszPath)
{
   STATE_MACHINE(pszPath, BEGIN)
   {
      STATE(BEGIN)   // ^
      {
         TRANSITION('.', BD);
         TRANSITION('/', S);
         OTHER_TRANSITION(ANY);
      }

      STATE(BD)      // ^.
      {
         TRANSITION('.', BDD);
         TERMINAL('/', FALSE);
         TERMINAL('\0', FALSE);
         OTHER_TRANSITION(ANY);
      }

      STATE(BDD)     // ^..
      {
         TERMINAL('\0', FALSE);
         TERMINAL('/', FALSE);
         OTHER_TRANSITION(ANY);
      }

      STATE(ANY)
      {
         TERMINAL('\0', TRUE);
         TRANSITION('/', S);
         OTHER_TRANSITION(ANY);
      }

      STATE(S)       // /
      {
         TRANSITION('.', SD);
         TERMINAL('\0', TRUE);
         OTHER_TRANSITION(ANY);
      }

      STATE(SD)      // /.
      {
         TERMINAL('\0', FALSE);
         TERMINAL('/', FALSE);
         TRANSITION('.', SDD);
         OTHER_TRANSITION(ANY);
      }

      STATE(SDD)     // /..
      {
         TERMINAL('\0', FALSE);
         TERMINAL('/', FALSE);
         OTHER_TRANSITION(ANY);
      }
   }
}

The following table shows 6290 timings of the IsCleanPath() example and the state machine version of IsCleanPath().
pszPath strstr state machine ratio
"fs:/mod/perffilesystem/openfile/f_0.dat" 11.31 us 1.42 us 8.0