Wednesday, November 18, 2009

Parsing a String into Tokens Using sscanf

Parsing a String into Tokens Using sscanf
These search terms have been highlighted: parse string c (Clear)
Dave Sinkula Dave Sinkula is offline Offline Jun 4th, 2005, 4:23 pm |
This post is useful and well-written 0 This post is unclear
Many times strtok is recommended for parsing a string; I don't care for strtok. Why?

* It modifies the incoming string, so it cannot be used with string literals or other constant strings.
* The identity of the delimiting character is lost.
* It uses a static buffer while parsing, so it's not reentrant.
* It does not correctly handle "empty" fields -- that is, where two delimiters are back-to-back and meant to denote the lack of information in that field.

This snippet shows a way to use sscanf to parse a string into fields delimited by a character (a semicolon in this case, but commas or tabs or others could be used as well).

Thanks to figo2476 for pointing out an issue with a previous version!
Thanks to dwks for asking why not to use strtok.

#include 

int main(void)
{
   const char line[] = "2004/12/03 12:01:59;info1;info2;info3";
   const char *ptr = line;
   char field [ 32 ];
   int n;
   while ( sscanf(ptr, "%31[^;]%n", field, &n) == 1 )
   {
      printf("field = \"%s\"\n", field);
      ptr += n; /* advance the pointer by the number of characters read */
      if ( *ptr != ';' )
      {
         break; /* didn't find an expected delimiter, done? */
      }
      ++ptr; /* skip the delimiter */
   }
   return 0;
}

/* my output
field = "2004/12/03 12:01:59"
field = "info1"
field = "info2"
field = "info3"
*/


sanushks sanushks is offline Offline | Oct 10th, 2008

Hi,

This code does not work if there are successive delimiters with no info as shown below like
2007/09/15 12:34:23;;info1;info2;
The output is 2007/09/15 12:34:23
The rest of the string is ignored.

Please check

To get around successive delimiters
Replace code on line 17 with

while ( *ptr == ';' )      
     { 
          ++ptr; /* skip the delimiter */    
      }

======================================================================
ArkM (IS/IT--Management)
5 Aug 05 0:33

Never use strtok in this context:

CODE

char* p = "string literal";
...strtok(p,...

String literals in C/C++ are constants but strtok modifies its argument.

Reference: http://www.tek-tips.com/viewthread.cfm?qid=1102104&page=33

No comments: