WebBiscuit

WebBiscuit

The Webbiscuit Blog: Taking the Internet crumb by crumb…

  • Home
  • Software
    • Palette Parser
    • XHTML Generator
  • Articles
    • The Strategy Pattern
    • The Factory Pattern
    • Memory leaks
  • About Web Biscuit
  • Contact

A tokeniser using STL

Posted in C++, code, STL by Daniel
Aug 23 2011
TrackBack Address.

Using STL and its find functions, you can write a simple and extremely useful tokenise method.

std::vector<std::wstring> Tokenise(const std::wstring& stringToTokenise, const std::wstring& delimiters)
{
	std::vector<std::wstring> tokens;
	size_t startPos = 0; 
	size_t endPos = 0;
	std::wstring token;

	// Get the tokens
	while(startPos != std::wstring::npos)
	{
		// Find the start of the next token, beginning from the last one found
		startPos = stringToTokenise.find_first_not_of(delimiters, endPos);
		// Find the end of the next token, beginning from the one just found
		endPos = stringToTokenise.find_first_of(delimiters, startPos);

		// If a token wasn't found, don't try to extract it
		if(startPos != std::wstring::npos)
		{
			tokens.push_back(stringToTokenise.substr(startPos, endPos - startPos));
		}
	}

	return tokens;
}

Usage of this function is nice:

std::vector<std::wstring> strings = 
    Tokenise(L"Custard Creams;Jaffa Cakes;Hobnobs", L";");

This will return a vector of crumbly deliciousness:

strings[0] = L"Custard Creams"
strings[1] = L"Jaffa Cakes"
strings[2] = L"Hobnobs"

You can also specify multiple deliminators for the same result:

std::vector<std::wstring> strings = 
    Tokenise(L"Custard Creams?Jaffa Cakes!Hobnobs:)", L"?!:)");

One thing it does not do is return an empty string for cases like this:

std::vector<std::wstring> strings = 
    Tokenise(L";;", L";");

This returns an empty vector, but in a strange parallel world it could return 3 empty strings. Should it? I will need convincing.

Share
2 Comments »
Tagged as: biscuits, snippet, tokenise, wstring

RSS Other Nibbles

  • Palette Parser Update – Now supports CMYK format
  • Are you a biscuit?
  • Base64 Encoder and Boost
  • The Sleep Experiment
  • A macro to create a C++ implementation from a header declaration
  • A macro to flip between the source and header file (and back again)
  • An introduction to three VC++ Macros: How they came to be
  • Making Eclipse more like Visual Studio
  • A tokeniser using STL
  • More of WebBiscuit’s WordPress Plugins

Categories

  • Boost
  • C++
  • code
  • CodeProject
  • CSharp
  • domains
  • Eclipse
  • experiments
  • funny
  • games
  • IDE
  • macro
  • migrating
  • plugins
  • productivity
  • reviews
  • sleeping
  • STL
  • Test
  • tips
  • VB
  • Visual Studio
  • web tips
  • WebBiscuit Software
  • wordpress
  • workarounds

Blogroll

  • Create colour palettes with Kuler
  • Free Icons with Axialis Software

Stale Biscuits

  • December 2012 (1)
  • August 2012 (1)
  • April 2012 (1)
  • January 2012 (1)
  • December 2011 (2)
  • November 2011 (1)
  • September 2011 (1)
  • August 2011 (1)
  • July 2011 (9)

Eating the web bit by bit

Powered by WordPress | “Blend” from Spectacu.la WP Themes Club