Category Archives: C++

Base64 Encoder and Boost

Well, I was looking for a Base64 library recently, and I thought, “I know, I bet it is in Boost, I have Boost, and Boost has EVERYTHING.” And it turns out that it does! Kind of. But it’s a bit odd and sadly incomplete.

So let’s fix it.

To start with, I didn’t really have a clue how to plug these Boost components together, so a quick scout on the beloved StackOverflow (a Programmer’s Guide to the Galaxy) yields the following code, which I have slightly modified:

using namespace boost::archive::iterators;

typedef 
  insert_linebreaks<         // insert line breaks every 76 characters
    base64_from_binary<    // convert binary values to base64 characters
      transform_width<   // retrieve 6 bit integers from a sequence of 8 bit bytes
        const unsigned char *
        ,6
        ,8
        >
      > 
      ,76
    > 
  base64Iterator; // compose all the above operations in to a new iterator

Disgusting, isn’t it? It also doesn’t work. It will only work when the data you use is a multiple of three, so we’ll have to pad it ourselves. Here’s the full code for that:

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/transform_width.hpp>

namespace Base64Utilities
{
  std::string ToBase64(std::vector<unsigned char> data)
  {
    using namespace boost::archive::iterators;

    // Pad with 0 until a multiple of 3
    unsigned int paddedCharacters = 0;
    while(data.size() % 3 != 0)
    {
      paddedCharacters++;
      data.push_back(0x00);
    }

    // Crazy typedef black magic
    typedef 
      insert_linebreaks<         // insert line breaks every 76 characters
        base64_from_binary<    // convert binary values to base64 characters
          transform_width<   // retrieve 6 bit integers from a sequence of 8 bit bytes
            const unsigned char *
            ,6
            ,8
            >
          > 
          ,76
        > 
        base64Iterator; // compose all the above operations in to a new iterator

    // Encode the buffer and create a string
    std::string encodedString(
      base64Iterator(&data[0]),
      base64Iterator(&data[0] + (data.size() - paddedCharacters)));

    // Add '=' for each padded character used
    for(unsigned int i = 0; i < paddedCharacters; i++)
    {
      encodedString.push_back('=');
    }

    return encodedString;
  }
}

It’s not that elegant but it seems to work. Can you improve on this code? Can you write the decode function? Check your answers with this excellent Online Base64 Converter.
Leave your comments below!

A macro to flip between the source and header file (and back again)

The macro journey begins here, moving a function from a header file into its source file. The first problem presents itself as this: how can I get at the source file from the header file?

I have never known why this functionality has not been present in Visual Studio, perhaps it is harder than it appears! Still, Visual Assist has managed it, and a few people seem to want it. So, let’s add it!

The easiest thing to do is to take the filename in full, check if it ends in .cpp or .h, and then replace with .h or .cpp to get the paired source file. This is basically what the following code does (more or less based on the code found here)

    Private Function GetCorrespondingFilename(ByRef currentFilename As String) As String
        Dim correspondingFilename As String

        If (currentFilename.EndsWith("cpp", StringComparison.InvariantCultureIgnoreCase)) Then
            correspondingFilename = Left(currentFilename, Len(currentFilename) - 3) + "h"
        ElseIf (currentFilename.EndsWith("h", StringComparison.InvariantCultureIgnoreCase)) Then
            correspondingFilename = Left(currentFilename, Len(currentFilename) - 1) + "cpp"
        End If

        Return correspondingFilename

    End Function

This does have a few limitations though: what if the header is a .hpp file? What if the cpp file is cxx or c? What if the header and source files live in different folder locations?
These are all very interesting questions that I am going to ignore for now and go for the simplest case. Your headers and source live in the same folder (I am assuming). You use rigorous .h and .cpp naming standards for your header and source files respectively. Ah, life is easy.

Where I will deviate from the linked post is how we open the document. The original open method can be slow if you have many files open in visual studio, and if the file does not exist it will be created. Which can be nice, but not usually what we want. Here is the function to get the corresponding source file and open it:

    Public Sub ToggleBetweenHeaderAndSource()
        If (ActiveDocument Is Nothing) Then
            Return
        End If

        Dim otherFilename = GetCorrespondingFilename(ActiveDocument.FullName)

        If FileIO.FileSystem.FileExists(FileName) Then
            Application.Documents.Open(otherFilename, "Text")
        End If
    End Sub

We check that ActiveDocument Is Nothing because the nasty errors that visual studio throws at us when we run this macro with no file open aren’t very nice. We then get the corresponding filename and open it if it exists. Great!

We’ll be building on this macro next time to automatically generate an empty function body in the source file, all from the header definition. That is when things really get interesting.

Download Macro: SourceCodeHelpers1

Can you improve upon this macro? Leave your suggestions and comments below!

An introduction to three VC++ Macros: How they came to be

I’ve been recently working on a C++ project and ran into a task where I had to move many inline header functions out of the header and into the corresponding .cpp file.

Visually, I wanted to convert:
Biscuit.h

class Biscuit
{
public:
    virtual void Taste(int chompiness) { m_ChompRating = chompiness; }
private:
    int m_ChompRating;
}

into
Biscuit.h

class Biscuit
{
public:
    virtual void Taste(int chompiness);
private:
    int m_ChompRating;
}

Biscuit.cpp

void Biscuit::Taste(int chompiness)
{
    m_ChompRating = tastiness; 
}

We’ve all been there. It is usually an exercise in our copy and paste skills, and an opportunity to work on our RSI. I thought there must be a better way! So I scoured the Internet but noone seems to talk about it. It seems every programmer has just accepted that the only way to get this done is to stop mucking about on Google and push your fingers to the keys.

I was still not satisfied. I asked the collective brains of Stack Overflow and the only solution I got was to purchase Visual Assist. I’ve tried Visual Assist. It is a brilliant product. It is also a brilliantly expensive product. Nawaz, from Stack Overflow (he is the one that looks like a baby) challenged me to write a macro myself, and share it, because “it wouldn’t be too difficult to write”.

Hah! It was not that difficult to get something that worked some of the time. It was difficult to create something that worked all of the time. I knew nothing about macros, other than how to record them and see the generated code. Finally, after a bit of reverse engineering, a tiny bit of reading and some code stealing inspired coding, I have a set of macros I am quite happy with. Over my next 3 blog posts I will share these macros with you, and on the way we will build up a few good utilities:

  1. How to flip between the header and cpp file
  2. Automatically adding an implementation skeleton to the cpp file from the header definition
  3. Moving a function defined inline in the header into the cpp file

And hopefully some Googler will be able to make use of these macros. See you soon.

A tokeniser using STL

Using STL and its find functions, you can write a simple and extremely useful tokenise method.

std::vector<std::wstring> Tokenise(const std::wstring& stringToTokenise, const std::wstring& delimiters)
{
	std::vector<std::wstring> tokens;
	size_t startPos = 0; 
	size_t endPos = 0;
	std::wstring token;

	// Get the tokens
	while(startPos != std::wstring::npos)
	{
		// Find the start of the next token, beginning from the last one found
		startPos = stringToTokenise.find_first_not_of(delimiters, endPos);
		// Find the end of the next token, beginning from the one just found
		endPos = stringToTokenise.find_first_of(delimiters, startPos);

		// If a token wasn't found, don't try to extract it
		if(startPos != std::wstring::npos)
		{
			tokens.push_back(stringToTokenise.substr(startPos, endPos - startPos));
		}
	}

	return tokens;
}

Usage of this function is nice:

std::vector<std::wstring> strings = 
    Tokenise(L"Custard Creams;Jaffa Cakes;Hobnobs", L";");

This will return a vector of crumbly deliciousness:

strings[0] = L"Custard Creams"
strings[1] = L"Jaffa Cakes"
strings[2] = L"Hobnobs"

You can also specify multiple deliminators for the same result:

std::vector<std::wstring> strings = 
    Tokenise(L"Custard Creams?Jaffa Cakes!Hobnobs:)", L"?!:)");

One thing it does not do is return an empty string for cases like this:

std::vector<std::wstring> strings = 
    Tokenise(L";;", L";");

This returns an empty vector, but in a strange parallel world it could return 3 empty strings. Should it? I will need convincing.

Your ADO is broken

I found today that Microsoft has violated the holy rules of COM and broken their msado15.dll. The violation occurs after the installation of Windows 7 Service Pack 1 (version 6.1.7601.17514).

What’s happened is Microsoft has changed a few function signatures due to a 64bit problem, and changed the class IDs while they are at it. Thus, backwards compatibility has been broken, making COM absolutely useless. Executables compiling on Windows7 SP1 will now only run on Windows7 SP1. We’ll have to assume that any other alternatives were carefully thought through and resulted in the world ending, because on face value this looks like a terrible fix.

In this epic saga spanning five months, we’ve still not reached a warm fuzzy conclusion. The best solution seems to be reverting this change made in the service pack and created a new set of 64bit functions. This would probably break all ADO code compiled in the last five months, but at least the last 15 years would work again.

In the meantime, here’s our official workarounds:

  1. Use late binding. Because, you know, type safety and efficiency aren’t all that.
  2. Uninstall Service Pack 1. I think they are laughing at us.
  3. Temporary workaround for C++ developers. Yay!

So we’ll go for the only attractive option, #3. Details are here: http://support.microsoft.com/kb/2517589, but it basically boils down to installing and registering a file called “Msado60_Backcompat”, which is a truly ironic name if you remember what the whole point of COM is.

Except I’d deviate a little from the recommended steps, because forcing your team to install and register these files is a bit inefficient, and playing around in your common files location is plain nasty. There’s also no indication on how to swap them in and out for 32 and 64 bit builds. So here’s my lazy “it-just-works” method for Visual Studio 2010:

  1. In your 3rd party area of your source control, create a branch for Microsoft and Ado. Then branch off for the 64bit and 32bit versions. ie


    3rd Party
    =>Microsoft
    ====>Ado
    ======>x86
    ======>x64

  2. Download the 64 and 32 bit .tlb files, place them in their respective folders and rename them both to “msado60_Backcompat.tlb”.
  3. Check these files in.
  4. Open up properties for your project, and under C/C++ add the path to your tlb files.
    Shows the proprties screen in Visual Studio 2010. The path to the new ADO .tlb file has been added to the additional Directories field,
    Add the path to the .tlb file to Additional Directories

    Don’t forget Release settings!
  5. Do the same for your 64 bit version.
  6. Change the #import "msado15.dll" part in your code to #import "msado60_Backcompat.tlb"
  7. Recompile, build, wait for the next hotfix to come along and destroy your working programs.

Meanwhile, your programs will just work using COM black magic that nobody really understands.

I can only assume everyone who understands COM at Microsoft who could have prevented this fairly fundamental error from occurring has gone mad, senile or died. I know they are keen to get us all writing .NET code (I hear that’s what the cool kids are doing these days), but if no-one understands the technology that .NET is built on, we’re in trouble. I imagine in the near future, mythical COM developers will be held up in great esteem like a lost Mayan civilisation, revered as demi-gods with alien intelligence and magical powers. Or starving as unemployable geeks crying into their RSI-riddled hands.