0. Background

Just as usual, we start this topic with simple case and code. We need one person’s information, whatever it for, we just need them, maybe only print out. So what will we do?

struct Person
{
    std::string Name;
    std::string NickName
    std::string Address;
    std::string Org;
    std::string Gender;
    int Age;
    int Height;
    int Weight;

    // Other information
    // ...
}

Then, wherever we get these data, whether from database, user input, file, manipulated, etc…, when we need to use it, ex, print it out, we write the code as follows,

void PrintPersonInformation(const Person& person)
{
    printf("Name: [%s]\n", person.Name.c_str());
    printf("NickName: [%s]\n", person.NickName.c_str());
    printf("Address: [%s]\n", person.Address.c_str());
    printf("Org: [%s]\n", person.Org.c_str());
    printf("Gender: [%s]\n", person.Gender.c_str());
    printf("Age: [%d]\n", person.Age);
    printf("Height: [%d]\n", person.Height);
    printf("Weight: [%d]\n", person.Weight);

    // Print other information
    // ...
}

Really simple and straight forward, but here comes the problem. Here we only list 8 properties of ‘Person’, and as the design each property need as printf operation, if there is hundreds of properties, thousands, millions…  what would happen?  Is it really needed to have a individual line in code for duplicate operation of each property? Why couldn’t we find a way like ‘for’ loop to cover the duplicate code?

The problem is that, we could never loop all the properties in a struct/class. Although all the properties in a structure is continuously stored in memory, we could loop the memory according to property’s length, as described below(only integer values in example), the code would be too tricky to read/understand/hard to debug because the programmer have to be very familiar with all the memory detail of every property type/compiler alignment ,  although the performance is really attractive.

struct Person
{
    int Age;
    int Height;
    int Width;
}

void PrintPersonInformation(const Person& person)
{
    int* intAddIter = reinterpret_cast<int*>(person);
    for (int i = 0; i < 3; i++)
    {
        intAddIter += i;
        printf("Value: [%d]\n", *intAddIter);
    }
}
1.Solution

So in fact, what we need is a container, in which all elements could be looped. STL containers like ‘vector’, ‘map’ could meet this requirement, but the issue is any one of them can only support one type of elements. Multi-type element container has not been implemented until now in STL. We have to implement a ‘DataItemProperty’ object which could support all, at least most element type, including the getter, setter and type flag to identify which element type it holds.

Considering property’s name is also very important, std::map with ‘std::string’ could help to distinguish each property, so the struct ‘Person’ could be defined and declared as follow,

    typedef std::map<std::string, DataItemProperty> DataItem;
    DataItem Person;

Also, the usage should be as simple as it could be, both on the getter and setter side.

    Person[std::string("Name")] = DataItemProperty("zZED");
    Person[std::string("Age")] = DataItemProperty(30);
    std::string srcName = Person[std::string("Name")];
    int srcAge = Person[std::string("Age")];

Here I’d like to mention something additional. As we all know, index check should be done before elements get when use std::map to avoid run-time crash issue, which should like

if (Person.find("Name") != Person.end())
{
    std::string srcName = Person[std::string("Name")];
}
else
{
    // log and error handling
}

But sometime unexpected error typing when coding would really confuse the programmer and really hard to debug if we use std::string directly as map’s index, which error could not be found during build time like this

Person["Neme"] = DataItermProperty("zZED");

// Processing

if (Person.find("Name") != Person.end())
{
    std::string srcName = Person[std::string("Name")];
}
else
{
    // log and error handling
}

Here we mistook typing ‘Neme’ instead of ‘Name’ during coding, no compile error happens, but it will be a bug. To avoid this kind of issue, using enum(int) pre-defined as the map’s index is preferred

enum PERSON_PROPERTY
{
	PERSON_NAME		= 0,
	PERSON_NICK		= 1,
	PERSON_ADD		= 2,
	PERSON_ORG		= 3,
	PERSON_GENDER		= 4,
	PERSON_AGE		= 5,
	PERSON_HEIGHT		= 6,
	PERSON_WEIGHT		= 7,

	// More information
	// ...
	PERSON_PEROPERTY_UNKNOWN = 0xFF
};

Then we could use it as follows,

Person[PERSON_NAME] = DataItermProperty("zZED");

// Processing

if (Person.find(PERSON_NAME) != Person.end())
{
    std::string srcName = Person[std::string(PERSON_NAME)];
}
else
{
    // log and error handling
}

If there is any spelling error, it could be found during build time and easy to target then fix.

And just stringify it we need to using the index as string.

const std::string StringifyPersonProperty(PERSON_PROPERTY property)
{
	switch (property)
	{
		case PERSON_NAME:
		return std::string("NAME");
		case PERSON_NICK:
		return std::string("NICKNAME");
		case PERSON_ADD:
		return std::string("ADDRESS");
		case PERSON_ORG:
		return std::string("ORGANIZATION");
		case PERSON_AGE:
		return std::string("AGE");
		case PERSON_HEIGHT:
		return std::string("HEIGHT");
		case PERSON_WEIGHT:
		return std::string("WEIGHT");
	}
	assert(false);
	return std::string("");
};
2.Implementation of DataItemProperty

Here comes the point, how to implement a class to support all data types? Since in Cpp there is no type like ‘Object’ in Java or C#, which means we have to declare the specific type of the class member ‘m_data’ to store the data in class DataItemProperty. Here I just introduce the simplest way to implement, not the best solution because there will be additional memory cost, which is to declare individual class member to every data type would be use as below. Also, a member’DataType’ indicates which data type is stored is needed.

class DataItemProperty
{
public:
	enum DATA_TYPE
	{
		DATA_TYPE_UINT8		=	0,
		DATA_TYPE_UINT32	=	1,
		DATA_TYPE_UINT64	=	2,
		DATA_TYPE_INT64		=	3,
		DATA_TYPE_INT		=	4,
		DATA_TYPE_DOUBLE	= 	5,
		DATA_TYPE_STR		=	6,
		DATA_TYPE_NOT_SUPPORTED	=	0xFF,
	};
private:
        // Members for each data type
	uint8_t		U8Value;
	uint32_t	U32Value;
	uint64_t	U64Value;
	int64_t		Int64Value;
	int		IntValue;
	double		DoubleValue;
	std::string	StrValue;

        // Data type indicator
	DATA_TYPE	DataType;
};

Only all the types listed above is supported, if there is any requirement, class member of certain type could be added. Then we need to overload constructors according to the support data types.

public:
	// Constructors
        DataItemProperty();
	DataItemProperty(const int& value);
	DataItemProperty(const uint8_t& value);
	DataItemProperty(const uint32_t& value);
	DataItemProperty(const uint64_t& value);
	DataItemProperty(const int64_t& value);
	DataItemProperty(const double& value);
	DataItemProperty(const std::string& value);
	DataItemProperty(const DataItemProperty& dataItemProperty);

Then operator for cast is required

	// Operators
	operator uint8_t();
	operator uint32_t();
	operator uint64_t();
	operator int64_t();
	operator int();
	operator double();
	operator std::string();

At last, sometimes we do not need want create some temp variable explicitly to use the DataItemProperty, such as print it out like the case mentioned at the very beginning. A Interface to get the value is needed, but the problem is, in Cpp return type of a function must be declared and could not be overloaded, how could we return class member of different types? I preferred the powerful type ‘void*’, to return the member address according to the ‘DataType’. Also interface to get the DataType is only required.

	// Interface
	void* Value();
	DATA_TYPE GetDataType();

At last, DataItem & DataItemIter should be defined.

typedef std::map<PERSON_PROPERTY, DataItemProperty> DataItem;
typedef std::map<PERSON_PROPERTY, DataItemProperty>::iterator DataItemIter;

Another thing is, to help simplify the print scenario for the cases as mentioned at beginning, macro definition is needed, then we could have the full definition of DataItemProperty in ‘DataItemProperty.h’

// DataItemProperty.h

#ifndef _DATA_ITEM_PROPERTY_
#define _DATA_ITEM_PROPERTY_

#include <stdio.h>
#include <typeinfo>
#include <string>
#include
<map>
#include <inttypes.h>
#include <iostream>
#include "assert.h"

#define STR_DATA_TRACE(dataItemProperty, fmt, args...)	\
	printf("[DataItemProperty]\t[String]\t Value: [%s]\t" fmt "\n", \
	(*(reinterpret_cast<std::string*>(dataItemProperty.Value()))).c_str(), ## args);

#define INT_DATA_TRACE(dataItemProperty, fmt, args...)	\
	printf("[DataItemProperty]\t[Int]\t\t Value: [%d]\t" fmt "\n", \
	(*(reinterpret_cast<int*>(dataItemProperty.Value()))), ## args);

#define U32_DATA_TRACE(dataItemProperty, fmt, args...)	\
	printf("[DataItemProperty]\t[U32]\t\t Value: [%04X]\t" fmt "\n", \
	(*(reinterpret_cast<uint32_t*>(dataItemProperty.Value()))), ## args);

enum PERSON_PROPERTY
{
	PERSON_NAME	= 0,
	PERSON_NICK	= 1,
	PERSON_ADD	= 2,
	PERSON_ORG	= 3,
	PERSON_GENDER	= 4,
	PERSON_AGE	= 5,
	PERSON_HEIGHT	= 6,
	PERSON_WEIGHT	= 7,

	// More information
	// ...
	PERSON_PEROPERTY_UNKNOWN = 0xFF
};

class DataItemProperty
{
public:
	enum DATA_TYPE
	{
		DATA_TYPE_UINT8		=	0,
		DATA_TYPE_UINT32	=	1,
		DATA_TYPE_UINT64	=	2,
		DATA_TYPE_INT64		=	3,
		DATA_TYPE_INT		=	4,
		DATA_TYPE_DOUBLE	= 	5,
		DATA_TYPE_STR		=	6,
		DATA_TYPE_NOT_SUPPORTED	=	0xFF,
	};

public:
	// Constructors
        DataItemProperty();
	DataItemProperty(const int& value);
	DataItemProperty(const uint8_t& value);
	DataItemProperty(const uint32_t& value);
	DataItemProperty(const uint64_t& value);
	DataItemProperty(const int64_t& value);
	DataItemProperty(const double& value);
	DataItemProperty(const std::string& value);
	DataItemProperty(const DataItemProperty& dataItemProperty);

	// Operators
	operator uint8_t();
	operator uint32_t();
	operator uint64_t();
	operator int64_t();
	operator int();
	operator double();
	operator std::string();

	// Interface
	void* Value();
	DATA_TYPE GetDataType();

private:
	uint8_t		U8Value;
	uint32_t	U32Value;
	uint64_t	U64Value;
	int64_t		Int64Value;
	int		IntValue;
	double		DoubleValue;
	std::string	StrValue;
	DATA_TYPE	DataType;
};

typedef std::map<PERSON_PROPERTY, DataItemProperty> DataItem;
typedef std::map<PERSON_PROPERTY, DataItemProperty>::iterator DataItemIter;

#endif

Then we could implement ‘DataItemProperty’ in ‘DataItemProperty.cpp’ as follow,

#include "DataItemProperty.h"

DataItemProperty:: DataItemProperty()
	:U8Value(0),
	 U32Value(0),
	 U64Value(0),
	 Int64Value(0),
	 IntValue(0),
	 DoubleValue(0),
	 StrValue(""),
	 DataType(DATA_TYPE_NOT_SUPPORTED)
{
}

DataItemProperty::DataItemProperty(const int& value)
	:IntValue(value),
	 DataType(DATA_TYPE_INT)
{
}

DataItemProperty::DataItemProperty(const uint8_t& value)
	:U8Value(value),
	 DataType(DATA_TYPE_UINT8)
{
}

DataItemProperty::DataItemProperty(const uint32_t& value)
	:U32Value(value),
	 DataType(DATA_TYPE_UINT32)
{
}

DataItemProperty::DataItemProperty(const uint64_t& value)
	:U64Value(value),
	 DataType(DATA_TYPE_UINT64)
{
}

DataItemProperty::DataItemProperty(const int64_t& value)
	:Int64Value(value),
	 DataType(DATA_TYPE_INT64)
{
}

DataItemProperty::DataItemProperty(const double& value)
	:DoubleValue(value),
	 DataType(DATA_TYPE_DOUBLE)
{
}

DataItemProperty::DataItemProperty(const std::string& value)
	:StrValue(value),
	 DataType(DATA_TYPE_STR)
{
}

DataItemProperty::DataItemProperty(const DataItemProperty& dataItemProperty)
	:DataType(dataItemProperty.DataType)
{
	switch (dataItemProperty.DataType)
	{
		case DATA_TYPE_UINT8:
			U8Value = dataItemProperty.U8Value;
		break;
		case DATA_TYPE_UINT64:
			U64Value = dataItemProperty.U64Value;
		break;
		case DATA_TYPE_UINT32:
			U32Value = dataItemProperty.U32Value;
		break;
		case DATA_TYPE_INT64:
			U64Value = dataItemProperty.Int64Value;
		break;
		case DATA_TYPE_INT:
			IntValue = dataItemProperty.IntValue;
		break;
		case DATA_TYPE_DOUBLE:
			DoubleValue = dataItemProperty.DoubleValue;
		break;
		case DATA_TYPE_STR:
			StrValue = dataItemProperty.StrValue;
		break;
	}
}

DataItemProperty::operator uint8_t()
{
	assert(this->DataType == DATA_TYPE_UINT8);
	return this->U8Value;
}

DataItemProperty::operator uint32_t()
{
	assert(this->DataType == DATA_TYPE_UINT32);
	return this->U32Value;
}

DataItemProperty::operator uint64_t()
{
	assert(this->DataType == DATA_TYPE_UINT64);
	return this->U64Value;
}

DataItemProperty::operator int64_t()
{
	assert(this->DataType == DATA_TYPE_INT64);
	return this->Int64Value;
}

DataItemProperty::operator int()
{
	assert(this->DataType == DATA_TYPE_INT);
	return this->IntValue;
}

DataItemProperty::operator double()
{
	assert(this->DataType == DATA_TYPE_DOUBLE);
	return this->DoubleValue;
}

DataItemProperty::operator std::string()
{
	assert(this->DataType == DATA_TYPE_STR);
	return this->StrValue;
}
void* DataItemProperty::Value()
{
	switch (DataType)
	{
		case DATA_TYPE_UINT8:
			return &U8Value;
		case DATA_TYPE_UINT64:
			return &U64Value;
		case DATA_TYPE_UINT32:
			return &U32Value;
		case DATA_TYPE_INT64:
			return &Int64Value;
		case DATA_TYPE_INT:
			return &IntValue;
		case DATA_TYPE_DOUBLE:
			return &DoubleValue;
		case DATA_TYPE_STR:
			return &StrValue;
		default:
			return NULL;
	}
	return NULL;
}

DataItemProperty::DATA_TYPE DataItemProperty::GetDataType()
{
	return DataType;
}
3. Trial Run

After all the implementation, let’s just try with it.

// main.cpp
#include "DataItemProperty.h"
const std::string StringifyPersonProperty(PERSON_PROPERTY property)
{
	switch (property)
	{
		case PERSON_NAME:
		return std::string("NAME");
		case PERSON_NICK:
		return std::string("NICKNAME");
		case PERSON_ADD:
		return std::string("ADDRESS");
		case PERSON_ORG:
		return std::string("ORGANIZATION");
		case PERSON_AGE:
		return std::string("AGE");
		case PERSON_HEIGHT:
		return std::string("HEIGHT");
		case PERSON_WEIGHT:
		return std::string("WEIGHT");
	}
	assert(false);
	return std::string("");
};
int main()
{
	DataItem personInfo;
	personInfo[PERSON_NAME] = DataItemProperty(std::string("ZED"));
	personInfo[PERSON_NICK] = DataItemProperty(std::string("ZZ"));
	personInfo[PERSON_ADD] = DataItemProperty(std::string("Mingyue Road 1257"));
	personInfo[PERSON_ORG] = DataItemProperty(std::string("King Bridge"));
	personInfo[PERSON_AGE] = DataItemProperty(30);
	personInfo[PERSON_HEIGHT] = DataItemProperty(175);
	personInfo[PERSON_WEIGHT] = DataItemProperty(65);

	for (DataItemIter iter = personInfo.begin();
	     iter != personInfo.end();
	     iter++)
	{
		switch ((iter->second).GetDataType()) {
			case DataItemProperty::DATA_TYPE_STR:
			STR_DATA_TRACE(iter->second, "Data Item Index: [%s]",
					StringifyPersonProperty(iter->first).c_str());
			break;

			case DataItemProperty::DATA_TYPE_INT:
			INT_DATA_TRACE(iter->second, "Data Item Index: [%s]",
					StringifyPersonProperty(iter->first).c_str());
			break;

			default:
			printf("Data type is not supported to print! Data item index: [%s], type: [%d]",
				StringifyPersonProperty(iter->first).c_str(),
				(iter->second).GetDataType());
		}
	}

	int height = 0;
	if (personInfo.find(PERSON_HEIGHT) != personInfo.end()) {
		height = personInfo[PERSON_HEIGHT];
	}
	printf("Height: [%d]\n", height);
	return 0;
}

Then we could get the output as we expected~

[DataItemProperty]	[String]	 Value: [ZED]	Data Item Index: [NAME]
[DataItemProperty]	[String]	 Value: [ZZ]	Data Item Index: [NICKNAME]
[DataItemProperty]	[String]	 Value: [Mingyue Road 1257]	Data Item Index: [ADDRESS]
[DataItemProperty]	[String]	 Value: [King Bridge]	Data Item Index: [ORGANIZATION]
[DataItemProperty]	[Int]		 Value: [30]	Data Item Index: [AGE]
[DataItemProperty]	[Int]		 Value: [175]	Data Item Index: [HEIGHT]
[DataItemProperty]	[Int]		 Value: [65]	Data Item Index: [WEIGHT]
Height: [175]
4.Extension

Considering the memory consumption, this solution is not good, or even unacceptable when the memory resource is limited. When used, DataItemProperty only needs one member to store the data, but since which type the data is is unknown when declare, memory is wasted and could be optimized. Maybe just one ‘m_data’ with uint8_t* could be enough to store the raw data of every data type, and allocate the memory for ‘m_data’ when construct depends on the raw data length of input. Also, inherit of a base class and allocate memory for instance dynamic when construct is another solution.

Advertisements