cppTypeTips

some tips about the cpp type cast

the blog about the static cast and the dynamic cast, refer to the vtksafedowncast, add some experiments, when to use the dynamc_cast
https://discourse.vtk.org/t/should-we-replace-safedowncast-with-dynamic-cast/669
some tricky part about the void pointer and how it is used
https://www.learncpp.com/cpp-tutorial/613-void-pointers/

To some extend of you cpp programming journey, you need to start to consider the things about the data type or class type. The general thought is that you may want to make your code more general, and use one code snippet to do multiple things. It might be more clear to start with a concrete case. let’s say we want to program a data management class that index the data block, the key of the map is the data block id, how to design the value of the data block ? Let’s say the block can be really variant, it can be different length of memory types etc, how will you design it? let’s discuss several design considerations here.

void pointer and the meta info

Conceptually, the void pointer can represent any data type, in this case, we could use a void* to hold the address of any data type, but the question is that it is too flexible, how do we know what is the actual data type here? If there is a struct that conains void* it must requires the metadata part that stores the type of element type and the number of element. In this case, we will know what is the actual data type and their length hold by the void poiter. But this may not a good design, since it is error prone, and we basically did not apply any restriction to the data type (in the real program, the type is always limited, since we can not provide a program that can do anything). As illustrated in this article, do not use the void pointer unless it is really necessary.

If we really need to use the void pointer, for example, the typical case is that the process for the data transfer between networking, we may need to flatten the data structure into an array with the continuous memory space, maybe an char array, wich has size of one byte for one element. In this case, one complexity part is that the management of the metadata, basically we need to provide a function to tell the program what is the size of the current data type. Here is an example of the common value type.

At the client side, we may use the sizeof function to get the size of the data, but at the server end, we still need an identifier to know the actual type of data in order to reconstruct the data.

This is the sample code for the proof of the concept

#include <stdio.h>
#include <time.h>
#include <unistd.h>

#include <iostream>
#include <vector>

// use the explicit underlying type such as uint32_t
enum class Type : uint32_t {
  UCHAR8,
  INT8,
  UINT8,
  INT16,
  UINT16,
  INT32,
  UINT32,
  INT64,
  UINT64,
  FLOAT32,
  FLOAT64
};

// return the byte value of the specific type
size_t ComputeDataSize(const Type& type) {
  size_t s = 1;
  switch (type) {
    case Type::UCHAR8:
    case Type::INT8:
    case Type::UINT8:
      return 1;
    case Type::INT16:
    case Type::UINT16:
      return 2;
    case Type::FLOAT32:
    case Type::INT32:
    case Type::UINT32:
      return 4;
    case Type::INT64:
    case Type::UINT64:
    case Type::FLOAT64:
      return 8;
  }
  return 0;
}

void transferAndReconstruct(void* ptr, size_t memSize, const Type& type) {
  size_t elemSize = ComputeDataSize(type);
  if (elemSize == 0) {
    throw std::runtime_error("elemSize is not supposed to be zero");
  }
  if (memSize % elemSize != 0) {
    throw std::runtime_error("memSize should be divided by elemSize");
  }
  size_t elemNum = memSize / elemSize;
  std::cout << "elemNum is " << elemNum << std::endl;

  switch (type) {
    case Type::UINT16: {
      // conversion to reconstruct
      uint16_t* temp = static_cast<uint16_t*>(ptr);
      std::cout << "conversion uint16" << std::endl;
      for (int i = 0; i < elemNum; i++) {
        std::cout << "index " << i << " value " << *(temp + i) << std::endl;
      }
      break;
    }

    case Type::FLOAT32: {
      // conversion to reconstruct
      float* temp = static_cast<float*>(ptr);
      std::cout << "conversion float" << std::endl;
      for (int i = 0; i < elemNum; i++) {
        std::cout << "index " << i << " value " << *(temp + i) << std::endl;
      }

      break;
    }

    case Type::UCHAR8: {
      char* temp = static_cast<char*>(ptr);
      std::string str;
      for (int i = 0; i < elemNum; i++) {
        std::cout << "index " << i << " value " << *(temp + i) << std::endl;
        str.push_back(*(temp + i));
      }
      std::cout << "get str " << str << std::endl;
      break;
    }

    default: {
      throw std::runtime_error("unsuppoted str id " +
                               std::to_string(uint32_t(type)) +
                               " to reconstructValue");
      break;
    }
  }
  return;
}

int main() {
  // size_t size = ComputeDataSize(Type::INT16);
  // std::cout << "size " << size << " byte" << std::endl;
  // size = ComputeDataSize(Type::STR);
  void* ptr = nullptr;

  // the uint16 case
  std::vector<uint16_t> stcIntVector;
  for (uint16_t i = 0; i < 10; i++) {
    stcIntVector.push_back(i);
  }

  ptr = (void*)stcIntVector.data();
  size_t memSize = sizeof(uint16_t) * 10;

  transferAndReconstruct(ptr, memSize, Type::UINT16);

  // the float case
  std::vector<float> stcFloatVector;
  for (int i = 0; i < 10; i++) {
    stcFloatVector.push_back(i * 0.1);
  }

  // to make sure the ptr is updated here
  ptr = (void*)stcFloatVector.data();
  memSize = sizeof(float) * 10;
  if (sizeof(float) != 4) {
    throw std::runtime_error("float is not 4 in this platform");
  }
  transferAndReconstruct(ptr, memSize, Type::FLOAT32);

  // the string case
  std::vector<char> stcStrVector;
  for (int i = 0; i < 10; i++) {
    stcStrVector.push_back('A' + i);
  }
  // to make sure the ptr is updated here
  ptr = (void*)stcStrVector.data();
  memSize = sizeof(char) * 10;
  transferAndReconstruct(ptr, memSize, Type::UCHAR8);
}

There are several points desers to mention

it is good practice to use the typeName_bitnumber to define the type, even if for the customized type, such as int_32, then the secondary classification is sth like singed or unsigned etc.
the important function is transferAndReconstruct that accepts the void pointer and metadata about the data then it reconstruct the actual data type based on these metadata. Becareful about the match of the metadata and actual value since this might be set manuaaly, it is easy to make mistakes, for example we may set the src as int vector but the function call use the float type, this is the main complexity of this case, there is no guarantee at the server to check if the metadata match with the actual data, this is guaranteed by the programmer, but the programmer is always unreliable. Just be careful and add more detailed unit test if you need to use void pointer.
in c++ we use the static_cast in the case where you basically want to reverse an implicit conversion, check this question. Basically when we know exactly what is our origianl type, then we can use the static cast. However, converstion from the explicit type to the void pointer is a kind of implicit conversion, but for the conversion between the child class instance and the parent class, it is better to use the dynamic cast with the runtime check.

the template

The common data type has been supported but the cpp STL, for example, we can use the vector to keep the data block with all kinds of types.

If different data type has the similar logic except the data type in the program, such as alsorithms, then the template is a good thing. But be careful, if you need to adjust if a template variabel is a specific class, maybe you did a wrong design, just refer to this.

Sometimes, it is necessary for some specical or edge case, in that case, maybe we can use the cast or type of function to decide if the template parameter belongs to particular type.

Simply speaking, when the different part in the code is only the type, then the template is a good choice. If we come back to the broblem to design the block management module, sometimes, the block might be in differnet forms, maybe a continuous space in memory, maybe a file, maybe a particular data object representation, in that case, the template might not be suitable.

declare the interface

So, if same things between different object is only the signature of the function, then the interface or abstracted class or polymorphic can be used, it is more loose coupled relationship comapred with the template and the void pointer, the inner data can be different in value and forms, only the behaviours is similar. If the signature is also different, maybe they should be two differnet class or modules.

For example, the put and get operation of the file based data object need to handle the file io but the memory based data object just need to get the data from the data structure in memory, this can be quite flexible with the interface and then different concrete object implementation just need to implement the interface.

We not dive into details of the polymorphism here. The dynamic_cast is necessary to be mentioned here, we tend to use the parent class (interface) instance to point to the child class. But one thing is that to make sure the parent class is actually hold a pointer to the child class, the good thing about the dynamic_cast is the runtime checking.

#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <iostream>

// this class can not be initialized with the =0
class Parent {
 public:
  std::string parent;
  virtual void func() = 0;
};

class Child : public Parent {
 public:
  void func() { std::cout << "child call func" << std::endl; }
};

class Child2 {
 public:
  Child2();
};

int main() {
  Child c;
  Parent* p = &c;
  Child* cptr = dynamic_cast<Child*>(p);
  if (cptr == nullptr) {
    throw std::runtime_error("failed to transfer");
  }
  cptr->func();
  Child2* c2 = dynamic_cast<Child2*>(p);
  if (c2 == nullptr) {
    throw std::runtime_error("failed to transfer into Child2");
  }
}

In this simple example, we could see that the return value is nullptr if we try to cast the parent into another unrelated calss. Basically, the dyamic cast only suitable for down cast, and sidecast, check here to get more detailed information.