2022-09-10

All kinds of compiling issues

This blog takes all kinds of compiling issues for the c/cpp program.

Sometimes there are some unreasobale enthusiasm about the fixing the compiling issue. You just now it is not a complicated issue, maybe add one line code or fix some configurations, the whole things can work immediately, but you just don’t know how to make it work. You know it is not complicated, but you just don’t know how to solve it, and you must solve it anyway, that makes people uncomportable.

Although it is not the most important issue compared with the algorithm and what you actually program, but I feel more comfortable actually when all compiling process can work as expected. Anyway, if you could not compiling it, you could not do other things and test your algorithm.

Essentially, it is just some combination of the make commands. The previous blog discussed some notes about the key parameters of c compiling.

One tips is that, do not too attached to the compiling things. You are not the system engineer anyway, you know how it works, that’s fine. If it can not work, ask the people that maintain the associated softwares firstly. Do not dive into to much details on the compiling part for other projects. We can find alternative solution most of the time. You should put more energy onto your project.

Undefiend reference

This blog sumarizes common cases for undefined reference to issue. We firs need to understand this happens to the link stage, we call that function but we do not know how it is implemented, that is a good angle to further investigating what are the real issues here. The next step to locate the issue can be summarized as follows, (1) if we can find the assocaited files, if we find the associated file, (2) if there are correct signature. If the siganature is correct, (3) if there are approporiate visibility property.

Could not find file

Forgot to add file

At link stage, if we forget to add source file (if we use one line compiling) or object file, or we forgot to add the library file during the linking stage, obviously the linker can not find how the function is implemented.

Wrong sequence

At the explanation here about the -l library, it shows that the sequence of the library matters. The general library should be put at the end of the linker. If library a use the function defined in library b, the a should be put at the left side of b. It seems that if the linker find main.cpp called the function in fun() it will search the subsequent link list after the main.cpp in the compiling command to see if the implementation can be found here.

Found the file but the wrong defination

This is more related to the programming issues.

The defination of the signature might have more parameters or different defination of the parameters of functions, those are more like typo issues, so the linker can not find the correct one. Which is comparatively easy to solve.

The namespace to the function also belongs to the defination of the function. The function might be defined in a different path with the differnet namespace, it is highly possile that we forgot to add the namespace somewhere, so the linker can not find the approporiate one.

The name mangling when c and cpp are compiled together. In c, there is no concept about the namespace, so the extern c must be adopted when these two things are compiled together. This blog shows more details about using the extern c.

An unobvious issue might be that the two code are compiled by two different compilers, these compiler can not work well with each other, the typical message are sth like this:

undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29

Similarly, we also need to be careful if the other common library such as MPI are linked with the same version.

The visibility issue

This is a good reosurce to show how the visibility works.

Simply speaking, the compiler provides a capability that can hidden all the symbles. In the source code, user need to specify the visibility properties to show which symble is visible for the program that use the associated library. Otherwise, we still get the undefined reference issue.

For example, in the vtk-m library, we need to manully specify the visibility for each function:
https://gitlab.kitware.com/vtk/vtk-m/-/blob/master/vtkm/internal/ExportMacros.h

Otherwise, all the symbal are label as the visibility hidden, and could not be used by other libraries.

Other possible issues

The template instance is not instantiated.

From the aspect of the compiler or linker, life might be easier if it is good to show more detailed reasons to help the user to find what are more detailed problems that shows can not find the associated library.

Could not find linked library

Wrong compiler

There are multiple ways from the perspective of the cmake to implement the cuda compiling. The lastest one I know is this one. No matter which one it adpted, the core idea is to firstly use the nvcc to create the cuda object, and then compile the associated object to the executable based on gcc.

For example, these are make commands output from the cmake

// Building CUDA object CMakeFiles/example.dir/example.cxx.o
$nvcc -forward-unknown-to-host-compiler  -isystem=/global/common/cori_cle7/software/sles15_cgpu/openmpi/4.0.3/gcc/include --generate-code=arch=compute_70,code=[compute_70,sm_70] -Xcompiler -pthread -MD -MT CMakeFiles/example.dir/example.cxx.o -MF CMakeFiles/example.dir/example.cxx.o.d -x cu -dc /global/homes/z/zw241/cworkspace/src/5MCST/cmake_example/testFindCudaMPI/example.cxx -o CMakeFiles/example.dir/example.cxx.o


// Linking CUDA device code CMakeFiles/example.dir/cmake_device_link.o
$nvcc -forward-unknown-to-host-compiler  --generate-code=arch=compute_70,code=[compute_70,sm_70] -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink CMakeFiles/example.dir/example.cxx.o -o CMakeFiles/example.dir/cmake_device_link.o   -lcudadevrt -lcudart_static -lrt -lpthread -ldl 

// Linking CUDA executable example
$g++ -Wl,-rpath -Wl,/usr/common/software/sles15_cgpu/openmpi/4.0.3/gcc/lib -Wl,--enable-new-dtags -pthread CMakeFiles/example.dir/example.cxx.o CMakeFiles/example.dir/cmake_device_link.o -o example  -Wl,-rpath,/global/common/cori_cle7/software/sles15_cgpu/openmpi/4.0.3/gcc/lib /global/common/cori_cle7/software/sles15_cgpu/openmpi/4.0.3/gcc/lib/libmpi.so -lcudadevrt -lcudart_static -lrt -lpthread -ldl  -L"/usr/common/software/sles15_cgpu/cuda/11.1.1/targets/x86_64-linux/lib/stubs" -L"/usr/common/software/sles15_cgpu/cuda/11.1.1/targets/x86_64-linux/lib"

The issue related with the cuda compiling can be come down to two cases, (1) weather the source files are identified as cuda file. Just checking this command is set properly:

set_source_files_properties(example.cxx PROPERTIES LANGUAGE "CUDA")

(2) weather the target file is the cuda executable, just checking if this command is set properly for the associated target:

set_target_properties(example PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

References

all kinds of undefined reference issue

https://zhuanlan.zhihu.com/p/81681440

good tutorial about the visibility

https://stackoverflow.com/questions/52719364/how-to-use-the-attribute-visibilitydefault

AverageMind

All kinds of compiling issues

Undefiend reference

Could not find file

Could not find linked library

Wrong compiler

References

推荐文章

Undefiend reference

Could not find file

Could not find linked library

Wrong compiler

MPI related issue

Cuda related issue

References

推荐文章