I use vtkm a lot for all kinds of projects, this article includes some tips and thoughts about using the vtkm from the perspective of software design.
The original paper of vtkm is this one
Pattens that the algorithm follows for parallel running is this one:
Decouple between the execution env and control env
One important thing is to figure out how to set the
ExecutionSignature and parameters of operator function in the worklet type. It seems that all worklet in vtkm style follows this style.
There are some detailed explanatio in the vtkm userguide 17.1-17.4, the parameters specified by exeutionSignature shows which parameters are chosen to execute in the exec env. Parameters specified in the operator function should match with the type of parameters specified in exec env (including the type of the template parameter). It seems that the template parameters can be reused? Not sure how it works here, but it works. If two parameters use the same type, we only need one input template parameter.
filter plus worklet style
If we just use the worklet based on specific data sets or array, the things described about can satisfy our requirments, and we need to create the dispatcher separately. Then call the invoker function of based on worklet.
Another level of abstraction is the filter. we need to send the data in and return the data and hidden the details. There are multiple filters in vtkm, the base class is sth like
vtkm::filter::FilterField, we can call the invoke function direactly if the filter is inherited from that base class, this is an example. We inherite from that base class and call
The filter base class provids more simple things to do the dispatcher, invoke and detect the type of input and output type.
The case of multiple input and the multiple output.
Type detect and conversion
The core idea is the type tranform here is to let the code can work on different types. The strategies used in vtkm is the runtime transform. The idea is that the deduce the type of a particular field, and then they use that field as the template parameter.
How to deduce the type of a specific array in runtime? The idea is specify a type list and then compare your type with that list to see which is accurate one. In this array handle there are all kinds of functions start with
CastAndCallFor... this can be used at different scenarios. The deduced array type is usually set as an input variable of a lambda expression.
If there are multiple variable, the solution of the vtkm is to use the type of specific variable to infer the type of other type. They have an function called
ArrayCopyShallowIfPossible, the typical scenario is sth like this:
//use decay_t to remove the const qulifier
If they have the same type, the do the shallow copy, the
T is the deduced type.
Another commonly used function to convert the array type is the
AsArrayHandle, if we search this keyword on the gitlab page of the vtkm, there are all kinds of examples and test case
In this file that defines the ArrayHandle, it defines specific behaviours for the
AsArrayHandle, the idea is to use the
decltype to get the target type and to check if it is ok to do the conversion, if it is ok, we then use the
reinterpret_cast to do the conversion. For the cellset, there are also some function such as
AsCellSet to cast the unknown cellset into the specific one.
This question discusses some details regarding why use this kind of general type. The interface want to be desgined as general as possible, for example such as get cell set in vtkm, it can be desgined to have a get cell set function for each pecific type of cell set, it needs to have some kind of generality. We may imagine the unknown cell set or array type is a kind of interface for the concrete member.
Create data set for sample testing
Although there are flexible libraries in vtkm to imput and output data based on the vtk format, sometimes it is conveneint to build data direactly for simple testing, such as building a structured data. The assocaited code is at the vtkm/cont/DataSetBuilderxx, there are all kinds of exmaples such as build uniform and rectilinear example. We can refer to these examples to see how to add coordinates and cellset when we need to create a new vtkm dataset for specific goals.
Visulization with data parallel primitives
Or so called parallel skeletons:
The general types of the gpu primitives:
Map (scan operation or add operation, stencil is also a kind of operator)
Reduce (output is a single value, sum)
Scan (stor intermidiate results)
Sort (reordering operation)
Search (find index)
One paper regarding the VTKM design, the idea of using the parallel primitives (https://cdux.cs.uoregon.edu/pubs/MorelandPARCO.pdf)
Flexible and zero-copy array handle
Device primitives (this is still one key research direaction)
Still not sure how it works, what are connection between these parallel primitives and the upper level operations
Classifiation different filter types from different categories:
field_transform transform fields from/to the same entity
field_conversion does sth cell to pointer or pointer to cell converion
Iterate the specific array
There is no direact way to do this things, we need to getdata firstly, then use the AsArrayHanle to transfer to dedicated handle then get the ReadPortal, then based on this ReadPortal or write Portal, we can read or write data. This may looks like a little bit redoundant, but it is how a framework works, you need to do things following its convention.
Call the worklet
The idea is the combination between the lambda expression and the type deduction.
This is a typical exmaple
CastAndCallVecField can get the correct type.
Another case is that we do not know the type. It will change the concrete type in the lambda expression. such as this example, the
CastAndCallForTypesWithFloatFallback will do the actual cast tranform and it will try to compare with the list defined in the first parameter and to see wich inner type can match. Since when we call the GetData, it returns the unknown type. The common use case scenarios is to use the input data type to deduct the output type.
In the lambda expression, it usually declares a dispather and then use the Invoke function of the dispatcher to call the worklet, the code in this part follows a similar structure.
Some tips for code standard
Using DataSet::GetField with field name and association instead of GetPoint or GetField when getting a specific array.
Using different device adaptor
There currently is no direct way to query on which device something ran on. One way you can do this is to turn on the logging level to Perf, but that will print out a lot of info that will be difficult to discern.
What I usually do is force a particular device. That way VTK-m will run on that device or else it will raise an error. The easiest way to force a device is to add a
--vtkm-device=openmp command line argument (that gets passed to vtkm::cont::Initialize). Alternately, you can use ScopedRuntimeDeviceTracker to force a particular device (https://docs-m.vtk.org/latest/structvtkm_1_1cont_1_1ScopedRuntimeDeviceTracker.html).
Essentially, the vtkm is just a library for other dedicated service. Its location is a service which can be integrated into other service such as paraview.
Another way is just try to use the ForceDevice to set the backend as the expected one, such as this (https://github.com/Kitware/VTK-m/blob/master/examples/multi_backend/MultiDeviceGradient.cxx#L60)
This is a example to use the initialization in vtkm
Pay attention, if we set the backend as the openmp, and use the mpirun to run the program, the default case is that there will be one core in the program even for local env, this is a good we to bind multiple physical cores to one process/task
mpirun -np ncores --bind-to none -x OMP_NUM_THREADS=nthreads ./program
The poor performance of the debug mode
When we set the vtk as debug mode, it might decrease the performance since it will add extra assertion operations for arrayhandle, check this place (https://github.com/Kitware/VTK-m/blob/master/CMakeLists.txt#L137) to get more details.
Here we can see that the performance can increase a lot if we do not use extra VTKM assert
Set the log level and the backend
These configurations are recomended to be set in the initilization function of the vtkm call. There are other operations to set the backend and the log levels, such as the ForceDevice function.