2022-10-03

Scientific visualization with Paraview

This blog write some quick notes about using the paraview to do the visualization. There are some videos here, this article records some quick notes, which can be useful for daily use.

All kinds of details about color mapping

TODO meaning of the 2d color plot
TODO discriticesed value and continuous values (check the discriticese button todo that)

Necessary thing for taking screenshot

Backgound color might be set to white color when taking a screen shot and put results into the paper as visualization reuslts. There is a color plaette at the top row of the GUI, click that can set the backgound color.

Setting legend is a trick part. The default legend is using scientific format. There is a legend label with the letter ‘e’ on that in the editing region (information section), click that button can edit the figure as needed. We can adjust title size, the anotation size, the style of the legend number as needed. If there are multiple render view in the window, and you want to save each output one by one, we may want to make sure the legend is in the same place for all figure, in this case we can set legend position section and set its lengend as needed (such as setting it to 1), so all legend will stay at the same place.

Setting multiple render view. We ususlly want to visualize multiple figures in one row in the paper to show the results with or without specific visualization effects. In this case we want to have same view for whole visualization results. It is good to open multiple render view and use link camera to link views together. Then we can zoom in and out a little bit to find a good angle and click the save screenshot to save results. Remember to set the image resolution x and y as the same value each time when you want to save the image.

Remember to save the state if there are mutliple views and complicated settings, you do not want to redo all things from scratch when you want to show results to your colleages or continue to edit the previous visualization results.

Sometimes we may use the isocontour, or iso related thing, if there is only one value associated with it, we just set it as a solid color, otherwise, the color might be messed up, it will still think that we choose color from a specific range (which is not the expected results since we only have one value associated with it).

Find the data id list

Once, I need to check if the adjacent id of the mesh also locates at the adjacent physical positions, there is really useful function: using view->find data to label out the particle ids in a specific range. We can select out specific data and the associated selected data will be labeled out by pink dot, we can track the id associated with each data point, that can be helpful for debugging parallel processing based on vtkm. We can only select the region with wrong output and check it in particular.

Connect to remote server

For a long time, I run the paraview in local machine, I basically transfer the node to my local env. It is not the most efficient way to do use the paraview. For lots of cases, the large scale data is located on the Cluster, we need to run the paraview by a job on the cluster and then use local paraview as a client to login to the remote server. Just click that connet botton, and fetch all configurable servers. If you run the code on the typical HPC, then there is a terminal will be poped out, after login the password and token, the pv server is then started according to the configurations, such as process number.

Be careful, we need to install the XQuartz for mac to trigger that terminal. After install the XQuartz, we need to restart the os. This is the related discussion, after reinstalling, the display parameter is sth like this:

$ echo $DISPLAY
/private/tmp/com.apple.launchd.DAdu1PW5GV/org.xquartz:0

The good thing is that we can run multple paraview server to accelarate the rendering speed. When everything is work, we can then use the extractor to generated png and send back to local machine to make animation. The png file is comparatively small and easy to transfer back and forth between client and server.

The spread view and the particle/cell finder

When click the split tab, we can get a new tab which shows different view, the default view is called the render view which shows the graphics data, another commonly used view is called the spreadsheet view, the good thing for this view is that we can clearly check the point data, cell data, field data etc, which can provide a good understading for the position of the specific points or cells.

Furthermore, we can use the View->FindData to select different data points or cells according to the properties of the id, which is really useful for debuging some of the algorithm related to the vtk. The selected points or cells will be labeled out in the figure. These can provide a direact understanding for the position of points/cells in the rendered geometry. We can even click the extract to extract the associated geometry with different colors.

Data selection mechanism

The Paraview provide a really flexible data selection machanism, you could select the interested region from the visualized results direactly. This doc and this video provides a detailed explanation about it. After selecting the dedicated region, we can then use a specific filter called extracting selection to export the selected data region as a separate data set. This is helpful in selecting the perticular region to generate a data set used for debuging. It can save us a lot of energy for preparing dedicated data set for debug, we can just select them out from the rendered figure and export it for testing, this is really a flexible capability and can be used together with find data view.

Processing multiple data blocks

In parallel computing, it is common that the data are outputed into separate vtk files. In visit, we can set the metadata file and load associated vtk file once. Such as

!NBLOCKS 4
Block0.vtk
Block1.vtk
Block2.vtk
Block3.vtk

In paraview, one possible solution I find is to output all associated data into the vtkPartitionedDataCollection, there is no conveienit way to do that, and this is a python script to show how to load the data and useing the filter to add data together.

Some small but important operations

Clear the reset session (The circle button) to clean the current loaded data

Click the View -> Memory inspector to get the monitor about the system memory utilization.

When we try to load a new data, we just put the dir name into the File Name column, (then click the navigate button) the window can jump to the specific dir where the data located, sometimes, it is inconvenient to open that dir.

When we have to window for rendering the data, we can click right position of mouse and link the view, to let two windows have the same view.

If our data only has the scalar field but we want to try some vector field visualization techniques, the simple way is to use the gradient filter (compute the change of the scalar value through each directions), then we have an vector field to play with.

Similarly, if we have a 2d data set and want to change it into a 3d data set, we can use WarpByScalar filter. This kind of data is really cool to visualize the geo related data, or the region with a low value at the center. The visulaized results will show a large hole within it. This is an online exmaple. Using Extract Surface after that can put the associated values onto that warped 3d surface and we can edit values based on that. One recent use case is that using the critical point filter based on that, this plugin can help to find the minimal values. We can do this through TTK plugin, we can enable the TTK and choose associated critical point based on scalar, it can provide more infomative resutls.

One online tutorial at olcf about paraview

The video is here, the online page is here

Some archetecture regarding the Paraview design.

Paraview are divided into the client, render server and data server. The design is based on Proxy pattern. We can find a lot of class name start with SMProxy, this means that the class is a proxy for server manager. Details of proxy pattern can be found here.

Common flow vis tasks

1> show flow data by glyph direactly

I do not know the good way to show it in paraview, just use the visit, hit the vector button direactly and we can get the general feeling about how vector field looks like, and get a quick generaly opinion about the dataset. If the data is from some special source which is not well suppoted by visit, maybe using clean to grid to conver the data into a standard vtk file and then load it into the visit.

2> streamline (steady flow)

The easiest way:
Using the streamline representation view (the seeds are placed through all data domains). Just load associated stream line plugin, and the results can be checked in new representation view.

https://www.kitware.com/new-animated-stream-lines-representation-for-paraview-5-3/

The associated visit way are listed in another blog (need to do a lot of manual work)

3> Extracting the streamline for detailed analysis:

Checking this answer. The results output by stream trace is really informative, it is conveneint to show the specific polyline through the data finder view of the paraview. After that, using the extraction to extract that data set and the coordinates of each points are included in that results. It is conveneint to output the coordinates of these points for further processing. Such as using python to do the further analysis.

4> particle path with unsteady flow

The particle tracer filter is just the particle advection operation in vtk for the unsteady flow. There are two necessary input parameter, the first is the input data, which can be a 2d slice of the original data, the second parameter is the seed, which can be a line, we need to add a line source separately before using this fileter.

From paraview’s interface, when using the particle tracer filter. The first input is the source, which can be the the line source or other similar one. For the source, the resolution parameter can shows how many particles are selected. With this filter, it only advect particle at different steps. Only some of options are exposed by paraview interface, we can set more to call vtk filter separately.

One important parameter is Force Reinjection every NSteps if we set it as 1, it means that after each iteration, we put a particle at the start position of the source and let it advected through the flow field.

If we want to draw some lines based on it, then we need to use the TemporalParticlesPathlines the input of this filter is the output of the particle tracer.

One important parameter is mask points under the TemporalParticlesPathlines, this filter can tell us how many point will be skiped when we visualize the path line.

5> FTLE

Tips of building paraview in all kinds of scenarios

General step by step process can be found here on the official dodument

https://gitlab.kitware.com/paraview/paraview/-/blob/master/Documentation/dev/build.md

But I got some small issues during the installing.

1> python issue. The details can be found here. Maybe just do not mix the anaconda and the homebrew python, they might not work together properly. And linker might find a wrong so libraray. Trying to use the otool or ld to double check the correctness of the linked shared library is a typical way to find the root issue. Or just distable the python and mpi build if we do not need these capabilities.

2> Wrong rpath of the python library when executing the final binary file. Maybe the paraview did not set the rpath in a right way during the compilation, using the make VERBOSE=1 or ninja -v to show all compilation details to check the rpath, this is detailed description of issue

3> qt issue. The details can be found here. Just double checking the installed package, for example, on mac, check the homebrew env and make sure only the qt5 or qt6 is installed. The default one is qt6, for the qt5, the package name is qt@5.

4> some other issues, such as linkinng to a wrong so file for some depedenceis, check this one

5> One time, i got some networking issue, then git submodule does not run successfully, remember to check git status to make sure it runs well, otherwise, try git submodule deinit -f . to set it back to the initial status. If the git-lfs is not there, it might still fail

build paraview on HPC

These are steps to build the paraview on Frontier supercomputer, this is the one withtout using the qt and try to run paraivew through the server mode.

module load gcc
Module load python
module load ninja
module load mesa
module git-lfs

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP -DCMAKE_BUILD_TYPE=Release -DPARAVIEW_USE_QT=OFF -DVTK_OPENGL_HAS_OSMESA=ON -DVTK_USE_X=OFF ../paraview

build paraview by cuda

When cuda is enabled, it will also enable the vtk-m gpu version automatically. This is an example to build the paraview based on GeForce 2080, be careful about different cuda architecture version, this is an key parameter to make cuda things work properly.

cmake ../paraview/ -DPARAVIEW_USE_CUDA=ON -DVTKm_ENABLE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=75

build paraview with adios enabled

The paraview could detect the adios bp file if associated bp reader plugin is enabled. But we need to make sure the associated bp reader is built when installing adios. This is an exmaple:

cmake ../paraview/ -DPARAVIEW_USE_QT=OFF -DVTK_USE_X=OFF -DVTK_OPENGL_HAS_OSMESA=ON -DPARAVIEW_ENABLE_ADIOS2=ON -DADIOS2_DIR=/ccs/home/zw241/cworkspace/install_adios/lib64/cmake/adios2/ -DPARAVIEW_USE_MPI=ON

After building by this way, when we load the bp file, we can load it automatically. However, it might be more flexible to use the adios cpp reader to load the raw data and write out the data as needed. This is an online example to get the adios file and write it out (it might still be necessary to add the vtk.xml file under the associated bp folder as described by the online document). Ensentially, the adios thing is just a bounch of n-d array, the vtk.xml is the metadata that tells paraview how to parse the data (the structure of the mesh).

For the metadata of the adios, for example, if it is {d1,d2,d3}, the d3 is the slowest changed one and the d1 is the fastest changing one. For x-y-z case, it should be {dz,dy,dx} where the x dim changes fast and then y and then z.

There is a ADIOS AdiosReaderPixie reder, which is not maintained for a long time, it also need to include adios1 which is time-consuming when compiling things. It seems there are still some compiling errors associated with it. The best way is still to use the cpp bp reader to get the data and write the resutls out into the raw data, and rendered by other associated softwares.

Paraview loading the raw data

If the data is stored as raw format, it is easy to load associated data by paraview through the .raw file. Just use the image reader, and input the small or big endian, and also the extend value for each dimention, then the results can be rendered properly.

Customized filter

When we set a pipeline that contains multiple filters, we can simply compose these filters together into one filter. Using ctrl to select multiple filter and then click customized filter to create a new filter. For this new filter, we can set input, output and exposed parameter conveniently, we can specify which parameters need to be exposed and other unexposed filter will adopt the

Paraview plugin

This can be a really useful techniques, since wrapping your algorithm into a plugin can let it share other paraview’s capabilities. We will use detailed example to explain how to wrap vtkm things into the filter.

Let’s take VTKmFilters plugin as an example to show what should you do if you want to add your own plugin.

Step 1, set up associated CMakeLists to make the algorithm compiled under the paraview plugin command. Such as these files. The specific command can be added through vtk module build system, instead of naive cmake.

Step 2, clone the paraview code and build it locally. Pay attention that there are depedencies in paraview, after clone code, remember to use git submodule update --recursive to update all depedencies such as VTK and VTKm associated with it.

Step 3, make sure the naive version of plugin can be compiled by paraview, this is an example of my personal project. It needs to find the vtkm and paraview through the paraview build dir (maybe just use build dir to avoid the symbal link issue sometimes).

Step 4, adding the xml file to make sure how the plugin can be interacted with the paraview frontend. This doc details a lot of xml label and its meaning.

Step 5, Just try to debug the algorithm to make sure it can work as expeced, maybe run the paraview by command line to see the printed log message. For the mac, just go into that paraview.app folder and run the paraview through the command line.

Step 6, Program the class that inherit from specific vtk algorithm such as vtkImageAlgorithm and implement the RequestData function. In this function, we need to make sure how to get data form vtk data object and return the vtk data etc. It is quite flexible and case by case. Try to look at these blogs and this doc to get more ideas.

Using client/remote mode run paraview.

Building and install same version of paraview on client and server respectively. For example, this is the cmake command to build and install paraview on server. Remember to load necessary module such as mesa, gcc on HPC.

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON \
-DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP \
-DCMAKE_BUILD_TYPE=Release \
-DPARAVIEW_USE_QT=OFF \
-DVTK_OPENGL_HAS_OSMESA=ON \
-DVTK_USE_X=OFF ../paraview

This is example command to build the paraview on client, check the official documents to get ideas for different platform.

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON \
-DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP \
-DCMAKE_BUILD_TYPE=Release ../../src/paraview

Be careful about several common issue listed above, such as python version issue (mix the system python and the conda python), rpath issue (wrong rpath when link the executable) and qt issue (mix the qt5 and qt6).

Using frontier HPC as an exmaple, we first run the paraview server on login node. Be careful to login to dedicated login node. If we use ssh <username>@frontier.olcf.ornl.gov we might ssh to a random login node. There are different name patterns for these login node. On Frontier cluster, it is login01.frontier, login02.frontier etc. This can be different for different cluster. Assuming we are on login01.frontier, then run the pvserver by ./pvserver --force-offscreen-rendering, the output is:

Waiting for client...
Connection URL: cs://login01:11111
Accepting connection(s): login01:11111

Building a ssh tunnel between local machine (laptop) and the HPC login node. This is an important step and we can do it by this way:

ssh -L 11111:login01:11111 <username>@login01.frontier.olcf.ornl.gov

This means that we map the 11111 port on local machine to the login01:11111 through the <username>@login01.frontier.olcf.ornl.gov. This is same to the normal ssh login process, we input password and login to the HPC.

Starting the paraview on local machine, choose File->Connect->Add server, then add localhost:11111 the start up type is manual. Since the we have already map the local port to remote HPC machine, once we connect button, the client could connect to the pvserver on remote machine successfully. The server end shows Client connected.
The good practice is to start the pvserver on a dedicated compute node instead of the login node. For example:

srun -A <project id> -t 00:30:00 -N 1 -n 8 ./pvserver --force-offscreen-rendering

The output is like this when the node is allocated:

Waiting for client...
Connection URL: cs://frontier07337:11111
Accepting connection(s): frontier07337:11111

In this case, when building the ssh tunnel, we should use

ssh -L 11111:frontier07337:11111 <username>@login01.frontier.olcf.ornl.gov

This means that we actually build a tunnel through login01 to the compute node frontier07337, the ip of frontier07337 can be recognized on the login node. We can also build to tunnels first one is

ssh -L 11111:localhost:11111 <username>@login01.frontier.olcf.ornl.gov

After logining into the login node, then build the second one, which is

ssh -L 11111:localhost:11111 frontier07337

After building this tunnel, we can connect the client with the pvserver. Checking associated document to learn how to use reverse connection.

AverageMind