Scientific visualization with Paraview

This blog write some quick notes about using the paraview to do the visualization. There are some videos here, this article records some quick notes, which can be useful for daily use.

Find the data id list

Once, I need to check if the adjacent id of the mesh also locates at the adjacent physical positions, there is really useful function: using view->find data to label out the particle ids in a specific range.

Connect to remote server

For a long time, I run the paraview in local machine, I basically transfer the node to my local env. It is not the most efficient way to do use the paraview. For lots of cases, the large scale data is located on the Cluster, we need to run the paraview by a job on the cluster and then use local paraview as a client to login to the remote server. Just click that connet botton, and fetch all configurable servers. If you run the code on the typical HPC, then there is a terminal will be poped out, after login the password and token, the pv server is then started according to the configurations, such as process number.

Be careful, we need to install the XQuartz for mac to trigger that terminal. After install the XQuartz, we need to restart the os. This is the related discussion, after reinstalling, the display parameter is sth like this:

$ echo $DISPLAY
/private/tmp/com.apple.launchd.DAdu1PW5GV/org.xquartz:0

The good thing is that we can run multple paraview server to accelarate the rendering speed. When everything is work, we can then use the extractor to generated png and send back to local machine to make animation. The png file is comparatively small and easy to transfer back and forth between client and server.

The spread view and the particle/cell finder

When click the split tab, we can get a new tab which shows different view, the default view is called the render view which shows the graphics data, another commonly used view is called the spreadsheet view, the good thing for this view is that we can clearly check the point data, cell data, field data etc, which can provide a good understading for the position of the specific points or cells.

Furthermore, we can use the View->FindData to select different data points or cells according to the properties of the id, which is really useful for debuging some of the algorithm related to the vtk. The selected points or cells will be labeled out in the figure. These can provide a direact understanding for the position of points/cells in the rendered geometry. We can even click the extract to extract the associated geometry with different colors.

Data selection mechanism

The Paraview provide a really flexible data selection machanism, you could select the interested region from the visualized results direactly. This doc and this video provides a detailed explanation about it. After selecting the dedicated region, we can then use a specific filter called extracting selection to export the selected data region as a separate data set. This is helpful in selecting the perticular region to generate a data set used for debuging. It can save us a lot of energy for preparing dedicated data set for debug, we can just select them out from the rendered figure and export it for testing, this is really a flexible capability and can be used together with find data view.

Processing multiple data blocks

In parallel computing, it is common that the data are outputed into separate vtk files. In visit, we can set the metadata file and load associated vtk file once. Such as

!NBLOCKS 4
Block0.vtk
Block1.vtk
Block2.vtk
Block3.vtk

In paraview, one possible solution I find is to output all associated data into the vtkPartitionedDataCollection, there is no conveienit way to do that, and this is a python script to show how to load the data and useing the filter to add data together.

Some small but important operations

Clear the reset session (The circle button) to clean the current loaded data

Click the View -> Memory inspector to get the monitor about the system memory utilization.

When we try to load a new data, we just put the dir name into the File Name column, (then click the navigate button) the window can jump to the specific dir where the data located, sometimes, it is inconvenient to open that dir.

When we have to window for rendering the data, we can click right position of mouse and link the view, to let two windows have the same view.

If our data only has the scalar field but we want to try some vector field visualization techniques, the simple way is to use the gradient filter, then we have an vector field to play with.

One online tutorial at olcf about paraview

The video is here, the online page is here

Some archetecture regarding the Paraview design.

Paraview are divided into the client, render server and data server. The design is based on Proxy pattern. We can find a lot of class name start with SMProxy, this means that the class is a proxy for server manager. Details of proxy pattern can be found here.

Common flow vis tasks

1> show flow data by glyph direactly

I do not know the good way to show it in paraview, just use the visit, hit the vector button direactly and we can get the general feeling about how vector field looks like, and get a quick generaly opinion about the dataset. If the data is from some special source which is not well suppoted by visit, maybe using clean to grid to conver the data into a standard vtk file and then load it into the visit.

2> streamline (steady flow)

The easiest way:
Using the streamline representation view (the seeds are placed through all data domains). Just load associated stream line plugin, and the results can be checked in new representation view.

https://www.kitware.com/new-animated-stream-lines-representation-for-paraview-5-3/

The associated visit way are listed in another blog (need to do a lot of manual work)

3> Extracting the streamline for detailed analysis:

Checking this answer. The results output by stream trace is really informative, it is conveneint to show the specific polyline through the data finder view of the paraview. After that, using the extraction to extract that data set and the coordinates of each points are included in that results. It is conveneint to output the coordinates of these points for further processing. Such as using python to do the further analysis.

4> particle path with unsteady flow

The particle tracer filter is just the particle advection operation in vtk for the unsteady flow. There are two necessary input parameter, the first is the input data, which can be a 2d slice of the original data, the second parameter is the seed, which can be a line, we need to add a line source separately before using this fileter.

From paraview’s interface, when using the particle tracer filter. The first input is the source, which can be the the line source or other similar one. For the source, the resolution parameter can shows how many particles are selected. With this filter, it only advect particle at different steps. Only some of options are exposed by paraview interface, we can set more to call vtk filter separately.

One important parameter is Force Reinjection every NSteps if we set it as 1, it means that after each iteration, we put a particle at the start position of the source and let it advected through the flow field.

If we want to draw some lines based on it, then we need to use the TemporalParticlesPathlines the input of this filter is the output of the particle tracer.

One important parameter is mask points under the TemporalParticlesPathlines, this filter can tell us how many point will be skiped when we visualize the path line.

5> FTLE

Tips of building paraview

General step by step process can be found here on the official dodument

https://gitlab.kitware.com/paraview/paraview/-/blob/master/Documentation/dev/build.md

But I got some small issues during the installing.

1> python issue. The details can be found here. Maybe just do not mix the anaconda and the homebrew python, they might not work together properly. And linker might find a wrong so libraray. Trying to use the otool or ld to double check the correctness of the linked shared library is a typical way to find the root issue. Or just distable the python and mpi build if we do not need these capabilities.

2> Wrong rpath of the python library when executing the final binary file. Maybe the paraview did not set the rpath in a right way during the compilation, using the make VERBOSE=1 or ninja -v to show all compilation details to check the rpath, this is detailed description of issue

3> qt issue. The details can be found here. Just double checking the installed package, for example, on mac, check the homebrew env and make sure only the qt5 or qt6 is installed. The default one is qt6, for the qt5, the package name is qt@5.

4> some other issues, such as linkinng to a wrong so file for some depedenceis, check this one

5> One time, i got some networking issue, then git submodule does not run successfully, remember to check git status to make sure it runs well, otherwise, try git submodule deinit -f . to set it back to the initial status. If the git-lfs is not there, it might still fail

These are steps to build the paraview on Frontier supercomputer, this is the one withtout using the qt and try to run paraivew through the server mode.

module load gcc
Module load python
module load ninja
module load mesa
module git-lfs

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP -DCMAKE_BUILD_TYPE=Release -DPARAVIEW_USE_QT=OFF -DVTK_OPENGL_HAS_OSMESA=ON -DVTK_USE_X=OFF ../paraview

Paraview plugin

This can be a really useful techniques, since wrapping your algorithm into a plugin can let it share other paraview’s capabilities. We will use detailed example to explain how to wrap vtkm things into the filter.

Let’s take VTKmFilters plugin as an example to show what should you do if you want to add your own plugin.

Step 1, set up associated CMakeLists to make the algorithm compiled under the paraview plugin command. Such as these files. The specific command can be added through vtk module build system, instead of naive cmake.

Step 2, clone the paraview code and build it locally. Pay attention that there are depedencies in paraview, after clone code, remember to use git submodule update --recursive to update all depedencies such as VTK and VTKm associated with it.

Step 3, make sure the naive version of plugin can be compiled by paraview, this is an example of my personal project. It needs to find the vtkm and paraview through the paraview build dir (maybe just use build dir to avoid the symbal link issue sometimes).

Step 4, adding the xml file to make sure how the plugin can be interacted with the paraview frontend. This doc details a lot of xml label and its meaning.

Step 5, Just try to debug the algorithm to make sure it can work as expeced, maybe run the paraview by command line to see the printed log message. For the mac, just go into that paraview.app folder and run the paraview through the command line.

Step 6, Program the class that inherit from specific vtk algorithm such as vtkImageAlgorithm and implement the RequestData function. In this function, we need to make sure how to get data form vtk data object and return the vtk data etc. It is quite flexible and case by case. Try to look at these blogs and this doc to get more ideas.

Using client/remote mode run paraview.

  • Building and install same version of paraview on client and server respectively. For example, this is the cmake command to build and install paraview on server. Remember to load necessary module such as mesa, gcc on HPC.
cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON \
-DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP \
-DCMAKE_BUILD_TYPE=Release \
-DPARAVIEW_USE_QT=OFF \
-DVTK_OPENGL_HAS_OSMESA=ON \
-DVTK_USE_X=OFF ../paraview
  • This is example command to build the paraview on client, check the official documents to get ideas for different platform.
cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON \
-DVTK_SMP_IMPLEMENTATION_TYPE=OPENMP \
-DCMAKE_BUILD_TYPE=Release ../../src/paraview

Be careful about several common issue listed above, such as python version issue (mix the system python and the conda python), rpath issue (wrong rpath when link the executable) and qt issue (mix the qt5 and qt6).

  • Using frontier HPC as an exmaple, we first run the paraview server on login node. Be careful to login to dedicated login node. If we use ssh <username>@frontier.olcf.ornl.gov we might ssh to a random login node. There are different name patterns for these login node. On Frontier cluster, it is login01.frontier, login02.frontier etc. This can be different for different cluster. Assuming we are on login01.frontier, then run the pvserver by ./pvserver --force-offscreen-rendering, the output is:
Waiting for client...
Connection URL: cs://login01:11111
Accepting connection(s): login01:11111
  • Building a ssh tunnel between local machine (laptop) and the HPC login node. This is an important step and we can do it by this way:
ssh -L 11111:login01:11111 <username>@login01.frontier.olcf.ornl.gov 

This means that we map the 11111 port on local machine to the login01:11111 through the <username>@login01.frontier.olcf.ornl.gov. This is same to the normal ssh login process, we input password and login to the HPC.

  • Starting the paraview on local machine, choose File->Connect->Add server, then add localhost:11111 the start up type is manual. Since the we have already map the local port to remote HPC machine, once we connect button, the client could connect to the pvserver on remote machine successfully. The server end shows Client connected.

  • The good practice is to start the pvserver on a dedicated compute node instead of the login node. For example:

srun -A <project id> -t 00:30:00 -N 1 -n 8 ./pvserver --force-offscreen-rendering

The output is like this when the node is allocated:

Waiting for client...
Connection URL: cs://frontier07337:11111
Accepting connection(s): frontier07337:11111

In this case, when building the ssh tunnel, we should use

ssh -L 11111:frontier07337:11111 <username>@login01.frontier.olcf.ornl.gov

This means that we actually build a tunnel through login01 to the compute node frontier07337, the ip of frontier07337 can be recognized on the login node. We can also build to tunnels first one is

ssh -L 11111:localhost:11111 <username>@login01.frontier.olcf.ornl.gov

After logining into the login node, then build the second one, which is

ssh -L 11111:localhost:11111 frontier07337

After building this tunnel, we can connect the client with the pvserver. Checking associated document to learn how to use reverse connection.

推荐文章