2022-12-18

Many faces of vis research

Some ideas about the different aspects of visualization research.

Which domain

The vis operation is simple to understand, you just vis the data out, that’s it. The two questions are how to vis the data and what are data you need to vis out.

I use a simple way to divide these domians, the data generated by scientific simulaion and other cases. (just use two options, which can make things easy at the first step)

For vis generated by scientific simulation, it is important to look at different types of the data (scalar vector or volum rendiering) and different scientific domain.

The vis domain is really active and there are lots of new concepts and papers for general visulziation, such as infovis, biovis, scientific vis etc. Different domains requires different visualization techniques.

One interesting thing is to consider how much you need to understand about the domain knowledge and where is the line here. At least you need to have some sense about the format of the data you try to visualize here.

Some classification about the dataset

Data generated by simulation:

One important type is data generated by FEM method, another is the data generated by CFD method. This article discusses difference between FEM and CFD, these methods are used a lot in the industry domain. Such as the simulation about the aircraft parts etc.

The data generartetd by N-body simulation methods. Typical one is the molecure dynamic simulation, or the data set generated by cosmological simulation. These simulation are used a lot in the research domain.

Observed or profiling data:

The typical observed data are CT data or data based on specific signal or dedicated device which can not read by human direactly. We might need some reconstruction strategy to process these data.

Profilling data are usually the data for the large computing platform or the system. We use all kinds of timer or log file to acqure these data.

Image data is the most natual data format, we just take a photo for particular object. Then there are all kinds of image processing techniques, this is the typical targeted data for the domain of image processing, the data here can from scientific domain or other domains these data are much easier to acquire and the format of the data is comparatively simple.

Understanding these things can help to decide the priority of things. I spent several days to write a solver for particular pde equation, however, I founud that is not much helpful from the research perspective. If the main research contribution is the in situ data ana/vis and workflow management, your main work is not the data producing, it should be data processing. Compared with taking several days to write the solver by yourself, maybe it is good to use these time to learn how to use the existing software to generate data and how to make your in situ tools work on it. The data processing thing is sth you need to know (so just learn other software and use it, and get familiar with the software background) instead of the working on it (put some code in it). The vis/ana or workflow management things is the code you need to put direct contribution into it.

Scale or not scale

That is the one important question for research works. For the scaled vis algorithm, the main challenging is how to make it run in a large scale and process the data generated by the large scale smulation, there might be some reduction techniques here, and even the in situ processing approach.

Actually, a lot of research or post processing ana/vis is executed in an unscaled way, we use the personal device to look at the data back and forth and get the results or find out the new insights. For this trend, the main research challenging is to look at the new data sets (new domain, such as the data generated by quantum system, vr for rendering) and new visulization method (such as AI based approach for visulization or fancy interactive way such as the interaction based on dialogue).

Post processing or in situ

Post processing is the most natural form. You put the data into the disk, then using the interactive software to explore the data and get some insights. The main challenge here is to develop the software that has a good visualization results or provide the dedicated filter or new algorithm for the new data sets.

In situ processing is associated with the large data size and parallel processing. That is the typical scenario that the in situ processing is targeted.

It might be a little bit easier to have post processing first and then moving to the in situ processing. Pay attention to the challenge when using these different scenarios. In situ case cares more care about the scale aspect, and the dataset is not much easier to acquire since we do not store the datasets and we need to make sure the simulation runs well and vis part can be properly run together with the simulation which is not easy.

Different founding sources

NSF: It seems that the research work from the university is more interested in the non-scale type visulization system, their may funding source is from the NSF. They encourage the fancy ideas, although these ideas might not be used to actual system direactly. For example, recently i noticed some work for visulizing the quantum system, using the language model to interact with the vis system and all kinds of deep learning things. The info vis is really popular in this domain. There are all kinds of smart ideas here, the recent popular topics are explanable AI system and the visualization based on new system such as the quantum computing. You have more freedom to choose what you want to look at if you work in univesrity as a Professor, the topic are usually high risk high award type.

DOE: The founding source fomr the DOE is more focused on the scientific domain, the application they focused on is the parallel computing on these big machine. So the mission here is clear, how to support these large simulation program and visulization research is more focused on the parallel situation.

NIH: The founuding source here is more focused on the image processing image reconstruction etc. Since lots of NIH founding source focuse on the human health.

Industry: Industry may focused on the domain that can make money, the industrial manufacturing is what they targeted, they may focuse on the tightly coupled with the doamin expert to make sure they can help the specific domain, such as computer assistent design, CAD, CFD, FEM etc.

Actually, the founding source decides the type of the research you work on.

One interesting thing is to look at the research path of this professor, he moves from the scientific vis domain to more general vis domain, the founding source and the type of research changes a lot.

Things in the toolbox

The graphics level things, such as opengl rendering things; the ability to customize the vis pipeline and show its results such as the qt things. The post processing techniques and associated softwares, the common filter based on VTK and VTK-m and parallel things, how does the common filter run in paralle. The domain knowledge for the data set you worked on and how are they generated.

Other thoughts

The key thing is not to work at an interesting domain, the key thing is to know the challenging and mission your research domain has and how to make your contribution.

There are really good answer and this about different research direactions in vis domain.

AverageMind