what are you working on

Some thoughts about the classification of CS related topic. Why it is not enough to divide the CS topics into the system track, algorithm track and the middleware track.

It is still hard for me to explain what I am working to others, especially to the people that are not working on this major. Previously, I try to explain whole picture from the perspective of the algorithm, system and the application, but it is still too obscure for other peoples. And I found a new framework here.

Maybe I could explain it in this way. Two focus aspects (or the world view) are the data and resource in the domain of the CS

For the aspects of the data, it includes the data generation, data transfer, data processing and data storage or indexing

For the aspects of the resources, it inluces the computation, networking and storage resources.

Then basically, it is easy to use this view to analyse particular things. People usually work in differnt context or semantics, for exmaple, the IOT, cloud computing, Edge computing, HPC, robotics, etc. Let’s try to see how to use the presentes angle to analyse these domains.

Let’s see IOT and the edge computing, the data come from the edge device, then it might be transfered to central device maybe at the data center, after some processing, the decision will be made, the output is the processed information useful for people to make decision, or the algorithm can make the decision (data processing). When the decision is made, the decision is transfered to the device (data transfering) and control the resources (resource management), the simple case is lifting the barrier in the park lot, the complictaed case might be adjusting the engine of the space ship.

You could see that real workable project includes both aspects of data and resource. People may focuse on differnt part and optimize it, that is why it is hard to explain what they do to others, since it is not easy to consider the whole picture if we dive into particular part too much.

For other context such as the HPC/Cloud related work, the model from different scientific domain will generate the simulation data (data generation), or the log message from the user information, when they try to generate the data, they need the computing resource, and use it. If we want to manage it such as allocating or releasing the resources, we need all kinds of resource management tool (such as slurm or k8s), When we want to use resource properly, we need compiling tool and the programming language. Then we may need to transfer data between different devices for further processing, we need to utilize the data transfer medium such as disk or networking, and the data index methods. that is the position of the data I/O and storage. Then for the data processing, we use differemt methmetical model or visulization tools to get some insights. When we get some results, then we can adjust the simulation based on the insights.

It is easy to find a position for what you are working on by this framework, namely the data angle and the resource angle. Previously, we use the system, algorithm and middleware track to classify the CS topics, basically we only consider things in static way without the thoughts of the high level workflow. And it is hard for people to understand it as a whole. Besides, it is not good for the project’s view, since you may neglect the upstram and downstream if you just scope your work as the system or the algorithm part. You may lose the motivation that how your piece will contribute to the whole project.

One typical case is that the middleware related project such as the storage part, if you neglact the idea of upstream and downstrem, it is easy to generate sth that is not useful. Since the goal of the project is to support the loop of data generation, transfer, processing and decision apply to the resource/device or users.

推荐文章