the content for this part mainly come from here
There are several principles, simple and powerful, these might be the most important things during the PhD traning. This can be viewd as the electric drill to get some new knowledge.
The idea could come from these aspects thinking perspectives, namely, the first step to get new knowledges.
- Incremental, when you have your first paper, you can always start to improve it, although that might not fancy enough, but it is still a step to move forward. Maybe just add some experiments and pushing a conference paper into the journal paper. Similarly, reading other people’s work such as other people’s thesis and to look at their future work or limitation, if it is possible for you to remove that limitation or gap
- Interdisiplinary. Just A+B, this is simple, or use the ideas from other domain to solve a problem in your domain. For exmaple, some techniques are implemented by CPU and you migragte it to the GPU or other accelarators.
- Implementation and improvement. Implement sth is the stratight forward way to give you the idea about some details. But it is dangerous to implement sth without the goal or motivation. Your goal is always to make sth faster, more accurate, save more space, utilize resource in more efficient way, more flexible or adaptive in particular scenarios etc. sth like this. With these motivation in mind, you might find “unusual suspects” during the implementaiton. Some times, the idea is impeded by the implementation perspectives, for example, you may want to try sth that runs on distributed nodes with GPU but you may not access to the GPU resources. The habit of implemeation aims to identify that if your idea is feasible, and what are underlying obstacles, if we can overcome them based on the current constraints.
- Ask your self questions or discuss with others to let other people ask your questions. When you go through your paper, try to be rigourous to your self and assume you are a reviewer. Then you try to answer these questions. By doing things like this, you may come up with new ideas and get better ideas or a whole picture for your work.
the idea is always start from a small and unclear motif, then when you doing it, more things comes out, and the whole picture is getting clear.
For me, I mean the average mind people, it looks that start from some easy idea and use solid work to show it works might be a good step to start. I never came out fancy ideas or solutions for hard problem during the my study journey in my impression.
some times, it is hard to think about the reaesch idea, at this point, instead of torturing yourself, it is better to use some methods to guide yourself to moveforward.
what is the research problem or the goal you want to target, such as metric
what is the context such as the entity you want to research or target
what is the current research status (it is highly possible there are some related works, you do not need to worry too much if your idea overlaps with other people’s idea)
what is the assumptions or the conditions for other people’s method, basically no solutions can solve every problems or suitable for every conditions.
you may dive into related works carefully and see what are there conditions or assumptions, they may not discuss these things too much some times, when you could answer when this method is suitable to solve this problem, it is a good point that you could try to provide other solutions with other conditions.
when consider about the conditions, pay attention to some commonly used methods such as the how to classify one thing. (maybe find just search the key word “assume” or “require”)
you may find some small ideas or minor points that you can improve, do not let it go, and many a little makes a mickle，new ideas can be presented during this process.
做一个事情解决一个问题一共有 a b c d e 几种类似的方式，这些方式在某些情况下这个好，在某些情况下那个好。影响好坏的parameter都有哪些然后做实验来evaluate，体现出不同，然后讨论那些不够好的地方，看看怎么能够优化的更好，比如利用现有的技术手段，怎么能进一步减少overhead。似乎对于刚刚接触某个领域，想发现一些隐藏着的问题的时候，这个方法比较有效。
As with any discussion about plots, we may use the handy mnemonic “WALTER” to make sure your readers understand the plot: a) What is the plot and Why it is important? b) Axes. c) Labels. d) Trends. e) Exceptions. f) Recap. Although it may looks cliched to follow particular rules to write paper, but at least for the first version of the paper, following particular rules are necessary, you could make sure that every details are clearly explained. Then for the second version, you may merge and adjust sequences of these key pieceses and make the work more readable.
some times it is hard to detect if an idea is deserved to do, or even if it is right. The secure way is to find a similar paper, maybe it used same methods for differnet data or different domain, maybe it focuse on similar problems but the methods can be improved further. At least there are some research value in this way. If you are lucky, what you want to do is the future step of similar work or sth that previous paper ignored, that might be easier to find the research motivation by this way.
A dangerous case is confuse the tool and the goal, sometimes you just think s perticular methods is good and dive into it a lot but lose the overall picure, that is dangerous. Always ask your self what is the metric, evaluation goal and the benifits for the whole picture, if the answer is clear, you may at the right direaction and have a good motivation. Otherwise, you may just try to recreate a tire with a different shape!!!
Even in a ten pages paper, the implementation is just around one page, the more important thing is the motivation, methods, evaluation. You may just need to describe some key points of the implementation even if you puts lots of effots here for the most of the time.
sometimes, it is important to get feedbacks from colleagues. You may have different ideas here, before discussing, always try to void yourself and think what other people’s idea are right, then follow their thoughts and go through it by your logics, listen to the full expression from the other people and then you may get more in this case and the expression my get more effective, do not say right or wrong for other people’s idea, just by asking particular questions if you think a particular question is wrong. People may willing to provide more feedbacks for you in this case. and you may find there are more places that can be improved in your work. It is important to make sure you understand people’s idea instead of just saying that other people’s ideas are incorrect inadvance.
The idea is motivated by the area such as the quantumn computing, even though the state of art quantum computer can process less than hundreds of qubit, but the new algorithm that use thousands of qubit can be created and applied by the synthetic case or simulated env. The simulated env can show that if this ideas work in an ideal env, some times it does not work even for the ideal case, that is a good try since you avoid the wrong direaction with the small cost, after play with the simulated env, there might be better ideas, and that is the benifits of the simulated env. The ideas of using simulated env may similar with writting the test case or integratred testing for the projects, these kind of tests can help you to find out the potential errors in a more efficient way. Then from the simulated or synthetic case to the workable example or project, you may need this depedency or that capabilities, these things may take lots of time to do, then you could show this to your boss that what you are doing is meaningful based on the simulated or synthetic case. This could release your pressure when you are doing the long time work and convince yourself.
Other two similar ideas are how to start from the simple case to complex case and how to start from the naive solution to the optimization solution. For example, in one work, I try to provide a model and use this model to decide how particular things work in differnet situation. Then I start from the real use case and I found that it is hard to express things clearly since the actual example includes multiple cases and mix different conditions together. So I decide to evaluate things from the particualr concrete case based on synthetic example and then move to more complicated real case. During this process, I fonud that I did not consider several edge cases clearly. It is just like how you do the test for a large system, you need to have the unit test and the integrate test. It is dangerous to start from the integrate test direactly since it is not convincing enough and you may also miss some key situation. Always start from the simple case and the unit test. At least it is how things work at the computer engineering since you may need to make thigns controllable.
Similar ideas are also applicable for building the mathmatical model or strategies to do a particular thing, start from an simple strategy that are applicable to the general cases. And then move to more complicated one by considerng more edge cases and loosen the constraints. We may need to add more terms or conditions for this naive model.
when list the related work, the good practice is try to find the clues of the works instead of just describing the content of the work. One tricky point is that we need to find the knowledge graph hidden behind the related works. By realing lots of related works, we find that clue, and then use that clues, we can show that if the current works are enough and we can easy to extend it and get new direactions. This is also a good strategy to have a good motivation.
How to get a clue is imporant, the idea here is how to use the classification. When we see the structure of the related works, one simple way is to say if wen can classify this thing. When we start to classify it, we can start to get the new knowledge.
Another way is that, try to put your target in the different clues or goals (typically A+B method to generate new things). It will become a trick to generate more knowledge. For example, you focuse on a system that process the data, the common classification for this system is producer, consumer. This is the different part of constructing this thing. Then maybe you want it becomes an adaotive system, and there are some classifications for the adaptive system to describe one thing. Such as adaptive goal, adaptive policies and adaptive mechanisms. Then you have 2 dimentions to descirbe this thing, and there is a 3 time 2 configuration space that you can use to describe related works. You may find that contents discribed in one paper may locate in multiple entries, by this way, you may find that some entris are not well explored. Then this is a good opportunity to go to the next step.
Basically, you use different dimentions to describe one target, and for a particular dimention, there are some discrete values that can be typical types of this dimention. Figure out how it works in each dimention is the typical process of learn this thing, then you may create more discrete values, this is a kind of contribution to extend the knowledge. Another way is to combine different dimentions together, this is a more simple way to extend the entry, for dimetion a there is classification [a1,a2], for dimention b, there is classification [b1,b2,b3], then you just look at the system from the perspective of the a1,b1; a1,b2; a1,b3…
When we talking about the background and the motivation part of a workflow, both BFS and DFS are important strategy to describe the knowledge tree. Especially for the introduction writting, we basically move from the branch to a particular specific problem or issue step by step. Sometimes, we may easy to neglect the BFS part. For example, in the context A, there is a problem P, there are several methods, such as M1, M2, M3 to solve it, this paper improve the performance for M1. For this case, we my easily neglect the A2, A3… P2 P3… during the description. Although we may not dive into details for thses cases, but it is also worth to describe them when we start to introduce the A and P, this can make the description progress in a more systamatical way and it is also easy for reader to locate what you have done in this work in the whole knowledge tree. Sometimes it is important to let other people’s know the scope or this work compared with how you actually do it. Since most of the methods are common or easy undersanding or a minor improvements. People may just need to make sure what you do make sense and the results make sense, that is enough for the basic level of the paper. For the novelty, it is sth we expect but not often have
It might also a good practice to make sure you understand the problem clearly by pushing your self describe things in this way. An interesting analogy is to consider the generative and the discriminative model used in the machine learning scope. When we consider one thing in the generative way, basically we are doing the DFS and shape the distribution step by step and refine it continuously. But to some extend, we may find it is not worth to continue shape the description of this thing, we may tend to use the discriminative method to consider the distinction between this thing and other things. Basicslly, when we try to do the distinction, we use the BFS method and when we try to describe the properties of sth, we use the generative model. Without the clear defination about the sample space, it is really hard to describe sth clearly and make it a unique thing in definatino. For example, let’s say we want to define a person, it is hard to describe it clearly since there are all kinds of overlapping between different aspacets of a person, even using all biological characteristic, there are probabilitis of the twins. We may use the generative model to limit the description into a particular scope, and then, use the discriminative model to do that. This might be a more economic way. Basically in the paper, the last point is the contribution or research values. you could describe what is difference between your work and other works that try to solve similar tasks. Your work might be better in this case and other work might be better in that case etc.
one important thing is about the disucssion of the reaults in the related works. for example, you build sth. like a model and get results, the reviewer or the reader may want to see your observation and the thoughts about these observation. Namely, why it is like this. there is a tradeoff about how much content and efforts that describe the implementation details and analysing the resutls. Basically, both of these two things are important, but the analysing is more imporant, especially during the presentation. Since you may take several days to make sth work, in the presentation, it is just a sentences. But for analysing, you could discuss a lot.
error dicussion is sth that is easy to be neglected especially for the work that focus on a particular system or framework. One kind of paper is to dicuss a framework and then analysis the performance for key metric such as scalability, IO or throughoutput and so on, and you may say that your work outperform the state of the art case.
But I had few chance to do the paper like that. more works I worked on is to present a model and say we want to evaluate or compare sth, then use this as the motivation of developing particular framework (our implementation), and then say this framework works well based on model. This is kind of mixing the model and the framework developing together. Actually this is a little bit hard, at one aspect, you need to show that the framework is good for performance such as scale and key metric. At another aspect the model approach should be presneted. Although neither aspects are really strong (it is hard to output a pure theoritical paper or a pure system paper), but at least, there are enough content to be published.
I may have some knowledge about the framework evaluation, such as varying the configuration and checking the performance, but few knowledge about the model evaluation.
there should be a disctinction between the model and the policy or algorithm
for model, the metric such as error need to be added into the evaluation for the model evaluation, this is sth that may be neglected. We may provide an model (abstraction of sth), then we may build a policy based on this model, then we may use these policies direactly, the unconvinced part of this operation is the error of the model itself. We may first figure out the requirments of the model and the limitation of the model, and figure out what are the errors that influenmce the model, and then we can move to the next step to say how to make the policy and model more accurate for representing and anticipating the actual problem.
Let’s to an simple example, if you want to line up several kids efficiently, the first is to abstract kids as a number, namelay the height, although there are multiple other aspects, but they can be neglected in this problem. Then we can use the particular policies such as quick sort algorithm to do it. But for the model adpects, if we do not use the real time height of the kids, but use the data recorded one years ago for every kids, the reaults might be inaccurate. The results might also be influenced based on inacurate measurement of the model process.
Sometimes when you struggle that it is hard to find a good direaction or aspects, maybe try to collect more data is a good way to open up a new perspect. Try to consider what data you have, and if you fully explored it, if you could extract more data or information from current system, and if it is possible to collect more data. Not only for the research. This is also applicable for other real life problems. When you not sure about sth, you need to ask ideas from other people, the more you ask, the more you know about the real things and more direactions to go.
进来有两条颇有触动的思想方法，一则是“信息消除不确定性”， 更详细的论述来自吴军博士的硅谷方法论。 另外一则是“手段与目的”。
对于手段与目的的论述，常常让人感到，不识庐山真面目，只缘身在此山中。特别是可以算作system的大类下的工作，常常容易在这里吃亏，因为system的工作目的往往并不容易找，让一个东西在新的系统或者device上工作，这归根结底还是手段，目的是比较不同device的特性，或者是让这个更快，等等。自己认为可以当做范例的一篇是 “Full-Stack, Real-System Quantum Computer Studies: Architectural Comparisons and Design Insights” 这里文章的主要内容实际上是一个基于quantum computer的compiler的工作，但是切入点并不是compiler，而是如题目，architecture comparision。如果仅仅是介绍compiler可能并不算一个特别好的文章，但是通多改变立意，就提升了一个档次，使得手段是围绕着目的进行。为了evaluate，才开发compiler。这样在tool上做的工作也更加convincing。
核心思路正如标题所述 找一个promising的场景 然后Build sth that can make it works, then optimize it 这个似乎是通常的research 套路，比如进来的good work 可以参考这个IPDPS21上的这个 Designing High-Performance MPI Libraries with On-the-fly Compression for Modern GPU Clusters
从research perspective讲的话，optimize的部分算是真正有research taste或者research value的部分。如果抛开后面一部分的optimization的工作，直接说前面的，一方面很难说这个是只有你做了的，别人没做的，因为大部分工作都是有一两个相似的work，这时候你就要说，你这个做的在哪里哪里要比别人的好，或者别人哪里哪里没有做，对于algorithm这是比较好弄的，可对于framework类的工作，这似乎并不太好说明，于是从research的角度来讲，比较convincing的taolu就是尽量找到一个新的场景，然后在这个新的场景下做一个完成对应事情和优化对应事情的工作。否则就容易被质疑motivation，相比较起来也不太容易表述。