paper_taolu

整理一些所谓的做research些论文的套路。

其实所谓的套路就是所谓的科学的思维方法,凭经验和感觉在scentific的方面可能不是很适用,最起码是在完善idea的过程中,需要用到科学的思维方法。

how to come up with new idea

the content for this part mainly come from here

https://github.com/asherliu/researchHOWTO

There are several principles, simple and powerful, these might be the most important things during the PhD traning. This can be viewd as the electric drill to get some new knowledge.

The idea could come from these aspects thinking perspectives, namely, the first step to get new knowledges.

  • Incremental, when you have your first paper, you can always start to improve it, although that might not fancy enough, but it is still a step to move forward. Maybe just add some experiments and pushing a conference paper into the journal paper. Similarly, reading other people’s work such as other people’s thesis and to look at their future work or limitation, if it is possible for you to remove that limitation or gap
  • Interdisiplinary. Just A+B, this is simple, or use the ideas from other domain to solve a problem in your domain. For exmaple, some techniques are implemented by CPU and you migragte it to the GPU or other accelarators.
  • Implementation and improvement. Implement sth is the stratight forward way to give you the idea about some details. But it is dangerous to implement sth without the goal or motivation. Your goal is always to make sth faster, more accurate, save more space, utilize resource in more efficient way, more flexible or adaptive in particular scenarios etc. sth like this. With these motivation in mind, you might find “unusual suspects” during the implementaiton. Some times, the idea is impeded by the implementation perspectives, for example, you may want to try sth that runs on distributed nodes with GPU but you may not access to the GPU resources. The habit of implemeation aims to identify that if your idea is feasible, and what are underlying obstacles, if we can overcome them based on the current constraints.
  • Ask your self questions or discuss with others to let other people ask your questions. When you go through your paper, try to be rigourous to your self and assume you are a reviewer. Then you try to answer these questions. By doing things like this, you may come up with new ideas and get better ideas or a whole picture for your work.

the idea is always start from a small and unclear motif, then when you doing it, more things comes out, and the whole picture is getting clear.

For me, I mean the average mind people, it looks that start from some easy idea and use solid work to show it works might be a good step to start. I never came out fancy ideas or solutions for hard problem during the my study journey in my impression.

when it is hard to move forward

some times, it is hard to think about the reaesch idea, at this point, instead of torturing yourself, it is better to use some methods to guide yourself to moveforward.

what is the research problem or the goal you want to target, such as metric

what is the context such as the entity you want to research or target

what is the current research status (it is highly possible there are some related works, you do not need to worry too much if your idea overlaps with other people’s idea)

what is the assumptions or the conditions for other people’s method, basically no solutions can solve every problems or suitable for every conditions.

you may dive into related works carefully and see what are there conditions or assumptions, they may not discuss these things too much some times, when you could answer when this method is suitable to solve this problem, it is a good point that you could try to provide other solutions with other conditions.

when consider about the conditions, pay attention to some commonly used methods such as the how to classify one thing. (maybe find just search the key word “assume” or “require”)

you may find some small ideas or minor points that you can improve, do not let it go, and many a little makes a mickle,new ideas can be presented during this process.

control experiments

这个方法在自然科学,或者是data science中似乎用的更多一些,就是对于原理还比较黑盒的时候,比较经典的例子就是发现坏血病和维生素C有关的例子,把病人分成多组,然后每个组使用不同的方法进行处理,最后发现输入与结果的关联性。

evaluation

做一个事情解决一个问题一共有 a b c d e 几种类似的方式,这些方式在某些情况下这个好,在某些情况下那个好。影响好坏的parameter都有哪些然后做实验来evaluate,体现出不同,然后讨论那些不够好的地方,看看怎么能够优化的更好,比如利用现有的技术手段,怎么能进一步减少overhead。似乎对于刚刚接触某个领域,想发现一些隐藏着的问题的时候,这个方法比较有效。

a tip about the results discussion

As with any discussion about plots, we may use the handy mnemonic “WALTER” to make sure your readers understand the plot: a) What is the plot and Why it is important? b) Axes. c) Labels. d) Trends. e) Exceptions. f) Recap. Although it may looks cliched to follow particular rules to write paper, but at least for the first version of the paper, following particular rules are necessary, you could make sure that every details are clearly explained. Then for the second version, you may merge and adjust sequences of these key pieceses and make the work more readable.

faster, higher and stronger

直接的contribution,这个事情可以怎样把原来的问题解决的更好,首先已经明确了做一个事情的方式或者解决一个问题的方法是什么。这种类型的paper往往是比较大的contribution或者是突破性的成果,这个事情解决了一个原来没有解决的问题。具体进行evaluation的时候就是两组比较,没有使用这个方法和有使用这个方法的时候各是什么样的,然后突出了使用这个方法之后的metric的优越性。

当然也有一些trick,比如得到了独一无二的数据或者access到别人无法使用的机器资源等等。

what is the good idea or novelty

别人不知道的,非局部的incremental的,解决方法能适用于一类问题的,并非单纯的A+B的问题。

motivation, tool vs goal

some times it is hard to detect if an idea is deserved to do, or even if it is right. The secure way is to find a similar paper, maybe it used same methods for differnet data or different domain, maybe it focuse on similar problems but the methods can be improved further. At least there are some research value in this way. If you are lucky, what you want to do is the future step of similar work or sth that previous paper ignored, that might be easier to find the research motivation by this way.

A dangerous case is confuse the tool and the goal, sometimes you just think s perticular methods is good and dive into it a lot but lose the overall picure, that is dangerous. Always ask your self what is the metric, evaluation goal and the benifits for the whole picture, if the answer is clear, you may at the right direaction and have a good motivation. Otherwise, you may just try to recreate a tire with a different shape!!!

Even in a ten pages paper, the implementation is just around one page, the more important thing is the motivation, methods, evaluation. You may just need to describe some key points of the implementation even if you puts lots of effots here for the most of the time.

discuss and motivation

sometimes, it is important to get feedbacks from colleagues. You may have different ideas here, before discussing, always try to void yourself and think what other people’s idea are right, then follow their thoughts and go through it by your logics, listen to the full expression from the other people and then you may get more in this case and the expression my get more effective, do not say right or wrong for other people’s idea, just by asking particular questions if you think a particular question is wrong. People may willing to provide more feedbacks for you in this case. and you may find there are more places that can be improved in your work. It is important to make sure you understand people’s idea instead of just saying that other people’s ideas are incorrect inadvance.

from the synthetic or simulated env to the real project

The idea is motivated by the area such as the quantumn computing, even though the state of art quantum computer can process less than hundreds of qubit, but the new algorithm that use thousands of qubit can be created and applied by the synthetic case or simulated env. The simulated env can show that if this ideas work in an ideal env, some times it does not work even for the ideal case, that is a good try since you avoid the wrong direaction with the small cost, after play with the simulated env, there might be better ideas, and that is the benifits of the simulated env. The ideas of using simulated env may similar with writting the test case or integratred testing for the projects, these kind of tests can help you to find out the potential errors in a more efficient way. Then from the simulated or synthetic case to the workable example or project, you may need this depedency or that capabilities, these things may take lots of time to do, then you could show this to your boss that what you are doing is meaningful based on the simulated or synthetic case. This could release your pressure when you are doing the long time work and convince yourself.

from simple to complex and from the naive to the optimization

Other two similar ideas are how to start from the simple case to complex case and how to start from the naive solution to the optimization solution. For example, in one work, I try to provide a model and use this model to decide how particular things work in differnet situation. Then I start from the real use case and I found that it is hard to express things clearly since the actual example includes multiple cases and mix different conditions together. So I decide to evaluate things from the particualr concrete case based on synthetic example and then move to more complicated real case. During this process, I fonud that I did not consider several edge cases clearly. It is just like how you do the test for a large system, you need to have the unit test and the integrate test. It is dangerous to start from the integrate test direactly since it is not convincing enough and you may also miss some key situation. Always start from the simple case and the unit test. At least it is how things work at the computer engineering since you may need to make thigns controllable.

Similar ideas are also applicable for building the mathmatical model or strategies to do a particular thing, start from an simple strategy that are applicable to the general cases. And then move to more complicated one by considerng more edge cases and loosen the constraints. We may need to add more terms or conditions for this naive model.

when list the related work, the good practice is try to find the clues of the works instead of just describing the content of the work. One tricky point is that we need to find the knowledge graph hidden behind the related works. By realing lots of related works, we find that clue, and then use that clues, we can show that if the current works are enough and we can easy to extend it and get new direactions. This is also a good strategy to have a good motivation.

How to get a clue is imporant, the idea here is how to use the classification. When we see the structure of the related works, one simple way is to say if wen can classify this thing. When we start to classify it, we can start to get the new knowledge.

Another way is that, try to put your target in the different clues or goals (typically A+B method to generate new things). It will become a trick to generate more knowledge. For example, you focuse on a system that process the data, the common classification for this system is producer, consumer. This is the different part of constructing this thing. Then maybe you want it becomes an adaotive system, and there are some classifications for the adaptive system to describe one thing. Such as adaptive goal, adaptive policies and adaptive mechanisms. Then you have 2 dimentions to descirbe this thing, and there is a 3 time 2 configuration space that you can use to describe related works. You may find that contents discribed in one paper may locate in multiple entries, by this way, you may find that some entris are not well explored. Then this is a good opportunity to go to the next step.

Basically, you use different dimentions to describe one target, and for a particular dimention, there are some discrete values that can be typical types of this dimention. Figure out how it works in each dimention is the typical process of learn this thing, then you may create more discrete values, this is a kind of contribution to extend the knowledge. Another way is to combine different dimentions together, this is a more simple way to extend the entry, for dimetion a there is classification [a1,a2], for dimention b, there is classification [b1,b2,b3], then you just look at the system from the perspective of the a1,b1; a1,b2; a1,b3…

The BFS and DFS to describe the background knowledge (introduction writting)

When we talking about the background and the motivation part of a workflow, both BFS and DFS are important strategy to describe the knowledge tree. Especially for the introduction writting, we basically move from the branch to a particular specific problem or issue step by step. Sometimes, we may easy to neglect the BFS part. For example, in the context A, there is a problem P, there are several methods, such as M1, M2, M3 to solve it, this paper improve the performance for M1. For this case, we my easily neglect the A2, A3… P2 P3… during the description. Although we may not dive into details for thses cases, but it is also worth to describe them when we start to introduce the A and P, this can make the description progress in a more systamatical way and it is also easy for reader to locate what you have done in this work in the whole knowledge tree. Sometimes it is important to let other people’s know the scope or this work compared with how you actually do it. Since most of the methods are common or easy undersanding or a minor improvements. People may just need to make sure what you do make sense and the results make sense, that is enough for the basic level of the paper. For the novelty, it is sth we expect but not often have

It might also a good practice to make sure you understand the problem clearly by pushing your self describe things in this way. An interesting analogy is to consider the generative and the discriminative model used in the machine learning scope. When we consider one thing in the generative way, basically we are doing the DFS and shape the distribution step by step and refine it continuously. But to some extend, we may find it is not worth to continue shape the description of this thing, we may tend to use the discriminative method to consider the distinction between this thing and other things. Basicslly, when we try to do the distinction, we use the BFS method and when we try to describe the properties of sth, we use the generative model. Without the clear defination about the sample space, it is really hard to describe sth clearly and make it a unique thing in definatino. For example, let’s say we want to define a person, it is hard to describe it clearly since there are all kinds of overlapping between different aspacets of a person, even using all biological characteristic, there are probabilitis of the twins. We may use the generative model to limit the description into a particular scope, and then, use the discriminative model to do that. This might be a more economic way. Basically in the paper, the last point is the contribution or research values. you could describe what is difference between your work and other works that try to solve similar tasks. Your work might be better in this case and other work might be better in that case etc.

model approach and error analysis

one important thing is about the disucssion of the reaults in the related works. for example, you build sth. like a model and get results, the reviewer or the reader may want to see your observation and the thoughts about these observation. Namely, why it is like this. there is a tradeoff about how much content and efforts that describe the implementation details and analysing the resutls. Basically, both of these two things are important, but the analysing is more imporant, especially during the presentation. Since you may take several days to make sth work, in the presentation, it is just a sentences. But for analysing, you could discuss a lot.

error dicussion is sth that is easy to be neglected especially for the work that focus on a particular system or framework. One kind of paper is to dicuss a framework and then analysis the performance for key metric such as scalability, IO or throughoutput and so on, and you may say that your work outperform the state of the art case.

But I had few chance to do the paper like that. more works I worked on is to present a model and say we want to evaluate or compare sth, then use this as the motivation of developing particular framework (our implementation), and then say this framework works well based on model. This is kind of mixing the model and the framework developing together. Actually this is a little bit hard, at one aspect, you need to show that the framework is good for performance such as scale and key metric. At another aspect the model approach should be presneted. Although neither aspects are really strong (it is hard to output a pure theoritical paper or a pure system paper), but at least, there are enough content to be published.

I may have some knowledge about the framework evaluation, such as varying the configuration and checking the performance, but few knowledge about the model evaluation.

there should be a disctinction between the model and the policy or algorithm
for model, the metric such as error need to be added into the evaluation for the model evaluation, this is sth that may be neglected. We may provide an model (abstraction of sth), then we may build a policy based on this model, then we may use these policies direactly, the unconvinced part of this operation is the error of the model itself. We may first figure out the requirments of the model and the limitation of the model, and figure out what are the errors that influenmce the model, and then we can move to the next step to say how to make the policy and model more accurate for representing and anticipating the actual problem.

Let’s to an simple example, if you want to line up several kids efficiently, the first is to abstract kids as a number, namelay the height, although there are multiple other aspects, but they can be neglected in this problem. Then we can use the particular policies such as quick sort algorithm to do it. But for the model adpects, if we do not use the real time height of the kids, but use the data recorded one years ago for every kids, the reaults might be inaccurate. The results might also be influenced based on inacurate measurement of the model process.

try to use more data

Sometimes when you struggle that it is hard to find a good direaction or aspects, maybe try to collect more data is a good way to open up a new perspect. Try to consider what data you have, and if you fully explored it, if you could extract more data or information from current system, and if it is possible to collect more data. Not only for the research. This is also applicable for other real life problems. When you not sure about sth, you need to ask ideas from other people, the more you ask, the more you know about the real things and more direactions to go.

information can erase the uncertainty, means with the end

进来有两条颇有触动的思想方法,一则是“信息消除不确定性”, 更详细的论述来自吴军博士的硅谷方法论。 另外一则是“手段与目的”。

在遇到问题或感觉思路无法打开的时候,问问自已有哪些可以利用的信息,以及有哪些是不确定的,当前所拥有的信息能够消除哪些不确定性。这常常能找到一个起码能往下走的道路。搜集数据,分析数据,做出决策,这似乎是科学研究过程中怎么都要考虑的几个环节。经过这样的思考之后,常常会发现比如,数据不够信息不全,或者是信息利用的不充分,等等,有了这些考量,就能进一步地消除不确定性,最后达到controllable的程度,就算是得到了结论以及做出了contribution。所谓controllable大概是知道为什么会这样,什么时候会这样,以及想让它变成另外一个样子应该如何。似乎完美的不确定性的消除是很难的,除非加上一系列的附加assumption,弄清楚在哪些consumption下有什么样的结果,这就是得到sinceiitc insights的过程。

对于手段与目的的论述,常常让人感到,不识庐山真面目,只缘身在此山中。特别是可以算作system的大类下的工作,常常容易在这里吃亏,因为system的工作目的往往并不容易找,让一个东西在新的系统或者device上工作,这归根结底还是手段,目的是比较不同device的特性,或者是让这个更快,等等。自己认为可以当做范例的一篇是 “Full-Stack, Real-System Quantum Computer Studies: Architectural Comparisons and Design Insights” 这里文章的主要内容实际上是一个基于quantum computer的compiler的工作,但是切入点并不是compiler,而是如题目,architecture comparision。如果仅仅是介绍compiler可能并不算一个特别好的文章,但是通多改变立意,就提升了一个档次,使得手段是围绕着目的进行。为了evaluate,才开发compiler。这样在tool上做的工作也更加convincing。

关于peer review

尽管在讨论的过程中经常会遇到想法或者做的事情诶否定的过程,可一个论文完成的过程从来就不是一个人的事情,大概和音乐美术之类的艺术创作不同,一个好的paper有比较清晰的衡量标准,需要一边与别人讨论一边不断否定或者更新之前的想法,这样才能有新的想法和合乎逻辑的表达。总之一个paper完成的过程就是不断吸别人的建议并且更改的过程。但有一点就是要有基本的评判一个观点是好或者是坏的能力,有的时候别人提的建议不一定是有帮助的,这可能需要对于当前领域有一个比较系统的了解,知道这个人是外行充当内行,或者是墨守成规不愿意考虑新的问题和场景,总之一个tip就是,不妨把容错的指标放的更宽一些,如果有一些建议说不上不好,也没有明确的不去这样做的理由或者信息作为支撑,以及看起来似乎可有可无的建议,那不妨就按照别人说的试试。经常方向试过之后才发现这是有益的帮助,有了之前没考虑到的想法。

在写论文的书写过程中,自我感觉良好是一个很危险的事情,应该尽可能的让比自己专业的人员来做细致的review.常常觉得已经修改的差不多的地方,结果别人一看,又出现了很多新的需要修改的地方,这些地方往往是自己很难意识的到,但是别人很快就能看出来的,特别是一些用词上面,可能发现自己一直是把不准确的地方当做准确的地方来使用,这些地方多半是以自己当前的知识和经验不太能意识到的,别人一说过之后,就有一种恍然大悟的感觉。

归根结底,公里就是公共认证的道理,这个也是peer review的意义之一,大家都认同你在论文里说的事情,观点和验证方式,这个才是有真正意义而不是引起争论的paper。

build sth meaningful and then optimize it

核心思路正如标题所述 找一个promising的场景 然后Build sth that can make it works, then optimize it 这个似乎是通常的research 套路,比如进来的good work 可以参考这个IPDPS21上的这个 Designing High-Performance MPI Libraries with On-the-fly Compression for Modern GPU Clusters

从research perspective讲的话,optimize的部分算是真正有research taste或者research value的部分。如果抛开后面一部分的optimization的工作,直接说前面的,一方面很难说这个是只有你做了的,别人没做的,因为大部分工作都是有一两个相似的work,这时候你就要说,你这个做的在哪里哪里要比别人的好,或者别人哪里哪里没有做,对于algorithm这是比较好弄的,可对于framework类的工作,这似乎并不太好说明,于是从research的角度来讲,比较convincing的taolu就是尽量找到一个新的场景,然后在这个新的场景下做一个完成对应事情和优化对应事情的工作。否则就容易被质疑motivation,相比较起来也不太容易表述。

另一个就是思考motivation的角度,如果暂时没有什么好的thinking,就尝试回答,如果没有这个work,那当前的work会有什么disadvantages,通常就是哪些时间被浪费掉了,这样也是一个常见的套路,如果能把这些说清楚,就能找到一个比较make sense的baseline然后contribution自然也比较容易说明清楚。

推荐文章