Marwane El Kharbili

Dec 22, 2008

Having a PhD Strategy (Part Two)

This is the second part of my reaction to this post by Kai, a fellow PhD Student at Blekinge TH and Ericsson AB. I had already introduced his blog and you can find the previous part in the previous post on this blog. In this post he reports among others about two other courses he has to take besides from software productivity. I will try to shortly analyze what he wrote about scientific publications and statistical methods.

About scientific publications, what interested me is that they get to study scientometrics. Scientometrics, according to Kai, are the methods used to assess the relevance and importance of journals and scientific conferences. If you have worked as a researcher you must know that not all scientific conferences, workshops, symposia, journals and all sorts of scholarly transactions are equal. They surpass each other in terms of popularity, reach, perceived quality and influence on the scientific community. Using Scientometrics, the impact factors of journals are established, which helps the researcher to select where he wants his papers to be published. as Kai simply expresses it: if you get a paper accepted on one of the journals with the highest impact factors for your community and area of research, then " this increases your reputation as a scientist in the area". But logically, the difficulty to get a paper accepted at one of these conferences grows with their impact factor.

What they learn in this course is to asses the impact factors of target journals and conferences, the major scientists (what my supervisor calls immediate research community), to come up with a strategy of publication and to make a review of papers. I guess the latter concerns papers that have been accepted at famous scholarly transactions or that have been written by members of the major scientists in the community.

Knowing the major scientists in the community is very important. First of all you get to know the most important directions of research, and you get to know the most important results already achieved. So you get a lot quicker to know the state of the art of the area you are working on. Also studying the references used by these scientists can help you know the foundations of your research area that you may want to read for a solid and fundamental understanding of current state of the art results. these scientists are the ones that publish the most articles and the ones at the most important transactions. These scientists normally also show you at which journals and conferences papers tackling related topics can be published and which ones are the most relevant.

The second course are statistical methods. Now, like every computer scientist I have had my share of statistical mathematics. Actually more, due to the emphasis put onto mathematics and the additional statistics options I took at my engineering school, the ENSIMAG, Grenoble. But this is not what Kai is talking about. He specifically says that they take a course dedicated to learn how to use statistics to evaluate and analyze data gathered during research (especially in software engineering, the use of case studies as qualitative analysis tools is widely done, as I can see it from the work of Sebastian). Such use cases and the accompanying statistical analysis can be of great use when tackling one of the last steps of a PhD, which is evaluating your results. The course is based on concrete problems, in 5 seminars.

The goal of these two posts was to give an idea about how a structured PhD at a university (at least partly) takes on the harsh project of a dissertation, and the tools PhD students get to learn and to use. In a totally industrial PhD, you have to try to go as structured as possible with the tools that you have. That means you shouldn't expect to have time, resources to learn or mentors who will teach you how to optimally get on with your PhD. You are in quite some extent an auto-didact, a multi-disciplinary researcher and necessarily one with extended curiosity. The desire to work in a highly structured way and the ability to combine several sources of information, from areas that do not necessarily have much to do with your direct area of research, is a must. being open to learn from the techniques of others and to get the best out of all who you meet or you read about can only make your PhD better.

I hope that my two small analyses have helped better explain what an industrial PhD is about. I will write some other posts about this, since I have quite some opinions to share concerning the topic.

Marwane El Kharbili

Dec 1, 2008

Having a PhD Strategy (Part One)

This is my reaction to this post by Kai, a fellow PhD Student at Blekinge TH and Ericsson AB. I had already introduced his blog.
In this post he reports about his (at the time he wrote the post) next steps for the PhD, He lists three courses he has to take at the university and explains ho he wants to approach the PhD Thesis.

Kai explains that he is going to do what is called a "Systematic Literature Review" (SLR. A systematic literature review is different from a normal Literature Review (LR) in the following points:
  1. Allows to analyze the current state of knowledge about a whole scientific Area, such as Software productivity (Kai's example) or Compliance Management (my example).
  2. It is easier to see what has been done in an area and what hasn't been done yet.
  3. It is easier to argue why a PhD Student took a certain direction, and why this direction of research will bring outcomes which are useful to the scientific community.
  4. It also easier to motivate the use of certain methods, tools or approaches.
  5. it typically covers a way wider scientific scope than a normal SOTA (State Of The Art) review since it doesn't seek to focus on a certain problematic as an efficiency criteria.
But the main difference resides in the following quote from Kai's blog:
  • "Systematic means that one has to document search strategy (keywords, search strings, scientific databases), paper selection criteria, paper evaluation criteria, how to synthesize the findings of the identified studies and so forth."
So the main and real difference to a normal SOTA is the strategy. Strategy in the sense that you'll have to select what you are looking for, in terms of setting keywords and search strings, and where to look for references. Moreover, the selection of papers returned has also got to be documented in the strategy. the final part of the strategy being the specification of the LR's results analysis and synthesis. I imagine the last point means that you would have to define a set of dimensions/axes on which you would want to project the results of your research, in order to get what is relevant for you from the LR. One of the main deliverables after this Systematic LR is a taxonomy of the domain of research and solid material for one (or maybe even) several papers. These papers are an important way of synthesizing results of an SLR because they are a way of consolidating the SLR results along one or several axes of research and because they are of high utility to other researchers. Thus, quality SLR papers have their place in highly regarded research Journals. They are also an archived and extensible knowledge basis of the domain. An SLR also makes it easier for the PhD student to later write related sections in research papers, so the big overhead of conducting an SLR can become a good investment.

I was surprised. I had never heard of the clear specification of a Literature Review (LR) strategy. of course you have always your own strategy when conducting one, but it is only "in your head" and your are the only one who knows what you are really looking for. In addition, no one would bother to describe an LR´s strategy because it would not be of a direct use for the expected outcome of the LR. So I was very intrigued. I think that in the scope of a PhD Thesis, dressing strong, clear and most importantly far-reaching fundamentals for the dissertation is a requirement for the thesis. That's why I am really convinced of the utility of a Systematic Literature review (SLR).

The most interesting in this is that I have noticed my non-intended attempt at the beginning of my PhD to do the same for my Thesis. For example, my attempts to have a taxonomy of the domain were in the form of complex mind-maps. Unfortunately , Minds Maps do not scale wekll with complex research domains, you will need a structured approach using several mind maps on several layers of the same domain and use hyperlinks between the mind maps. But I thought (mostly due to comments from colleagues and my entourage) that I shouldn't go for it because it is just a waste of time and that I should focus a lot more on certain topics. Knowing my tendency to tackle a lot wider range of reources to solve one problem (which I call the Sponge Effect (SE) which I will come to later in this Blog), in an attempt to let no information escape my research I thought I had to stop it. But this is not the main reason.

The real reason is that doing a PhD in 3 years necessitates extremely focused work on getting results for a well defined problem. The main fear of a PhD student is not to be able to fully understand the problems he has to deal with and to need a lot more years to conduct his research than intended. This is the real reason why my working method has been to cluster the domain I am working on in sub-domains, and studying one of these sub-domains fully in order to achieve some results for this sub-domain. My strategy is to examine the results I get from my research on this first target sub-domain in comparison to the other sub-domains afterwards. And thus to be able to get a more global overview on my first intended target PhD Domain by criticizing, completing, correcting and extending initial results. Whether 3 years are enough for this, I really doubt. But hope keeps alive :D

Marwane El Kharbili