Report of internship at KDE laboratory

Vincent Jordan (KDE lab.)

report first page First page as PDF

Acknowledgements

First and foremost I would like to thank Professor Hiroyuki Kitagawa, my internship adviser. He accepted me into his laboratory and enabled this internship opportunity at Tsukuba University. Although Professor Kitagawa is extremely busy with his own research schedule, he has never hesitated to assist me with any questions or problems I’ve had over the duration of my internship.
I am grateful to Professor Toshiyuki Amagasa for suggesting my current research topic. He trusted my abilities at solving my project’s numerous implementation issues. Along with his support on my over all project, Professor Amagasa has helped me improve this report.

I cannot over look Takahiro Komamizu for his continuous support both inside and outside of the laboratory. KDE laboratory would not have its superior environment without him. He does every task with a smile such as, network wires, teaching Japanese, and presentation reviews. He spends so much time helping everybody, including me for this report.
I thank Djelloul Boukhelef for his comments and clever debugging ideas… in French.
I also want to thank Mariko Kamie, Maria Alejandra Quirós and Sherry Morgenstern for their interest and friendship during the whole six months of my internship. Sherry Morgenstern gave smart advices that improved the English of this report.

You cannot fully enjoy the experience of living in Japan without the Japanese language. I would like to thank my five Japanese teachers from the International Student Center for their motivation to teach Japanese to beginners like me: Professor Tanaka (Mon.), Miyazaki (Tue.), Seki (Wed.), Onodera (Thu.) and Imai (Fri.).

I cannot forget to thank my Japanese teacher at my French university, Keiko Jimbo and Mireille Jacquot who manages administrative tasks for Computer Science internships. Japanese and French universities schedules are not synchronized, but I never had to care about it thanks to their efforts.

Introduction

This final project assignment is the last step of the curriculum at the University of Technology of Belfort-Montbéliard. This 6 months internship validates my software engineering ability as well as my research master studies. Therefore it had to include both of these aspects of Computer Science. The purpose of this training period is to apply and further improve my skills learned at the university. It is also a way to prepare students, such as myself, for their future jobs, by already having experience. Thus the choice of the institution where to do this internship is crucial.

In order to achieve this, I decided to do my internship at the Kitagawa Data Engineering laboratory of the University of Tsukuba, Japan. The internship took place from April 5th to September 17th 2010. The initial theme was "XML query processing using GPGPU" and involved CUDA language for its implementation.

This report is divided into four sections. The first part will introduce the University of Tsukuba and the Kitagawa Data Engineering laboratory. It will also explain my expected work at KDE lab. and the schedule of the 6 months spent there. The two following sections are about the required knowledge that I had to gain before starting: XML query algorithms and NVidia GPU architectures. Finally the fourth part will explain the difficulties and applied solutions in order to use GPU for XML query processing.

Since this report was designed to validate both engineering and research studies, it is supposed to feature the content of two reports. The second and third sections will contain more details about research carried out in semi-structured language (XML) and many-cores parallel processing (GPGPU) while the last section will give more information on software engineering issues in development using the CUDA toolkit.

About KDE laboratory and my internship

XML query processing algorithm: Parallel TwigStack

nVIDIA GPU architectures and the CUDA framework

Parallel TwigStack on GPU

References
Appendix A: CUDA to PTX1.4 full example
Appendix B: CUDA to PTX2.0 full example
Appendix C: basic XPath grammar for lemon
Appendix D: v_array full example
Appendix E: my CUDA debugging map

Conclusion

The purpose of this report is to finalize my last project assignment, as well as my studies at the UTBM. It is the suitable time for a personal reflection on the curriculum I have endured. Engineering studies at the UTBM include three mandatory internships, the first done over one month and remaining two, which were completed over a 6 month time period. The previous and second internship had been spent at Euro Airport (France/Swiss, from September 2007 to January 2008). The main goal of these internships was to gather more credible experience in software engineering and project management through the model of an airport environment. The Technical skills I learned were already related to databases, since most of my work was linked to Oracle DBMS.
As for the third and final internship, I oriented my studies at KDE laboratory toward three goals:

The result of each point is discussed in the following paragraphs.

About current research, this training period has been worthy of validating my project and aspirations to start doctoral studies. Although I decided on this laboratory with little knowledge of it, Kitagawa Data Engineering laboratory was a sensible choice.

About GPU processing, this point is the most uncertain aspect of my studies since I failed to finish the planned schedule because of unforeseen implementation issues. Given that it is from mistakes that you learn the most, I would try to behave differently if I have to cope again with a still new technology as GPGPU was during this internship.

About internationalization, Japan is a place without equivalents in the world for French or European people to sharpen their communication skills. The mix of its very own unique Japanese culture, combined with both American and other Asian cultures creates a true challenge for understanding and being understood. In human terms, my internship was a great success. I discovered and applied the process of enculturation: "the process by which a person learns the requirements of the culture by which he or she is surrounded, and acquires values and behaviors that are appropriate or necessary in that culture".

The UTBM report writing guidelines suggest including in the conclusion section as an estimation of the financial gains enabled by the work done during the internship. As regards to the work made in research laboratory, this estimation is generally extremely difficult to achieve because those gains are expected at the end of a very the long-term process. This estimation is also made fuzzier because of the fact that the gains are of a human aspect before becoming financial.

To conclude, I think that two out of three goals have been reached with success. I cannot deny that I would have liked to finish on time. Unlike in engineering, the probability of the implementation issues is very high in research because of the use of cutting-edge technologies. This personal experience leverages the importance of the conception of a development methodology even if the project seems small at start.

My proposition to go further in using GPGPU for XML query processing by undertaking a PhD at the same place is supported by both Professors Kitagawa and Amagasa.

This report was written using XML-based document formats (XHTML, SVG, MathML). Thus it could have been generated on GPU using a software that includes the result of this research.

report last page Last page as PDF
xhtml valid? | css valid? | last update on September 2010