General requirements for final project papers (LING 6570, LING 8570)

Michael A. Covington – 2002

These are the usual requirements for a final project paper. Specific projects can be exempted from some of the requirements if there are clear academic benefits from doing so.

Typical length: 10 pages single-spaced. (Can vary considerably, but extra-long papers are not desirable; excessive length often indicates a lack of focus or organization.)

Typesetting: LaTeX, \documentclass[12pt]{article}. (Learning to use LaTeX is one of the academic skills you should develop at this time, since you need it for your thesis, and typesetting is itself a form of natural language processing.)

Perfect English. (The requirements are the same whether or not English is your native language. If there are any gaps in your knowledge of the English language, punctuation, or spelling, now is the time to remedy them.)

Research: It is not enough to simply write and document a computer program. You must do library research to determine how your work fits into the existing science and technology of natural language processing. This involved tracing important ideas to their sources, understanding clearly what is already known, and acknowledging sources properly. Allow 10 to 20 hours for library and Internet research. It is not enough to find something about your subject – you must find the best sources of information, then read and understand them thoroughly.

Possible types of papers:

(1) Implementation of an important natural language processing technique, preferably in a somewhat original way. (Find an important technique in the literature; if possible, improve it or apply it in a new way; and implement it.)

(2) Critical review of a published paper or a group of related papers. (Pick up someone else’s idea and run with it; develop the ideas further; or if you see an evident blunder, criticize it.)

(3) Exposition. Find a technique that is known in the literature and write a clear, textbook-like explanation of it, based on published sources that give the information in a less accessible form.

(4) Practical research problem. Develop an important step in solving a problem that is actually needed for your research, or build a software tool that is useful in your research.

(5) Evaluation of an existing piece of software, based on an understanding of the relevant technology.

Where to find information:

Do not just go blindly to the library. Start by looking in a good reference book such as Jurafsky and Martin, or Allen (Natural Language Understanding), to get pointers to the literature. Then follow up the references there. Once you’ve found an important paper, use Web of Science to find out about later papers that cite it.

A textbook itself is rarely the original source of an idea. Textbooks lead you to the research literature; they do not replace it.

Use a search engine such as www.google.com to find web sites that discuss your topic, but remember that the best research is published in journals, not web pages; use web pages mainly to locate good published research.

Important journals include Natural Language Engineering (Science Library) and Computational Linguistics (Main Library). Browsing in a journal is often a good way to get ideas for a project. Do not expect to understand every article; even professional scientists normally read only about 1/8 of the articles in each issue.