CS1ADS: Algorithms and Data Structures


Mini-project

Issue date: 2000.02.25

Submission date: 2000.03.31


In order to pass this coursework, you must submit and pass ONE and only one of the following mini-projects.

Submissions should include all appropriate support documentation as well as a Java system: Word (.doc extension), FrameMaker (.fm extension), LATEX2e (.ltx extension), HTML or plain ASCII (.txt extension) files are suitable media for text documentation. A set of HTML pages should probably comprise part of the system manual.

The mini-projects specified below are all given a difficulty measure on a scale 1-5:

  1. A fairly trivial project that must be done well to pass.
  2. A straightforward but non-trivial project.
  3. A fairly complex project but with no real research needed.
  4. A complex but eminently doable project.
  5. A tough project requiring serious research.
  6. Probably should be a second year group project or final year individual project.
The easier the mini-project, the more complete and well done it has to be in order to pass: it is easier to get marks for a more difficult project but more difficult projects generally require greater ability to make any headway.

Another correlation that can observed is that the more difficult the mini-project, the more design research is needed and also the more sophisticated the Java code employed will have to be in order to implement the mini-project.

Your choice of project is therefore a balance between wanting to pass, wanting to challenge yourself, wanting to learn something, your own abilities in Java programming and preparing for future courses (and, later, work).

If you believe that you have extended the project specification of your chosen project in such a way that you believe it has become more complex, then feel free to claim a difficulty level. Any such claim must be accompanied by a cogent argument as to why your project's difficulty should be reclassified.

A submission should consist of the following components:

  1. The source code. This should be the fully documentation and otherwise commented source code of the system.
  2. The technical manual. There is no need for low-level documentation since this is provided (via javadoc) by the documentation comments in the source code. It can be assumed that the low-level documentation can be extracted from the source code. The technical manual is about the architectural and high-level design issues. Text and UML diagrams (class diagrams, use cases, etc.) will comprise the document. The idea is for this documentation to be read prior to the source code being studied so as to give the reader a good conception of the system and how the bits of code fit together.
  3. The test plan. This is really the quality manual. What is the strategy for testing? What are the basic test data sets?
  4. The test log. What happens when the tests in the test plan are run?
  5. The user manual. All systems need some form of introductory manual. Certainly there will be an installation sub-section. Depending on the type of application and the amount of online help, the user manual could be a small or quite substantial document. This document should tell the user, in the users language, how to use the system. It should not be condescending.


The 'Towers of Hanoi' game. The game comprises three towers and a number of rings, each of a different size. Moves in the game are taking a ring from one tower and putting it on another. The only constraint is that a larger ring may not be placed on a smaller ring on a tower. The start point is with all the rings on one tower. The aim of the game is to end with all the rings on a different tower. Create a program that allows the user to play the game and is also able to complete the game from the current position showing the sequence of moves. Difficulty Level: 2.

A WYSIWYG editor of simple text. Such an editor might be used for dealing with email messages. In the simple case the editor will deal with plain ASCII characters. In the more complex case, any font will be allowed. A minor alternative on this might be to construct a WYSIWYG WWW page editor. Difficulty Level: Easy Case 1, Hard Case 3.

A WYSIWYG editor for music. Music needs to be represented on paper so that it can be communicated and worked on. As well as standard score notation, there are notations such as guitar or lute tablature. This project is to write an editor of music that will work with standard score notation. Conversion from score to tablature notation would be useful and direct editing of tablature also. Direct conversion to and from MIDI files would be an interesting addition. Difficulty Level: Easy Case 3, Hard Case 5.

Red-Black tree. Create an implementation of a Red-Black tree as a publically usable class; the implementations in the JGL and JFC Collections packages are currently dedicated for internal use and are not really usable as public classes. The implementation produced should be in a fit state to be entered into one of the packages: ADS, JGL, JFC Collections. The theory of Red-Back trees can be found in any modern book dedicated to algorithms and data structures, one example is: M A Weiss (1996) Algorithms, Data Structures and Problem Solving with C++, Addison-Wesley. Difficulty Level: 3.

A package for supporting simple data analysis. Physicists, Chemists, Cognitive Scientists, Market Researchers, etc. often create large data sets to which they need to find a "best fit" curve. The problem is to find an equation that best describes this "best fit" curve. These people need to be able to experiment with their data to create different relationships and curve fits. Maple and Mathematica are systems that are beginning to provide such facilities. For this project you should write a Java system as a first prototype of such an "Experimenter's Workbench". The user should be able to experiment with linear, quadratic, cubic and logarithmic curves to a dataset to see which is the best fit to the data. Difficulty Level: 3.

An electronic notebook (cf. PlanIt, Psion Organiser, etc.). The notebook should support a full graphical user interface. A number of features could be supported: full contact detail information storable and searchable; repeated diary entries with editing and partitioning; the setting of alarmsto warn the user of impending appointments; the support of workgroups with shared diaries. The List is basically open-ended. Thus, this project can be made as simple or as difficult as wanted by appropriate choice of features. Difficulty Level: Easy Case 1, Hard Case 3, Distributed Case 5.

A personal finance management system. This should support multiple bank and other sorts of account. It should allow transactions to be input and should undertake estimation of all non-explicit transactions, e.g. interest payments, bank charges. The simple version of the project will just be a graphical user interface to basic account management, with information storage and retrieval. The complex version of the project will deal with maximizing income by use of a portfolio of interest bearing accounts. Difficulty Level: Easy Case 2, Hard Case 3.

A full graphical animation of the Sieve of Eratosthenes algorithm. The basic algorithm is very straightforward and already known, this mini-project is to create, using Java, a graphical animation of the algorithm. Difficulty Level: 2.

The game variously called 'Oranges and Lemons' or 'Hits and Inners'. One player (A) guesses a sequence of digits; usually four numbers but letters could be used for more difficulty, also the number of digits in the sequence could be made larger (five or six) for added complexity. The other player (B) then makes 'guesses' as to what the sequence is. For each 'guess' that B makes, A must reply with a 'score' specified as the number of oranges (hits), i.e. correct digits in the correct place in the sequence, and lemons (ins), i.e. correct digit in the wrong position in the sequence. When B finally has the correct sequence (four, or how ever many places in the sequence, oranges), the number of guesses used is noted. Under normal circumstances, A and B now swap roles to obtain a number of guesses for A. The person with the lower number of guesses wins the game. The straightforward project is for the computer to always hold the sequence of digits and for the human to always be the guesser. The more complex mini-project is for the computer to take a turn as in the traditional version of the game. To increase the difficulty markedly, try allowing two players to play against each other (on different computers obviously) with the computer acting as intermediary. Difficulty Level: Easy Case 1, Hard Case 3, Multi-player Version 5.

Calculator. Create a calculator application. The calculator should be able to work in different number bases as well as being able to perform arithmetic calculations. The calculator could have two modes, a console-based mode (cf. bc under UNIX) and GUI-based (cf. xcalc or calctool under Solaris (UNIX) or calc under Windows95). Difficulty Level: Console only 2, GUI as well 3.

Object-based drawing editor. Create an object-based drawing editor.Such an editor allows diagrams to be drawn, where each shape in the diagram is treated as an object that can be manipulated in various ways, for example by being moved around and resized. Examples of this kind of editor can be seen in Word, FrameMaker or Visio. Difficult Level: 5.

Ant simulation. This project requires an improved version of the Ant simulation system that is in the Developing Java Software book to be constructed. It should have a full GUI (controls, menus, dialogs etc.), the ability to run and/or set up different simulations, perhaps with a scripting language interpreter to program simulation. In the very difficult case the program might want to make use of multiple PCs and screens, i.e. be a fully distributed version, which would make use of the RMI package. Difficulty Level: Simple extensions: 1, Full distributed version 5.

Filestore reconciliation. What is required here is a Java program that coordinates files between an unlimited number of filestores (something similar to the Windows95 Briefcase only usable). Users of multiple systems cannot always make use of a single shared filestore, they must have mutliple copies of thier files on the different systems. There is a need though to ensure that all copies of what is supposed to be the same file are in fact the same. For example if a person has a desk top system at home and a desktop system at work that can be connected over a network they need a system to compare the files on the two systems and make them equal. Clearly only specified parts of the users filestore, and not the system components, will be reconciled. Difficulty Level: 4.

Scientific data types. In scientific computing, rational numbers and complex numbers are used frequently; to support engineers and scientists in constructing their computations in Java, types such as Rational, Complex , Quarternians, Octonians are required. Construct complete classes to represent two or more of these types. Completeness of the representation of the algebra of these types is important. The results should be classes that can be inserted in one of the packages: ADS, JGL or JFC Collections. Difficulty Level: Rational and Complex 1, Quarternians and Octonians: 2.

Polynomial package. For some scientific computing and also increasingly for statistical systems, manipulation of polynomials is required. Construct a class to represent polynomials and their algebra. The result should be a class (or set of classes) that can be inserted in one of the packages: ADS, JGL or JFC Collections. Difficulty Level 2.

Artificial life. (à la Langton or Dawkins or Dewdney, etc). Simulation of the evolution, speciation, interaction and extinction of a multi-species ecosystem of artificial life-forms by stochastic processes involving both genetic changes and environmental influences. (A simpler statement of this is available on request:-) References for this project are: R M Stein R M (1991) "Real Artificial Life", Byte, January, 289-298; Thimbleby, H (1990) "Artificial Life", Computer Bulletin, May, 22-23 (erratum August 1990, 5); Dewdney, A K (????) "Computer Recreations", Scientific American 84, 16. Difficulty Level: 4.

Lift simulator. Construct a system to enable simulation of lifts. Two sorts of simulation are possible:

  1. Simulation of the various user interfaces and their behaviour during use of the lift (easier).
  2. Simulation of the people flow using the lift (harder).
The number of floors should be settable by the "simulation experimenter" as should the algorithms for behaviour both of the lift and the user interface. A scripting language might be one way of enabling this dynamic specification. Difficulty Level: Simple simulation 2, full GUI 4.

Word transformation. A popular word game is to transform one word into another by a sequence of single letter changes each of which results in another word. For example, bleed converts to blood by the sequence bleed, blend, blond, blood. Construct a program that takes a source word, a target word and a dictionary (/usr/dict/words on any UNIX machine for example) as input and returns the set of valid sequences, sorted according to length of the sequence, and in which a given word only occurs once in the sequence. Difficulty Level: 3.

Currency speculation. Currency speculators are always on the lookout for a sequence of transactions that makes money. For instance, if the currencies X, Y and Z have the exchange rates 1X = 2Y, 1Y = 2Z and 1X = 3Z, then 300Z will buy 100X, which in turn will buy 200Y, which in turn will buy 400Z and we have this made a 33\% profit. Construct a program that accepts as input a collection of currencies and their exchange rates and calculates whether there is a sequence of exchanges that makes money instantly. The program should output all such sequences sorted by number of transactions and profit margin. Difficulty Level: 3.

Educating students. In most institutions of tertiary education, a student needs to take a certain number of courses to graduate. Each course has prerequisites that must be followed. In many institutions, particularly those in the USA, each course is offered each semester, though in most each course is only offered once per year, i.e. in one of two or three semesters. In some institutions, students can take as many courses as they wish, however in most there is an upper limit on the number of courses. So for example at KCL for a BSc degree, students take 12 units (usually composed of 24 half-unit but there are some whole unit courses, final year projects for example) and must pass 9 in order to be awarded a degree. The academic year comprises two teaching semesters and courses are offered in either the first or the second semester but not both. In many institutions, there is a core set of courses that must be taken, some of which must be passed in order to graduate with a given named honours degree. The USA college version of the question is: Given a list of courses and their prerequisites, and assuming all courses are offered every semester and that the student can take an unlimited number of courses, compute a schedule that requires the minimum number of semesters. The KCL version of the problem requires much more information to be specified (indeed possibly a complete database) but then it is really a null question since it does not run a full modular degree/course unit scheme on the grounds that it believes degree programmes offer a better quality of education for its students. Such a program could probably best be written using a DBMS such as Oracle, Sybase, Informix or Access with a user interface using Java (and employing JDBC via the Swing library JTable type). Difficulty Level: 3.

Phone numbers as words. If you look at some telephone dialling pads you find the following number to letter mapping:

1 2

ABC

3

DEF

4

GHI

5

JKL

6

MNO

7

PRS

8

TUV

9

WXY

* 0 #
(This is the standard USA mapping (standardized by ``Ma Bell'' in around 1923), the old UK mapping was slightly different but that has disappeared. Notice that Q and Z are not in this mapping. Some phones have Q mapped to 7 and Z mapped to 9.) This mapping means that we can represent phone numbers not just with a numeric representation but also as words and phrases. For example:
448 4373 HI THERE
43 55 68 43 73  HELLO THERE
working with the assumption that spaces are not significant of course. This use of words and phrases to represent phone numbers is practiced a lot in the USA by almost all companies and marketing agencies to aid people remembering the company's phone numbers. These advertisers always want a memorable word or phrase directly associated with the company if the company name itself cannot be used. Very often not all the number is encoded: the long-distance dialling code and the area code or freephone code are often left as numbers since they are very standard and hence easily remembered. For example:
1 800 ACE FOOD 1 800 223 3663
1 800 ACE F00D 1 800 223 3003
but this is not always the case, sometimes they encode the whole number. What is being used here is that phone numbers with a distinct pattern, 0345 48 49 50, 0171 8686868 or 0113 9111111 are very memorable but random sequences of numbers are not. Words are like patterns, they are easily remembered; by encoding number sequences as words we are constructing a pattern easily remembered. Notice that: There are two problems that need to be solved:
  1. Given a telephone number, find all the words and phrases that could be used to represent that number or part of that number.
  2. Given a word or phrase, find the telephone numbers that make use of that word or phrase.
NB There are a large number of ways of solving these problems, brute force use of loops and control structures being totally the wrong ones. There are a number of very elegant solutions that are also among the most efficient that employ relatively straightforward data structures.

Footnote: An interesting sideline on all this is the old practice of naming exchanges rather than using numerics. The STD codes were originally decided in this way. For example Colchester was 206, COL. (NB The old UK letter mapping was not the same as the standard USA mapping given above and which we now use.) The URLhttp://www.scruz.net/~rcrowe/TENproject.html documents the same state of affairs in the USA.

Difficulty Level: 3

Russel Winder
Last updated: 2000-02-08