F04 CPSC 501 - Assignment number 1: Speech Synthesis

Due: Friday October 15, 2004.

Assignment goals

To develop a piece of software of reasonable complexity within a domain that you know very little about. You will also be required to demonstrate various principles that are being addressed in lecture. These principles include:

Assignment specifications

A general overview of the speech synthesis process was shown in class. The slides/documentation of that lecture can be found here: ppt sxi

Essentially, speech synthesis includes 3 main processes:

  1. Text to phonetic conversion (determining what speech sounds to make)
  2. Phonetic to synthesizer parameter conversion (generation of synthesizer control parameters)
  3. Parameter to sound out conversion (synthesis)

Of the above, your assignment will involve writing a piece of software which computes the parameters for the speech synthesizer (item number 2). The essence of the assignment is to write a program which takes in a series of phones as input and, from those phones, generates an appropriate synthesizer parameter file. You will test your output by providing it as input to the speech synthesizer and listening to the output.

There are several resources (including the main posture data file) available to help you in this assignment. They are located here

There are also example files located here.

Interpolation

The general algorithm to follow is:

There are several ways of interpolating between the points. Here are some possibilities:

For this assignment you must implement 2 interpolation algorithms. The User must have some way of specifying which algorithm is being used for the interpolation. Use polymorphism to implement this scheme.

Assignment demonstration

You must demonstrate the operation of your assignment to a TA. The TA will ask questions about the execution of the application. Your demonstration must contain at least 3 utterances (each of at least 1000 ms long). If you are synthesizing English, you may choose two of your own utterances (use the pronunciation guide to aid in the text to posture conversion). The other utterance must be:

	n aa u i z dh uh s a m uh r uh v ah uu r d' i' s' k' uh' n' t' e' n' t'

	Now is the summer of our discontent.

If you are synthesizing another language, you may choose whatever three utterances you like, but you have to be able to pronounce the words so that the TA can verify it is what you are synthesizing.

Email a copy of each audio output file you demonstrated to your professor so that he can publish them on a class wide website. Because other people are going to hear your output, please only synthesize suitable utterances :-).

Handing in your assignment

You must print out your code and hand it in to your TA in the boxes by the due date. Don't forget to also provide any supporting documentation (such as your refactoring log or design diagrams).

Evaluation Criteria

Your assignment will be marked by both TAs with the exception of the demonstration which will be marked by one TA only (2 demonstrations would take up too much time). In the sections which are marked by both TAs, the average of their marks will be taken to compute the final mark for the section. The following is the criteria that is going to be used to assess your performance on this assignment.

For each of the following sections, a mark of A, C or F will be assigned. The final mark for the assignment will be a summation of the weighted GPA equivalents.

SectionWeightDescription
Refactoring25% Refactoring can be difficult to assess when looking at the final code. The reason being that the marker can only see the "after" picture and has no clue of what the "before" picture looked like. You will be assigned a mark in this category based on how effectively you can show to your marker how well you have applied the principles of refactoring in your assignment. I recommend that you create a log (in tabular form) which details how you've applied the principles of refactoring to this assignment. Note: Creating logs or audit trails of the software development process is common in industry and research.
Design20% This assignment is very well suited to an Object Oriented design. You will be assessed on your final design. If your design is difficult to understand, not relevant to the problem, or is overly procedural, you will likely get a mark of C or less for this section.
Implementation15% Implementation follows naturally from Design. If your design is deficient, then your implemention will probably also be deficient as well. An implementation which is not properly documented will not receive a mark higher than a C for this section.
Use of polymorphism20% Fundamental to managing state in OO is polymorphism. Your use of polymorphism in this assignment must convince the marker that you know how to use it effectively. I recommend that you at least use polymorphism to allow for the separate interpolation methods. No use or improper use of polymorphism will result in an F for this section.
Does the software satisfy specs?10%The software needs to take in a phonetic or posture string and output a file which can feed the provided synthesizer. The parameter file MUST be accepted by the speech synthesizer and produce audio output.
Demonstration10%You will demo this assignment to 1 TA (not both). The TA will assign a mark in this category based on your output, your ability to convince him that both partners worked on the assignment and that both understand the design/implementation. Because only 1 TA will mark your demonstration, this mark will not be averaged between 2 markers.

Hint

Make things easy on your TAs. The better organized your assignment is, the easier it is for your TA to mark it. Anticipate what your TA is going to be looking for and make it easy for him to find that information.