We will need you to submit all your code as a final deliverable. PLAGIARISM will be strictly penalized, see here for more details.
Project Pitch (5%)
Every student pitches a 5-minute project idea for which all the other students vote. This will help the students choose project teams. The pitch should outline the problem being solved and why should we care about it. There should be a clear connection to language models as a path to addressing the problem (i.e., it must involve language of some kind). It should also provide an idea of what the inputs and the outputs are, ideally with real-world examples. It is important to name the project idea for ease of voting.
Project Team Formation
Projects are to be done in teams of 2. Based on enrollment, we will make some exceptions if people are left behind.
Project proposal (10%).
Student teams should submit a ~1-page proposal (using the *CL paper submission template) for their project by Week 5. The proposal should:
- state and motivate the problem by providing a problem or task definition (preferably with example inputs and expected outputs),
- situate the problem within related work (this might help you find sources of data for training a model for your task),
- Related work: publications, start by looking in the ACL anthology
- References do not count towards page limit, but please follow the correct format
- state a hypothesis to be verified and how you will evaluate if it is valid, and
- provide a brief description of the approach to be followed to verify the hypothesis (such as proposed models and baselines, no need to provide all the modeling details at this point).
We highly encourage students to work towards a problem involving predictive models, hence it’s worth thinking about the five key ingredients of supervised learning: data, model, loss function, optimization algorithm and inference / evaluation.
Project progress report (10%).
Student teams should submit a ~3-page progress report (using the *CL paper submission template) for their project by the end of Week 9. This report should:
- once again describe the project’s goals (it is okay if this has changed slightly since the proposal, based on the feedback),
- contain all details on the dataset (your dataset should mostly be collected by this time),
- contain some initial results (think of this as a motivating results or analysis of your data to support your hypothesis), and
- must outline a concrete plan of what will be done before the final report.
While the initial results might be inconclusive, you are expected to have made non-trivial progress by this point. The project proposal may be extended for this report. Please take into consideration the earlier feedback you received, and address those inline (you may highlight these in a different text color if you wish to draw the grader’s attention).
Project final presentation (15%).
Please prepare for a 15-20 mins presentation + 5-10 mins QA. Too long / too short presentations will be penalized. There should be some more progress since the progress report. Please show up for other presentations + ask questions - this will be part of your participation grade. After the presentation, please upload your project slides in Brightspace for grading
Project final report (15%).
The final deliverable for the project is the an 8-page progress report (using the *CL paper submission template). This report should be comprehensive, contain all the details in the project, fit for a new reader. The report should contain:
- A short abstract, highlighting the goal of the project and a single takeaway from the results.
- Introduction and motivation for the problem. What are the key hypotheses?
- Related Work (brief; additional details can be moved to the appendix)
- A clear description of the method, with as many details as possible. Diagrams encouraged.
- Data used. If new data was collected, detailed description of the collection / preprocessing.
- Experimental settings, which model / tools were used, as well as hyperparameter settings.
- Quantitative results in a tabular format, as well as textual description. Qualitative results (for e.g., input / output pairs from your model) are encouraged. If too large, can add to appendix and point to it.
- Analysis / discussion of the key findings (was your hypothesis satisfied?) and future work.
- References (please follow the formatting closely).
- Appendix (unlimited length) with all additional details.
Please use the appendix for any additional details which do not fit 8 pages, as well as point to the appendix from the main report. Please include a (Github) link to the code. All projects will be graded taking into consideration prior feedback given throughout the project.