Course Project
An integral part of the course is the class project (30% of the grade), which gives students a chance to apply the algorithms discussed in class to a research oriented project. This semester the theme is Machine Learning for Good.
Some example project titles are as follows:
- Predicting Poverty from Satellite Imagery. Predicting poverty from multispectral satellite images.
- Disease Outbreak Detection. Monitoring outbreaks using data from social media streams.
- Histopathological Cancer Detection. Detecting cancer from pathological scans.
- Automated Emergency Management. Using machine learning for emergency response management.
- Detecting Hate Speech. Identifying hate speech in social media platforms.
Recommended Reading
- Neal Jean, Marshall Burke, Michael Xie, William Davis, David Lobell, Stefano Ermon. Combining Satellite Imagery and Machine Learning to Predict Poverty. Science, 2016
- Hayate Iso, Shoko Wakamiya, Eiji Aramaki. Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction/ COLING, 2016
- Dayong Wang, Aditya Khosla, Rishab Gargeya, Humayun Irshad, Andrew H. Beck. Deep learning for identifying metastatic breast cancer. arXiv Preprint:1606.05718, 2016
- Christos Kyrkou, Theocharis Theocharides. Deep-Learning-Based Aerial Image Classification for Emergency Response Applications Using Unmanned Aerial Vehicles. CVPR Workshops, 2019
- Rohan Kshirsagar, Tyrus Cukuvac, Kathleen McKeown, Susan McGregor. Predictive Embeddings for Hate Speech Detection on Twitter. EMNLP Workshops, 2018
- David Rolnick et al. Tackling Climate Change with Machine Learning. arXiv preprint:1906.05433, 2019
Resources
In your projects, you may use a dataset available on the web (some example datasets are listed below) or collect your own data. However, if you choose the latter option, you must you must keep in mind that data collection can be fun and exciting, but it is also time-consuming.
Software and Libraries
You are encouraged to learn and use the following machine learning and deep learning frameworks in your projects. Links to some useful NLP tools are also provided.
- Caffe: A deep learning framework made with expression, speed, and modularity in mind
- Tensorflow: Open Source Software Library for Machine Intelligence
- Theano: A Python framework for fast computation of mathematical expressions.
- Keras: Keras: Deep Learning library for Theano and TensorFlow
- MatConvNet: CNNs for MATLAB
- mxnet: Flexible and Efficient Library for Deep Learning
- Torch: A scientific computing framework for LuaJIT
- LIBSVM: A Library for Support Vector Machines
- scikit-learn: Machine learning in Python
- Stanford CoreNLP: A suite of core NLP tools
- NLTK: Natural Language Toolkit
Deliverables
- Proposals: November 16, 2019.
- Project progress reports: December 22, 2019
- Final project presentations: January 8, 10, 2020
- Final reports: January 15, 2020
In preparing your progress and final project reports, you should use the provided LaTeX template and submit them electronically in PDF format. Late submissions will be penalized.
Collaboration Policy
Each project should be done in groups of 3 students. Of course, there may be some exceptions, depending on the enrollment. Note that students without a team will be randomly assigned to one project group.
Grading
- Proposal (2%)
- Blog posts (4%)
- GitHub commits and meetings with TAs(4%)
- Progress report (5%)
- Presentation (7.5%)
- Final report and code (7.5%)
Project Proposal
Each project group should submit a half page project proposal on their specific project idea by November 10, 2019. The proposal should provide
- The research topic to be investigated,
- What data you will use,
- A list of related papers.
Blog posts/GitHub commits/Meetings with TAs
Each project group should maintain a blog sharing their steady progress, ideas, and experiments, and they must write at least one blog post per week (excluding exam weeks). NEW Moreover, they will regularly meet with TAs to discuss their progress and get feedback. Each group should maintain a GitHub repository for their project (must be viewable to the TAs and instructor). The frequency of your commits to GitHub will also be graded.
Progress Report
Due: December 22, 2019 (11:59pm)
Each student should submit a project progress report by December 4, 2017. The report should be 3-4 pages and should describe the following points as clearly as possible:
- Problem to be addressed. Give a short description of the problem that you will explore. Explain why you find it interesting.
- Related work. Briefly review the major works related to your research topic.
- Methodology to be employed. Describe the machine learning method that is expected to form the basis of the project. State whether you will extend an existing method or you are going to devise your own approach.
- Experimental evaluation. Briefly explain how you will evaluate your results. State which dataset(s) you will employ in your evaluation. Provide your preliminary results (if any).
Project Presentations
Due: January 8-10, 2020 (in class)
Each project group will have ~8 mins to present their work in class. The suggested outline for the presentations are as follows:
- High-level overview of the paper (main contributions)
- Problem statement and motivation (clear definition of the problem, why it is interesting and important)
- Key technical ideas (overview of the approach)
- Experimental set-up (datasets, evaluation metrics, applications)
- Strengths and weaknesses (discussion of the results obtained)
In addition to classroom presentations, each group should also prepare an engaging video presentation of their work using online tools such as PowToon, moovly or GoAnimate. The deadline is January 12, 2020.
Final Report
Due: January 15, 2020 (11:59pm)
As the last deliverable of the course project, each group is expected to submit a project report prepared using the style files provided in the course web page. The report should be 6-8 pages and should be structured as a research paper. It will be graded based on clarity of presentation and technical content. A typical organization of a report might follow:
- Title, Author(s).
- Abstract.
- Introduction. This section introduces the problem that you investigated by providing a general motivation and briefly discusses the approach(es) that you explored to solve this problem.
- Related Work. This section discusses relevant literature for your project topic.
- The Approach. This section gives the technical details about your project work. You should describe the representation(s) and the algorithm(s) that you employed or proposed as detailed and specific as possible.
- Experimental Results. This section presents some experiments in which you analyze the performance of the approach(es) you proposed or explored. You should provide a qualitative and/or quantitative analysis, and comment on your findings. You may also demonstrate the limitations of the approach(es).
- Conclusions. This section summarizes all your project work, focusing on the key results you obtained. You may also suggest possible directions for future work.
- References. This section gives a list of all related work you reviewed or used.