Example Independent Studies - Fall 2017

Looking for an independent study to complete your skills and get some in-depth experience while doing an interesting research project? Read on ...

Participation is limited to students who completed the Web Mining and Computing course in Spring 2017 (or before) and received an A, or students who completed another data mining course and received an A.

Example Project 1: Can profiles extracted automatically from people’s bios indicate implicit mentoring?

Using text and data mining techniques, profile people using their bio and find implicit evidence of mentor relationships between people. The data set of this will be gathered from publicly available bios of people who have collaborated not at all, for a short time and for a longer time. This information is available in peer-reviewed articles that include author bios. We would model relationships based on information extracted from text and additional structured data (e.g., gender, location, …)

Potential skills to be learned: NLP, text mining as needed.

This is a new project but with interested collaborators from Baylor University who will provide help collecting data. The goal is to publish the results.

Example Project 2: Do laypersons and clinicians see autism differently?

Evaluate the presence of diagnostic criteria in text created by laypersons who are discussing autism.  The dataset will consist of online blogs and social medical text and also Electronic Health Records (EHR). We will compare and contrast the vocabulary used by laypersons and different clinical professionals? Do they use different emphasis? What is important to each group?

Potential skills to be learned: NLP, scraping of text (if needed), clustering, classification of symptoms.

This is an ongoing project. Much of the text is already available.

Example Project 3: Matching description of accidents in the mining industry to structured fields.

Mining is an important industry in Arizona. We have access to a data set with thousands of descriptions of accidents.  These accidents are coded according to mine it happened, severity, body part, etc. There are several interesting questions that need to  be solved. Does the textual description match the structured information? Does the type of accidents change over time? Is there a relationship with environmental factor? With policies?

Potential skills to be learned: SQL Server and java connections, NLP, information extraction from text, classification/clustering applications.

This is an ongoing project and some initial work has been completed. We will collaborate with two professor in Public Health.

Example Project 4: Can human-interpretable machine learning algorithm deliver text simplification rules?

Using human-interpretable machine learning algorithms (decision trees, association rules), we aim to find rules that translate ‘difficult’ sentences into ‘easy’ sentences. Successfully discovered and validated rules will be integrated in our text editor. Several corpora with easy and difficult texts are available. Preliminary work on frequency of grammatical structures in each has been completed in summer 2017.

Potential skills to be learned: SQL Server and java connections, NLP, statistical approaches, rule-based approaches, new uses of existing algorithms, new algorithms.

This is an ongoing project and many data sources are available. The work can be in Spanish or English.

Example Project 5: Leverage Alexa (Amazon Echo)

Learn how to write skills for an Alexa application. This will be tested in a medical context. We will use Alexa to look up medical information on Wikipedia, and read snippets of the text in response to the questions asked.

Potential skills to be learned: developing project on Amazon, Alexa skills, NLP

This project was started by a volunteer this summer and the code will be available for reuse.




AMCIS 2017 Doctoral Consortium in Boston - Panel Presentation: the job talk.

In spring 2017, more than 50 students earned a certificate of Tomorrow's Leaders Equipped for Diversity. The certificates will be mailed soon and the names added to the website. Congratulations to all!

Summer 2017 - Eller Center for Mgt Innovation in Healthcare Funding. EHR Gold Standard Dataset Creation for Autism Spectrum Disorders Surveillance Project, 5K, 2017. 

Summer 2016 - Eller College Small Research Grants, Age Prediction using Twitter, $2K, 2016.

August 2015 - NLM/NIH Funding: $1.4M Grant Will Design Free Online Text Simplifying Tool 

January 2015 - CDC Funding: NLP for Supporting Autism Surveillance