MIDS Capstone Project Spring 2023

PlantDx

Overview and Motivation

Our Capstone project, PlantDx, is a tool that uses artificial intelligence and computer vision to diagnose plant health. It allows you to take a picture of a plant leaf, upload to a webpage, and in real-time get a diagnosis on your plant from an AI-trained model. The model is trained with over 50,000 plant images from 30+ classes and types, which include both healthy and unhealthy / diseased training images in the "PlantVillage" dataset. The model outputs a level of confidence for each prediction, and displays that to the user, along with potential suggestions if the diagnosis is "unhealthy". Much like depositing a check on your phone with an image improved the banking experience, this project uses a conceptually similar approach to enhance plant care.

The motivation for this project is the countless time spent on frustratingly inconclusive Google searches for remedies to plant issues. Consumer plant and garden is a very large consumer market ($25bn forecast by 2029, which excludes agriculture), and the time is ripe for a digital solution to plant care, given recent improvements in computer vision. Furthermore, digital plant care is a relatively uncapitalized opportunity with no consumer focused diagnosis-based tools on the market (while there are plant identification solutions, none are health diagnosis-driven). Most of the development in this space has been towards commercial agriculture, but not on the typical consumer that owns house plants or has a garden. This segment is PlantDx's target market.

Approach and Data Science Techniques

The data science approach is a convolutional neural net (CNN), pre-trained on ImageNet data. We experimented with EfficientNet, MobileNet, and ShuffleNet, each with different strengths and weaknesses. Ultimately EfficientNet led to the best results, while also being computationally feasible. Transfer learning allows for the fine-tuning of each layer of the CNN, and data augmentation was used to reduce overfitting through background transformations, along with using the Albumentation library for image transformations. To evaluate the model, accuracy and loss metrics were used across both the training and validation images.

Some key learnings include that funded research datasets (typically homogenous and of high quality) do not provide the level of real-world dataset variance needed for optimal real-world performance. This typically leads to overfitting, as our original model was extremely accurate, but this was partially driven by the low level of variance in the source dataset. Collecting the amount of data required to do this at an industrial scale will require significant resources, or an inflexible approach to the input image (similar to how bank checks require very specific input image background color, light quality, etc.). Additionally, prioritizing data augmentation strategies with enough time for effective experimentation (as most did not turn out to be useful) is another key learning. 

Outcomes

Overall, the model performs well on classes and health outcomes that are included in the training data. The input image quality and similar characteristics to the training image set is the most important variable in the model's performance. This can be very useful to consumers with tomatoes, bell peppers, and other common plants that are included in the training data.

 

Last updated:

April 20, 2023