Developing a Vegan and Vegetarian Ingredient Reader
University
Shawnee State University
Major
Computer Science
Student Type
Undergraduate Student
Presentation Types
Oral Presentation (Live)
Keywords:
computer vision, artificial intelligence, natural language processing, grocery
Abstract
Text analysis in images is an important subfield of computer vision and extends objects recognition to language purposes. Key to text analysis are detecting, segmenting, recognizing, and comparing text to keywords of interest. For this project, this process will be used to find non-vegetarian and non-vegan ingredients on a given food ingredient list and return an answer if a product is vegetarian, vegan, or none. First, a large dataset of varied ingredient list images was collected. With this data set, computer vision was used to find the words within the image, and then text mining was used to compare those words to a word list of ingredients. Testing has found a high accuracy with deciphering and classifying text from clean images; however, there are difficulties with pictures taken with more noise in the image. Unlike in the legally standardized nutrition labels, critical issues in this problem is a lack of standard format and style in ingredient labels, which can have a variety of fonts, different colors and contrast.
Human and Animal Subjects
no
IRB or IACUC Approval
no
Faculty Mentor Name
Dr. Trevor Bihl
Faculty Mentor Title
Adjunct Faculty
Faculty Mentor Department
Engineering Technologies
Recommended Citation
Simpkins, Dustin and Bihl, Trevor, "Developing a Vegan and Vegetarian Ingredient Reader" (2025). Celebration of Scholarship. 1.
https://digitalcommons.shawnee.edu/cos/2025/session1/1
Location
LIB 204
Developing a Vegan and Vegetarian Ingredient Reader
LIB 204
Text analysis in images is an important subfield of computer vision and extends objects recognition to language purposes. Key to text analysis are detecting, segmenting, recognizing, and comparing text to keywords of interest. For this project, this process will be used to find non-vegetarian and non-vegan ingredients on a given food ingredient list and return an answer if a product is vegetarian, vegan, or none. First, a large dataset of varied ingredient list images was collected. With this data set, computer vision was used to find the words within the image, and then text mining was used to compare those words to a word list of ingredients. Testing has found a high accuracy with deciphering and classifying text from clean images; however, there are difficulties with pictures taken with more noise in the image. Unlike in the legally standardized nutrition labels, critical issues in this problem is a lack of standard format and style in ingredient labels, which can have a variety of fonts, different colors and contrast.