UMD Researchers Build AI Database to Improve Math Learning Outcomes

Project Funded by $4.5M Philanthropic Grant Aims to Help Teachers, Students, Researcher and Edtech Industry
Students raise their hands in class. Photo by Adobe Stock

For many K–12 students, mathematics is a scary subject—confusing, frustrating, and often disliked. The problem is more pronounced among students from underserved communities and varied socioeconomic backgrounds. According to the most recent data from the National Assessment of Educational Progress, the score gap in mathematics between the highest- and lowest-performing students in 8th grade has widened by 7 points in 2024 compared with 2019.

Educators looking to close this gap are constantly seeking new ideas and improved teaching modules. The use of artificial intelligence (AI) shows promise, with natural language processing software being deployed to scrutinize recorded transcripts of classroom instruction, allowing an analysis of specific speech patterns that can incentivize students’ reasoning skills.

Other AI tools like machine learning have been used to evaluate videos of classroom interactions—how or when students take notes or raise their hands, or where teachers are positioned in the classroom while speaking are but a few examples.

But according to a multi-institutional team of education experts led by the University of Maryland, relying solely on current AI-driven tools provides an incomplete picture of the rich multimodal nature of classroom instruction, especially if the data used to train the AI models is of low quality or does not have the kind of rigorous mathematics instruction educators would like to see. 

To address this challenge, the team is planning to develop a largescale open-source dataset for AI model-training tools focused on K–12 math education. The data, to be collected over the next three years from classroom recordings of 300 instructors who teach fourth to eighth graders, is expected to accelerate AI-driven strategies that improve educational best practices for both teachers and students.

The project is funded by a $4.5 million award from the Gates Foundation and Walton Family Foundation.

The dataset’s potential uses are vast, says Jing Liu, an assistant professor of education policy in UMD’s College of Education who is the lead principal investigator on the project. Researchers from various disciplines—from math educationists to psychologists to economists—could use it to explore topics like understanding students’ sense of belonging in classrooms and analyzing teaching quality.

Additionally, EdTech companies could use the data to train their large language models for curriculum design, while AI developers could use it to train new models to improve computer vision techniques or speech recognition systems used in education.

The project includes reaching out to classrooms across the nation, Liu says, aiming to cover school districts from many different localities and that serve students from different socio-economic backgrounds. 

“We already know that accuracy and representativeness are critical issues in AI systems,” he explains. “For this project, we want to capture a range of students to make it as representative as possible—including students with different learning needs and language backgrounds to ensure our dataset is robust and broadly applicable.”

Assisting Liu on the project are Wei Ai, an assistant professor in the College of Information; Heather Hill, a professor of teacher learning and practice at Harvard University; and Dora Demszky, an assistant professor of education data science at Stanford University.

For their project, data science experts on the team will add additional content to the classroom recordings, including student and teacher post-lesson surveys, lesson plans, classroom materials, administrative data and test scores.

This will allow the recorded transcripts to include detailed annotations of key teaching moves and student mathematical practices, the researchers say.

For Liu and Ai, ensuring that the dataset is easily accessible is also crucial.

The team wants the barrier to entry of using the data to be quite low to allow researchers from a broad array of scientific and educational backgrounds to use it, says Ai, who has a joint appointment in the University of Maryland Institute for Advanced Computer Studies (UMIACS).

The data will be anonymized, linked, and stored in a research repository and made available to researchers and developers under data use agreements. Researchers will access data components that do not contain personally identifying information through open-source avenues such as EdConvokit, Github, and Hugging Face.

From left: Jing Liu (College of Education), Wei Ai (College of Information/UMIACS), and Ph.D. students Meiyu Li (information studies), and Paiheng Xu (computer science) discuss the use of AI to advance math education. Photo by Mike Morgan for UMIACS

Given the project’s ambitious scope, Liu is anticipating several challenges, including managing a large group of study participants and securing data privacy agreements with partnering school districts. 

He notes that Hannah Rosenstein, a research project manager at UMD’s College of Education, has been invaluable in helping recruit various school districts, in the hiring of local staff members, and in overseeing the initial data collection process. Liu also expressed appreciation for the UMD doctoral students contributing to the project: Ting-Yu Chung, Sarah Montana and Jiseung Yoo from the Department of Teaching and Learning, Policy and Leadership; Paiheng Xu from the Department of Computer Science; and Meiyu (Emily) Li and Phuong Anh (Kem) Nguyen-Le from the College of Information.

The computational end of the project will be supported by UMIACS, with the Artificial Intelligence Interdisciplinary Institute at Maryland (AIM) also aiding with several data sharing tasks. Additional help will come from the College of Education’s Center for Educational Data Science and Innovation, which Liu and Ai have set up to serve as a hub for research in AI and education.

The Gates Foundation/Walton Family Foundation-funded initiative is the latest in a series of collaborations between Liu and Ai in the field of education and AI. In 2023, they received a Grand Challenges grant from UMD’s Division of Research to measure and improve equity in K–12 math classes with machine learning. Last year, they were awarded $1.5 million from the National Science Foundation to continue advancing this work with a focus on lesson planning. Around the same time, the team also received a seed grant from the Institute for Trustworthy AI in Law & Society (TRAILS) to address disparities in PK–12 education that are predictable by race and ZIP code.

This story first appeared on the University of Maryland Institute for Advanced Computer Studies website.

Top photo by Adobe Stock

Bottom photo: From left: Jing Liu (lead-PI, College of Education), Wei Ai (co-PI, College of Information and UMIACS), and Ph.D. students Meiyu Li (information studies), and Paiheng Xu (computer science) discuss the use of AI to advance math education for fourth through eighth graders. Photo credit: Mike Morgan for UMIACS

***

About the Gates Foundation

Guided by the belief that every life has equal value, the Gates Foundation works to help all people lead healthy, productive lives. Addressing issues that include poverty, health and education, the foundation excels in building partnerships that bring together the best organizations around the globe to find solutions and drive change.

About the Walton Family Foundation

The Walton Family Foundation is rooted in several generations of family and is currently working in three important areas: strengthening the connections between K–12 education and lifelong opportunity; protecting rivers, oceans and the communities they support; and advancing opportunities and positive outcomes in the organization’s home region of Northwest Arkansas and the Arkansas-Mississippi Delta.