By Jian Wu and Tiffany Whitfield

Jian Wu, Ph.D., assistant professor of Computer Science at 51情报站 is at the forefront of innovation and big data. Recently he utilized his knowledge to present his work at one of the leading international academic conferences on artificial intelligence in the world. On February 22, 2024, Wu presented a paper titled, 鈥淓TDPC: A Multimodality Framework for Classifying Pages in Electronic Theses and Dissertations鈥 at the 36th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24), collocated with the in Vancouver, Canada.

This year, more than 5,000 people from many continents and countries attended the conference. Compared with AAAI, which focuses more on theoretical contributions, IAAI focuses on application of AI to real-world scenarios. This year, the acceptance rate of IAAI was 24%, making it one of the most competitive conferences on AI applications.

The first author of the paper Dr. Wu presented as well as Muntabir Choudhury, a senior Ph.D. student in Computer Science at 51情报站. This paper proposes a novel method to classify PDF pages of electronic theses and dissertations (ETDs) into 13 different categories, such as chapters, references, appendices, and title pages. The novel method, called the multimodality model, leverages a deep neural network to fuse text and visual information into a single representation. This method achieved a much higher performance when compared with state-of-the-art methods, which were only based on either text or visual information. The new method improved the accuracy by at least 25%. This work laid a foundation to build a user-friendly online reader for ETDs. Instead of downloading and reading a lengthy ETD on a computer, users can directly navigate to the sections they are interested in. Dr. Wu said, 鈥淢untabir is an excellent student, and I am glad his two year鈥檚 effort finally paid off.鈥

This method is partially sponsored by a research grant awarded by the Institute of Museum and Library Services. In addition to Muntabir and Wu, other participants included Lamia Salsabil, a graduate student at 51情报站, Edward Fox, Ph.D., professor of Computer Science at Virginia Tech, and Bill Ingram, Ph.D., assistant professor and associate dean and executive director for information technologies in the University Libraries at Virginia Tech.

Several current and previous professors from 51情报站 attended the conference, including Dr. Jiang Li (ECE, 51情报站), Dr. Hongyi Wu (University of Arizona), and Dr. Wu He (National Science Foundation).