BYU students and faculty members are working to create an algorithm to digitally transcribe the Wilford Woodruff papers.
These papers consist of hundreds of thousands of letters, journal entries and other documents all written by the fourth president of The Church of Jesus Christ of Latter-day Saints.
BYU economics professor Joe Price and director of the Record Linking Lab has done research which involves developing “tools that would automatically index historical records.”
“The Wilford Woodruff Foundation learned about some of our work and reached out to us to see if maybe we could be involved in the project,” Price said.
According to the Wilford Woodruff Papers website, their mission is to “digitally preserve and publish Wilford Woodruff’s eyewitness account of the Restoration of the Gospel of Jesus Christ from 1833 to 1898.” Digitalizing these papers will make them accessible to the public for anyone to read and learn from.
However, transcribing these papers is no easy feat. BYU student Paul Smith is part of the Applied and Computational Math Emphasis program and is lending his skills to work on this project.
“We’re using a deep learning model,” Smith said. “What we do is we feed it a bunch of images (of the papers) and we label each word on an image. Hopefully the algorithm will start recognizing what’s a word and what’s not.”
Smith explained this process is difficult because Woodruff often wrote with typos, shorthand language and had ink smudges on papers. Woodruff also wrote extremely specific journal entries every day for most of his life.
“He was a good writer, but it’s still a journal,” Smith said. “It’s not even easy to read for a human, let alone a machine.”
Because of the large number of papers to transcribe, the Wilford Woodruff papers project is set to be finished within 10 years with a budget of $10 million. BYU’s team is optimistic they will complete the project much sooner and at a much lower cost with their transcribing model.
“The immediate goal is to dramatically expedite the speed with which we can index the records of Wilford Woodruff,” Price said. “But the broader goal is to develop tools that anyone could use to auto-index the letters and diaries of their own ancestors.”
As family history work is a key component of The Church of Jesus Christ of Latter-day Saints as well as BYU, this project aims to make transcribing personal letters or journal entries an easy task for everyone.
“This is something that we are hoping everyone can use,” Smith said. “I’m sure we’ve all written in our journal or have received letters that we want to transcribe. Something like this would come in handy.”
Tim Palmer, an economics BYU student, is also collaborating on this project.
“The more people use (the model) and the more handwriting styles it gets trained on, the better it will be at recognizing handwriting,” Palmer said. “We’re just starting specifically with Wilford Woodruff because he wrote a ton and that’s a really good place to start.”
Working on this project has led Smith and Palmer to recognize how collaborative human efforts are essential in creating new technology.
“I think it’s incredible how amazing the human mind is,” Smith said. “We can look at a page and we instantly know what’s going on. It takes so much effort to get a machine to recognize something so simple.”
Palmer said his project combines the efforts of the computer science department, the Applied and Computational Math Emphasis program, economics students and family history resources.
“When we boil it down, it’s people from all over campus in all different areas working together to get something done,” Smith said. “It’s very much a joint effort of a lot of different disciplines coming together.”
The BYU team hopes to finish transcribing Woodruff’s journals by the end of the summer.