AsiaTEFL Logo
AsiaTEFL MENU
Indexed in SCOPUS
About AsiaTEFL
Notice New
2024 Conference
Conference
Webinars
AsiaTEFL Book Series
AsiaTEFL Links
Gallery
The Journal of Asia TEFL
Information of the Journal
Submission Guidelines
Ethical Guidelines
Manuscript Submission
Editorial Board
Past Issues
Journal Order
Membership
Contact Us
My Page
Past Issues
Go List

Volume 17 Number 1, Spring 2020, Pages 1-318   


 http://dx.doi.org/10.18823/asiatefl.2020.17.1.9.143 PDF Download
   

An Analysis of the Errors in the Auto-Generated Captions of University Commencement Speeches on YouTube

    Jeong-Hwa Lee & Kyung-Whan Cha


Auto-generated captions on YouTube have proven useful in helping viewers better understand the words being spoken. However, at times they fail to contain accurate captions. In these cases, they lead to confusion. The aim of this paper is to identify and analyze errors in the auto-generated captions of 20 commencement speeches on YouTube. These speeches were presented over a period of 12 years by speakers from different walks of life. The researchers selected ten male and ten female icons. Only the first 10 minutes of the speeches were utilized for this investigation. All the captioned errors were collected and analyzed. Upon completion of the analysis, it was discovered that the frequency of errors in each speech ranged between 10 and 46 cases, with an average of one error occurring about every 26 seconds. Among the different error categories, nouns record the highest number with 144 cases (31.3%). The second is verbs with 93 cases (20.2%), then prepositions with 37 cases (8.1%). Among the four subcategories, namely omission, addition, substitution, and word order, substitution recorded the highest amount of errors with 357 cases (77.6%). Furthermore, the errors were classified into two major groups. The first, involving function words, appeared in 169 cases (36.7%). The second, involving content words, appeared in 291 cases (63.3%). The results of this research suggest that a continuous development of the voice recognition software that automatically generates captions is necessary for more efficient and accurate data that will help viewers and listeners better comprehend the video contents.

Keywords: auto-generated caption errors, YouTube, university commencement speeches, function words, content words, omission, addition, substitution, word order



JACET
KAFLE
KOTESOL
MELTA
ThaiTESOL
HAAL
TEFLIN
IGSE
단체명: AsiaTEFL (아시아영어교육학회)
고유번호: 605-82-77130
소재지: 광주광역시 광산구 호남대길 59, 1층 4101호
대표자: 박주경