- This event has passed.
Transforming Classical Chinese Texts into Searchable Databases with AI – Nov 6, 2024
Session Description
November 6 2024 @ 3:00 pm - 5:00 pm
In this presentation, Lomas will demonstrate how AI technologies can convert large volumes of unstructured classical Chinese texts—such as genealogies and Qing dynasty government employee records—into organized, searchable databases. This groundbreaking approach addresses the longstanding challenges of manual data entry in classical Chinese studies.
Attendees will gain a deep understanding of a comprehensive workflow designed to process millions of pages of historical texts, focusing on the complexities of layout identification and the precision required for effective text extraction. Key technologies, including customized Optical Character Recognition (OCR) and Named Entity Recognition (NER) models, will be discussed, showcasing how they significantly improve data extraction accuracy and enhance accessibility. This talk offers an exciting opportunity for those interested in digital humanities, cultural heritage preservation, and the intersection of AI and historical scholarship.