William Shakesblake: Exploring Text Generation From Images Using N-grams and Large Language Models


Nalcı D., Yeşilyurt S., Yılmaz B.

Karaelmas International Science and Engineering Symposium (KISES 2024) , 10 - 11 Mayıs 2024, sa.8997095, ss.65-66

  • Yayın Türü: Bildiri / Özet Bildiri
  • Sayfa Sayıları: ss.65-66
  • Sivas Cumhuriyet Üniversitesi Adresli: Evet

Özet

Text generation is a hot topic that researchers have been working on recently. The development of artificial intelligence, natural language processing technologies and Large Language Models (LLM) enables the automation of text generation processes. In this study, text generation from images was performed using Markov Chain Model (MCM) and LLM models [1,6]. It is expected that the generated texts will contain words specific to Blake and Shakespeare. Our study consists of four parts: Dataset creation, fine-tuning of models, analysis of images, and expert observation.  To create the dataset, Shakespeare's sonnets [5] and Blake's poems [7] were found and concatenated consecutively to form a single dataset consisting of 44,563 words and split into 90% for training and 10% for testing [5,7]. We share the dataset we created [2]. For the analysis of input images, YOLOv5 was used. For the contextual analysis, we asked 18 philology experts, to indicate which text they thought belonged to which author within this range, with 0 being Blake and 10 being Shakespeare. We trained our model with MCM by setting the n-parameter to 5 and created n-grams [1]. Then we fed the resulting model with images. Experts concluded that although the results contained words from authors, the sentences were far from context [3]. Then, the GPT model was finetuned with dataset [6,2]. Philology experts concluded that the generated texts carried the styles of both authors and that the word distribution and context were more meaningful than MCM [6].