Extractive Summarization of Urdu Language using Deep Learning Techniques on a Custom Dataset
Extractive summarization makes long text easier to understand by picking the most important sentences. Our research focuses on using deep learning to create summaries for Urdu articles. Here’s how we did it:
We built a dataset of 1,000 Urdu articles, each representing a unique topic. Using a fine-tuned BERT model, we added extra attention layers to improve the model’s language understanding. This helped us create accurate and meaningful summaries. Finally, we saved the results in a CSV file with three simple columns: Articles, Topic, and Summary.
Our approach is a step forward in natural language processing for Urdu. By combining AI with language understanding, we aim to make Urdu content easier to read and process. This research was presented at the 19th International Conference on Emerging Technologies (ICET’24).
This project shows how AI can improve language tools for non-English languages like Urdu, opening new possibilities for education, research, and communication.
The full Paper will Be available soon on IEEE Xplore.