Multilingual Few-shot Document Layout Analysis
- Forschungsthema/Bereich
- Document Analysis, Deep Learning, Computer Vision, Artifical Intelligence
- Typ der Abschlussarbeit
- Master
- Startzeitpunkt
- -
- Bewerbungsschluss
- 31.05.2026
- Dauer der Arbeit
- -
Beschreibung
Document Layout Analysis (DLA) has made great strides in high-resource languages (like English and Chinese) But remains challenging for low-resource languages due to scarce labeled data. State-of-the-art models such as LayoutLM achieve strong results on English documents; however, they struggle to generalize to languages like Tamil, Urdu, or Amharic that were absent or underrepresented in training. In this thesis, we will research multilingual modeling and transfer learning to tackle DLA in low-resource languages. We will explore how multilingual document models (e.g., LayoutXLM, LayoutLMv3) pre-trained on rich-resource languages can be adapted through few-shot learning to perform layout analysis in new languages.What you do:● Literature research on DLA and few-shot learning.
● Dataset creation with DLA task in low-resource languages.
● Implementation of state-of-the-art methods for multilingual DLA tasks.What we offer:
● Getting started quickly with our open-source code
● Compute resources for model training and deployment
● Experienced guidance and open discussions with other team members
● Support publishing your work at top conferences (also attending conferences in person)Further Information:
We have further topics, such as Computer Vision, large language models (LLMs), Generative Models, Retrieval-Augmented Generation (RAG), Document Analysis and understanding, etc.Please feel free to contact me (yufan.chen@kit.edu) with your CV and transcript of your records.
Voraussetzung
- Voraussetzungen an Studierende
-
- Interest in the topic of computer vision and doing task-oriented research
- Python programming skills and knowledge of PyTorch/Tensorflow are desirable
- Studiengangsbereiche
-
- Ingenieurwissenschaften
Elektrotechnik & Informationstechnik
Geodäsie & Geoinformatik
Informatik
Mechatronik & Informationstechnik
Sonstige Studienbereiche
Remote Sensing and Geoinformatics
Information System Engineering and Management
- Ingenieurwissenschaften
Betreuung
- Titel, Vorname, Name
- Msc., Yufan, Chen
- Organisationseinheit
- Computer Vision for Human-Computer Interaction Lab, Institute for Anthropomatics and Robotics (IAR)
- E-Mail Adresse
- yufan.chen@kit.edu
- Link zur eigenen Homepage/Personenseite
- Website
Bewerbung per E-Mail
- Bewerbungsunterlagen
-
- Lebenslauf
- Notenauszug
E-Mail Adresse für die Bewerbung
Senden Sie die oben genannten Bewerbungsunterlagen bitte per Mail an yufan.chen@kit.edu
Zurück