nearmejobs.eu
The increasing volume of digital images containing written text has created a need for advanced algorithm to not only recognise but also localise and comment on this textual content. Traditional Optical Character Recognition (OCR) technology excels at recognising text within images but falls short in localising, interpreting and providing context-aware comments.
Vision-language models have made significant strides in integrating visual and textual data, enabling more comprehensive analysis of multimedia content. However, current models also often lack the capability to accurately perceive spatial information in images.
Due to these limitations, how to accurately recognise, localise and comment texts in digital images in a joint manner is still an open question. This project aims to address the aforementioned gaps by arming vision-language models with the ability to understand the spatial relationships between textual elements and other objects in an image. By enhancing models with spatial knowledge, the proposed methodology will enable them to provide insightful comments on written texts within digital images.
This PhD project aspires to pioneer a transformative approach in multimedia data analysis by developing vision-language models that seamlessly integrate recognition, localisation, and contextual commentary of written text in digital images. The ambition is to push the boundaries of current technology, creating models that can understand and interpret the spatial and semantic context of text within images. By achieving this, the project aims to set new standards for AI applications in digital media analysis, content moderation, and assistive technologies. The long-term vision and impact include advancing vision-language models, making significant contributions to both academic research and industrial applications such as robotics and AI healthcare.
The candidate of this PhD project will be working closely with the Microsystems Group, Newcastle University. Experiment of vision-language learning models will be carried out in Python, C/C++, MATLAB under deep learning framework (e.g. PyTorch, Tensorflow or Keras).
Entry Requirements
To be considered for the project, candidates must be highly self-motivated and meet the Essential Criteria. They are also expected to meet the Desirable Criteria as follows:
Essential Criteria:
Desirable Criteria:
Newcastle University is committed to being a fully inclusive Global University which actively recruits, supports and retains colleagues from all sectors of society. We value diversity as well as celebrate, support and thrive on the contributions of all our employees and the communities they represent. We are proud to be an equal opportunities employer and encourage applications from everybody, regardless of race, sex, ethnicity, religion, nationality, sexual orientation, age, disability, gender identity, marital status/civil partnership, pregnancy and maternity, as well as being open to flexible working practices.
Dr. Zhuang Shao Zhuang.Shao@newcastle.ac.uk
To help us track our recruitment effort, please indicate in your email – cover/motivation letter where (nearmejobs.eu) you saw this posting.
Job title: Metallic Materials Engineer Company Vipas AB Job description of telecom, Automotive, Retail, and…
Job title: Sr. Accounting Analyst Company Johns Hopkins Applied Physics Laboratory (APL) Job description achievement…
Job title: Lead Electrical Engineer (O) Company John Wood Group Job description consultancy business areas,…
Job title: Computational Biology Postdoc (f/m/d) - Single-Cell Genomics / Pediatric Cancer / Stem Cells…
nearmejobs.eu PRIMARY PURPOSE: The Tester assists the Bilingual/ESL Language Placement Center in the testing of…
nearmejobs.eu Noncredit English Associate (Part-time) Instructor Pool 2024 - 2025 at MiraCosta Community College Tweet …
This website uses cookies.