Recognise, Localise, and Comment: Evaluate Written Texts in Digital Images with Vision- Language Deep Learning Models

nearmejobs.eu

The increasing volume of digital images containing written text has created a need for advanced algorithm to not only recognise but also localise and comment on this textual content. Traditional Optical Character Recognition (OCR) technology excels at recognising text within images but falls short in localising, interpreting and providing context-aware comments.

Vision-language models have made significant strides in integrating visual and textual data, enabling more comprehensive analysis of multimedia content. However, current models also often lack the capability to accurately perceive spatial information in images.  

Due to these limitations, how to accurately recognise, localise and comment texts in digital images in a joint manner is still an open question. This project aims to address the aforementioned gaps by arming vision-language models with the ability to understand the spatial relationships between textual elements and other objects in an image. By enhancing models with spatial knowledge, the proposed methodology will enable them to provide insightful comments on written texts within digital images.

This PhD project aspires to pioneer a transformative approach in multimedia data analysis by developing vision-language models that seamlessly integrate recognition, localisation, and contextual commentary of written text in digital images. The ambition is to push the boundaries of current technology, creating models that can understand and interpret the spatial and semantic context of text within images. By achieving this, the project aims to set new standards for AI applications in digital media analysis, content moderation, and assistive technologies. The long-term vision and impact include advancing vision-language models, making significant contributions to both academic research and industrial applications such as robotics and AI healthcare.

The candidate of this PhD project will be working closely with the Microsystems Group, Newcastle University. Experiment of vision-language learning models will be carried out in Python, C/C++, MATLAB under deep learning framework (e.g. PyTorch, Tensorflow or Keras).

Entry Requirements 

To be considered for the project, candidates must be highly self-motivated and meet the Essential Criteria. They are also expected to meet the Desirable Criteria as follows:

Essential Criteria:

  • 2.1 undergraduate degree or MSc Distinction (or international equivalent) in Computer Science, Engineering, Mathematics, Physics, or a closely related subject.
  • Proficient programming skills in Python/C/C++/MATLAB.

Desirable Criteria:

  • 1st Class honours undergraduate degree or MSc Distinction (or international equivalent) in Computer Science, Engineering, Mathematics, Physics, or a closely related subject.
  • Experience of applied machine learning or computer vision algorithms and machine learning frameworks such as Pytorch, Tensorflow or Keras.
  • Excellent communication skills with the ability to explain complex areas.
  • Good track record of publications.

Newcastle University is committed to being a fully inclusive Global University which actively recruits, supports and retains colleagues from all sectors of society. We value diversity as well as celebrate, support and thrive on the contributions of all our employees and the communities they represent.  We are proud to be an equal opportunities employer and encourage applications from everybody, regardless of race, sex, ethnicity, religion, nationality, sexual orientation, age, disability, gender identity, marital status/civil partnership, pregnancy and maternity, as well as being open to flexible working practices.

Application enquires:

Dr. Zhuang Shao  

To help us track our recruitment effort, please indicate in your email – cover/motivation letter where (nearmejobs.eu) you saw this posting.

Share

Metallic Materials Engineer

Job title: Metallic Materials Engineer Company Vipas AB Job description of telecom, Automotive, Retail, and…

5 mins ago

Sr. Accounting Analyst

Job title: Sr. Accounting Analyst Company Johns Hopkins Applied Physics Laboratory (APL) Job description achievement…

9 mins ago

Lead Electrical Engineer (O)

Job title: Lead Electrical Engineer (O) Company John Wood Group Job description consultancy business areas,…

10 mins ago

Computational Biology Postdoc (f/m/d) – Single-Cell Genomics / Pediatric Cancer / Stem Cells

Job title: Computational Biology Postdoc (f/m/d) - Single-Cell Genomics / Pediatric Cancer / Stem Cells…

11 mins ago

Part-Time Bilingual/ESL Tester

nearmejobs.eu PRIMARY PURPOSE:  The Tester assists the Bilingual/ESL Language Placement Center in the testing of…

17 mins ago

Noncredit English Associate (Part-time) Instructor Pool 2024 – 2025

nearmejobs.eu Noncredit English Associate (Part-time) Instructor Pool 2024 - 2025 at MiraCosta Community College Tweet …

17 mins ago
For Apply Button. Please use Non-Amp Version

This website uses cookies.