Job Description
Are you passionate about solving fundamental challenges in Vision and multimodal AI understanding?
You should consider joining a pioneering European AI Agent company as a Research Engineer to help advance the theoretical foundations of how AI systems process and understand complex interfaces.Their mission is to break new ground in AI's ability to comprehend multimodal data from real-world interfaces. The next stage of their journey is to solve core scientific challenges in visual understanding and its relationship with underlying document structures.
The company has established strong research foundations in multimodal AI, and now seeks to achieve scientific breakthroughs in visual-linguistic understanding and structured data interpretation.
Key Research Areas:
- Develop novel architectures for computer vision and structured multimodal understanding
- Advance the theoretical foundations of representation spaces for AI Agents and action models
- Pioneer new multimodal approaches for the best results
- Push the boundaries of structure understanding
Requirements:
- PhD in Computer Science, Machine Learning, or related field or equal practical experience in industry plus a Masters.
- Deep theoretical understanding of Computer Vision and multimodal learning architectures
- Experience with visual language models (e.g., LLaVA or similar)
- Ability to lead independent research initiatives
- Experience in the whole research pipeline from research, experimentation, building & training models, improving performance etc.
Nice to have:
- Experience working on structured understanding, document understanding or similar
- Strong publications record in top-tier conferences (NeurIPS, ICML, ICLR, CVPR)
This role offers the opportunity to conduct groundbreaking research in multimodal AI understanding. You'll be investigating fundamental questions at the intersection of vision, language, and structured data interpretation, with the goal of achieving significant scientific breakthroughs.
The company values scientific innovation and provides an environment conducive to pursuing novel research directions. They offer competitive compensation of Tier 1 AI lab salaries, plus attractive equity options.
The position is remote in Europe, with some regular travel to meet with the team in Paris.
If you're excited about conducting pioneering research in computer vision and multimodal AI and want to contribute to the theoretical foundations of next-generation AI systems, this could be your ideal opportunity to make a lasting impact.
Interested in being part of this AI revolution? Apply today. All applicants will receive a response.