techire ai Engagement Hub

ML Model Serving Engineer

Want to build the layer that actually makes AI usable in real time?

You’ll join a team focused on inference, where performance is the product. This is about delivering low-latency, high-throughput systems across LLMs, speech, and vision models running in production, not offline experiments.

They’re building real-time AI systems that need to respond instantly, reliably, and at scale. That means solving hard problems around batching, GPU efficiency, memory constraints, and system-level bottlenecks that most teams never fully crack.

You’ll sit at the core of the platform, working across model serving, infrastructure, and performance optimisation. A big part of the role is pushing current tooling beyond its limits, extending frameworks, profiling bottlenecks, and designing systems that hold up under real-world load.

This is not about training models. It’s about making them fast, efficient, and production-ready.

What you’ll work on:

Building high-performance serving systems for LLM, speech, and vision models
Scaling inference to production workloads with strict latency requirements
Optimising GPU utilisation and execution efficiency
Implementing techniques like continuous batching, KV cache optimisation, speculative decoding, and prefill/decode separation
Improving frameworks such as vLLM, TensorRT-LLM, Triton, and SGLang
Profiling and debugging performance across GPU, memory, and system layers

What you’ll bring:

Strong experience with ML inference or model serving systems
Deep understanding of latency and throughput optimisation in production
Solid Python and PyTorch skills, plus a systems or performance engineering mindset
Familiarity with distributed systems and production infrastructure

Exposure to CUDA, GPU profiling tools, or systems like Kubernetes and Ray is useful, but the key is knowing how to make models run efficiently at scale.

You’ll join a highly technical team with experience across major AI labs and big tech. The environment is pragmatic, focused on solving real performance problems rather than abstract research.

There’s real ownership here. You’ll help define how next-generation AI systems are served.

Package:
$220,000 – $320,000 base + equity
San Francisco, onsite 3 days per week

If you’re interested in working on the part of AI that actually determines whether it works in the real world, this is worth exploring.

All applicants will receive a response.

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	30/04/2026
Job ID:	34247

Research Engineer, Vision

Research Engineer – Computer Vision & Machine Learning

Want to build vision systems that let machines understand the physical world as naturally as we do?

This role sits within a highly technical team developing a new class of computing devices where perception, language, and interaction are tightly integrated. Vision is a core capability. Your work will directly influence how machines see, reason about space, and collaborate with humans in real-world environments.

You’ll join a specialist vision group working across 3D computer vision and machine learning. The problems sit at the boundary between learned models and physical reality, including gaze tracking, SLAM, multi-camera geometry, and systems that explicitly model optics, refraction, and light transport. The focus is on geometry-aware, physically grounded approaches rather than purely pixel-driven modelling.

This is a hands-on research engineering role. You’ll move between reading papers, building and training models, designing datasets, running controlled experiments, and deploying onto real hardware. You’ll work closely with firmware and hardware teams to ensure models operate reliably on-device.

Your work will include:

Developing ML models across 3D perception, tracking, and spatial understanding
Designing model architectures, training pipelines, evaluation frameworks, and inference systems
Working with large-scale, multi-camera and sensor-rich datasets
Translating state-of-the-art research into robust, production-ready systems
Creating new approaches when existing methods do not meet performance or physical constraints

You’ll have genuine technical ownership. The team values clear thinking, strong experimental discipline, and the ability to make informed bets on promising ideas.

You’ll likely bring end-to-end experience building computer vision and ML models, alongside strong familiarity with modern research in 3D or geometry-aware vision. Hands-on experience with PyTorch or JAX is expected, as is comfort working with complex datasets. The ability to operate independently in ambiguous environments is important, as is clear communication across research, hardware, and product teams.

A Bachelor’s degree or higher in computer science, machine learning, computer vision, applied mathematics, or a related field is required. A Master’s or PhD is a plus, particularly if you’ve worked on geometry-aware or physically informed modelling approaches. Experience deploying ML systems into real products or working in high-ownership startup environments would be valuable.

Compensation: $190,000 - $320,000 base (depending on experience) + equity
Benefits: 401(k) matching, 100% employer-paid health, vision, and dental insurance, unlimited PTO and sick time, medical FSA matching
Location: San Francisco, on-site collaboration required

If you’re motivated by building geometry-aware vision systems that connect AI to the physical world in meaningful ways, we’d like to hear from you!

All applicants will receive a response.

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	30/04/2026
Job ID:	34942

Staff Research Engineer

Most AI systems work in demos. Very few hold up in real customer environments.

This team is building the decision-making systems behind AI agents that operate across voice, chat, and email — where performance is measured in outcomes, not benchmarks.

You’ll work on models that need to reason over time, handle multi-step workflows, and stay consistent across entire interactions. Not just once, but repeatedly, under real-world constraints.

This is applied research that ships. You’ll take ideas from early concept through to production, owning how systems behave when deployed at scale.

The challenge is not just capability. It’s reliability — making reasoning systems that can operate across long-context interactions, manage memory, use tools, and execute workflows without breaking down.

You’ll be working closely with product and engineering teams, iterating on real-world failures, and improving systems based on how they actually perform in production.

What you’ll work on

Designing and improving reasoning systems for real-world agent workflows
Building and refining memory, retrieval, and multi-step execution systems
Developing post-training and evaluation approaches for deployed models
Iterating on systems based on real user behaviour and performance
Taking research ideas through to production environments

What they’re looking for

Experience working on LLM systems in production
Background in RL, post-training, or agent-based systems
Experience building systems involving memory, reasoning, or tool use
Strong engineering fundamentals and ability to ship end-to-end systems
Clear understanding of how models behave outside of controlled environments

Why this role

Work on systems judged by real users, not offline metrics
Direct ownership of how models behave in production
High autonomy in a fast-moving, product-driven team
Real-world complexity, not sandboxed problems

Package

📍 San Francisco or London (on-site)
💰 $200K–$400K base + equity

All applicants will receive a response.

Location:	SF, onsite
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	27/04/2026
Job ID:	35338

Research Scientist

What if AI systems could run full research loops — not just generate outputs, but form hypotheses, design experiments, and produce new scientific insight?

This team is building autonomous AI scientists that do exactly that. Their systems ingest large bodies of scientific literature, reason across them, and generate traceable outputs already used by teams in life sciences.

The problem is no longer getting models to produce plausible answers. It’s pushing them to plan, explore, and iterate across complex domains — reliably, and at scale.

You’ll join a team working at the edge of this shift, developing models that move beyond instruction following into structured, multi-step scientific reasoning.

This is not research in isolation. Your work will be deployed into real systems used by scientists, where model behaviour directly impacts what the platform can discover.

You’ll work closely with engineers and domain experts across biology and chemistry, translating open-ended problems into systems that can be trained, evaluated, and improved in production.

The company originated from one of the earliest groups working seriously on AI for science, including early language agents and AI-generated discoveries They’re now pushing further with systems capable of long-horizon reasoning across huge amounts of data.

They’ve primarily focused on post-training and reasoning so far, and are now moving into pre-training their own models to support this end-to-end.

What you’ll work on

Developing models that can reason across long-horizon scientific problems
Designing post-training methods to improve multi-step decision making
Working on sampling, exploration, and evaluation in complex environments
Building systems that move from research ideas into production workflows
Collaborating with scientists to define problems and validate outputs

What they’re looking for

Strong background in machine learning research (RL, representation learning, or related areas)
Experience pre-training or post-training LLMs
Track record of applying ML to real-world or complex domains
Strong programming skills (PyTorch, JAX, or similar)
Ability to work across research and applied systems

Why this role

Work on systems that aim to automate scientific discovery
Direct impact on real-world research and outcomes
Small, high-calibre team across AI and science
Real traction, not just prototypes

Package

📍 San Francisco (on-site or hybrid). Other locations considered: NYC, London.
💰 $200K–$400K base + stock

All applicants will receive a response.

Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	27/04/2026
Job ID:	35536

Full Stack Software Engineer

Want to build the systems that make AI actually useful inside real companies?

This Series A startup is tackling one of the hardest problems in enterprise AI. Models are generic. Company processes aren’t. They’re building AI agents that learn how work actually happens, then run those operations end-to-end.

Backed by top-tier investors, they’ve built a deeply technical team across engineering, AI research, and strategy. The focus is simple. Build things properly, with people who care about the craft.

You’ll join as an early full stack engineer, shaping both the product and the foundations it’s built on.

The work sits around the models, not inside them. You’ll build the platform, workflows, and interfaces that make AI usable in real-world environments. That means designing systems that are reliable, observable, and genuinely pleasant to work with, both for users and other engineers.

There’s no separation between building and shipping here. You’ll take ideas from whiteboard to production, owning the outcome end to end. The bar is high, but so is the autonomy.

You’ll spend time designing clean API contracts, modelling data properly, and building frontends that don’t fight you six months later. Velocity matters, but not at the expense of quality.

Your focus will include:

Designing and building backend systems in Python using FastAPI, from API design through to database schema and infrastructure
Creating high-quality frontend experiences in TypeScript and React, with strong typing and clean component architecture
Building shared libraries, internal tooling, and component systems that improve how the whole team ships
Owning problems end to end, from shaping ambiguous requirements through to production deployment
Developing integrations, connectors, and data pipelines that tie the platform into external systems

You’ll also have real input into how the product evolves, working closely with design and product to understand how customers use what you build.

This is greenfield work. The decisions you make now will compound over time.

They’re looking for engineers who care about how things are built, not just that they work.

You enjoy writing Python and TypeScript to a high standard, with strong typing and clear structure
You think carefully about data models and take pride in getting schema design right
You’ve built libraries, SDKs, or internal tooling that improved developer experience
You’re comfortable owning problems end to end, even with ambiguity
You have good product instinct and care about how things feel to use

Experience-wise, around 3+ years is a useful guide, but what matters more is how you think and build.

You’ll join a small, high-calibre team where you can influence tooling, patterns, and technical direction from day one.

Compensation: Up to $250,000 base + equity
Location: New York (in-person)

If building foundational systems properly, with real ownership, sounds like your kind of environment, it’s worth a conversation.

Location:	NYC
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	27/04/2026
Job ID:	35899

Frontend Engineer

Want to build product experiences for AI agents that actually understand how companies operate?

This team is tackling a core limitation in enterprise AI. Models are general, but workflows are not. They’re building agents that learn how processes really run, then execute them. This is not surface-level UI work, you’re building the interface layer to systems that directly operate inside real business workflows.

They’re a Series A company backed by Sequoia, with a deeply technical team across engineering and AI. As an early frontend engineer, you won’t just build features, you’ll shape how the product feels, how it’s structured, and how other engineers build on top of it.

You’ll work primarily in TypeScript and React, building high-quality, user-facing experiences that sit on top of complex AI systems. Strong typing, clean abstractions, and thoughtful API design matter here. This is a team that values well-modelled systems over quick fixes.

There’s no separation between building and shipping. You’ll take ideas from concept to production, owning decisions across architecture, UX, and implementation. You’ll work closely with design, contributing to interface decisions and helping define how users interact with agent-driven workflows.

You’ll also have real influence on frontend architecture, tooling, and patterns. Whether that’s building a component library, shaping state management decisions, or improving how the frontend integrates with backend systems, your decisions will compound as the team scales.

What you’ll focus on:

Building production-grade frontend applications using TypeScript and React
Designing and contributing to frontend architecture, patterns, and component systems
Collaborating closely with design to shape UX and interface decisions
Owning features end-to-end, from scoping through to deployment
Debugging across the stack, including tracing issues into API and backend layers

What you’ll bring:

Strong experience with TypeScript, with a focus on well-typed, maintainable code
Solid React fundamentals, building scalable and performant interfaces
Experience designing APIs, data models, or internal tooling that improves developer workflows
Good product and interaction judgement, comfortable working closely with design
Comfort owning ambiguous problems and turning them into clear, deliverable solutions

You’ll likely have around 3+ years in software engineering, but what matters more is how you think about systems. If you enjoy building from scratch, care about clean abstractions, and take pride in code that other engineers enjoy working with, you’ll fit well here.

Bonus if you’ve worked with Next.js, built design systems, or developed libraries and SDKs. But strong fundamentals in React and TypeScript are the priority.

This is a frontend-focused role, but you’ll be expected to understand the wider system. You should be comfortable debugging issues beyond the UI when needed.

Comp: Base Salary up to $250,000 + equity
Location: New York (Also growing in London)

If you’re motivated by building thoughtful systems, not just shipping features, this is the kind of environment where your work compounds over time.

Location:	NYC
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	27/04/2026
Job ID:	35866

Backend Engineer

Want to build the infrastructure that makes AI agents actually work inside real companies?

AI models are powerful, but they’re generic. Enterprise workflows aren’t. This team is solving that gap, building a learning layer that turns messy internal context into structured, executable systems that AI agents can actually use.

You’ll join a deeply technical team working on a platform that learns from tickets, Slack, emails, logs, and knowledge bases, then converts that into versionable “skills” for AI. Think of it as a “GitHub for context”, a system that makes company knowledge readable, maintainable, and executable.

This isn’t model training. It’s everything that makes models useful in production.

You’ll design and build the backend systems that power this layer, APIs, data models, integrations, and tooling that connect into real enterprise environments like ServiceNow, Jira, Zendesk, and Salesforce. The platform is already operating at serious scale, processing vast amounts of operational data across large organisations.

The work is high ownership. You won’t be handed tickets. You’ll take problems from idea to production, shaping architecture, building systems, and seeing how they perform in real-world use.

Your focus will include:

Building backend systems in Python (FastAPI), from API design through to database schema
Creating integrations, connectors, and data pipelines across enterprise tools
Developing internal tooling and libraries that improve engineering velocity
Owning systems end-to-end, including deployment and observability

You’ll enjoy this if you care about how software is built. Strong typing, clean interfaces, and well-structured data models aren’t afterthoughts here, they’re core to how the team works.

You’re likely someone who takes pride in designing schemas properly, enjoys building systems other engineers rely on, and prefers thoughtful, robust solutions over quick fixes.

The company has raised a $28M Series A led by Sequoia and is already working with large enterprise environments, processing data at significant scale. It’s still early enough that your decisions will shape the platform and engineering culture long-term.

Package:

Comp: $190K - 250K + meaningful equity
Location: New York (also expanding in London)

If you’re interested in building the systems that make AI actually usable in the real world, this is worth exploring.

Location:	NYC
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	27/04/2026
Job ID:	35833

ML Engineer

Define how large-scale AI systems for scientific discovery are actually built, trained, and run in production.

This team is building autonomous AI scientists that run full research loops — ingesting large bodies of literature, forming hypotheses, designing experiments, and producing traceable outputs already used across biotech and pharma.

The challenge isn’t just model capability. It’s building the systems that allow these models to be trained, evaluated, and deployed reliably at scale.

You’ll sit at the intersection of model training and systems — owning the infrastructure, pipelines, and experimentation platforms that make long-horizon reasoning systems possible.

This is not research in isolation. It’s building the engine that research runs on.

You’ll work closely with the wider team, translating ambiguous scientific problems into systems that can be trained, iterated on, and deployed in real-world environments.

The company comes from one of the earliest groups working seriously on AI for science, including early language agents and AI-generated biological discoveries. They’re now pushing further with systems capable of reasoning across thousands of papers and large-scale analyses, and moving toward pre-training their own models end-to-end.

The platform is already operating at scale, with tens of thousands of users and millions of queries, and is actively used in scientific workflows today.

What you’ll work on

Building and scaling training pipelines for large-scale LLM systems
Developing experimentation platforms that enable fast, reliable iteration
Designing data pipelines and systems for observability and reproducibility
Improving how training runs are orchestrated, monitored, and debugged
Supporting model deployment and inference for complex reasoning systems
Working closely with researchers to translate ideas into production systems

What they’re looking for

Experience building and scaling ML systems in production
Strong background across model training, data pipelines, and deployment
Experience with large-scale training or distributed systems
Fluency in frameworks like PyTorch, JAX, or similar
Strong engineering fundamentals and systems thinking
Ability to operate across ambiguity and own problems end-to-end

The company

~$70M raised, with another round planned
Platform already at meaningful scale (tens of thousands of users, hundreds of millions of lines of code written by the agent)
Strong commercial traction
Small, high-calibre team working at the intersection of AI and science

📍 San Francisco (on-site or hybrid, remote considered case by case)
💰 $250K–$400K base + equity
Levels: Senior, Staff, Principal
Roles available: ML Engineer, ML Infra, Research Engineers & Research Scientists

All applicants will receive a response.

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	16/04/2026
Job ID:	35767

Frontend SWE

Want to build the interface layer for an AI scientist?

You’ll join a team building autonomous AI agents designed to accelerate scientific discovery. The goal is simple, science moves too slowly, and they’re building systems that can change that.

This isn’t a typical frontend role. The product is an integrated research environment where scientists interact directly with AI models, workflows, and generated insights. Your work defines how usable that system actually is.

You’ll sit within the Platform team, working closely with researchers and product to turn complex, often messy scientific workflows into clear, intuitive interfaces.

The challenge is translating depth into clarity without losing fidelity.

You’ll be building high-performance frontend systems where data density, responsiveness, and usability all matter. Real-time interactions, dynamic visualisations, and scalable UI patterns are core to the product.

Your focus will include:

Building performant React applications for data-heavy workflows
Designing interfaces for real-time AI interactions and streaming data
Creating modular, scalable design systems used across the platform
Translating scientific and model outputs into usable visual interfaces

You’ll need strong frontend fundamentals, but more importantly, the ability to think in systems. Understanding how users navigate complexity, how interfaces guide decision-making, and how performance impacts usability at scale.

There’s a strong emphasis on performance engineering. You’ll be profiling rendering behaviour, optimising asset loading, and ensuring smooth interaction across browsers and devices.

The product itself sits at the intersection of AI, biology, and research tooling. If you’ve worked on complex internal tools, data platforms, or visualisation-heavy applications, this will feel familiar, just at a deeper technical level.

You’ll likely have experience building production frontend systems with React (or similar), working with TypeScript, and handling real-time data flows such as WebSockets or GraphQL subscriptions. Experience with visualisation libraries like D3, Deck.gl or Three.js is highly relevant here.

The environment is highly collaborative. You’ll work closely with researchers to anticipate how the product should evolve, not just respond to specs.

This is an onsite role based in San Francisco, working with a team focused on building something that genuinely pushes forward how science gets done.

Salary: $175,000 – $240,000 + equity
Location: San Francisco, onsite

If you’re interested in shaping how scientists interact with AI systems, apply today.

Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	01/04/2026
Job ID:	35602

Principal Full Stack Engineer

Want to build systems that actually hold up under long-running AI workloads?

Most agentic systems for science don’t fail at the model layer. They fail because the infrastructure can’t support long-horizon execution.

You’ll join a team building autonomous AI agents that run full research cycles. Ingesting thousands of papers, forming hypotheses, running experiments, and producing traceable outputs used by real scientific teams.

The challenge is making that work in production.

You’ll own the systems behind it. APIs, data pipelines, and platform architecture designed for long-running workloads, large-scale ingestion, and iterative experimentation loops. This is full-stack in scope, but backend in depth, where system design decisions directly impact what the platform can do.

You’ll be working across:

Backend services in Python or Node, building scalable APIs (FastAPI/REST)
Data pipelines supporting agent execution and scientific workflows
Cloud infrastructure (AWS/GCP), containerisation (Docker, Kubernetes)
CI/CD, observability, and reliability for systems under continuous load

This isn’t a generalist full-stack role. You’ll need to understand how systems behave under heavy data and compute demands, and be comfortable making architectural trade-offs across distributed systems.

The team is small, high-calibre, and already running real workloads with revenue traction. Backed by $70M+, they’re building infrastructure that defines how AI is applied to scientific discovery.

Salary: $200,000–$350,000 + equity
Location: San Francisco (onsite)

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	30/03/2026
Job ID:	35569

Your search query

What you’ll work on

What they’re looking for

Why this role

Package

What you’ll work on

What they’re looking for

Why this role

Package

What you’ll work on

What they’re looking for

The company

Our use of cookies

Your search query

What you’ll work on

What they’re looking for

Why this role

Package

What you’ll work on

What they’re looking for

Why this role

Package

What you’ll work on

What they’re looking for

The company

Send me similar jobs

Our use of cookies