Research Engineer
Privacy
Role Overview
We are looking for a Research Engineer to sit at the intersection of applied science and product engineering.
You will be one of a small team responsible for evaluating new product directions before they reach our engineering org, assessing feasibility, running rapid experiments, and owning the ML foundations that power what we build.
This is not a pure research role, nor a pure engineering role: you will read a PRD, determine whether the idea is worth pursuing, prove it with code, and then stay in the room to help build it.
Your focus is privacy-preserving ML. Our privacy stack offers two core avenues, query-based mechanisms and DP synthetic data, and you will own the research layer across both: designing experiments, iterating on pipelines, evaluating results, and pushing the state of what our platform can offer.
The role combines rapid technical judgment, applied mathematics, disciplined experimentation, and close collaboration with product and engineering teams in a high-ownership environment.
Key Responsibilities
Feasibility & Architecture
Receive PRDs and produce clear feasibility assessments with risk-rated recommendations, including early-stage flags when a product idea is fundamentally at odds with meaningful privacy guarantees
Author architecture documents that bridge research findings and engineering implementation, forming the basis for user stories and sprint planning
Collaborate with product and engineering teams to translate research outcomes into buildable specs
Rapid Experimentation
Design and execute time-boxed experiments in sandboxed environments to validate or invalidate product hypotheses quickly
Build lightweight MVPs to demonstrate technical viability before full engineering investment
Know when to stop - ruthlessly prioritise signal over polish in the exploration phase
Differential Privacy Research & Implementation
Own the full DP experimentation lifecycle: mechanism selection, privacy budget management, utility evaluation, and formal accounting, knowing when a privacy guarantee is genuinely meaningful and when it is not
Investigate and iterate on query-based DP mechanisms, evaluating their utility and privacy tradeoffs in the context of our data access pipelines
Read and implement from current privacy-ML literature (NeurIPS, ICML, CCS, USENIX Security); translating research papers into production-ready code is a core part of the job
DP Synthetic Data
Research, implement, and iterate on DP synthetic data generation algorithms, owning the full experimental pipeline from training to evaluation
Evaluate synthetic data quality rigorously across statistical fidelity, downstream utility, and privacy leakage, including membership inference attacks
Identify promising directions in the synthetic data literature and assess their applicability to our stack
Privacy Infrastructure & Tooling
Build and maintain internal libraries for private training, privacy accounting, and audit tooling
Proactively identify gaps in our privacy infrastructure and surface them as prioritised proposals for the engineering roadmap
ML Infrastructure & Experiment Tracking
Set up and maintain experiment tracking infrastructure and enforce rigorous logging discipline across the team
Lead data curation efforts: sourcing, cleaning, versioning, and documenting datasets
Build reusable research infrastructure - evaluation harnesses, baseline suites, synthetic data benchmarks - that accelerates iteration speed
Regulatory & Compliance Liaison
Translate formal privacy guarantees into plain-language risk assessments that product, legal, and compliance stakeholders can act on
Required Qualifications
Education
MS in Computer Science, Mathematics, Physics, Statistics, or a related quantitative discipline
PhD strongly preferred
Experience & Technical Depth
Strong mathematical background: probability theory, statistics, and ideally measure theory or information theory; you are comfortable reading formal proofs and translating them into code
Demonstrable, hands-on experience implementing differential privacy in ML workflows, not just awareness, but working, production-quality code
Deep understanding of privacy accounting and composition; you know what a privacy budget means in practice and can defend your choices to a non-technical audience
Experience with DP synthetic data generation, including designing, running, and evaluating experiments, and reasoning rigorously about the privacy-fidelity tradeoff
Familiarity with query-based privacy mechanisms and their practical application in data access pipelines
Strong Python engineering skills with a genuine commitment to clean, performant, maintainable code; you apply SOLID principles in research contexts as readily as in production
Docker proficiency; our entire platform is containerised, and you must be comfortable building, debugging, and composing multi-service environments daily
Comfort operating at the boundary of engineering and applied mathematics; you write proofs and PRs in the same week
Strong written communication: your architecture docs and feasibility assessments are readable by engineers and product managers alike
Nice to Have
Experience evaluating synthetic data quality, including statistical fidelity metrics, downstream utility benchmarks, and membership inference attacks
Familiarity with complementary privacy techniques, including secure multi-party computation or homomorphic encryption
Experience with privacy-preserving ML in sensitive domains such as healthcare, finance, or regulated industries
Exposure to GenAI or LLM-based systems where privacy constraints apply, for example private fine-tuning or anonymisation pipelines
TypeScript proficiency; comfort working across the stack accelerates prototyping and production handoff
Tech Stack Indicators
Python
PyTorch
Opacus / Google DP Library / OpenDP
NumPy / SciPy
MLflow / W&B
Docker
TypeScript (desirable)
Languages
Fluent English (mandatory)
Italian is a plus
What We Offer
A central role in shaping privacy-preserving product directions before full engineering investment
Direct exposure to differential privacy, private data access mechanisms, and synthetic data research in a regulated healthcare-data environment
High ownership across research, architecture, experimentation, and production handoff
The opportunity to build reusable privacy research infrastructure that compounds team velocity
Flexible, international, and mission-driven working environment
