Research Statement
The agenda behind fifteen years of work on fully homomorphic encryption, privacy-preserving machine learning, and the silicon to make them practical.
Private computation should be a default, not a feature.
Most of today’s compute runs in the clear. Training data lives in a cluster, inference servers see raw inputs, analytics pipelines need column-level access. As models scale and inference moves into regulated industries (healthcare, finance, defense), that assumption becomes more difficult to satisfy.
Fully homomorphic encryption is the most general tool we have for keeping data encrypted throughout its lifecycle, including during computation. The question is no longer whether FHE works. It does. The question is whether it can be made fast, ergonomic, and infrastructure-ready.
My work has been a sustained attack on that gap, from three angles.
1. Algorithms
The RNS variants of the BFV scheme (IEEE TETC) that I implemented delivered roughly two orders of magnitude over CPU baselines and are now standard building blocks in every modern FHE library. The first homomorphic CNN on GPUs (HCNN) (IEEE TETC) showed that encrypted deep learning is not a thought experiment but a measurable benchmark. CareNets (NeurIPS 2019) pushed that result to high-resolution images: a compact packing scheme that fits CNN inputs, weights, and activations into HE ciphertexts, delivering over 32x speedup, 45x better memory efficiency, and a 5851x reduction in transferred message size on encrypted 96x96 and 256x256 retinal images, all within 3% of the plaintext accuracy. PrivFT (IEEE Access) extended the line into text, training and serving private classification on encrypted documents.
2. Systems
Algorithms alone do not close the deployment gap. My systems work started with the first single-GPU CUDA implementation of FV/BFV (IACR TCHES 2018), the foundational GPU-FHE result that set the baseline for every accelerated FHE implementation since. The multi-GPU extension (IEEE TPDS) then scaled FHE workloads across GPU clusters, turning encrypted training and inference into a parallel workload rather than a CPU bottleneck. As a co-author of PALISADE and its successor OpenFHE, I helped build the open-source infrastructure that most FHE research and product work today depends on. As technical lead on DARPA DPRIVE, I drove a $15M effort from concept to a 12 nm, 1 GHz ASIC design for homomorphic ML, with a custom ISA designed around FHE. The architecture and methodology are documented in TREBUCHET (GOMACTech 2025). From compilers to GPU clusters to silicon, the systems work is the rest of the answer.
3. Applications
The goal of the applications work is to push what FHE can deploy, not just what it can demonstrate.
In healthcare, cross-institution oncology analysis under multiparty HE (PNAS) showed that federated learning on real cancer-center data is technically achievable end-to-end, and CKKS-based private pathological assessment (Springer BioData Mining, 2024) extended the same machinery to encrypted pathology classification: SVM inference on encrypted patient data paired with a compact feature-extraction pipeline runs in seconds at 128-bit security and matches the accuracy of plaintext baselines.
In financial-sector analytics, FHSVM (Neural Computing and Applications, 2022) ran homomorphic SVM inference for anti-money-laundering classification on encrypted Bitcoin-transaction datasets, achieving roughly 1.25 s prediction latency on multi-core CPUs at 128-bit security with zero accuracy loss versus the plaintext model, through novel CKKS packing strategies and parallel implementation.
On the LLM side, POLARIS is the open-source, model-preserving reference framework I introduced for CKKS-based private LLM inference: encrypted BERT-Tiny and BERT-Mini under GPU acceleration, evaluated on standard, unmodified architectures without retraining or activation-function substitution. It is intended as a proof-of-concept and a shared baseline for the community, paired with the Private LLM Card System (PLCS) for standardized reporting of framework configurations and results across the field.
Cancer-center collaborations, federated learning on encrypted gradients, intrusion detection on encrypted telemetry, and private LLM inference on encrypted prompt embeddings are not toy demos. They are the cases that show what production-grade private computation actually looks like.
The next decade
The frontier is private inference for large generative models. CKKS-based frameworks have now demonstrated end-to-end encrypted inference of LLMs up to 8 billion parameters, but my recent systematization of knowledge (SoK on private LLM inference under approximate HE) identifies a runtime gap of roughly four orders of magnitude between encrypted and plaintext inference as the primary barrier to practical use. CKKS-based inference is now algorithmically feasible on standard, unmodified LLMs from BERT-Tiny up to Llama-3-8B; it is not yet operationally practical for human-facing applications until that efficiency gap is narrowed.
Closing it is not a single-discipline problem. It requires progress on all three pillars at once:
- Algorithms: FHE-friendly approximations of transformer building blocks (softmax, GELU, layer norm), packing layouts for linear and non-linear blocks, and bootstrapping schedules tuned to attention and MLP rather than generic worst-case loops.
- Systems: compilers, hybrid execution stacks, GPU acceleration, and silicon that turn the algorithmic gains into wall-clock latency a user will tolerate.
- Applications: regulated, latency-tolerant use cases (healthcare diagnostics, federated analytics, multi-party machine learning, secure inference for defense) where the privacy guarantee is worth a real latency budget today, and where deployment exercises the rest of the stack.
I work across all three and I am looking for collaborators (labs, startups, program committees) who want to make encrypted AI practical.