Compiler Engineer · Low-level systems · Class of '26

AdityaTrivedi.

I build the layer between language and silicon — compiler IR, parallel runtimes, and GPU offloading. GSoC '25 at Fortran-Lang, core LFortran contributor, four published papers, with early LLVM upstream contributions. Joining Qualcomm's ARM compiler team in July 2026 — chasing MLIR & ML compilers along the way.

↗ FocusCompiler IR · Codegen ↗ UpstreamLLVM · LFortran ↗ Papers4 · HiPC, EuroPDP, FGCS ↗ GSoC'25 · Fortran-Lang
DRAG TO ORBIT · SCROLL TO FALL IN

Trajectory — where I am, where I'm headed

Now
Open-source compilers
LFortran core contributor · LLVM / ClangIR upstream · GSoC '25 with Fortran-Lang.
Exploring
MLIR & ML compilers
Where I'm pointing my curiosity — tensor IR, accelerator codegen, the ML-compiler stack.

01 / Featured

SECTOR 01 · FLAGSHIP
2 transmissions

The work that defined
the year.

01
TRANSMISSION

Google Summer of Code 2025 · Fortran-Lang · May → Sept

OpenMP 6.0 Infrastructure & GPU Offloading in LFortran

C++ · Fortran · OpenMP · CUDA · LLVM / ASR

LFortran is a from-scratch Fortran compiler targeting LLVM and C backends. My GSoC project built OpenMP deep enough to eventually enable GPU execution of Fortran — something no open-source Fortran compiler handles cleanly today.

The architectural decision: OpenMP handling was entangled with the DoConcurrentLoop path. I proposed and implemented a dedicated OMPRegion ASR node — giving OpenMP first-class representation in the IR tree. This unlocked structurally clean nested and hierarchical parallelism.

On that foundation: 13+ constructs, 8+ clauses across thread, team, and task models — then extended the C-backend to emit compilable host-device code for OpenMP Target Offloading on NVIDIA GPUs. A custom GPU emulator under 250 lines handles CI without physical hardware.

13+
OpenMP constructs end-to-end
8+
Clauses — thread, team, task
‹250
Lines — custom GPU emulator
12wk
Documented compiler work
02
TRANSMISSION

Open Source · LFortran · Sept 2024 → Sept 2025

LFortran Compiler — Core Contributor

C++ · Fortran · MPI · ISO_C_BINDING

Contributed to compiling POT3D — Predictive Science's MPI + OpenMP solar magnetic field solver used in real space-weather research. The 9th production-grade third-party code LFortran ever compiled.

Built a pure Fortran MPI wrapper library using ISO_C_BINDING with 30+ subroutine implementations — eliminating C-wrapper overhead. Now lives in the fortran_mpi repo under the lfortran org. Separately: 50+ compiler issues resolved across OpenMP, OOP, structs, and strings.

50+
Compiler issues resolved
30+
MPI subroutines, pure Fortran
0.95×
Compile time vs GFortran
0.75×
Runtime vs GFortran
Also upstream

Beyond the two flagship efforts, smaller upstream contributions: a single merged ClangIR x86 rdtsc / rdtscp builtins PR (#180714) in the LLVM monorepo, plus ongoing Flang OpenMP work.

02 / Experience

SECTOR 02 · TRAJECTORY
2022 → 2026

Where I've shipped.

Jul 2026incoming

Qualcomm India — Compiler Engineer

ARM Compiler Team
C++ · LLVM · ARM / AArch64 codegen · backend optimization
  • Joining the ARM compiler team — backend codegen, optimization passes, and toolchain work across Qualcomm's ARM platforms.
  • Bringing low-level LLVM & IR experience into production compiler engineering at scale.
2026

LLVM / ClangIR — Upstream Contributor

x86 builtins · IR lowering
C++ · ClangIR · MLIR · LLVM
  • Merged PR #180714 into the LLVM monorepo — rdtsc / rdtscp x86 timestamp-counter builtins in ClangIR.
  • Active Flang OpenMP contributions upstream.
May → Sept 2025

Google Summer of Code 2025 · Fortran-Lang

Compiler Developer
C++ · Fortran · OpenMP · CUDA · ASR/IR
  • Designed the OMPRegion ASR node — IR abstraction giving OpenMP first-class representation, enabling scalable OpenMP 6.0 work and GPU codegen.
  • 13+ constructs, 8+ clauses — thread, team, and task models.
  • Extended C-backend for OpenMP Target Offloading on NVIDIA GPUs. [Discussion]
  • Custom GPU emulator (<250 lines) for hardware-free CI.
Sept 2024 → 2025

LFortran — Compiler Development Engineer

Core Contributor
C++ · Fortran · MPI · Python
  • Helped compile POT3D solar-physics codebase — 0.95× compile, 0.75× runtime vs GFortran. [Blog]
  • Built pure Fortran MPI wrappers via ISO_C_BINDING, 30+ subroutines. [fortran_mpi]
  • 50+ issues resolved: OpenMP, OOP, structs, string handling.
2022 → 2026

Indian Institute of Technology, Jodhpur

B.Tech, Computer Science · CGPA 8+
Graduation · May 2026
  • HPC research with Prof. Dip Sankar Banerjee on dynamic graph algorithms.
  • Four papers, three years, one problem class.

03 / Research

SECTOR 03 · RESEARCH
4 papers

Four papers, one problem.

Maintaining a Maximal Independent Set on dynamic graphs as edges are inserted and deleted — at billion-edge scale — with parallel GPU and multi-core compute.

Best measured results — 15.64× on insertions and 10.57× on deletions, against the sequential baseline.

2024Origin
IEEE HiPCSRS

Fast MIS on Incremental Graphs

Aditya Trivedi, P. Nijhara, D.S. Banerjee
2025Extends
EuroMicro PDP

Fast Maximal Independent Sets on Dynamic Graphs

P. Nijhara, Aditya Trivedi, D.S. Banerjee
2025Refines
IEEE HiPCSRS

Fast and Accurate MIS on Dynamic Graphs

A.H. Singh, Aditya Trivedi, N. Sharma, A. Pandey, D.S. Banerjee
2025Unifies
Elsevier FGCSUnder Review

ParMIS: Fast & Unified MIS Maintenance for Large-Scale Dynamic Graphs

P. Nijhara, Aditya Trivedi, A.H. Singh, N. Sharma, A. Pandey, D.S. Banerjee

04 / Projects

SECTOR 04 · CATALOG
6 objects

Systems work, bottom-up.

PRJ·001 — ARCHITECTURE MIPS Pipeline Simulator C++ · Computer Architecture Cycle-accurate 5-stage MIPS pipeline with hazard detection, forwarding, and branch prediction — modeling stalls and data dependencies at the microarchitecture level.Source ↗
PRJ·002 — CONCURRENCY Multi-Threaded Web Crawler Go · Concurrency · Channels Concurrent crawler with a worker pool, bounded queues, and graceful backpressure — tuned for throughput without overwhelming target hosts.Source ↗
PRJ·003 — HPC · LLM Parallel LLM Inference on RISC-V C · RISC-V · OpenMP · SIMD Optimized transformer inference on a RISC-V target — vectorized kernels and threading delivering a 3.42× speedup over the baseline.Source ↗
PRJ·004 — ML · PRIVACY Federated Fraud Detection Python · Federated Learning Privacy-preserving fraud model trained across distributed nodes without centralizing sensitive transaction data — federated aggregation end-to-end.Source ↗
PRJ·005 — FULL-STACK WorkHubPro React · Node · PostgreSQL Full-stack workspace and project-management platform — auth, role-based access, and real-time collaboration features.Source ↗
PRJ·006 — OPEN SOURCE fortran_mpi Fortran · MPI · ISO_C_BINDING Pure-Fortran MPI wrapper library, 30+ subroutines, eliminating C-wrapper overhead. Maintained under the LFortran organization.Source ↗

05 / Stack

SECTOR 05 · INSTRUMENTS
5 bands

The tools I reach for.

01Languages
CC++FortranRustGoKotlinPython
02Parallel & HPC
OpenMPCUDAMPISIMDPthreads
03Compilers & IR
LLVMMLIRClangIRASR / IR designCodegen
04Architectures
AArch64 / ARMARMv9RISC-Vx86
05Exploring
MLIRML CompilersTensor IRAccelerators

06 / Contact

Let's talk compilers.

Not job hunting — joining Qualcomm's ARM compiler team in July 2026. But always open to community conversations and open-source collaboration. If you work on MLIR, ML compilers, LLVM, or parallel runtimes — or just want to talk IR design — reach out.

Channel open · transmitting