Shuvro Chowdhury’s Personal Portfolio

Welcome

Shuvro Chowdhury, PhD

Postdoctoral Scholar
University of California Santa Barbara
Santa Barbara, CA

Hello, there! I am Dr. Shuvro Chowdhury. This site provides insights into my research interests, publications, academic background, and more.

Currently, I work as a postdoctoral scholar with Kerem Y. Camsari at the OPUS lab, University of California, Santa Barbara. My current research primarily focuses on hardware acceleration for machine learning of classical and quantum many-body systems using probabilistic bits (p-bits). These p-bits are robust, room-temperature-operable entities that fluctuate between two logic states and can be tuned via external signals. They offer a promising avenue for advancements in probabilistic and quantum computing and can be integrated using existing fabrication technologies.

My expertise spans Probabilistic and Neuromorphic Computing, Quantum Computing, Machine Learning, and Nanoscale Device Modeling and Simulation. I am an active IEEE member.

I received my Ph.D. from the Elmore Family School of Electrical and Computer Engineering at Purdue University, where I worked under the guidance of Supriyo Datta. My doctoral research focused on quantum emulation with probabilistic computers, inspired by Richard Feynman's insight:

"The only difference between a probabilistic classical world and the equations of the quantum world is that somehow or other it appears as if the probabilities would have to go negative ... "

This notion underlines the potential of quantum computing to leverage negative probabilities, though practical quantum computation remains a challenging goal today.

Feel free to explore the sections below to learn more about my work and contributions. For any inquiries or collaborations, please contact me through the links provided in the sidebar. Thank you for visiting!

Research Interests

My research interests include:

Probabilistic and Neuromorphic Computing
Quantum Computing
Machine Learning
Nanoscale Device Modeling and Simulation
Hardware Acceleration

Academic Degrees

Ph.D. in Electrical and Computer Engineering, Purdue University, 2022
M.S. in Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology (Bangladesh), 2014
B.S. in Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology (Bangladesh), 2011

Publications

Pushing the boundary of quantum advantage in hard combinatorial optimization with probabilistic computers

October 2025

Recent demonstrations on specialized benchmarks have reignited excitement for quantum computers, yet their advantage for real-world problems remains an open question. Here, we show that probabilistic computers, co-designed with hardware to implement Monte Carlo algorithms, provide a scalable classical pathway for solving hard optimization problems. We focus on two algorithms applied to three-dimensional spin glasses: discrete-time simulated quantum annealing and adaptive parallel tempering. We benchmark these methods against a leading quantum annealer. For simulated quantum annealing, increasing replicas improves residual energy scaling, consistent with extreme value theory. Adaptive parallel tempering, supported by non-local isoenergetic cluster moves, scales more favorably and outperforms simulated quantum annealing. Field Programmable Gate Arrays or specialized chips can implement these algorithms in modern hardware, leveraging massive parallelism to accelerate them while improving energy efficiency. Our results establish a rigorous classical baseline for assessing practical quantum advantage and present probabilistic computers as a scalable platform for real-world optimization challenges.

Accelerated quantum Monte Carlo with probabilistic computers

April 2023

Quantum Monte Carlo (QMC) techniques are widely used in a variety of scientific problems and much work has been dedicated to developing optimized algorithms that can accelerate QMC on standard processors (CPU). With the advent of various special purpose devices and domain specific hardware, it has become increasingly important to establish clear benchmarks of what improvements these technologies offer compared to existing technologies. In this paper, we demonstrate 2 to 3 orders of magnitude acceleration of a standard QMC algorithm using a specially designed digital processor, and a further 2 to 3 orders of magnitude by mapping it to a clockless analog processor. Our demonstration provides a roadmap for 5 to 6 orders of magnitude acceleration for a transverse field Ising model (TFIM) and could possibly be extended to other QMC models as well. The clockless analog hardware can be viewed as the classical counterpart of the quantum annealer and provides performance within a factor of \(< 10\) of the latter. The convergence time for the clockless analog hardware scales with the number of qubits as \(~ N\), improving the \(~N^2 \) scaling for CPU implementations, but appears worse than that reported for quantum annealers by D-Wave.

All-to-all reconfigurability with sparse and higher-order Ising machines

October 2024

Domain-specific hardware to solve computationally hard optimization problems has generated tremendous excitement. Here, we evaluate probabilistic bit (p-bit) based Ising Machines (IM) on the 3-regular 3-Exclusive OR Satisfiability (3R3X), as a representative hard optimization problem. We first introduce a multiplexed architecture that emulates all-to-all network functionality while maintaining highly parallelized chromatic Gibbs sampling. We implement this architecture in single Field-Programmable Gate Arrays (FPGA) and show that running the adaptive parallel tempering algorithm demonstrates competitive algorithmic and prefactor advantages over alternative IMs by DWave, Toshiba, and Fujitsu. We also implement higher-order interactions that lead to better prefactors without changing algorithmic scaling for the XORSAT problem. Even though FPGA implementations of p-bits are still not quite as fast as the best possible greedy algorithms accelerated on Graphics Processing Units (GPU), scaled magnetic versions of p-bit IMs could lead to orders of magnitude improvements over the state of the art for generic optimization.

Scalable connectivity for Ising machines: Dense to sparse

July 2025

In recent years, hardware implementations of Ising machines have emerged as a viable alternative to quantum computing for solving hard optimization problems, among other applications. Unlike quantum hardware, dense connectivity can be achieved in classical systems; however, we show that dense connectivity leads to severe frequency slowdowns and interconnect congestion, scaling unfavorably with system size. As a scalable solution, we propose a systematic sparsification method for dense graphs by introducing copy nodes to limit the number of neighbors per graph node. In addition to solving interconnect congestion, this approach enables constant frequency scaling, where all spins in a network can be updated in constant time. Nonetheless, sparsification introduces new difficulties, such as constraint-breaking between copied spins and increased convergence times to solve optimization problems, especially if exact ground states are sought. Relaxing the exact-solution requirements, we find that the overheads in convergence times are milder. We demonstrate these ideas by designing probabilistic bit Ising machines using ASAP7 (a predictive 7-nm fin field-effect-transistor technology model) process design kits as well as field-programmable-gate-array-based implementations. Finally, we show how formulating problems in naturally sparse networks (e.g., by invertible logic) sidesteps challenges introduced by sparsification methods. Our results are applicable to a broad family of Ising machines using different hardware implementations.

Emulating Quantum Circuits With Generalised Ising Machines

October 2023

The primary objective of this paper is to present an exact and general procedure for mapping any sequence of quantum gates onto a network of probabilistic p-bits which can take on one of two values 0 and 1. The first \(n\) p-bits represent the input qubits, while the other p-bits represent the qubits after the application of successive gating operations. We can view this structure as a Boltzmann machine whose states each represent a Feynman path leading from an initial configuration of qubits to a final configuration. Each such path has a complex amplitude \(\psi \) which can be associated with a complex energy. The real part of this energy can be used to generate samples of Feynman paths in the usual way, while the imaginary part is accounted for by treating the samples as complex entities, unlike ordinary Boltzmann machines where samples are positive. Quantum gates often have purely imaginary energy functions for which all configurations have the same probability and one cannot take advantage of sampling techniques. Typically this would require us to collect \(2^{nd}\) samples which would severely limit its utility. However, if we can use suitable transformations to introduce a real part in the energy function then powerful sampling algorithms like Gibbs sampling can be harnessed to get acceptable results with far fewer samples and perhaps even escape the exponential scaling with \(nd\) . This algorithmic acceleration can then be supplemented with special-purpose hardware accelerators like Ising Machines which can obtain a very large number of samples per second through a combination of massive parallelism, pipelining, and clockless mixed-signal operation made possible by codesigning circuits and architectures to match the algorithm. Our results for mapping an arbitrary quantum circuit to a Boltzmann machine with a complex energy function should help push the boundaries of the simulability of quantum circuits with probabilistic resources and compare them with NISQ-era quantum computers.

A full-stack view of probabilistic computing with p-bits: devices, architectures and algorithms

March 2023

The transistor celebrated its 75th birthday in 2022. The continued scaling of the transistor defined by Moore’s law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning (ML) and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.

Training Deep Boltzmann Networks with Sparse Ising Machines

June 2024

The slowing down of Moore's law has driven the development of unconventional computing paradigms, such as specialized Ising machines tailored to solve combinatorial optimization problems. In this paper, we show a new application domain for probabilistic bit (p-bit) based Ising machines by training deep generative AI models with them. Using sparse, asynchronous, and massively parallel Ising machines we train deep Boltzmann networks in a hybrid probabilistic-classical computing setup. We use the full MNIST and Fashion MNIST (FMNIST) dataset without any downsampling and a reduced version of CIFAR-10 dataset in hardware-aware network topologies implemented in moderately sized Field Programmable Gate Arrays (FPGA). For MNIST, our machine using only 4,264 nodes (p-bits) and about 30,000 parameters achieves the same classification accuracy (90%) as an optimized software-based restricted Boltzmann Machine (RBM) with approximately 3.25 million parameters. Similar results follow for FMNIST and CIFAR-10. Additionally, the sparse deep Boltzmann network can generate new handwritten digits and fashion products, a task the 3.25 million parameter RBM fails at despite achieving the same accuracy. Our hybrid computer takes a measured 50 to 64 billion probabilistic flips per second, which is at least an order of magnitude faster than superficially similar Graphics and Tensor Processing Unit (GPU/TPU) based implementations. The massively parallel architecture can comfortably perform the contrastive divergence algorithm (CD-n) with up to n = 10 million sweeps per update, beyond the capabilities of existing software implementations. These results demonstrate the potential of using Ising machines for traditionally hard-to-train deep generative Boltzmann networks, with further possible improvement in nanodevice-based realizations.

Machine Learning Quantum Systems with Magnetic p-bits

May 2023

The slowing down of Moore’s Law has led to a crisis as the computing workloads of Artificial Intelligence (AI) algorithms continue skyrocketing. There is an urgent need for scalable and energy-efficient hardware catering to the unique requirements of AI algorithms and applications. In this environment, probabilistic computing with p-bits emerged as a scalable, domain-specific, and energy-efficient computing paradigm, particularly useful for probabilistic applications and algorithms.In particular, spintronic devices such as stochastic magnetic tunnel junctions (sMTJ) show great promise in designing integrated p-computers. Here, we examine how a scalable probabilistic computer with such magnetic p-bits can be useful for an emerging field combining machine learning and quantum physics.

Scalable Emulation of Sign-Problem–Free Hamiltonians with Room-Temperature p-bits

September 2019

The growing field of quantum computing is based on the concept of a q-bit, which is a delicate superposition of 0 and 1, requiring cryogenic temperatures for its physical realization along with challenging coherent coupling techniques for entangling them. By contrast, a probabilistic bit or a p-bit is a robust classical entity that fluctuates between 0 and 1 and can be implemented at room temperature using present-day technology. Here, we show that a probabilistic coprocessor built out of room-temperature p-bits can be used to accelerate simulations of a special class of quantum many-body systems that are sign-problem–free or "stoquastic," leveraging the well-known Suzuki-Trotter decomposition that maps a \(d\)-dimensional quantum many-body Hamiltonian to a \(d+1\)-dimensional classical Hamiltonian. This mapping allows an efficient emulation of a quantum system by classical computers and is commonly used in software to perform quantum Monte Carlo (QMC) algorithms. By contrast, we show that a compact, embedded magnetic tunnel junction (MTJ)-based coprocessor can serve as a highly efficient hardware accelerator for such QMC algorithms, providing an improvement in speed of several orders of magnitude compared to optimized CPU implementations. Using realistic device-level spice simulations, we demonstrate that the correct quantum correlations can be obtained using a classical 𝑝-circuit built with existing technology and operating at room temperature. The proposed coprocessor can serve as a tool to study stoquastic quantum many-body systems, overcoming challenges associated with physical quantum annealers.

A Probabilistic Approach to Quantum Inspired Algorithms

December 2019

Digital computing is based on the notion of deterministic bits that are 0 or 1. At the other extreme, quantum computing is based on q-bits that are delicate, coherent superpositions of 0 and 1. In this paper, we describe an intermediate approach between these two extremes, namely Probabilistic Computing, based on the notion of probabilistic or p-bits that fluctuate between 0 and 1. Hardware p-bits can be compactly realized from emerging spintronic building blocks that are smaller, more energy efficient compared to their traditional CMOS implementations. Probabilistic circuits built out of interconnected p-bits can be useful for a host of quantum computing inspired algorithms.

The Nonequilibrium Green Function (NEGF) Method

November 2022

The nonequilibrium Green function (NEGF) method was established in the 1960s through the classic work of Schwinger, Kadanoff, Baym, Keldysh, and others using many-body perturbation theory (MBPT) and the diagrammatic theory for nonequilibrium processes. Much of the literature is based on the original MBPT-based approach, and this makes it inaccessible to those unfamiliar with advanced quantum statistical mechanics. We obtain the NEGF equations directly from a one-electron Schrödinger equation using relatively elementary arguments. These equations have been used to discuss many problems of great interest such as quantized conductance, (integer) quantum Hall effect, Anderson localization, resonant tunneling, and spin transport without a systematic treatment of many-body effects. But it goes beyond purely coherent transport allowing us to include phase-breaking interactions (both momentum-relaxing and momentum-conserving as well as spin-conserving and spin-relaxing) within a self-consistent Born approximation. We believe that the scope and utility of the NEGF equations transcend the MBPT-based approach originally used to derive it. NEGF teaches us how to combine quantum dynamics with “contacts” much as Boltzmann taught us how to combine classical dynamics with “contacts,” using the word “contacts” in a broad figurative sense to denote all kinds of entropy-driven processes. We believe that this approach to “contact-ing” the Schrödinger equation should be of broad interest to anyone working on device physics or nonequilibrium statistical mechanics in general.

Experimental evaluation of simulated quantum annealing with MTJ-augmented p-bits

December 2022

The slowing down of Moore’s Law has created an exciting new era of electronics, leading to the emergence of various types of CMOS+X devices and architectures. Here, we present the first experimental demonstration of a probabilistic computer where a stochastic magnetic tunnel junction (sMTJ) drives a powerful CMOS-based field programmable gate array (FPGA) in a heterogeneous compute fabric. We use our machine to experimentally evaluate the simulated quantum annealing (SQA) algorithm, known to closely mimic the behavior of D-Wave’s quantum annealers which implement the transverse field Ising model (TFIM). Our machine matches the exact solution of the TFIM where p-bits in the FPGA are asynchronously driven by the stochastic dynamics of a magnetic tunnel junction. To compare the performance of SQA against classical annealing (CA) in hard combinatorial optimization at large scale, we also design a fully digital emulator of our asynchronous architecture in the FPGA. Our digital system uses 7,085 p-bits to factor up to 26-bit integers and is about 10X faster than optimized Tensor (TPU) and Graphics Processing Units (GPU) at lower power. Surprisingly, we find that the additional replica networks necessary for SQA do not lead to appreciably better performance over an optimized CA that is using the same computational resources. The systematic evaluation of the SQA algorithm we present will be relevant for other types of accelerators, such as photonic or electronic Ising machines and the integrated scaling of our CMOS + sMTJ architecture could lead to orders of magnitude further improvements over TPU and GPUs, according to experimentally-validated projections.

Quantum realization of some quaternary circuits

November 2008

We present the design of quaternary quantum version of reversible circuits such as Toffoli gate, modified Fredkin gate, mux, demux, encoder-decoder using linear ion realizable quaternary Muthukrishnan-Stroud gates. Our realization of quaternary Toffoli gate is more efficient than the previous realization and other quaternary circuits are realized for the time in literature.

Synthesis of GF(3) Based Reversible/Quantum Logic Circuits without Garbage Output

May 2019

We present a method of synthesizing Ternary GaloisField (GF(3)) based reversible/quantum logic circuits without any ancillary trits/qutrits and hence without any garbage outputs. We realize multi input ternary Toffoli gate and square functions of GF(3) variables using linear ion trap realizable Muthukrishnan-Stroud (M-S) gates and shift gates in the absence of ancillary qutrits. Then based on the Galois Field Sum of Products (GFSOP)expression of a multi-variable GF(3) function, we synthesize the corresponding circuit.

Contact Me

If you would like to get in touch, please email me at schowdhury@ucsb.edu or connect with me through my social media profiles listed in Welcome.