Ryohei Kobayashi received his Ph.D. in Engineering from Tokyo Institute of Technology in 2016. He then joined the Center for Computational Sciences (CCS), University of Tsukuba as an Assistant Professor in April 2016 and served until September 2024. Since October 2024, he has been an Associate Professor at the Supercomputing Research Center (SCRC), Institute of Integrated Research, Institute of Science Tokyo. He has also held a concurrent appointment as a Visiting Researcher with the Processor Research Team, RIKEN Center for Computational Science (R-CCS) since July 2021. His research interests include computer systems, high-performance computing (HPC), accelerators (GPU/FPGA), and reconfigurable computing, with a focus on GPU–FPGA cooperative computing and FPGA systems for HPC. He leads the Advanced Computing ACceleration (AC2) Laboratory, which advances hardware–software co-design for massively parallel systems based on accelerators. His honors include the HPC in Asia Poster Award (ISC 2018) and the IEICE CPSY Young Presentation Award (2015). He has served in program roles such as Proceedings Chair for HPC Asia 2026 and Publicity Co-Chair for IEEE Cluster 2025. He is a member of ACM, IEEE/IEEE CS, IPSJ, and IEICE.
Taiga Kobayashi works on communication optimization for large-scale systems, with interests in FPGAs, DPUs, machine learning, and data compression. His current work studies communication-data compression on NVIDIA BlueField DPUs for multi-GPU LLM training, with an eye toward communication substrates for future sparse and irregular accelerator workloads.
Research Interests: FPGA / DPU / Machine Learning / Data Compression
Akimasa Watanuki works on accelerator-oriented performance optimization, with interests in GPUs, CUDA, memory hierarchy, data layout optimization, and heterogeneous computing. His current work studies GNN training on GPUs and wafer-scale systems, focusing on irregular memory access, memory hierarchy behavior, and training efficiency in graph workloads.
Research Interests: Accelerators / GPUs / CUDA / Memory hierarchy / Data layout optimization / Accelerating parallel applications / Accelerating machine learning OSS / Heterogeneous computing (CPU-GPU, CPU-GPU-FPGA)