Preface I recently traveled to Taipei for the ACM CCS 2025. It’s been 14 years since my last…
惊人的 IOMMU Overhead 与不合理的 AQC 网卡驱动
背景 给家里的 AMD AI MAX 395 小主机加了个 Thunderbolt 万兆网卡,期望给这台小主机通上 10Gbps 的信息高速公路,这样我可以利用家里的 NAS 的大容量存储快速拉 LLM ,网卡用的是几年前买给 Mac 用的 QNAP…
Can we trust the cpu cycles from LLVM-MCA?
Background Recently, I’ve been delving into the ARM SVE2 speed-up over pure NEON in common workloads that are…
How CCMP reduce the pressure of branch predictor on aarch64
Preface When comparing branch MPKI (Miss Per Kilo Instructions) on aarch64 with other architectures such as RISC-V (including…
“Short-leg” of RISC-V | RISC-V 的小短腿
Preface I came across some code performance issues only on RISC-V recently. Their root cause is the short…
Spacemit X60 (K1) SPECINT 2006 Benchmark
I ordered a BananaPi F3 last week and it arrived on May 6th. Using the opensbi and kernel…
T-HEAD C910 SPEC CPU Benchmark
Enviroment Board: Lichee Module 4A (2GHz Version) SBI: revyos/opensbi/th1520-v1.3.1 Kernel: revyos/th1520-linux-kernel/th1520-master-wip SBI and Kernel Compiled with riscv64-linux-gnu-gcc version…
从与 AMBA AXI 的对比学习 TileLink
Background 最近仔细学习了一下 TileLink 以及 TL-C 的一致性协议,希望写一篇文章给 有AMBA AXI基础 的读者提供一篇 TileLink 快速入门的介绍。 本文参考的 TileLink Spec 基于https://starfivetech.com/uploads/tilelink_spec_1.8.1.pdf Variant Protocol Narrow…
Intel Data Dependent Prefetcher 对 SPEC CPU 2017 的影响
背景 最近在做一些商业硬件的 Data Dependent Prefetcher 测量,无意中注意到 Intel 13 代酷睿是有 Data Dependent Prefetcher 的,因此先进行了一个简单的尝试。 介绍与开关控制 介绍可以参考 Intel的文档 。 而根据…