Skip navigation

BenchCouncil: International Open Benchmark Council

 

BenchCouncil’s View On Benchmarking AI and Other Emerging Workloads (Technical Report, Slides presented by Prof. Jianfeng Zhan at BenchCouncil SC BoF). This paper outlines BenchCounci’s view on the challenges, rules, and vision of benchmarking modern workloads.

Since its founding, BenchCouncil has two fundamental responsibilities. On one hand, it encourages benchmark-based quantitative approaches to tackle multi-disciplinary challenges. On the other hand, BenchCouncil incubates projects and hosts the BenchCouncil projects, which specifically mean top level projects at the BenchCouncil, and further encourages reliable and reproducible research using the BenchCouncil projects or incubator projects.

BenchCouncil Top Level Projects

BigDataBench: a Scalable Big Data Benchmark Suite [HPCA'14]

The current version BigDataBench 5.0 provides 13 representative real-world data sets and 25 benchmarks. The benchmarks cover six workload types including online services, offline analytics, graph analytics, data warehouse, NoSQL, and streaming from three important application domains, Internet services (including search engines, social networks, e-commerce), recognition sciences, and medical sciences. Our benchmark suite includes micro benchmarks, each of which is a single data motif, components benchmarks, which consist of the data motif combinations, and end-to-end application benchmarks, which are the combinations of component benchmarks. Meanwhile, data sets have great impacts on workloads behaviors and running performance (CGO’18). Hence, data varieties are considered with the whole spectrum of data types including structured, semi-structured, and unstructured data. Currently, the included data sources are text, graph, table, and image data. Using real data sets as the seed, the data generators—BDGS— generate synthetic data by scaling the seed data while keeping the data characteristics of raw data.

AIBench: an Industry Standard Internet Service AI Benchmark Suite [TR, Bench18]

The current version of AIBench 1.0 is the first industry scale AI benchmark suite, joint with seventeen industry partners. First, we present a highly extensible, configurable, and flexible benchmark framework, containing multiple loosely coupled modules like data input, prominent AI problem domains, online inference, offline training and automatic deployment tool modules. We analyze typical AI application scenarios from three most important Internet services domains, including search engine, social network, and e-commerce, and then we abstract and identify sixteen prominent AI problem domains, including classification, image generation, text-to-text translation, image-to-text, image-to- image, speech-to-text, face embedding, 3D face recognition, object detection, video prediction, image compression, recommendation, 3D object reconstruction, text summarization, spatial transformer, and learning to rank. AIBench consists of 12 micro benchmarks, 16 component benchmarks, and 2 end-to-end application benchmarks: DCMix---a datacenter AI application combination mixed with AI workloads, and E-commerce AI---an end-to-end business AI benchmark. The benchmarks are implemented not only based on main-stream deep learning frameworks like TensorFlow and PyTorch, but also based on traditional programming model like Pthreads, to conduct an apple-to-apple comparison.

HPC AI500: A Benchmark Suite for HPC AI Systems [Bench18]

A Benchmark Suite for HPC AI Systems--- HPC AI500 provides 3 representative scientific data sets and 7 benchmarks. The benchmarks cover 3 workload types including extreme weather analysis, high energy physics, and cosmology. It consists of 3 micro benchmarks and 4 component benchmarks. Micro Benchmarks use two software stacks including CUDA and MKL. Component Benchmarks use two software stacks including TensorFlow and Pytorch.

AIoT Bench: Towards Comprehensive Benchmarking Mobile and Embedded device Intelligence [Bench18]

Benchmarking for Mobile and Embedded device Intelligence---AIOT Bench provides 3 representative real-world data sets and 12 benchmarks. The benchmarks cover 3 application domains including image recognition, speech recognition and natural language processing. It consists of 9 micro benchmarks and 3 component benchmarks. It covers different platforms, including Android devices and Raspberry Pi. It covers different development tools, including TensorFlow and Caffe2.

Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking [Bench18]

Comprehensive End-to-end Edge Computing Benchmarking---Edge AIBench provides 5 representative real-world data sets and 16 benchmarks. The benchmarks cover 4 application scenarios including ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle. It consists of 8 micro benchmarks and 8 component benchmarks. Moreover, it provides an edge computing AI testbed combined with federated learning.


BenchCouncil Incubator Projects

The Incubator Project is the entry path into BenchCouncil for projects and codebases wishing to become part of the BenchCouncil's efforts. All code donations from external organisations and existing external projects wishing to join BenchCouncil enter through the Incubator. The BenchCouncil Incubator has two primary goals: Ensure all donations are in accordance with BenchCouncil legal standards Develop new communities that adhere to the BenchCouncil's guiding principles For more regarding BenchCouncil Incubator, see the Incubator website.

A Benchmark Suite for Medical AI

BenchCPU

EChip

A Benchmark Suite for Smart Grid


Other Benchmarking Proposals

BenchCouncil conferences are open to everyone who would like to contribute benchmarking proposals at any time.

We have received 8 benchmarking proposals.