AI benchmarks

Applied to Long list of AI questions ago
Applied to Language models surprised us ago

Benchmarks are tests which enable us to measure the progress of AI capabilities, and test for characteristics which might pose safety risks. 

I'm not totally sure whether this should exist, and whether it should be called this.

[comment deleted]20