Training runs

Overview

Run #	Reference	Summary	Currently Active	Net Numbers	Best nets
NA	Old Main	Original 192x15 “main” run	No	1 to 601	ID595
test10	[[Lc0 Transition]]	Original 256x20 test run	No	10'000 to 11'262	11250 11248
test20	Training run reset	Many changes, see blog.	No	20'001 to 22'201	22018
test30	TB rescoring	Experiment with network initialization strategy, trying to solve spike issues. Experiment with Tablebase rescoring	No	30'001 to 33'005	32930

LR Drop

Training Run	1st LR drop	Elo	2nd LR drop	Elo	3rd LR drop	Elo	Best Net	Elo	Current best
Old Main							ID 595	3148
Test 10	ID 10077		ID 10320		ID 11013		ID 11248	3282	*
Test 20	ID 20247	2318	ID 20493		ID 21281		ID 22018	3118
Test 30	ID 30854

ID for test 20 to be checked

Sampling ratio

Most data from this sheet

Alpha Zero reference paper
Use best guess for games length and assuming resign cuts game length by 30%
Old Main
Initially new networks generated based on fixed timing rather than on games

Item	A0 with resign	A0 w/out resign	Main up to ID xxx	Main from ID xxx	Main from IDyyy to ID598	Test 10	Test 20
Positions per training game	95	135	135	135	135	135	———–
New networks per day	———–		6	6
Training Games per day	———–		160,000	160,000
Training Games per network	———–		26,700	26,700	40,000	40,000
Total training games	44,000,000	44,000,000			25,000,000
Positions generated per day	———–	————-	21,600,000	21,600,000
Positions generated per network	———–	————-	3,600,000	3,600,000	5,400,000	5,400,000
Total positions generated	4.158 B	5.940 B
Batch size	4,096	4,096	1,024	256	256	2,048
Training steps per day	———–	————-	300,000	300,000
Training steps per network	———–	————-	50,000	50,000	10,000	2,500
Total training steps	700,000	700,000
Positions trained per day	———–	————-	307,200,000	76,800,000
Positions trained per network	———–	————-	51,200,000	12,800,000	2,560,000	5,120,000
Total position trained	2.867 B	2.867 B
Sampling ratio	0.69	0.48	14.22	3.55	0.47	0.95	0.89

Last Updated: 2023-08-18