Database micro-benchmark (Discussion)

Sorry, I should have done the look up myself instead of just giving you the search. I am working on the extended list for own use and don’t plan to publish it here unless others are interested.

Bagof is a complicated beast. There is probably some explanation why this case is so much harder and there are the profiler and source code to figure it out :slight_smile:

A bit of search reveals that bind_bagof_keys/3, and notably keysort/2 cause the big slowdown. Profiling under valgrind shows almost all time goes into deRef(), the macro that follows a variable reference chain. The bind_bagof_keys/3 trick ensures consistent variable bindings and makes sure this test passes:

sets(bagof-1) :-
	List = [_,_,_],
	bagof(X, member(X, List), Xs),
	Xs == List.

The way this was written creates a 1,000 long variable reference chain from the shared variable dictionary through each of the answers. I changed this by creating the shared variable dictionary list before the findall/3 call, so each answer creates (possibly) a reference pointer into the dictionary. This has some quite drastic effect. Times on AMD 3950X, Ubuntu 19.10, normal (no PGO) optimization.

?- time((between(1,1000,_), bagof(T, pr(X,1,1,T), L), fail; true)).
% 6,322,999 inferences, 0.609 CPU in 0.609 seconds (100% CPU, 10377325 Lips)
?- time((between(1,1000,_), bagof(Z-T, pr(X,Y,Z,T), L), fail; true)).
% 6,324,007 inferences, 0.678 CPU in 0.679 seconds (100% CPU, 9321305 Lips)

And

?- time((between(1,1000,_), bagof(Z-T, pr2(X,Y,Z,T), L), fail; true)).
% 5,099,007 inferences, 0.797 CPU in 0.797 seconds (100% CPU, 6395849 Lips)
?- time((between(1,1000,_), setof(Z-T, pr2(X,Y,Z,T), L), fail; true)).
% 4,923,001 inferences, 0.934 CPU in 0.935 seconds (100% CPU, 5268815 Lips)

Ratio 117%

I thought long variable reference chains are hardly ever a problem, but apparently it is possible to find counter examples in real life. Learned something new :slight_smile: Thanks.

6 Likes

Greate, and interesting indeed!