Tuesday, December 29, 2015

The arraymath extension vs. plpgsql

After arraymath was fixed, I did some simple benchmarks against my plpgsql code from the previous posts.

All tests were done with PostgreSQL 9.5rc1 on a Fujitsu Celsius H730 (Core i7 4700MQ, 16 GB RAM, 500 GB SSD), 1 million random elements per array.

Here are the results.

Array * Array, plpgsql:

select array(select random() from generate_series(1,1000000)) * array(select random() from generate_series(1,1000000));

1208 ms

Array * Array, arraymath:

select array(select random() from generate_series(1,1000000)) @* array(select random() from generate_series(1,1000000));

935 ms

Array * Scalar, plpgsql:

select array(select random() from generate_series(1,1000000)) * 2.0::double precision;

784 ms

Array * Scalar, arraymath:

select array(select random() from generate_series(1,1000000)) @* 2.0::double precision;

743 ms

So, arraymath is 33% faster for Array * Array and 6% faster for Array * Scalar. C wins over plpgsql, but not as dominating as I initially thought. The hard work is indeed the array traversal - and arraymath has to do it too.

And it's probably no good idea to access arrays by index in plpgsql if you intend to scan the whole array anyway.

1 comment:

  1. Hey, my guess in your previous post on performance was right in the middle! ;)

    Did you look at using array_map(), at least for the array X element stuff? At first look I don't see anything that would make your function much faster...

    Disappointingly, it looks like the only current use for array_map is for T_ArrayCoerceExpr. I think it'd be great to have a version of array_map that handles multiple arrays. I don't think it'd be too hard to create one that would handle any number of arrays; something like array_map(regprocedure, VARIADIC anyarray) RETURNS anyarray.

    Any interest in working on a patch?

    ReplyDelete