1.10.0
User Documentation for MADlib

This module provides a set of basic array operations implemented in C. It is a support module for several machine learning algorithms that require fast array operations.

Implementation Notes

All functions (except normalize() and array_filter()) described in this module work with 2-D arrays.

These functions support several numeric types:

Several of the function require NO NULL VALUES, while others omit NULLs and return results. See details in description of individual functions.

Array Operations
array_add()

Adds two arrays. It requires that all the values are NON-NULL. Return type is the same as the input type.

sum()

Aggregate, sums vector element-wisely. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_sub()

Subtracts two arrays. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_mult()

Element-wise product of two arrays. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_div()

Element-wise division of two arrays. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_dot()

Dot-product of two arrays. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_contains()

Checks whether one array contains the other. This function returns TRUE if each non-zero element in the right array equals to the element with the same index in the left array.

array_max()

This function finds the maximum value in the array. NULLs are ignored. Return type is the same as the input type.

array_max_index()

This function finds the maximum value and corresponding index in the array. NULLs are ignored. Return type is array in format [max, index], and its element type is the same as the input type.

array_min()

This function finds the minimum value in the array. NULLs are ignored. Return type is the same as the input type.

array_min_index()

This function finds the minimum value and corresponding index in the array. NULLs are ignored. Return type is array in format [min, index], and its element type is the same as the input type.

array_sum()

This function finds the sum of the values in the array. NULLs are ignored. Return type is the same as the input type.

array_sum_big()

This function finds the sum of the values in the array. NULLs are ignored. Return type is always FLOAT8 regardless of input. This function is meant to replace array_sum() in cases when a sum may overflow the element type.

array_abs_sum()

This function finds the sum of abs of the values in the array. NULLs are ignored. Return type is the same as the input type.

array_abs()

This function takes an array as the input and finds abs of each element in the array, returning the resulting array. It requires that all the values are NON-NULL.

array_mean()

This function finds the mean of the values in the array. NULLs are ignored.

array_stddev()

This function finds the standard deviation of the values in the array. NULLs are ignored.

array_of_float()

This function creates an array of set size (the argument value) of FLOAT8, initializing the values to 0.0.

array_of_bigint()

This function creates an array of set size (the argument value) of BIGINT, initializing the values to 0.

array_fill()

This functions set every value in the array to some desired value (provided as the argument).

array_filter()

This function takes an array as the input and keep only elements that satisfy the operator on specified scalar. It requires that the array is 1-D and all the values are NON-NULL. Return type is the same as the input type. By default, this function removes all zeros.

array_scalar_mult()

This function takes an array as the input and executes element-wise multiplication by the scalar provided as the second argument, returning the resulting array. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_scalar_add()

This function takes an array as the input and executes element-wise addition of the scalar provided as the second argument, returning the resulting array. It requires that all the values are NON-NULL. Return type is the same as the input type.

array_sqrt()

This function takes an array as the input and finds square root of each element in the array, returning the resulting array. It requires that all the values are NON-NULL.

array_pow()

This function takes an array and a float8 as the input and finds power of each element in the array, returning the resulting array. It requires that all the values are NON-NULL.

array_square()

This function takes an array as the input and finds square of each element in the array, returning the resulting array. It requires that all the values are NON-NULL.

normalize()This function normalizes an array as sum of squares to be 1. It requires that the array is 1-D and all the values are NON-NULL.

Examples
  1. Create a database table with two integer array columns and add some data.
    CREATE TABLE array_tbl ( id integer,
                             array1 integer[],
                             array2 integer[]
                           );
    INSERT INTO array_tbl VALUES
                           ( 1, '{1,2,3,4,5,6,7,8,9}', '{9,8,7,6,5,4,3,2,1}' ),
                           ( 2, '{1,1,0,1,1,2,3,99,8}','{0,0,0,-5,4,1,1,7,6}' );
    
  2. Find the minimum, maximum, mean, and standard deviation of the array1 column.
    SELECT id, madlib.array_min(array1), madlib.array_max(array1),
               madlib.array_min_index(array1), madlib.array_max_index(array1),
               madlib.array_mean(array1), madlib.array_stddev(array1)
    FROM array_tbl;
    
    Result:
    id | array_min | array_max | array_min_index | array_max_index |    array_mean    |   array_stddev
    ----+-----------+-----------+---------------+---------------+------------------+------------------
      1 |         1 |         9 | {1,1}         | {9,9}         |                5 | 2.73861278752583
      2 |         0 |        99 | {0,3}         | {99,8}        | 12.8888888888889 | 32.3784050118457(2 rows)
    
  3. Perform array addition and subtraction.
    SELECT id, madlib.array_add(array1,array2),
              madlib.array_sub(array1,array2)
    FROM array_tbl;
    
    Result:
     id |          array_add           |        array_sub
     ---+------------------------------+-------------------------
      2 | {1,1,0,-4,5,3,4,106,14}      | {1,1,0,6,-3,1,2,92,2}
      1 | {10,10,10,10,10,10,10,10,10} | {-8,-6,-4,-2,0,2,4,6,8}
    (2 rows)
    
  4. Perform element-wise array multiplication and division. The row with id=2 is excluded because the divisor array contains zero, which would cause a divide-by-zero error.
    SELECT id, madlib.array_mult(array1,array2),
               madlib.array_div(array1,array2)
    FROM array_tbl
    WHERE 0 != ALL(array2);
    Result:
     id |         array_mult         |      array_div
     ---+----------------------------+---------------------
      1 | {9,16,21,24,25,24,21,16,9} | {0,0,0,0,1,1,2,4,9}
    (1 row)
    
  5. Calculate the dot product of the arrays.
    SELECT id, madlib.array_dot(array1, array2)
    FROM array_tbl;
    
    Result:
     id | array_dot
     ---+----------
      2 |       745
      1 |       165
    (2 rows)
    
  6. Multiply an array by a scalar 3.
    SELECT id, madlib.array_scalar_mult(array1,3)
    FROM array_tbl;
    
    Result:
     id |     array_scalar_mult
     ---+--------------------------
      1 | {3,6,9,12,15,18,21,24,27}
      2 | {3,3,0,3,3,6,9,297,24}
    (2 rows)
    
  7. Construct a nine-element array with each element set to the value 1.3.
    SELECT madlib.array_fill(madlib.array_of_float(9), 1.3::float);
    
    Result:
                  array_fill
     --------------------------------------
     {1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3}
    (1 row)
    

Related Topics

File array_ops.sql_in for list of functions and usage.