Apache MADlib is an open source project that endeavors to adhere in all respects to the principles of The Apache Way.

MADlib grew out of discussions between database engine developers, data scientists, IT architects and academics interested in new approaches to scalable, sophisticated in-database analytics. These discussions were written up in a paper in VLDB 2009 that coined the term “MAD Skills” for data analysis. The MADlib software project began the following year as a collaboration between researchers at UC Berkeley and engineers and data scientists at EMC/Greenplum (later Pivotal).

In September 2015 MADlib was accepted into the Apache Software Foundation Incubator and graduated to a Top Level Project in July 2017.

Some of the past and present participants in this project are:

If you are interested in joining our project please consider joining our User or Developer mailing lists. Everyone is welcome.

Reporting Issues

We need your feedback, so if you find a bug, would like to suggest an improvement, or create a request then please follow the steps below to let us know.

  • Start by logging into MADlib JIRA. If you don’t have an account yet you can create one yourself.
  • To report a bug:
    1. Create a new issue -> Bug.
    2. Fill in the required (and optional) fields.
  • To suggest an improvement or a new feature:
    1. Create a new issue -> New Feature.
    2. Fill in the required (and optional) fields.
  • Submit the issue. You should receive an email confirmation. Thanks for your feedback!

User Resources

User Forum

Our online user forum open to discuss any topics of interest to users of the product.

Developer Resources

Developer Forum

Our online developer forum open to discuss any topics of interest to open source contributors.

Contribution Guide

Step by step instructions guiding you through tho MADlib contribution model

Source Code Repository

Design Documents

There are several resources aimed at helping new developers understand how MADlib is designed