Motivation

Data science is filled with many mundane tasks that take up a majority of the data scientist’s time. Much of this revolves around formatting and transforming data to a form more amenable to learning or inference. Aloha attempts to alleviate this burden by providing a few things:

  • a DSL for feature specification based on familiar syntax
  • generic models that make use of this DSL
  • a pipeline for dataset generation using the same DSL

How does Aloha help?

Oftentimes machine learning libraries and models employ linear-algebraic data structure as their input type. For instance:

In Aloha, models are written generically, and different semantics implementations are provided to give meaning to the features extracted from the arbitrary input types on which the models operate.

Maven Setup

<dependency>
  <groupId>com.eharmony</groupId>
  <artifactId>aloha-core_2.11</artifactId>
  <version>5.2.0</version>
</dependency>