Motivation
Data science is filled with many mundane tasks that take up a majority of the data scientist’s time. Much of this revolves around formatting and transforming data to a form more amenable to learning or inference. Aloha attempts to alleviate this burden by providing a few things:
- a DSL for feature specification based on familiar syntax
- generic models that make use of this DSL
- a pipeline for dataset generation using the same DSL
How does Aloha help?
Oftentimes machine learning libraries and models employ linear-algebraic data structure as their input type. For instance:
In Aloha, models are written generically, and different semantics implementations are provided to give meaning to the features extracted from the arbitrary input types on which the models operate.
Maven Setup
<dependency>
<groupId>com.eharmony</groupId>
<artifactId>aloha-core_2.11</artifactId>
<version>5.2.0</version>
</dependency>