Leejay Wu

Thesis Title: Automated Modeling and Nonlinear Axis Scaling
Degree Type: Ph.D. in Computer Science
Advisor(s): Christos Faloutsos
Graduated: May 2005

Abstract:

This thesis examines nonlinear axis scaling and its impact on the modeling of inter-attribute relationships. Through automated methods, the described system identifies possible scaling methods; decides which attributes serve as inputs or outputs; and builds regression trees that quantify these relationships. While the experiments focus on the accuracy and complexity of these models, both of which one can attempt to quantitatively examine, the results also consider applicability towards the inherently more qualitative task of rule-based outlier or anomaly detection. The results demonstrate that the use of nonlinear axis scaling, even in an automated system, can provide significantly more accurate models compared to the unscaled case without proportionally higher complexity costs; and also can help reveal unusual tuples in which what is unusual is not any individual value, but the combination thereof.

Thesis Committee:
Christos Faloutsos (Chair)
Anastassia Ailamaki
Andrew Moore
Richard Caruana (Cornell University)

Jeannette Wing, Head, Computer Science Department
Randy Bryant, Dean, School of Computer Science

Keywords:
scaling, modeling, feature selection

CMU-CS-05-145.pdf (2.64 MB) ( 179 pages)
Copyright Notice