View Online | College Home | PDF Print Version
 
FAU College of Engineering and Computer Science
Announces the Ph.D. Dissertation Defense of
Eric Golinko
for the Degree of Doctor of Philosophy (Ph.D.)
"Generalized Feature Embedding Learning for Clustering and Classification"
 
Student
Fri., June 22
12 p.m.
777 Glades Rd., EE 405
FAU Boca Raton Campus
divider image
DEPARTMENT: Computer and Electrical Engineering and Computer Science

CHAIR OF THE CANDIDATE’S PH.D. COMMITTEE:
Xingquan Zhu, Ph.D.

PH.D. SUPERVISORY COMMITTEE:
Ankur Agarwal, Ph.D.
Mehrdad Nojoumian, Ph.D.
Shihong Huang, Ph.D.
Dingding Wang, Ph.D.
divider image
ABSTRACT OF DISSERTATION
Data comes in many different shapes and sizes. In real life applications it is common that data we are studying has features that are of varied data types. This may include, numerical, categorical, and text. In order to be able to model this data with machine learning algorithms, it is required that the data is typically in numeric form. Therefore, for data that is not originally numerical, it must be transformed to be able to be used as input into these algorithms.

Along with this transformation it is common that data we study has many features relative to the number of samples in the data. It is often desirable to reduce the number of features that are being trained in a model to eliminate noise and reduce time in training. This problem of high dimensionality can be approached through feature selection, feature extraction, or feature embedding. Feature selection seeks to identify the most essential variables in a dataset that will lead to a parsimonious model and high performing results, while feature extraction and embedding are techniques that utilize a mathematical transformation of the data into a represented space. As a byproduct of using a new representation, we are able to reduce the dimension greatly without sacrificing performance. Oftentimes, by using embedded features we observe a gain in performance.

Though extraction and embedding methods may be powerful for isolated machine learning problems, they do not always generalize well. Therefore, we are motivated to illustrate a methodology that can be applied to any data type with little pre-processing. The methods we develop can be applied in unsupervised, supervised, incremental, and deep learning contexts. Using 28 benchmark datasets as examples which include different data types, we construct a framework that can be applied for general machine learning tasks.

The techniques we develop contribute to the field of dimension reduction and feature embedding. Using this framework we make additional contributions to eigendecomposition by creating an objective matrix that includes three main vital components. The first being a class partitioned row and feature product representation of one-hot encoded data. Secondarily, the derivation of a weighted adjacency matrix based on class label relationships. Finally, by the inner product of these aforementioned values, we are able to condition the one-hot encoded data generated from the original data prior to eigenvector decomposition. The use of class partitioning and adjacency enable subsequent projections of the data to be trained more effectively when compared side-to-side to baseline algorithm performance. Along with this improved performance, we are able to adjust the dimension of the subsequent data arbitrarily. In addition, we also show how these dense vectors may be used in applications to order the features of generic data for deep learning. In this dissertation we examine a general approach to dimension reduction and feature embedding that utilizes a class partitioned row and feature representation, a weighted approach to instance similarity, and an adjacency representation. This general approach has application to unsupervised, supervised, online, and deep learning. In our experiments of 28 benchmark datasets, we show significant performance gains in clustering, classification, and training time.
BIOGRAPHICAL SKETCH
  • Born in Brooklyn, New York, U.S.A.
  • B.S., Florida Atlantic University, Boca Raton, Florida, 2009
  • M.S., Florida Atlantic University, Boca Raton, Florida, 2012
  • Ph.D., Florida Atlantic University, Boca Raton, Florida, 2018
CONCERNING PERIOD OF PREPARATION & QUALIFYING EXAMINATION
Time in Preparation: 2016 - 2018

Qualifying Examination Passed: Fall 2015

Published Papers:
Eric Golinko and Xingquan Zhu. "Generalized Feature Embedding for Supervised, Unsupervised, and Online Learning Tasks," Information Systems Frontiers (2018): 1-18. (Impact Factor: 2.521).

Eric Golinko and Xingquan Zhu. "GFEL: Generalized feature embedding learning using weighted instance matching," 2017 IEEE International Conference on Information Reuse and Integration (IRI). San Diego, CA, August 4-7, 2017.

Eric Golinko, Thomas Sonderman, and Xingquan Zhu. "CNFL: Categorical to Numerical Feature Learning for Clustering and Classification," 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), Shenzhen, China, June 26-29, 2017.
 
facebook   twitter   youtube   vimeo
 
 
FAUCollege of Engineering & Computer Science
777 Glades Road, EE 308, Boca Raton, FL 33431-0991info@eng.fau.edu
© 2018 Florida Atlantic University. An Equal Opportunity/Equal Access Institution.