Student Research Project

 

 

Title: Nonlinear multiregressions for data with both numerical and categorical attributes.

 

 

Adviser: Zhenyuan Wang

 

 

Description: Based on Choquet integrals with respect to generalized fuzzy measures, a model of nonlinear multiregression that can catch the interaction among mixed-type predictive attributes toward the objective attribute has been established recently.

 

There are three aspects in the above work that can be improved: (1) Using a signed fuzzy measure to replace the generalized fuzzy measure such that the regression can more precisely describe the relation among the objective attribute and the predictive attributes. (2) To reduce the complexity of the genetic algorithm that is used to search the optimal estimation of the regression coefficients, taking a part on the unknown regression coefficients, the values of the signed fuzzy measure, out from the chromosome involved in the genetic algorithm. (3) Optimally projecting the states of the categorical attribute(s) into a partial ordered space, but not into a total ordered space as done in the previous work, to “numericalize” the categorical attribute(s) when there are more than two states for a predictive attribute.

 

 

References:

 

[1]  D. E. Goldberg, Genetic Algorithm in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA (1989).

[2]  Z. Wang, A new genetic algorithm for nonlinear multiregressions based on generalized Choquet integrals, Proc. FUZZ-IEEE2003, 819-821.

[3]  Z. Wang and G. J. Klir, Fuzzy Measure Theory, Plenum Press, New York, 1992.

[4]  K. Xu, Z. Wang, M. L. Wong, and K. S. Leung, Discover dependency pattern among attributes by using a new type of nonlinear multiregression, International Journal of Intelligent Systems 16 (2001), 949-962.

 

 

Prerequisites: MATH 4300/8306, Math 4310/8316, MATH 8520/9110, programming language (i.e., C++).

 

 

Requirements: Constructing a mathematical model. Develop the relevant algorithm and programming it. Running the program(s) for testing data. Completing a research paper on this topic that can be submitted to some international conference or international journal before April 2005. Presenting the paper at the MAM in the spring of 2005.