Statistical Machine Learning: Big p, Big n and Complex Data   March 20 , 2015, 11AM

302- 309

 

Abstract:

 

Modern statistics deals with complex data especially where the ambient dimension of the problem size p may be of the same order as, or even substantially larger than the sample size n. It has now become well understood that even in this type of high-dimensional scaling, statistically consistent estimators can be achieved provided one imposes structural constraints on the statistical models.

 

Graphical model is a key toolkit for modeling such multivariate high-dimensional data. However, the popular instances of graphical models still are not best suited for some data (for example, count or skewed data) especially in recently emerging "big-data" in genomics, social networking and economics to name a few. I will show how we can extend the classical modeling toolkit from univariate distributions to multivariate graphical models for various types of data (even all mixed case).

 

And if time allows, I will briefly discuss other interesting issues such as closed-form estimation and/or robust estimation for high-dimensional data.