Boston University Libraries OpenBU
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    •   OpenBU
    • Theses & Dissertations
    • Boston University Theses & Dissertations
    • View Item
    •   OpenBU
    • Theses & Dissertations
    • Boston University Theses & Dissertations
    • View Item

    Some methods for robust inference in econometric factor models and in machine learning

    Thumbnail
    Date Issued
    2014
    Author(s)
    Nikolaev, Nikolay Ivanov
    Share to FacebookShare to TwitterShare by Email
    Export Citation
    Download to BibTex
    Download to EndNote/RefMan (RIS)
    Metadata
    Show full item record
    Permanent Link
    https://hdl.handle.net/2144/14265
    Abstract
    Traditional multivariate statistical theory and applications are often based on specific parametric assumptions. For example it is often assumed that data follow (nearly) normal distribution. In practice such assumption is rarely true and in fact the underlying data distribution is often unknown. Violations of the normality assumption can be detrimental in inference. In particular, two areas affected by violations of assumptions are quadratic discriminant analysis (QDA), used in classification, and principal component analysis (PCA), commonly employed in dimension reduction. Both PCA and QDA involve the computation of empirical covariance matrices of the data. In econometric and financial data, non-normality is often associated with heavy-tailed distributions and such distributions can create significant problems in computing sample covariance matrix. Furthermore, in PCA non-normality may lead to erroneous decisions about numbers of components to be retained due to unexpected behavior of empirical covariance matrix eigenvalues. In the first part of the dissertation, we consider the so called number-of-factors problem in econometric and financial data, which is related to the number of sources of variations (latent factors) that are common to a set of variables observed multiple times (as in time series). The approach that is commonly used in the literature is the PCA and examination of the pattern of the related eigenvalues. We employ an existing technique for robust principal component analysis, which produces properly estimated eigenvalues that are then used in an automatic inferential procedure for the identification of the number of latent factors. In a series of simulation experiments we demonstrate the superiority of our approach compared to other well-established methods. In the second part of the dissertation, we discuss a method to normalize the data empirically so that classical QDA for binary classification can be used. In addition, we successfully overcome the usual issue of large dimension-to-sample-size ratio through regularized estimation of precision matrices. Extensive simulation experiments demonstrate the advantages of our approach in terms of accuracy over other classification techniques. We illustrate the efficiency of our methods in both situations by applying them to real datasets from economics and bioinformatics.
    Collections
    • Boston University Theses & Dissertations [6914]


    Boston University
    Contact Us | Send Feedback | Help
     

     

    Browse

    All of OpenBUCommunities & CollectionsIssue DateAuthorsTitlesSubjectsThis CollectionIssue DateAuthorsTitlesSubjects

    Deposit Materials

    LoginNon-BU Registration

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Boston University
    Contact Us | Send Feedback | Help