Package 'mvnmle'

Title: ML Estimation for Multivariate Normal Data with Missing Values
Description: Finds the Maximum Likelihood (ML) Estimate of the mean vector and variance-covariance matrix for multivariate normal data with missing values.
Authors: Kevin Gross [aut] , Douglas Bates [aut], Mao Kobayashi [cre]
Maintainer: Mao Kobayashi <[email protected]>
License: GPL (>= 2)
Version: 0.1-11.2
Built: 2024-11-08 06:13:39 UTC
Source: https://github.com/indenkun/mvnmle

Help Index


Worm Infestations in Apple Crops

Description

The apple data frame provides the number of apples (in 100s) on 18 different apple trees. For 12 trees, the percentage of apples with worms (x 100) is also given.

Usage

apple

Format

This data frame contains the following columns:

size

hundreds of apples on the tree.

worms

percentage (x100) of apples harboring worms.

Source

Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.

Cochran, W. G., and Snedecor, G. W. (1972) Statistical Methods, 6th ed. Ames: Iowa State University Press, ISBN:0813815606.

Examples

library(mvnmle)
data(apple)

mlest(apple)

Create likelihood function for multivariate data with missing values.

Description

getclf returns a function proportional to twice the negative log likelihood function for multivariate normal data with missing values. This is a private function used in mlest.

Usage

getclf(data, freq)

Arguments

data

A data frame sorted so that records with identical patterns of missingness are grouped together.

freq

An integer vector specifying the number of records in each block of data with identical patterns of missingness.

Details

The argument of the returned function is the vector of parameters. The parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).

Value

A function proportional to twice the negative log likelihood of the parameters given the data.

References

Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.

See Also

mlest


Obtain starting values for maximum likelihood estimation.

Description

Calculates the starting values to be passed to nlm for minimization of the negative log-likelihood for multivariate normal data with missing values. This function is private to mlest.

Usage

getstartvals(x, eps = 0.001)

Arguments

x

Multivariate data, potentially with missing values.

eps

All eigenvalues of the variance-covariance matrix less than eps times the smallest positive eigenvalue are set to eps times the smallest positive eigenvalue.

Details

Starting values for the mean vector are simply sample means. Starting values for the variance-covariance matrix are derived from the sample variance-covariance matrix, after setting eigenvalues less than eps times the smallest positive eigenvalue equal to eps times the smallest positive eigenvalue to enforce positive definiteness.

Value

A numeric vector, containing the mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor of the adjusted sample variance-covariance matrix, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).

See Also

mlest


Make the upper triangular matrix del from a parameter vector

Description

make.del takes a parameter vector of length k(k+1)/2k*(k+1)/2 and returns the upper triangular k×kk \times k matrix Δ\Delta. make.del is a private function intended for use inside mlest.

Usage

make.del(pars)

Arguments

pars

A length k(k+1)/2k*(k+1)/2 numerical vector giving the elements of Δ\Delta.

Details

The first kk elements of pars are the log of the diagonal elements of Δ\Delta. The next k(k1)/2k*(k-1)/2 elements are the elements above the main diagonal of Δ\Delta, ordered by column (left to right), and then by row within column (top to bottom). That is to say, if Δij\Delta_{ij} is the element in the iith row and jjth column of Δ\Delta, then the order of the parameters is Δ11,Δ22,,Δkk,Δ12,Δ13,Δ23,Δ14,,Δ(k1)k\Delta_{11}, \Delta_{22}, \ldots, \Delta_{kk}, \Delta_{12}, \Delta_{13}, \Delta_{23}, \Delta_{14}, \ldots,\Delta_{(k-1)k}.

Value

An upper triangular k×kk \times k matrix.

References

Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.

See Also

mlest


A multivariate data set with missing values.

Description

The missvals data frame has 13 rows and 5 columns. These are data from Draper and Smith (1966, ISBN:0471221708), and are included to demonstrate Maximum Likelihood (ML) estimation of mean and variance-covariance parameters of multivariate normal data when some observations are missing.

Usage

missvals

Format

This data frame contains the following columns:

x1,x2,x3,x4,x5

numeric vectors

Source

Draper, N. R., and Smith, H. (1966) Applied Regression Analysis. New York: Wiley, ISBN:0471221708.

Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.

Rubin, D. B. (1976) Comparing regressions when some predictor variables are missing. Psychometrika 43, 3–10, doi:10.2307/1267523.

Examples

library(mvnmle)
data(missvals)

mlest(missvals, iterlim = 400)

ML Estimation of Multivariate Normal Data

Description

Finds the Maximum Likelihood (ML) Estimates of the mean vector and variance-covariance matrix for multivariate normal data with (potentially) missing values.

Usage

mlest(data, ...)

Arguments

data

A data frame or matrix containing multivariate normal data. Each row should correspond to an observation, and each column to a component of the multivariate vector. Missing values should be coded by 'NA'.

...

Optional arguments to be passed to the nlm optimization routine.

Details

The estimate of the variance-covariance matrix returned by mlest is necessarily positive semi-definite. Internally, nlm is used to minimize the negative log-likelihood, so optional arguments mayh be passed to nlm which modify the details of the minimization algorithm, such as iterlim. The likelihood is specified in terms of the inverse of the Cholesky factor of the variance-covariance matrix (see Pinheiro and Bates (2000, ISBN:1441903178)).

mlest cannot handle data matrices with more than 50 variables. Each varaible must also be observed at least once.

Value

muhat

Maximum Likelihood Estimation (MLE) of the mean vector.

sigmahat

MLE of the variance-covariance matrix.

value

The objective function that is minimized by nlm. Is is proportional to twice the negative log-likelihood.

gradient

The curvature of the likelihood surface at the MLE, in the parameterization used internally by the optimization algorithm. This parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).

stop.code

The stop code returned by nlm.

iterations

The number of iterations used by nlm.

References

Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.

Pinheiro, J. C., and Bates, D. M. (1996) Unconstrained parametrizations for variance-covariance matrices. Statistics and Computing 6, 289–296, doi:10.1007/BF00140873.

Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.

See Also

nlm

Examples

library(mvnmle)

data(apple)
mlest(apple)

data(missvals)
mlest(missvals, iterlim = 400)

Sort a multivariate data matrix according to patterns of missingness.

Description

mysort sorts a multivariate data matrix so that records with identical patterns of missingness are adjacent to one another. mysort is a private function used inside of mlest.

Usage

mysort(x)

Arguments

x

A multivariate data matrix. Rows correspond to individual records and columns correspond to components of the multivariate vector.

Value

sorted.data

A matrix of the same size as x but with the rows re-arranged so that records with identical patterns of missingness are adjacent to one another.

freq

An integer vector giving the number of records in each block of rows with a unique pattern of missingness. The first element in freq counts the number of rows in the top block of sorted.data, and so on.

See Also

mlest