Title: | ML Estimation for Multivariate Normal Data with Missing Values |
---|---|
Description: | Finds the Maximum Likelihood (ML) Estimate of the mean vector and variance-covariance matrix for multivariate normal data with missing values. |
Authors: | Kevin Gross [aut] , Douglas Bates [aut], Mao Kobayashi [cre] |
Maintainer: | Mao Kobayashi <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1-11.2 |
Built: | 2024-11-08 06:13:39 UTC |
Source: | https://github.com/indenkun/mvnmle |
The apple
data frame provides the number of apples (in 100s) on
18 different apple trees. For 12 trees, the percentage of apples with
worms (x 100) is also given.
apple
apple
This data frame contains the following columns:
hundreds of apples on the tree.
percentage (x100) of apples harboring worms.
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Cochran, W. G., and Snedecor, G. W. (1972) Statistical Methods, 6th ed. Ames: Iowa State University Press, ISBN:0813815606.
library(mvnmle) data(apple) mlest(apple)
library(mvnmle) data(apple) mlest(apple)
getclf
returns a function proportional to twice the negative
log likelihood function for multivariate normal data with missing
values. This is a private function used in mlest
.
getclf(data, freq)
getclf(data, freq)
data |
A data frame sorted so that records with identical patterns of missingness are grouped together. |
freq |
An integer vector specifying the number of records in each block of data with identical patterns of missingness. |
The argument of the returned function is the vector of parameters. The parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).
A function proportional to twice the negative log likelihood of the parameters given the data.
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Calculates the starting values to be passed to nlm
for
minimization of the negative log-likelihood for multivariate normal
data with missing values. This function is private to mlest
.
getstartvals(x, eps = 0.001)
getstartvals(x, eps = 0.001)
x |
Multivariate data, potentially with missing values. |
eps |
All eigenvalues of the variance-covariance matrix less than
|
Starting values for the mean vector are simply sample means. Starting
values for the variance-covariance matrix are derived from the sample
variance-covariance matrix, after setting eigenvalues less than
eps
times the smallest positive eigenvalue equal to eps
times the smallest positive eigenvalue to enforce positive definiteness.
A numeric vector, containing the mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor of the adjusted sample variance-covariance matrix, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).
make.del
takes a parameter vector of length and
returns the upper triangular
matrix
.
make.del
is a private function intended for use inside mlest
.
make.del(pars)
make.del(pars)
pars |
A length |
The first elements of
pars
are the log of the diagonal
elements of . The next
elements are the
elements above the main diagonal of
, ordered by column
(left to right), and then by row within column (top to bottom). That
is to say, if
is the element in the
th row
and
th column of
, then the order of the parameters
is
.
An upper triangular matrix.
Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.
The missvals
data frame has 13 rows and 5 columns.
These are data from Draper and Smith (1966, ISBN:0471221708), and are included to
demonstrate Maximum Likelihood (ML) estimation of mean and variance-covariance parameters of
multivariate normal data when some observations are missing.
missvals
missvals
This data frame contains the following columns:
numeric vectors
Draper, N. R., and Smith, H. (1966) Applied Regression Analysis. New York: Wiley, ISBN:0471221708.
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Rubin, D. B. (1976) Comparing regressions when some predictor variables are missing. Psychometrika 43, 3–10, doi:10.2307/1267523.
library(mvnmle) data(missvals) mlest(missvals, iterlim = 400)
library(mvnmle) data(missvals) mlest(missvals, iterlim = 400)
Finds the Maximum Likelihood (ML) Estimates of the mean vector and variance-covariance matrix for multivariate normal data with (potentially) missing values.
mlest(data, ...)
mlest(data, ...)
data |
A data frame or matrix containing multivariate normal data. Each row should correspond to an observation, and each column to a component of the multivariate vector. Missing values should be coded by 'NA'. |
... |
Optional arguments to be passed to the nlm optimization routine. |
The estimate of the variance-covariance matrix returned by
mlest
is necessarily positive semi-definite. Internally,
nlm
is used to minimize the negative log-likelihood, so
optional arguments mayh be passed to nlm
which modify the
details of the minimization algorithm, such as iterlim
. The
likelihood is specified in terms of the inverse of the Cholesky factor
of the variance-covariance matrix (see Pinheiro and Bates (2000, ISBN:1441903178)).
mlest
cannot handle data matrices with more than 50 variables.
Each varaible must also be observed at least once.
muhat |
Maximum Likelihood Estimation (MLE) of the mean vector. |
sigmahat |
MLE of the variance-covariance matrix. |
value |
The objective function that is minimized by |
gradient |
The curvature of the likelihood surface at the MLE, in the parameterization used internally by the optimization algorithm. This parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom). |
stop.code |
The stop code returned by |
iterations |
The number of iterations used by |
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Pinheiro, J. C., and Bates, D. M. (1996) Unconstrained parametrizations for variance-covariance matrices. Statistics and Computing 6, 289–296, doi:10.1007/BF00140873.
Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.
library(mvnmle) data(apple) mlest(apple) data(missvals) mlest(missvals, iterlim = 400)
library(mvnmle) data(apple) mlest(apple) data(missvals) mlest(missvals, iterlim = 400)
mysort
sorts a multivariate data matrix so that records with
identical patterns of missingness are adjacent to one another.
mysort
is a private function used inside of mlest
.
mysort(x)
mysort(x)
x |
A multivariate data matrix. Rows correspond to individual records and columns correspond to components of the multivariate vector. |
sorted.data |
A matrix of the same size as |
freq |
An integer vector giving the number of records in each
block of rows with a unique pattern of missingness. The first
element in |