ghyper {SuppDists} | R Documentation |
Density, distribution function, quantile function, random generator and summary function for generalized hypergeometric distributions.
dghyper(x, a, k, N, log=FALSE) pghyper(q, a, k, N, lower.tail=TRUE, log.p=FALSE) qghyper(p, a, k, N, lower.tail=TRUE, log.p=FALSE) rghyper(n, a, k, N) sghyper(a, k, N) tghyper(a, k, N) ## scalar arguments only
x,q,n |
vector of non-negative integer quantities |
p |
vector of probabilities |
a |
vector of real values giving the first column total |
k |
vector of real values giving the first row total |
N |
vector of real values giving the grand total |
log, log.p |
logical vector; if TRUE, probabilities p are given as log(p) |
lower.tail |
logical vector; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x] |
The basic representation is in terms of a two-way table:
x | k-x | k |
a-x | b-k+x | N-k |
a | b | N |
and the associated hypergeometric probability P(x)=choose(a, x) choose(b, k-x) / choose(N, k).
The table is constrained so that rows and columns add to the margins. In all cases x is an integer or zero, but meaningful probability distributions occur when the other parameters are real. Johnson, Kotz and Kemp (1992) give a general discussion.
Kemp and Kemp (1956) classify the possible probability distributions that can occur when real values are allowed, into eight types. The classic hypergeometric with integer values forms a ninth type. Five of the eight types correspond to known distributions used in various contexts. Three of the eight types, appear to have no practical applications, but for completeness they have been implemented.
The Kemp and Kemp types are defined in terms of the ranges of the a, k, and N parameters and are given in ghyper.types
. The function tghyper()
will give details for specific values of a, k, and N.
These distributions apply to many important problems, which has lead to a variety of names:
The Kemp and Kemp types IIA and IIIA are known as:
The advantages of the conditional argument are considerable. Consider a few examples:
Example: Suppose Toronto has won 3 games and Atlanta 1 in the World Series. What is the probability that Toronto will win the series by taking 2 or more of the remaining 3 games?
Example: Suppose that only once in the last century has the high-water mark at the St. Joe bridge exceeded 12 feet, what is the probability that it will not do so in the next ten years?
Example: Suppose a lot of 100 contains 5 defectives. What is the mean number of items that must be inspected before a defective item is found?
Names for Kemp and Kemp type IV are:
One application is accidents:
Suppose accidents follow a Poisson distribution with mean L, and suppose L varies with individuals according to accident proneness, m. In particular, suppose L follows a gamma distribution with parameter r and scale factor m , and that the scale factor n itself follows a beta distribution with parameters A and B, then the distribution of accidents, x, is beta-negative-binomial with a = -B, k = -r , and N = A -1. See Xekalki (1983) for a discussion of this as well as a discussion of accident models for proneness, contagion and spells.
The output values conform to the output from other such functions in R. dghyper()
gives the density, pghyper()
the distribution function and qghyper()
its inverse. rghyper()
generates random numbers. sghyper()
produces a list containing parameters corresponding to the arguments – mean, median, mode, variance, sd, third central moment, fourth central moment, Pearson's skewness, skewness, and kurtosis.
The function tghyper()
returns the hypergeometric type and the range of values for x.
The parameters of these functions differ from those of the hypergeometric functions of R. To translate between the two use the following as a model: phyper(x,m,n,k) = pghyper(x,k,m,m+n).
Bob Wheeler bwheelerg@gmail.com
Johnson, N.L., Kotz, S. and Kemp, A. (1992) Univariate discrete distributions. Wiley, N.Y.
Kemp, C.D., and Kemp, A.W. (1956). Generalized hypergeometric distributions. Jour. Roy. Statist. Soc. B. 18. 202-211.
Xekalaki, E. (1983). The univariate generalized Waring distribution in relation to accident theory: proneness, spells or contagion. Biometrics. 39. 887-895.
tghyper(a=4, k=4, N=10) ## classic tghyper(a=4.1, k=5, N=10) ## type IA(i) Real classic tghyper(a=5, k=4.1, N=10) ## type IA(ii) Real classic tghyper(a=4.2, k=4.6, N=12.2) ## type IB tghyper(a=-5.1, k=10, N=-7) ## type IIA tghyper(a=-0.5, k=5.9, N=-0.7) ## type IIB tghyper(a=10, k=-5.1, N=-7) ## type IIIA Negative hypergeometric tghyper(a=5.9, k=-0.5, N=-0.7) ## type IIIB tghyper(a=-1, k=-1, N=5) ## type IV Generalized Waring sghyper(a=-1, k=-1, N=5) plot(function(x)dghyper(x,a=-1,k=-1,N=5),0,5) #Fisher's exact test: contingency table with rows (1,3),(3,1) pghyper(1,4,4,8) pghyper(3,4,4,8,lower.tail=FALSE) #Beta-binomial applications: #Application examples: tghyper(-4,3,-6) pghyper(2,-4,3,-6,lower=FALSE) pghyper(0,-2,10,-101) sghyper(-1,95,-6)$Mean+1