summaryP {Hmisc} | R Documentation |
summaryP
produces a tall and thin data frame containing
numerators (freq
) and denominators (denom
) after
stratifying the data by a series of variables. A special capability
to group a series of related yes/no variables is included through the
use of the ynbind
function, for which the user specials a final
argument label
used to label the panel created for that group
of related variables.
The plot
method for summaryP
displays proportions as a
multi-panel dot chart using the lattice
package's dotplot
function with a special panel
function. Numerators and
denominators of proportions are also included as text, in the same
colors as used by an optional groups
variable. The
formula
argument used in the dotplot
call is constructed,
but the user can easily reorder the variables by specifying
formula
, with elements named val
(category levels),
var
(classification variable name), freq
(calculated
result) plus the overall cross-classification variables excluding
groups
.
The latex
method produces one or more LaTeX tabular
s
containing a table representation of the result, with optional
side-by-side display if groups
is specified. Multiple
tabular
s result from the presence of non-group stratification
factors.
summaryP(formula, data = NULL, subset = NULL, na.action = na.retain, exclude1=TRUE, sort=TRUE, asna = c("unknown", "unspecified"), ...) ## S3 method for class 'summaryP' plot(x, formula, groups=NULL, xlim = c(-.05, 1.05), text.at=NULL, cex.values = 0.5, key = list(columns = length(groupslevels), x = 0.75, y = -0.04, cex = 0.9, col = trellis.par.get('superpose.symbol')$col, corner=c(0,1)), outerlabels=TRUE, autoarrange=TRUE, ...) ## S3 method for class 'summaryP' latex(object, groups=NULL, file='', round=3, size=NULL, append=TRUE, ...)
formula |
a formula with the variables for whose levels
proportions are computed on the left hand side, and major
classification variables on the right. The formula need to include
any variable later used as |
data |
an optional data frame |
subset |
an optional subsetting expression or vector |
na.action |
function specifying how to handle |
exclude1 |
By default, |
sort |
set to |
asna |
character vector specifying level names to consider the
same as |
x |
an object produced by |
groups |
a character string containing the name of a superpositioning variable for obtaining further stratification within a horizontal line in the dot chart. |
xlim |
|
text.at |
specify to leave unused space to the right of each
panel to prevent numerators and denominators from touching data
points. |
cex.values |
character size to use for plotting numerators and denominators |
key |
a list to pass to the |
outerlabels |
by default if there are two conditioning variables
besides |
autoarrange |
If |
... |
ignored |
object |
an object produced by |
file |
file name, defaults to writing to console |
round |
number of digits to the right of the decimal place for proportions |
size |
optional font size such as |
append |
set to |
summaryP
produces a data frame of class
"summaryP"
. The plot
method produces a lattice
object of class "trellis"
. The latex
method produces an
object of class "latex"
with an additional attribute
ngrouplevels
specifying the number of levels of any
groups
variable.
Frank Harrell
Department of Biostatistics
Vanderbilt University
f.harrell@vanderbilt.edu
bpplotM
, summaryM
,
ynbind
, pBlock
n <- 100 f <- function(na=FALSE) { x <- sample(c('N', 'Y'), n, TRUE) if(na) x[runif(100) < .1] <- NA x } set.seed(1) d <- data.frame(x1=f(), x2=f(), x3=f(), x4=f(), x5=f(), x6=f(), x7=f(TRUE), age=rnorm(n, 50, 10), race=sample(c('Asian', 'Black/AA', 'White'), n, TRUE), sex=sample(c('Female', 'Male'), n, TRUE), treat=sample(c('A', 'B'), n, TRUE), region=sample(c('North America','Europe'), n, TRUE)) d <- upData(d, labels=c(x1='MI', x2='Stroke', x3='AKI', x4='Migraines', x5='Pregnant', x6='Other event', x7='MD withdrawal', race='Race', sex='Sex')) dasna <- subset(d, region=='North America') with(dasna, table(race, treat)) s <- summaryP(race + sex + ynbind(x1, x2, x3, x4, x5, x6, x7, label='Exclusions') ~ region + treat, data=d) # add exclude1=FALSE to include female category plot(s, groups='treat') plot(s, val ~ freq | region * var, groups='treat', outerlabels=FALSE) # Much better looking if omit outerlabels=FALSE; see output at # http://biostat.mc.vanderbilt.edu/HmiscNew#summaryP # See more examples under bpplotM # Make a chart where there is a block of variables that # are only analyzed for males. Keep redundant sex in block for demo. # Leave extra space for numerators, denominators sb <- summaryP(race + sex + pBlock(race, sex, label='Race: Males', subset=sex=='Male') ~ region, data=d) plot(sb, text.at=1.3) plot(sb, groups='region', layout=c(1,3), key=list(space='top'), text.at=1.15) ## Not run: plot(s, groups='treat') # plot(s, groups='treat', outerlabels=FALSE) for standard lattice output plot(s, groups='region', key=list(columns=2, space='bottom')) plot(summaryP(race + sex ~ region, data=d, exclude1=FALSE), col='green') # Make your own plot using data frame created by summaryP useOuterStrips(dotplot(val ~ freq | region * var, groups=treat, data=s, xlim=c(0,1), scales=list(y='free', rot=0), xlab='Fraction', panel=function(x, y, subscripts, ...) { denom <- s$denom[subscripts] x <- x / denom panel.dotplot(x=x, y=y, subscripts=subscripts, ...) })) # Show marginal summary for all regions combined s <- summaryP(race + sex ~ region, data=addMarginal(d, region)) plot(s, groups='region', key=list(space='top'), layout=c(1,2)) # Show marginal summaries for both race and sex s <- summaryP(ynbind(x1, x2, x3, x4, label='Exclusions', sort=FALSE) ~ race + sex, data=addMarginal(d, race, sex)) plot(s, val ~ freq | sex*race) ## End(Not run)