Efficient estimation of smooth distributions from coarsely grouped data.
Bottom Line: Optimal values of the smoothing parameter are chosen by minimizing Akaike's Information Criterion.We demonstrate the performance of this method in a simulation study and provide several examples that illustrate the approach.Wide, open-ended intervals can be handled properly.
Related in: MedlinePlus
Mentions: Consider a sequence of values a1, …, aJ (i.e., ages a1 = 0, a2 = 1, …) and let γj, j = 1, …, J be the corresponding expected counts that constitute the distribution of the values aj. In a sample of size N, each γj = Npj, where pj is the probability of the value aj. If the sample were not grouped, then the number of observations at the aj would follow Poisson distributions with means γj (13). However, although we would like to estimate the distribution γ = (γ1, …, γJ)T, the actually observed grouped counts yi are realizations from Poisson variables Yi, i = 1, …, I, whose expected values μi = E(Yi) result from grouping the original distribution γ into I < J bins (Figure 1). Each of the μi results from a sum of those values γj that contribute to bin i of the histogram, and the observed counts have the probability If we combine the μi into a vector μ = (μ1, …, μI)T, we can write μ = Cγ, with C being the I × J composition matrixC=1⋯10⋯⋯⋯⋯00⋯01⋯10⋯0⋮⋮⋮0⋱00⋮⋮0⋯00⋯01…1.Figure 1.