## Statistical Analysis of Grouped Data

### Statistical Analysis of Grouped Data

Hi all,

I'm trying to find out if there is a simple way in Mathematica to deal with grouped data. For example a table showing the number of children in different families and the frequency, 3 have 0 children, 5 have 1 child, 7 have 2, 6 have 3, 3 have 4 and 1 has 5. Which I'd write as a list as such:
Code: Select all
`data={{0,3},{1,5},{2,7},{3,6},{4,3},{5,1}}`

If I ask for the mean it just gives me two means, one for each column. I can't find anything under the help for Mean that'll let me treat it differently. Is there an existing function that can treat the second column as the frequencies? Or doI need to create functions to do this myself?

Miles_Ford






### Re: Statistical Analysis of Grouped Data

Miles,
Thanks for the great question.

In Mathematica, this type of frequency data is easily obtained by using the Tally function. Tally works for both numeric and non-numerica lists. Such data is very frequently used in Histograms and determining bin counts. One difficulty with such frequency data is that the order of the original data set is lost.

Now to your question. I've created a short Mathematica notebook that outlines a couple of ways to calculate the Mean for such frequency data. http://download.wolfram.com/?key=5QWR11

The best might be the use of a delay function like the following :

TallyMean[data_List] :=
Total[Table[data[[n, 1]]*data[[n, 2]], {n, 1, Length[data]}]]/
Total[Table[data[[n, 2]], {n, 1, Length[data]}]]

And then evaluate a data set using the delay function:

TallyMean[data_1 ]
Where data_1 is your frequency data set.

Hope that this helps,
Craig

Craig_Bauling






