<< Chapter < Page Chapter >> Page >

Given the following box plot:

A box plot representing values from 0 to 150 with the first quartile at 0, the median at 20, and the third quartile at 100
  • Think of an example (in words) where the data might fit into the above box plot. In 2-5 sentences, write down the example.
  • What does it mean to have the first and second quartiles so close together, while the second to fourth quartiles are far apart?

Below are the gross earning of eleven movies in millions of dollars. Construct an outlier boxplot for the data. Include the five number summary.

Movie Gross earnings in Millions of Dollars
Alone in the Dark 5
Eternal Sunshine 34
Big Fish 66
Collateral 100
Vanilla Sky 101
Last Samurai 111
The Village 114
Break-Up 116
S.W.A.T. 117
DaVinci Code 213
Pirate of the Carribbean (ii) 322

See table and box plot below. There are 2 upper outliers: 213 and 322; Their is one lower outlier: 5

Min 5
Q1 83
Median 11
Q3 116.5
Max 322
IQR 33.5
1.5*IQR 50.25
This is a box plot with outliers adhering to the 5-Number summary given above and and outlier at 5 and at 322.

The box plot and descriptive statistics for the United States Youth Voter turnout below (source: US Census Bureau) is from the 2008 presidential election. Based on the data given answer the following questions:

  1. 62.9% of the Minnesota youth (18-24) voted during the 2008 election. Which quartile contains the Minnesota data?
  2. The US average youth turnout was 48.5%, which quartile contains the US overall average percent youth voter turnout?
  3. The 6 lowest and 6 highest youth turnout states were: AR (31.0), GA (25.5), IA (63.5), ME (54.7), MN (62.9), NH (57.7), OH (57), OK (41.5), TN (41.4), TX (36.6), UT (30.9), WI (57.5).
    1. Are all of these States outliers? If so WHY? If not, are any of them outliers? Be specific.
    2. Are any of the states “far outliers”? If so state which ones and why you believe they are far outliers.
Category Statistic
No. of observations 41
Minimum 25.5000
Maximum 63.5000
1st Quartile 44.1000
Median 49.9000
3rd Quartile 52.500
Mean 48.3488
Variance (n-1) 61.6996
Standard deviation (n-1) 7.8549

Insert Solution Text Here

Santa Clara County, CA, has approximately 27,873 Japanese-Americans. Their ages are as follows. ( Source: West magazine )

Age Group Percent of Community
0-17 18.9
18-24 8.0
25-34 22.8
35-44 15.0
45-54 13.1
55-64 11.9
65+ 10.3
  • Construct a histogram of the Japanese-American community in Santa Clara County, CA. The bars will not be the same width for this example. Why not?
  • What percent of the community is under age 35?
  • Which box plot most resembles the information above?
Three box plots with values between 0 and 100.  Plot i has Q1 at 24, M at 34, and Q3 at 53; Plot ii has Q1 at 18, M at 34, and Q3 at 45; Plot iii has Q1 at 24, M at 25, and Q3 at 54.

The following summary statistics are for the number of pairs of jeans students in your class own. Minimum = 0,  Q1 = 4, median = 5,  Q3 = 6, and maximum = 8 From this we know that

  1. There are no outliers in the data.
  2. There is at least one low outlier in the data.
  3. There is at least one high outlier in the data.
  4. None of the above.

b.

Refer to the following box plots.

Two box plots showing data between 0 and 7.  The Data 1 box plot shows Q1 at 2, M at 4, and Q3 at some unlabeled point greater than 4, while the Data 2 plot shows Q1 at an unlabeled point between 0 and 2, M at 2, and Q3 slightly greater than 2.
  • In complete sentences, explain why each statement is false.
    • Data 1 has more data values above 2 than Data 2 has above 2.
    • The data sets cannot have the same mode.
    • For Data 1 , there are more data values below 4 than there are above 4.
  • For which group, Data 1 or Data 2, is the value of “7” more likely to be an outlier? Explain why in complete sentences

In a recent issue of the IEEE Spectrum , 84 engineering conferences were announced. Four conferences lasted two days. Thirty-six lasted three days. Eighteen lasted four days. Nineteen lasted five days. Four lasted six days. One lasted seven days. One lasted eight days. One lasted nine days. Let X = the length (in days) of an engineering conference.

  • Organize the data in a chart.
  • Find the median, the first quartile, and the third quartile.
  • Find the 65th percentile.
  • Find the 10th percentile.
  • Construct a box plot of the data.
  • The middle 50% of the conferences last from _______ days to _______ days.
  • Calculate the sample mean of days of engineering conferences.
  • Calculate the sample standard deviation of days of engineering conferences.
  • Find the mode.
  • If you were planning an engineering conference, which would you choose as the length of the conference: mean; median; or mode? Explain why you made that choice.
  • Give two reasons why you think that 3 - 5 days seem to be popular lengths of engineering conferences.
  • 4,3,5
  • 4
  • 3
  • A box plot with a whisker between 2 and 3, a solid line at three, a dashed line at 4, a solid line at 5, and a whisker between 5 and 9.
  • 3,5
  • 3.94
  • 1.28
  • 3
  • mode

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Collaborative statistics using spreadsheets. OpenStax CNX. Jan 05, 2016 Download for free at http://legacy.cnx.org/content/col11521/1.23
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Collaborative statistics using spreadsheets' conversation and receive update notifications?

Ask