Background

The Biggest Loser is an American reality TV show that debuted on NBC on October 19, 2004. The show features obese contestants competing to win a $250,000 cash prize by losing the highest percentage of weight relative to their initial weight. Each season of The Biggest Loser starts with a weigh-in to determine the starting weights of the contestants. The starting weight for each contestant serves as the baseline for determining the overall winner.

The number of contestants on the show varies from one season to another: ranging from 12 contestants in the pilot season and as many as 25 contestants in season 11.

The contestants are usually grouped into teams. Depending on the season, a team may have worked with a specific personal trainer, or all trainers may have worked with all contestants. The personal trainers are responsible (in conjunction with medical personnel retained by the show) for designing comprehensive workout and nutrition plans and teaching them to the contestants. The contestants are individually responsible for implementing the principles taught.

The competition rules vary from season to season. Typically, the team that tallies the lowest overall percentage of weight lost in a week is deemed to be the losing team for that week. The losing team must vote for one member of their team to be eliminated from the competition. Usually, at some point in the competition, the teams are dissolved and everyone competes individually. Each week, the person who has the lowest percentage of weight lost at a weigh-in is the contestant who is eliminated.

The season finale features both the contestants remaining on the show and those sent home early; the latter are brought back for the finale. Those sent home early compete for a smaller “at-home” prize of $100,000, while those still on the show compete for a $250,000 prize and the title of “The Biggest Loser.” (This means all contestants work toward losing weight for the duration of the show, even if they are eliminated earlier in the competition.)

The lengths of the weight loss competitions have varied, but a typical length has been six or seven months from the initial weigh-in to the televised finale.

Your Task

Your task is to use the data in the Biggest Loser data file to answer the following research questions data compiled Mary Richardson and Daniel Adrian:

  1. In season 1, the winner of the show was based on how many total pounds were lost. In season 2, the winner of the show was based on what % of the initial weight was lost. Which do you believe is a better measure? Why?
    1. What other measure might be considered to crown a Biggest Loser?
  2. In terms of the variables in the dataset, what type of contestant is most likely to win a Biggest Loser competition? (i.e. who will have lost the largest proportion of their initial weight by the end of the competition). Possible approaches include:
    1. Consider calculating a summary statistic for the “percentage of weight lost at the finale”" for different subgroups of people in order to make a comparison.
    2. Consider making side-by-side box plots or multiple histograms of “percentage of weight lost at the finale”" to enable comparisons of different subgroups of people.
    3. Consider comparing those who won to those who lost on various attributes.
    4. Be creative.

Provide a paragraph to summarize your findings/conclusion. Include 2 plots and 2 summary statistics to support your conclusion.

Extra/Bonus Question 3. In terms of the variables in the dataset, what type of trainer is most successful? (e.g. who gets better results men trainers vs. women trainers, solo vs team trainers, trainers with lots of experience with contestants vs. relatively little experience, etc.)

Variable Summary

  • Gender: Male, Female
  • Age_Group: < 30, 30 to 39, 40 to 49, 50 to 59, 60 Plus
  • Age: The contestant’s age in years during the show
  • Weight1: The contestant’s starting weight in pounds
  • Weight2: The contestant’s weight at the first weigh-in, usually after the first week
  • Height:The contestant’s height in inches
  • Season:The season of the show the contestant was on
  • Trainer: The trainer of the contestant at the time of the first weigh-in
  • Percent1: The percentage of weight lost at the first weigh-in
  • Percent2: The percentage of weight lost at the finale

Note: The dataset may not contain the data exactly the way you need it. Feel free to add columns to the dataset so that it contains the data coded the way you would like it.

There have been a few contestants who, for various reasons, were not present at the finale. These contestants’ variable values were not included as part of the data set. All contestants to date have participated in the initial weigh-in.

Learning Targets/Skills Practice

  • Use graphical and numerical summaries of datasets to compare sub-populations
  • Decide which numerical summary is most appropriate
  • Practice describing both quantitative and categorical data (and deciding how to treat each -ariable in a dataset)
  • Consider the appropriateness of a response variable, and realize there is often more than -ne measure that can help us answer a research question (i.e. who lost the most weight).
  • Potential to think in multi-variate terms by combining variables (older men) instead of -imply in univariate terms (variable of age and a separate distribution for gender)
  • Potential to create new variables by combining other variables (i.e. create a BMI-index -sing weight, height, gender and age)
  • Potential to think in different data structures (i.e. data nested within seasons)
  • Should consider context outside of what is contained in the dataset. May need to collect -dditional facts/knowledge to help understand data in the dataset (i.e. gender of trainers)
  • Potential to practice recoding data so it is in a more useful format