Project Outline

This project outline and background information have been provided to assist you as you complete your project. You should assume the reader of your work has no knowledge or access to this information.

Phillips LED Bulbs - source

Measurements have uncertainty or variation. How can we model that uncertainty and use those models to answer questions? In this project, we’ll be fitting the data to stochastic models $f(L)$ that give the probability information about different lumen output of LED bulbs (as a percent of the initial lumens).

Task 1: Background and Data

Create an R Markdown file.

Use the following code to read in the light bulb data.

#Uncomment and run the line below once in the console to get the devtools package.
  #install.packages("devtools")

#Uncomment and run the line below once in the console to get the data4led package.
  #devtools::install_github("byuidatascience/data4led") 

#Use the code below to load the data4led package to your current R session.
library(data4led)

#Use the code below to load the data for all the bulbs at a time near 2100 hours.
dist <- led_time(2100)

This code creates a data frames is called “dist”. The dist data frame contains measurements for 202 light bulbs 2104 hours after the the light bulb was turned on.

The data frame named “dist” includes the columns (1) “id”, the identification number for your randomly selected bulb, (2) “hours”, the number of hours since the bulb has turned on, (3) “intensity”, the lumen output of the bulb, and (4) “percent_intensity”, the bulb intensity as a percent of the original lumen.

The stochastic models we will be fitting give the probability information for intensity measured as a percent of the initial brightness.

Use the hist() command in R to create a density histogram for the light bulb intensity as a percent of initial lumens. Use the dist data frame.
Organize your work into a cohesive analysis.

Task 2: Models and Parameter Exploration

Create a new R Markdown file.
Consider the following general models
- $f_0(L; a,b) = \frac{1}{b-a}$ with $-\infty < a < L < b < \infty$ (and 0 otherwise)
- $f_1(L; h,a) = \frac{1}{\sqrt{2\pi a}}e^{-\frac{(x-h)^2}{2a}}$ with $h > 0$ , $a > 0$ , and $-\infty < L < \infty$
- $f_2(L; h,a,b) = \frac{b^a}{\Gamma(a)}(L-h)^{a-1}e^{-b(L-h)}$ with $a > 0$ , $b >0$ , and $L \geq h$ (and 0 otherwise)
For each function, use the Desmos files below with sliders to dynamically explore how changing the parameter changes the behavior of the function.
For each function, observe how changing the parameters changes the behavior of the function (or model). Try to summarize your observations in terms of transformations of functions (shifts, reflections, stretch) and the mathematical behavior of the function (increasing, decreasing, constant, positive, negative, nonnegative). Identify interesting parameter values (or ranges of values) where the behavior of function is different.
Select one of the functions to describe. In your narrative summarize your observations about the parameters in terms of transformations of functions (shifts, reflections, stretch) and the mathematical behavior of the function (increasing, decreasing, constant, positive, negative, nonnegative). Use plot() commands in R to plot several representative curves illustrating what you learned in your parameter exploration. Use the par(mfrow()) command to organize your plots into one figure.
Organize your work into a cohesive analysis and submit it to Canvas.

Task 3 Fit the Models (“Visual” Method)

Create a new R Markdown file.
For each model, $f_i(L)$ , pick parameter values to find a visual fit of the model to the data.
Use the hist() and lines() commands in R to plot your each fitted function and the density histogram of the data together. You should have 3 plots, each with one fitted function and the data. Make sure the viewing window for each plot is xlim = c(95,105) and ylim = c(0,1).
Organize your work into a cohesive analysis and submit it to Canvas.

Task 4: Use the Fitted Models to Answer Questions

Create a new R Markdown file.
Use the runif(), rnorm(), and rgamma() commands in R with the parameter values you found when you fit the distribution to the data to create a sample of 5000 random measurements using each fitted model.
Use your simulated sample to approximate how many light bulbs out of 5000 will have intensity between 100.2533% and 102.6927% of their initial intensity after 2104 hours.
Calculate the probability that a light bulb will have an intensity between 100.2533% and 102.6927% of its initial intensity after 2104 hours.
Describe in 4-6 sentences how the information you get from the data depends on the general model you assume. Why is this an important concept to understand when working with models and data?
Are any of your fitted models inconsistent with the story the data tells in the density histogram of the light bulb intensities?
- If a fitted model is inconsistent with what we know about a situation, it is suspect and suggests further work should be done before trusting that model or it should not be used as a model in that situation.
Organize your work into a cohesive analysis and submit it to Canvas.

Project: Bringing it All Together

Create a new R Markdown file.
Use this code to read in the data for when the bulbs have been on for approximately 1 day (24 hours).
```
dist1 <- led_time(24)
```
- Visually fit $f_1(L)$ and $f_2(L)$ , to this data.
- Calculate the probability that a light bulb will have an intensity between 99.31491% and 101.3705% of its initial intensity after approximately a day.
Use this code to read in the data for when the bulbs have been on for approximately 1 month (720 hours).
```
dist2 <- led_time(720)
```
- Visually fit $f_1(L)$ and $f_2(L)$ , to this data.
- Calculate the probability that a light bulb will have an intensity between 99.95945% and 102.1952% of its initial intensity after approximately a month.
Use this code to read in the data for when the bulbs have been on for approximately 6 month (4320 hours).
```
dist4 <- led_time(4320)
```
- Visually fit $f_1(L)$ and $f_2(L)$ , to this data.
- Calculate the probability that a light bulb will have an intensity between 100.0117% and 102.499% of its initial intensity after approximately 6 months.
Describe in 2-4 sentences how a deterministic model is different than a stochastic model.
Organize your work from Tasks 1-5 into a cohesive analysis and submit it to Canvas. Use only the important and relevant plots, calculations, and information.
- Begin with background and an introduction to the question(s) you will be answering with the light bulb data.
Reflect on your work for this project. At the bottom of your report include the following in a brief (1-2 paragraph) reflection.
- Identify/explain 2-3 key mathematical ideas you learned (and would like to remember).
- Identify/explain 1-3 soft skills you needed/improved/learned while working on the project.
  - List of some Soft Skills
    - Dedication
    - Following Directions
    - Motivation
    - Self-directed
    - Organization
    - Planning
    - Time Management
    - Willing to Accept Feedback
    - Perseverance
    - Good attitude
    - Meets deadlines
    - Willingness to learn

Project – Class Example