Durability Testing of Stents Using Sensitivity-Based Methods in R

The current industry protocol for durability testing of vascular stents and frames involves testing many implants simultaneously at a range of different stimulus magnitudes (typically strain or stress). The test levels are spread out like a grid across the stimulus range of interest. Each implant is tested to failure or run-out at its specified level and a model is fit to the data using methods similar to those described in Meeker and Escobar.1 An example of a completed durability test of this type is shown below.2 Of interest to the designer is the distribution of stimulus levels that would be expected to cause a failure at a specified cycle count like 400 million.

This methodology works well if you have a large budget and is accepted by FDA. However, blanketing the parameter space with enough samples to fit a robust model can be very costly and requires enough lab equipment to do many tests in parallel. I recently learned that there is another way that has the potential to significantly reduce test burden and allows for some unique advantages. I will attempt to describe it below.

Sensitivity testing represents a different paradigm in which samples are tested sequentially rather than in parallel. In such testing, the outcome of the first test informs the stimulus level of the second test and so on. When an implant fails at a given stimulus, the next part is tested at a lower level. When it passes, the stimulus on the next is increased. In this way, the latent distribution of part strength is converged upon with each additional test providing improved precision.

The key questions that must be answered to execute a sequential sensitivity test:

  • “How to specify the test levels for subsequent runs in a principled and efficient way?”
  • “How to estimate the underlying failure strength distribution when the threshold for failure is not directly assessed on each run?”

Neyer was the first to present a D-Optimal answer to the first question.3 Implementation of the algorithm for determining sequential levels that minimize the test burden and converge on the true underlying strength distribution quickly is not trivial and historically has remained unavailable outside of pricey commercial software.4 Fortunately, generous and diligent researchers have recently created an open source tool in R called “gonogo” which can execute sequential testing according to the D-optimal Neyer algorithm (or its improved direct descendants).5

In the remainder of this post I will show how to use gonogo to determine the underlying strength distribution of hypothetical vascular implant and visualize uncertainty about the estimate.

Load Libraries

library(tidyverse)
library(gt)
library(plotly)
library(ggrepel)
library(here)
library(gonogo)
library(ggalt)

Accessing gonogo Functions

The functions in gonogo can be accessed in 2 ways:

  • Install and load the gonogo package using the instructions listed HERE
  • Download and source the gonogo script from Jeff Wu’s site HERE

I will be using the code sourced from Jeff’s site as the functions are more readily editable.

source(here("gonogoRiley.R"))

Simulating A Sequential Sensitivity Experiment

There are some nice built-in simulation features in gonogo using the gonogoSim function. For our walkthrough though we’ll simulate the data manually to have better visibility of what’s happening under the hood.

Suppose we have a part budget of n=25 stents from manufacturing and our goal is to estimate the underlying distribution of stress amplitude that causes fatigue fracture. First, let’s simulate the failure strength of the 25 parts using a known distribution (which is only available in the simulation). We’ll assume failure stress amplitude for these parts is normally distributed with a mean of 600 MPa and standard deviation of 13 MPa.

Simulate the Strength of Each Part

set.seed(805)
stent_strength_tbl <- tibble(part_strength_mpa = rnorm(n = 25, mean = 600, sd = 13))

stent_strength_tbl %>%
  gt_preview()
part_strength_mpa
1 602.0110
2 585.2761
3 603.3943
4 569.8526
5 601.5328
6..24
25 600.9657

See Test Levels and Input Test Results

The gonogo() function will initiate the test plan for sequential testing that prescribes test levels based on the outcome of each run. This operation happens in the console as shown in the screenshot below.

We’ll use the Neyer option by setting the “test” argument equal to 2. One key feature of sequential testing is that high and low initial guesses for the mean must be provided at the onset of testing along with an initial guess for the standard deviation. These guesses help set the scale. We’ll provide vague guesses that would be reasonable given our domain knowledge of cobalt-based alloys and their fatigue properties.

If we were really running the test on physical parts we’d want to set up a test at the prescribed stimulus level, then report the result to the console by inputing 2 numbers: the level you actually ran the test at (should be the same as what was recommended) and the outcome (1 for failure or 0 for survival) separated by a single space. After the results from the first test are entered, the recommended level of the next test is presented. When all runs are complete, the results are stored in the assigned object.

# z <- gonogo(mlo = 500, mhi = 700, sg = 10, test = 2, reso = 1) #input results in console

The resulting object z is a list that has a lot of information. The following code will pull the run history and clean it up nicely with tools and formatting from gt package.

z$d0 %>%
  gt() %>%
  cols_align("center") %>%
  opt_row_striping(row_striping = TRUE) %>%
  tab_header(
    title = "Run History of Sensitivity Test of n=25 Stents",
    subtitle = "X and EX are Stimulus Levels, Y = Outcome (1 = part failed, 0 = part survived)"
  ) %>%
  cols_width(everything() ~ px(100)) %>%
  tab_style(
    style = list(
      cell_text(weight = "bold")
    ),
    locations = cells_column_labels(columns = everything())
  )
Run History of Sensitivity Test of n=25 Stents
X and EX are Stimulus Levels, Y = Outcome (1 = part failed, 0 = part survived)
X Y COUNT RX EX TX ID
600 0 1 600 600.0000 600 B0
650 1 1 650 650.0000 650 B1
625 1 1 625 625.0000 625 B3
612 1 1 612 612.5000 612 B3
606 1 1 606 606.0000 606 B3
590 0 1 590 589.7220 590 B4
593 0 1 593 592.5733 593 B4
611 0 1 611 610.7146 611 B4
617 1 1 617 616.8731 617 II1
599 1 1 599 599.4648 599 II2
590 0 1 590 590.4793 590 II2
616 1 1 616 615.7384 616 II2
593 1 1 593 592.9877 593 II2
584 0 1 584 583.6982 584 II2
585 0 1 585 585.3147 585 II2
615 1 1 615 615.2118 615 II2
587 0 1 587 587.4391 587 II2
613 1 1 613 613.3351 613 II2
612 1 1 612 612.3336 612 II2
589 0 1 589 589.3771 589 II2
611 1 1 611 611.0667 611 II2
590 0 1 590 590.4844 590 II2
610 1 1 610 610.0713 610 II2
591 0 1 591 591.3679 591 II2
609 1 1 609 609.2816 609 II2

Gonogo has some really convenient built-in visualization tools. The run history can be plotted using gonogo’s ptest function with argument plt=1.

ptest(z, plt = 1)

## NULL

The values we really care about are the estimates of the mean and sd of the latent failure strength distribution. Numerical MLE’s for these parameters can be quickly extracted as follows:

z$musig
## [1] 600.383131   7.894673

We see that these are quite close to the true parameters of mean = 600 MPa, sd = 13 MPa! That really impressive for only 25 parts and vague initial guesses.

I like to occasionally look under the hood and see how the outputs are generated. I don’t always understand it but it helps me learn. In this case, the calculations to extract an estimate are pretty simple. gonogo is using a glm with probit link function. The results are replicated here with manual calculations.

xglm <- glm(Y ~ X, family = binomial(link = probit), data = z$d0)

ab <- as.vector(xglm$coef)
muhat <- -ab[1] / ab[2]
sighat <- 1 / ab[2]

tibble(
  true_parameter = c(600, 13),
  gonogo_estimate = c(muhat, sighat)
) %>%
  mutate(gonogo_estimate = gonogo_estimate %>% round(1)) %>%
  gt()
true_parameter gonogo_estimate
600 600.4
13 7.9

Confidence, Reliability, and Visualization

We just identified the MLE’s for the latent strength distribution. But anytime we’re working with medical devices we have to be conservative and respect the uncertainty in our methods. We don’t want to use the mean of the distribution to describe a design. Instead, we identify a lower quantile of interest and then put a confidence interval on that to be extra conservative.6 gonogo has some great tools and functions for helping us achieve these goals. I prefer to use the tidyverse framework for my data analysis and visualization so I combine the built in functions from gonogo with some helper functions that I wrote to keep things tidy. I present them here with some description of what they do but not much explanation of how they work. Sorry.

I will note however that gonogo provides options for obtaining multiple forms of confidence intervals: Fisher Matrix, GLM, and Likelihood-Based. For additional information on how it calculates each type, refer to THIS DOCUMENTATION. In the tests I ran, GLM and LR based CIs performed almost the same; FM was different. I use GLM below for my purposes.

In the following work, I assume we care about the 95% confidence bound on the .10 quantile. gonogo calculates all intervals as 2-sided, so to get the 95% confidence bound (1-sided) we request the 90% 2-sided limits.

Extract a tibble of 2-sided, pointwise confidence limits about all quantiles

This first function takes the output from the gonogo function and calculates 2-sided .90 confidence bounds on the various quantiles. The lims() function is from gonogo.

# helper function

conf_limits_fcn <- function(tbl, conf = .9) {
  conf_limits_tbl_full <- lims(ctyp = 2, dat = tbl, conf = conf, P = seq(from = .01, to = .99, by = .01)) %>%
    as_tibble() %>%
    rename(
      ciq_lower = V1,
      estimate_q = V2,
      ciq_upper = V3,
      ip_lower = V4,
      estimate_p = V5,
      cip_upper = V6
    )
}

rileysim_conf_tbl <- conf_limits_fcn(z$d0)

rileysim_conf_tbl %>%
  gt_preview()
ciq_lower estimate_q ciq_upper ip_lower estimate_p cip_upper
1 566.0095 582.0174 590.1744 0.000057 0.01 0.127107
2 569.7538 584.1695 591.6789 0.000283 0.02 0.165152
3 572.1145 585.5349 592.6478 0.000714 0.03 0.193273
4 573.8819 586.5620 593.3851 0.001369 0.04 0.216580
5 575.3128 587.3975 593.9914 0.002260 0.05 0.236954
6..98
99 610.9205 618.7489 633.9138 0.884141 0.99 0.999923

Extract a point estimate for the lower 2-sided confidence limit for the .10 quantile.

We present results to FDA using the terminology “confidence and reliability”. By extracting the lower end of the 2-sided 90% confidence band on the .10 quantile, we obtain an estimate of the 95/90 tolerance bound i.e. a bound above which we would expect 90% or more of the population to lie, with 95% confidence. This would be a conservative estimate of the endurance limit for this stent design.

lower_conf_fcn <- function(tbl, percentile) {
  lower_scalar <- tbl %>%
    mutate(qprobs = qnorm(estimate_p)) %>%
    arrange(qprobs) %>%
    filter(estimate_p == percentile) %>%
    mutate(ciq_lower = ciq_lower %>% round(1)) %>%
    pluck(1)

  lower_qnorm <- tbl %>%
    mutate(qprobs = qnorm(estimate_p)) %>%
    arrange(qprobs) %>%
    filter(estimate_p == percentile) %>%
    pluck(7)

  lower_tbl <- tibble(
    estimate_q = lower_scalar,
    estimate_p = percentile,
    qprobs = lower_qnorm
  )

  return(lower_tbl)
}

lower_conf_fcn(rileysim_conf_tbl, percentile = .1) %>% pluck(1)
## [1] 580.2

Plot CDF

Now some plotting functions to help visualize things. First, we use our confidence limit dataframe from above to construct a cdf of failure strength. With some clever use of geom_label_repel() and str_glue() we can embed our data right onto the plot for easy communication. The estimated endurance limit from our test sequence is 580 MPa.

plot_cdf_fcn <- function(tbl, percentile = .1, units = "MPa") {
  tbl %>%
    ggplot(aes(x = ciq_lower, y = estimate_p)) +
    geom_line(size = 1.3) +
    geom_line(aes(x = ciq_upper, y = estimate_p), size = 1.3) +
    geom_line(aes(x = estimate_q, y = estimate_p), linetype = 2) +
    geom_ribbon(aes(xmin = ciq_lower, xmax = ciq_upper), alpha = .1, fill = "purple") +
    geom_point(data = lower_conf_fcn(tbl, percentile = percentile), aes(estimate_q, estimate_p), color = "purple", size = 2) +
    geom_label_repel(
      data = lower_conf_fcn(tbl, percentile = percentile),
      aes(estimate_q, estimate_p),
      label = str_glue("1-sided, 95% Confidence Bound:\n{lower_conf_fcn(rileysim_conf_tbl, percentile = .1) %>% pluck(1)} {units}"),
      fill = "#8bd646ff",
      color = "black",
      segment.color = "black",
      segment.size = .5,
      nudge_y = .13,
      nudge_x = -25
    ) +
    geom_segment(data = tbl %>%
      filter(estimate_p == percentile), aes(x = ciq_lower, xend = ciq_upper, y = estimate_p, yend = estimate_p), linetype = 3) +
    theme_bw() +
    labs(
      x = "Stress (MPa)",
      y = "Percentile",
      caption = "Dotted line is estimate of failure strength distribution\nConfidence bands are 2-sided 90% (i.e. 1-sided 95 for lower)",
      title = "Sensitivity Testing of Stent Durability",
      subtitle = str_glue("placeholder")
    )
}

a <- plot_cdf_fcn(tbl = rileysim_conf_tbl, percentile = .1)

a +
  labs(subtitle = str_glue("Estimates: Mean = {muhat %>% round(1)} MPa, Sigma = {sighat %>% round(1)} MPa"))

Normal Probability Plot

Engineers in my industry have an unhealthy obsession to normal probability plots. Personally I don’t like them, but Minitab presents them by default and engineers in my industry will look for data in this format. As part of this project I taught myself how to convert distributional data from a CFD into a normal probability plot. I show the helper function here and the result.

norm_probability_plt_fcn <- function(tbl, percentile, units = "MPa") {
  conf_tbl <- tbl %>%
    mutate(qprobs = qnorm(estimate_p)) %>%
    arrange(qprobs)

  probs_of_interest <- c(0.01, 0.05, seq(0.1, 0.9, by = 0.1), 0.95, 0.99)

  probs_of_interest_tbl <- conf_tbl %>%
    filter(estimate_p %in% probs_of_interest)

  conf_tbl %>%
    ggplot() +
    geom_line(aes(x = ciq_lower, y = qprobs), size = 1.3) +
    geom_line(aes(x = ciq_upper, y = qprobs), size = 1.3) +
    geom_line(aes(x = estimate_q, y = qprobs), linetype = 2) +
    geom_ribbon(aes(xmin = ciq_lower, xmax = ciq_upper, y = qprobs), alpha = .1, fill = "purple") +
    geom_point(data = lower_conf_fcn(tbl, percentile = percentile), aes(estimate_q, qprobs), color = "purple", size = 2) +
    geom_label_repel(
      data = lower_conf_fcn(tbl, percentile = percentile),
      aes(estimate_q, qprobs),
      label = str_glue("1-sided, 95% Confidence Bound:\n{lower_conf_fcn(rileysim_conf_tbl, percentile = .1) %>% pluck(1)} {units}"),
      fill = "#8bd646ff",
      color = "black",
      segment.color = "black",
      segment.size = .5,
      nudge_y = 1,
      nudge_x = -20
    ) +
    geom_segment(data = conf_tbl %>%
      filter(estimate_p == percentile), aes(x = ciq_lower, xend = ciq_upper, y = qprobs, yend = qprobs), linetype = 3) +
    scale_y_continuous(limits = range(conf_tbl$qprobs), breaks = probs_of_interest_tbl$qprobs, labels = 100 * probs_of_interest_tbl$estimate_p) +
    theme_bw() +
    labs(
      x = "Stress (MPa)",
      y = "Percentile",
      caption = "Dotted line is estimate of failure strength distribution\nConfidence bands are 2-sided 90%",
      title = "Sensitivity Testing of Stent Durability",
      subtitle = str_glue("TBD_2")
    )
}

b <- norm_probability_plt_fcn(rileysim_conf_tbl, percentile = .1)

b +
  labs(subtitle = str_glue("Estimates: Mean = {muhat %>% round(1)} MPa, Sigma = {sighat %>% round(1)} MPa"))

Conclusion and Thanks

There you have it! A way to determine failure strength distributions using principled, sequential testing techniques guided by gonogo. If you’ve made it this far I really appreciate your attention. I would like to thank Paul Roediger for producing this amazing toolkit and for graciously offering your time to help me troubleshoot and work through some challenges on this project earlier this year. I’ll always be grateful to the open source community for making these things possible.

Session Info

sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] ggalt_0.4.0     gonogo_0.1.0    here_1.0.0      ggrepel_0.8.2  
##  [5] plotly_4.9.2.1  gt_0.2.2        forcats_0.5.0   stringr_1.4.0  
##  [9] dplyr_1.0.7     purrr_0.3.4     readr_1.4.0     tidyr_1.1.3    
## [13] tibble_3.1.4    ggplot2_3.3.5   tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] fs_1.5.0           lubridate_1.7.9.2  ash_1.0-15         RColorBrewer_1.1-2
##  [5] httr_1.4.2         rprojroot_2.0.2    tools_4.0.3        backports_1.2.0   
##  [9] bslib_0.3.0        utf8_1.2.2         R6_2.5.1           KernSmooth_2.23-18
## [13] DBI_1.1.0          lazyeval_0.2.2     colorspace_2.0-2   withr_2.4.2       
## [17] tidyselect_1.1.1   compiler_4.0.3     extrafontdb_1.0    cli_3.0.1         
## [21] rvest_0.3.6        xml2_1.3.2         labeling_0.4.2     bookdown_0.21     
## [25] sass_0.4.0         scales_1.1.1       checkmate_2.0.0    proj4_1.0-10      
## [29] digest_0.6.28      rmarkdown_2.11     pkgconfig_2.0.3    htmltools_0.5.2   
## [33] extrafont_0.17     dbplyr_2.0.0       fastmap_1.1.0      highr_0.9         
## [37] maps_3.3.0         htmlwidgets_1.5.4  rlang_0.4.11       readxl_1.3.1      
## [41] rstudioapi_0.13    jquerylib_0.1.4    generics_0.1.0     farver_2.1.0      
## [45] jsonlite_1.7.2     magrittr_2.0.1     Rcpp_1.0.7         munsell_0.5.0     
## [49] fansi_0.5.0        lifecycle_1.0.1    stringi_1.7.4      yaml_2.2.1        
## [53] MASS_7.3-53        grid_4.0.3         crayon_1.4.1       haven_2.3.1       
## [57] hms_0.5.3          knitr_1.34         pillar_1.6.2       reprex_0.3.0      
## [61] glue_1.4.2         evaluate_0.14      blogdown_0.15      data.table_1.14.0 
## [65] modelr_0.1.8       vctrs_0.3.8        Rttf2pt1_1.3.8     cellranger_1.1.0  
## [69] gtable_0.3.0       assertthat_0.2.1   xfun_0.26          broom_0.7.8       
## [73] viridisLite_0.4.0  ellipsis_0.3.2

  1. https://www.wiley.com/en-us/Statistical+Methods+for+Reliability+Data%2C+2nd+Edition-p-9781118115459↩︎

  2. https://www2.lbl.gov/ritchie/Library/PDF/2020%20Cao-Nitinol%20HCF.pdf↩︎

  3. https://en.wikipedia.org/wiki/Neyer_d-optimal_test↩︎

  4. Neyer provided a commercial software called Sentest to facilitate practitioners who wanted to use D-optimal methods without developing their own software↩︎

  5. https://arxiv.org/ftp/arxiv/papers/2011/2011.11177.pdf↩︎

  6. Since we only care about a 1-sided confidence interval, the 1-sided confidence bound on a specified quantile becomes analogous to a tolerance interval↩︎