Stats

Boundary Conditions and Anatomy - Correlated Data and Kernel Density Estimation in R

Measurements taken from patient anatomy are often correlated. For example, larger blood vessels might tend to have less curvature. Additionally, data are rarely Gaussian, favoring skewed shapes with some very large values and a lower bound of zero. These properties can make simulation and inference hard. In this post I will walk through a workflow for an engineering problem that might be presented in my industry.

Creating and Using a Simple, Bayesian Linear Model (in brms and R)

This post is my good-faith effort to create a simple linear model using the Bayesian framework and workflow described by Richard McElreath in his Statistical Rethinking book.1 As always - please view this post through the lens of the eager student and not the learned master. I did my best to check my work, but it’s entirely possible that something was missed. Please let me know - I won’t take it personally.

Confounders and Colliders - Modeling Spurious Correlations in R

Like many engineers, my first models were based on Designed Experiments in the tradition of Cox and Montgomery. I hadn’t seen anything like a causal diagram until I picked the The Book of Why which explores all sorts of experimental relationships and structures I never imagined.1 Colliders, confounders, causal diagrams, M-bias - these concepts are all relatively new to me and I want to understand them better. In this post I will attempt to create some simple structural causal models (SCMs) for myself using the Dagitty and GGDag packages and then show the potential effects of confounders and colliders on a simulated experiment adapted from here.

Modeling Particulate Counts as a Poisson Process in R

I’ve never really worked much with Poisson data and wanted to get my hands dirty. I thought that for this project I might combine a Poisson data set with the simple Bayesian methods that I’ve explored before since it turns out the Poisson rate parameter lambda also has a nice conjugate prior (more on that later). Poisson distributed data are counts per unit time or space - they are events that arrive at random intervals but that have a characteristic rate parameter which also equals the variance.

Stopping Rules for Significance Testing in R

When doing comparative testing it can be tempting to stop when we see the result that we hoped for. In the case of null hypothesis significance testing (NHST), the desired outcome is often a p-value of < .05. In the medical device industry, bench top testing can cost a lot of money. Why not just recalculate the p-value after every test and stop when the p-value reaches .05? The reason is that the confidence statement attached to your testing is only valid for a specific stopping rule.

Assessing Design Verification Risk with Bayesian Estimation in R

Suppose our team is preparing to freeze a new implant design. In order to move into the next phase of the PDP, it is common to perform a suite of formal “Design Freeze” testing. If the results of the Design Freeze testing are acceptable, the project can advance from Design Freeze (DF) into Design Verification (DV). DV is an expensive and resource intensive phase culminating in formal reports that are included in the regulatory submission.

Permutation Test for NHST of 2 Samples in R

As engineers, it is not uncommon to be asked to determine whether or not two different configurations of a product perform the same. Perhaps we are asked to compare the durability of a next-generation prototype to the current generation. Sometimes we are testing the flexibility of our device versus a competitor for marketing purposes. Maybe we identify a new vendor for a raw material but must first understand whether the resultant finished product will perform any differently than when built using material from the standard supplier.