The Benjamini-Hochberg Procedure (FDR) And P-Value Adjusted Explained

How to calculate the P-Value Adjusted in R, Python and Excel

George Pipis

--

Photo by Carlos Muza on Unsplash

In this tutorial, we will show you how to apply the Benjamini-Hochberg procedure in order to calculate the False Discovery Rate (FDR) and the P-Value Adjusted.

The Benjamini-Hochberg procedure, also known as the False Discovery Rate (FDR) procedure, is a statistical method used in multiple hypothesis testing to control the expected proportion of false discoveries. In many scientific studies or experiments, researchers test multiple hypotheses simultaneously, but when multiple tests are performed, the probability of obtaining at least one false positive result increases leading to an increased overall Type I error rate.

The Benjamini-Hochberg procedure addresses this issue by controlling the FDR, which is defined as the expected proportion of false positives among the rejected hypotheses.

The calculation of adjusted p-values in the Benjamini-Hochberg procedure involves comparing each individual p-value to a critical value or threshold. The critical value is determined based on the desired false discovery rate (FDR) control.

Here are the steps involved in calculating adjusted p-values using the Benjamini-Hochberg procedure:

  1. Sort the p-values in ascending order.
  2. Assign a rank or position (denoted as “i”) to each p-value based on its position in the sorted list.
  3. Calculate the adjusted p-value for each p-value using the formula: Adjusted p-value 𝑝𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑(𝑖)=𝑚𝑖𝑛{1,𝑚𝑖𝑛𝑗≥𝑖{𝑚𝑝(𝑗)𝑗}} where:
  • 𝑝𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑(𝑖) represents the adjusted p-value for the i-th ranked p-value.
  • 𝑝(𝑗) represents the j-th ranked p-value.
  • m represents the total number of hypotheses being tested. In this formula, for each p-value at rank i, the adjusted p-value is calculated as the minimum value between 1 and the minimum ratio obtained by dividing m (the total number of hypotheses) by j (the rank) for all j greater than or equal to i. The purpose of taking the minimum ratio is to find the smallest value that controls the FDR while considering all hypotheses ranked at…

--

--

George Pipis

Sr. Director, Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/