Skip to contents

A subset of data from the European Bioinformatics Institute Array-Express repository. The BMD data consists of gene expression measurements of 54,675 probes of 84 Norwegian women.

Format

A data frame with 84 rows and 101 columns.

Source

https://www.ebi.ac.uk/biostudies/files/E-MEXP-1618/Normarrayexpressdata.txt.magetab https://www.ebi.ac.uk/biostudies/files/E-MEXP-1618/E-MEXP-1618.sdrf.txt

Details

Given the large number of variables in the dataset, a pre-screening step was implemented to identify the subset of variables that are most correlated with the outcome of interest, the total hip T-score. To accomplish this, we first log-transformed all the predictors and then utilized the robust correlation estimate based on Winsorization. The screened data comprise measurements of $p = 100$ genes from $n = 84$ Norwegian women.

Examples

data(datascreen)