Skip to contents

epivaultr is an R package used to extract data from EpiVault, the Born in Bradford data warehouse. It can be used for quick grabs of specific variables from specific tables. It can also be used to manage an entire data request end to end, from reading a user’s requested variables, to writing output files ready for shipping.

In this guide we will install epivaultr, connect to EpiVault, and do a quick extract of a couple of variables from a single table.

Installation

epivaultr can be installed from GitHub using the devtools package:

devtools::install_github("BornInBradford/epivaultr")

If a firewall is blocking the download of the source tarball, you may be able to download and install a source zip manually as follows:

  • Navigate to the epivaultr GitHub page
  • Down the right-hand side you’ll see the Releases listed. Click on the Release you wish to install
  • Under Assets click Source code (zip) to download the source zip and save it locally
  • Run the following command, substituting the local path to the source zip:
devtools::install_local("path/to/source/zip")

A quick extract

The script below opens an EpiVault connection, extracts a couple of variables, and disconnects from EpiVault.

library(epivaultr)

con <- ev_connect(ev_server = "BHTS-RESRCH22DV", ev_database = "ResearchWarehouse")

dat <- ev_simple_fetch(con, 
                       project = "BiB_CohortInfo", 
                       table = "ethnicity",
                       variables = c("participant_type", "ethsource"))

ev_disconnect(con)

Bigger data extracts

The ev_simple_fetch() function provides a convenient way to quickly grab a small number of variables. If, however, your project needs a large number of variables from across many tables, you may find it easier to maintain a variables list in a separate file and read the data in using this.

The script below reads a variable request specification from a spreadsheet and extracts the data into a single container called an ev_data container.

# read the requested variables into an `ev_variables` container called `vars`
vars <- read_ev_variables("tests/example_inputs/variables.xlsx")

# extract the data and metadata tables into an `ev_data` container called `dat`
dat <- fetch_ev_data(con, vars)

Retrieving data from a bigger extract

The data in dat can be retrieved using the get_ functions.

Get the names of the data tables

Get a data table

dat_table <- get_ev_data(dat, df_name = "proj1.tab1")

Get metadata

Various types of metadata can also be accessed from the ev_data container.

# get variable metadata
meta1 <- get_ev_metadata(dat, type = "variable")
# get category metadata
meta2 <- get_ev_metadata(dat, type = "category")
# get table metadata
meta3 <- get_ev_metadata(dat, type = "table")

For more information on retrieving data and metadata from an ev_data container, see the Reference section.

For a detailed walk-through of how to run a complete data request see the Data requests article.