Skip to content

Querying a GraphQL client for linked data using R

Henry Partridge edited this page Oct 19, 2017 · 1 revision

Querying a GraphQL client for linked data using R


The following script relies on the ghql R package (Chamberlain 2017) to query multidimensional QB datasets using GraphQL. The example uses the graphql-qb service at graphql-qb.publishmydata.com which stores data from statistics.gov.scot.


Install the ghql package from GitHub (and devtools if not already installed)
devtools::install_github("hadley/devtools")
devtools::install_github("ropensci/ghql")
Load the necessary R packages
library(ghql) # for querying 
library(jsonlite) # for parsing the json response
library(httr) # for working with URLs
library(tidyverse) # for tidying data
Initialize the GraphQL client by pointing it to the appropriate endpoint e.g. http://graphql-qb.publishmydata.com/graphql
client <- GraphqlClient$new(url = "http://graphql-qb.publishmydata.com/graphql")

No OAuth token is required for this endpoint but the headers argument can be used for this purpose.

Make a Query class object
qry <- Query$new()
Add your GraphQL query, e.g. Filter datasets about gender and return the title and description
qry$query('query', '
  {
    datasets(dimensions: {and: ["http://statistics.gov.scot/def/dimension/gender"]}) {
      title
      description
    }
  }
')
Return the responses
responses <- client$exec(qry$queries$query)
Convert to a dataframe and return column names
df <- as.data.frame(responses)
glimpse(df)
## Observations: 55
## Variables: 2
## $ data.datasets.title       <chr> "Mid-Year Population Estimates (hist...
## $ data.datasets.description <chr> "Mid-year estimates by age and gende...
Change the column names
df <- rename(df, Dataset = data.datasets.title,
             Description = data.datasets.description)

The table below shows the first 6 responses.

Dataset Description
Mid-Year Population Estimates (historical geographical boundaries) Mid-year estimates by age and gender. Higher level geographies are aggregated using 2001 data zones.
Pupil Attainment Number of pupils who attained a given number of qualifications by level and stage.
Earnings Median gross weekly earnings (£s) by gender and workplace/residence measure.
Income Support Claimants Number of income support claimants by age and gender (age split not available for gender).
Healthy Life Expectancy Years of Healthy Life Expectancy (including confidence intervals) by gender
Disability Living Allowance Number of Disability Living Allowance claimants by age group and gender.

References