Skip to content

Commit

Permalink
Add details on controlling pathway clustering to the vignette
Browse files Browse the repository at this point in the history
Quietted some function calls
  • Loading branch information
willgryan committed Feb 2, 2024
1 parent 7a284a6 commit fcffec2
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 6 deletions.
4 changes: 2 additions & 2 deletions R/generate_themes.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ generate_themes <-
clust = stats::hclust(D_sim, method = hclust_method)

#Dynamic tree cut to generate clusters
clustering = dynamicTreeCut::cutreeDynamic(clust, distM = as.matrix(D_sim), ...) %>%
clustering = dynamicTreeCut::cutreeDynamic(clust, distM = as.matrix(D_sim), verbose = 0, ...) %>%
purrr::set_names(clust$labels) %>%
tibble::enframe(name = "UniqueID", value = "Cluster") %>%
dplyr::mutate(Cluster = as.factor(.data$Cluster)) %>%
dplyr::inner_join(PAVER_result$embedding_mat)
dplyr::inner_join(PAVER_result$embedding_mat, by = "UniqueID")

#Average the embeddings within each cluster
avg_cluster_embeddings = clustering %>%
Expand Down
2 changes: 1 addition & 1 deletion R/prepare_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ prepare_data <- function(input, embeddings, term2name) {
#Generate an embedding table keyed by unique pathway IDs
embedding_mat = embeddings[prepared_data$GOID,] %>%
magrittr::set_rownames(prepared_data$UniqueID) %>%
tibble::as_tibble(rownames = "UniqueID", .name_repair = "universal")
tibble::as_tibble(rownames = "UniqueID", .name_repair = "universal_quiet")

#Compute the UMAP of the embedding matrix
custom.config = umap::umap.defaults
Expand Down
15 changes: 12 additions & 3 deletions vignettes/PAVER.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,23 @@ embeddings = readRDS(url("https://github.com/willgryan/PAVER_embeddings/raw/main
term2name = readRDS(url("https://github.com/willgryan/PAVER_embeddings/raw/main/2023-03-06/term2name_2023-03-06.RDS"))
PAVER_result = prepare_data(input, embeddings, term2name)
```

# Identifying and Naming Pathways Clusters

After preparing your data, PAVER can generate a set of pathway clusters and identify the most representative pathway (theme) for each cluster. The following code chunk demonstrates how to generate pathway clusters using the example data provided in the PAVER package. To control the minimum number of pathways in a cluster, we pass the `minClusterSize` argument to (dynamicTreeCut)[https://cran.r-project.org/package=dynamicTreeCut].
After preparing your data, PAVER can generate a set of pathway clusters and identify the most representative pathway (theme) for each cluster. The following code chunk demonstrates how to generate pathway clusters using the example data provided in the PAVER package. To constrain the pathway clustering, we pass the following arguments to (dynamicTreeCut)[https://cran.r-project.org/package=dynamicTreeCut]. Increasing `minClusterSize` will result in fewer clusters, while increasing `maxCoreScatter` will result in more clusters.
<!-- https://stackoverflow.com/questions/19734381/cutting-dendrogram-into-n-trees-with-minimum-cluster-size-in-r -->
```{r}
PAVER_result = generate_themes(PAVER_result, minClusterSize = 40)
minClusterSize = 5
maxCoreScatter = 0.33
minGap = (1 - maxCoreScatter) * 3 / 4
PAVER_result = generate_themes(
PAVER_result,
maxCoreScatter = maxCoreScatter,
minGap = minGap,
minClusterSize = minClusterSize
)
#
```

# Visualization
Expand Down

0 comments on commit fcffec2

Please sign in to comment.