After clustering has been performed, we can evaluate the clusters in a few different ways.
Use CompareClusters.Rmd to compute aggregate gene expression per cluster using known marker genes and to render a dot plot.
Alternatively, query gene_expression_by_cluster.sql can be run manually in the BigQuery Web UI, via the bq command line tool, etc... Just edit anything you see in {{ JINJA MARKUP }} to be the actual tables or values you want to use.
BigQuery supports a wide range of functions and operators. It also supports JavaScript user-defined functions.
Testing_R_vs_JavaScript.Rmd demonstrates one way to test mathematical results generated via BigQuery against those from an alternate evironment such as R.
The specific example we demonstrate here is the differential expression implemented in https://github.com/broadinstitute/BipolarCell2016/blob/master/BCanalysis.pdf ported to BigQuery. To run it:
(1) Upload binomial_distribution.js to a Google Cloud Storage bucket.
(2) Use DifferentialExpression.Rmd to compute differential expression of one particular cluster compared to all others. It will materialize the result of differential_expression_one_vs_the_rest.sql to a new table.
To run the BigQuery query integration tests:
- Install the test
framework
via
pip install git+https://github.com/verilylifesciences/[email protected]
- and then run the test like so
python cell_metrics_test.py