Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding ood run and predicted output #37

Open
piyushjo15 opened this issue Oct 23, 2024 · 3 comments
Open

Question regarding ood run and predicted output #37

piyushjo15 opened this issue Oct 23, 2024 · 3 comments

Comments

@piyushjo15
Copy link

Hi,

Not an issue a but few questions regarding processing and output.

I am using cellout on scgen embedding using sciplex3 data. Basically I want to predict for an unseen single-cell tumor data, what drug can make it more closer to normal cell type. These tumors are not like the cancer type used in sciplex3 data, so i am sure there is huge issue with correct predictions. However, I still want to give it a try.

  1. I have 5 samples, but I cannot figure out for how to make predictions for each sample separately, without training the data again. Currently I am combining all the data together. This results in total cells for holdout sample to be ~20k.
  2. When I say test=0.6, does it mean 60% of holdout sample is used for perturbation prediction? But it doesn't seem so. How can I predict perturbed cells-state for more cells?
  3. Is it possible to run scGen part only to obtain the embedding rather than running the entire training process using scgen model as it is quite time consuming and I don't need scgen output.
  4. I am obtaining the prediction in data_space, and hence getting a matrix of predicted expression, which I am hoping are log transformed. However, I see a lot of negative values. Shall I make expression value less than 0 to 0? Is it z-scored?

Thanks for your help!

@JeanRadig
Copy link

Hey @piyushjo15, could you let me know how you managed to run the cellOT on scGen's embeddings? I am not sure how to modify the config files/change the command lines for this purpose. Thank you! :)

@piyushjo15
Copy link
Author

Hi @JeanRadig ..not sure if this is helpful but I ran my cellot like this. I wish the main developer could answer but will not count on it.

I was basically interested in identifying how the drugs will impact unseen sample so OOD model


echo "running CellOT for drug: $drug in ood mode"
mode="ood"
$CELLOT ## this is the location of the cellot github folder

echo "first generating scGen latent representation..."

python $CELLOT/scripts/train.py \
--outdir $CELLOT/results/sciplex3PA/${drug}/model-scgen \
--config $CELLOT/configs/tasks/sciplex3-PA-ood.yaml \
--config $CELLOT/configs/models/scgen.yaml \
--config.data.target $drug \
--config.datasplit.mode $mode

I chose the $drug and defined "ood" in $mode. I created the sciplex3-PA-ood.yaml very similar to provided example crossspecies-ood.yaml.

then I ran the cellot using the scgen embeddings

echo "now running CellOT model..."

python $CELLOT/scripts/train.py \
--outdir $CELLOT/results/sciplex3PA/${drug}/model-cellot \
--config $CELLOT/configs/tasks/sciplex3-PA-ood.yaml \
--config $CELLOT/configs/models/cellot.yaml \
--config.data.target $drug \
--config.datasplit.mode $mode \
--config.data.ae_emb.path $CELLOT/results/sciplex3PA/${drug}/model-scgen

--config.data.ae_emb.path this defines the embedding path
then predicting changed expression

echo "now running prediction for CellOT model.."

python $CELLOT/scripts/evaluate.py --outdir $CELLOT/results/sciplex3PA/${drug}/model-cellot --setting ood --where data_space

I was able to obtain results, but the predicted expression data had negative values. Since I didn't get much help from the developer to see if I can really use it for my data, I gave up on trying it further.

@JeanRadig
Copy link

Thank you very much for the indications!
Concerning the negative values, I am not sure whether this is best practice, but I saw some people setting negative values to zero. They then continued with downstream analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants