SurvLIMEpy implements SurvLIME algorithm (Survival Local Interpretable Model-agnostic Explanation), a local interpretable algorithm for Survival Analysis, which was proposed in this paper.
The publication in which we introduce this package will soon be available.
SurvLIMEpy can be installed from PyPI:
pip install survlimepy
from survlimepy import SurvLimeExplainer
from survlimepy.load_datasets import Loader
from sksurv.linear_model import CoxPHSurvivalAnalysis
# Load the dataset
loader = Loader(dataset_name='veterans')
X, events, times = loader.load_data()
# Train a model
train, test = loader.preprocess_datasets(X, events, times)
model = CoxPHSurvivalAnalysis()
model.fit(train[0], train[1])
# Use SurvLimeExplainer class to find the feature importance
training_features = train[0]
training_events = [event for event, _ in train[1]]
training_times = [time for _, time in train[1]]
explainer = SurvLimeExplainer(
training_features=training_features,
training_events=training_events,
training_times=training_times,
model_output_times=model.event_times_,
)
# explanation variable will have the computed SurvLIME values
explanation = explainer.explain_instance(
data_row=test[0].iloc[0],
predict_fn=model.predict_cumulative_hazard_function,
num_samples=500,
)
print(explanation)
# Display the weights
explainer.plot_weights()
Our package can manage multiple types of survival models as long as the functionality that predicts is implemented as a function that takes a vector of size
We choose to ensure the integration of these algorithms with SurvLIMEpy as they are the most predominant in the field. Note that if a new survival package is developed, SurvLIMEpy will support it as long as the output provided by the predict function is a vector of length
Please if you use this package, do not forget to cite us:
@article{PACHONGARCIA2024121620,title = {SurvLIMEpy: A Python package implementing SurvLIME},
journal = {Expert Systems with Applications},
volume = {237},
pages = {121620},
year = {2024},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2023.121620},
url = {https://www.sciencedirect.com/science/article/pii/S095741742302122X},
author = {Cristian Pachón-García and Carlos Hernández-Pérez and Pedro Delicado and Verónica Vilaplana},
keywords = {Interpretable machine learning, eXplainable artificial intelligence, Survival analysis, Machine learning, },
abstract = {In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data. The presented implementation uses a matrix-wise formulation, which allows to speed up the execution time. Additionally, SurvLIMEpy assists the user with visualisation tools to better understand the result of the algorithm. The package supports a wide variety of survival models, from the Cox Proportional Hazards Model to deep learning models such as DeepHit or DeepSurv. Two types of experiments are presented in this paper. First, by means of simulated data, we study the ability of the algorithm to capture the importance of the features. Second, we use three open source survival datasets together with a set of survival algorithms in order to demonstrate how SurvLIMEpy behaves when applied to different models.}
}
SurvLIME: Maxim S. Kovalev, Lev V. Utkin, & Ernest M. Kasimov (2020). SurvLIME: A method for explaining machine learning survival models. Knowledge-Based Systems, 203, 106164.
Cox Proportional Hazards Model: Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202.
Random Survival Forest: Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. doi:10.1214/08-AOAS169
Survival regression with accelerated failure time model in XGBoost: Barnwal, A., Cho, H., & Hocking, T. (2022). Survival Regression with Accelerated Failure Time Model in XGBoost. Journal of Computational and Graphical Statistics, 0(0), 1–11. doi:10.1080/10618600.2022.2067548
DeepHit: Lee, C., Zame, W., Yoon, J., & van der Schaar, M. (2018). DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). doi:10.1609/aaai.v32i1.11842
DeepSurv: Katzman, J., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (02 2018). DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18. doi:10.1186/s12874-018-0482-1
PBC: Therneau, T.M., Grambsch, P.M. (2000). Expected Survival. In: Modeling Survival Data: Extending the Cox Model. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-3294-8_10
Lung: Loprinzi CL. Laurie JA. Wieand HS. Krook JE. Novotny PJ. Kugler JW. Bartel J. Law M. Bateman M. Klatt NE. et al. Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group. Journal of Clinical Oncology. 12(3):601-7, 1994.
UDCA: (1) T. M. Therneau and P. M. Grambsch, Modeling survival data: extending the Cox model. Springer, 2000; (2) K. D. Lindor, E. R. Dickson, W. P Baldus, R.A. Jorgensen, J. Ludwig, P. A. Murtaugh, J. M. Harrison, R. H. Weisner, M. L. Anderson, S. M. Lange, G. LeSage, S. S. Rossi and A. F. Hofman. Ursodeoxycholic acid in the treatment of primary biliary cirrhosis. Gastroenterology, 106:1284-1290, 1994.
Veterans: D Kalbfleisch and RL Prentice (1980), The Statistical Analysis of Failure Time Data. Wiley, New York.
sksurv: Sebastian Polsterl (2020). scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. Journal of Machine Learning Research, 21(212), 1-6.
xgbse: Davi Vieira, Gabriel Gimenez, Guilherme Marmerola, & Vitor Estima. (2020). XGBoost Survival Embeddings: improving statistical properties of XGBoost survival analysis implementation.
pycox: Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and Cox regression. Journal of Machine Learning Research, 20(129):1–30, 2019.
This research was supported by the Spanish Research Agency (AEI) under projects PID2020-116294GB-I00 and PID2020-116907RB-I00 of the call MCIN/ AEI /10.13039/501100011033, the project 718/C/2019 with code 201923-31 funded by Fundació la Marato de TV3 and the grant 2020 FI SDUR 306 funded by AGAUR.