-
Notifications
You must be signed in to change notification settings - Fork 364
Home
Welcome to the CodeXGLUE wiki!
This repo includes datasets and codes for CodeXGLUE, a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark for CODE. It includes 14 datasets for 10 diversified programming language tasks covering code-code (clone detection, defect detection, cloze test, code completion, code refinement, and code-to-code translation), text-code (natural language code search, text-to-code generation), code-text (code summarization) and text-text (documentation translation) scenarios. We provide three baseline models for each task, including BERT-style pre-trained model (i.e. CodeBERT) which is, GPT-style pre-trained model which we call CodeGPT, and Encoder-Decoder framework which can most of which are based on large pre-trained models like CodeBERT.