Merge pull request #9 from ml-stat-Sustech/development

update scores
ml-stat-Sustech · Dec 24, 2023 · 3416992 · 3416992
2 parents 6f0c70e + df02ac3
commit 3416992
Show file tree

Hide file tree

Showing 23 changed files with 170 additions and 202 deletions.
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -2,7 +2,7 @@ name: Publish Python 🐍 distributions 📦 to PyPI
 
 
 on:
-# automatically running github actions when push a tag
+  # automatically running github actions when push a tag
   push:
     tags:
       - '*'
@@ -20,27 +20,27 @@ jobs:
       id-token: write
       contents: read
     steps:
-    - uses: actions/checkout@master
-    - name: Set up Python 3.10
-      uses: actions/setup-python@v3
-      with:
-        python-version: '3.10'
-    - name: Install pypa/setuptools
-      run: >-
-        python -m
-        pip install wheel
-        pip install readme_renderer[md]
-    - name: Build a binary wheel
-      run: >-
-        python setup.py sdist bdist_wheel
-#    - name: Publish distribution 📦 to TestPyPI
-#      uses: pypa/gh-action-pypi-publish@release/v1
-#      with:
-#        user: __token__
-#        password: ${{ secrets.jianguo_test_pypi_password }}
-#        repository_url: https://test.pypi.org/legacy/
-    - name: Publish distribution 📦 to PyPI
-      uses: pypa/gh-action-pypi-publish@release/v1
-      with:
-        user: __token__
-        password: ${{ secrets.jianguo_pypi_password }}
+      - uses: actions/checkout@master
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v3
+        with:
+          python-version: '3.10'
+      - name: Install pypa/setuptools
+        run: >-
+          python -m
+          pip install wheel
+          pip install readme_renderer[md]
+      - name: Build a binary wheel
+        run: >-
+          python setup.py sdist bdist_wheel
+      #    - name: Publish distribution 📦 to TestPyPI
+      #      uses: pypa/gh-action-pypi-publish@release/v1
+      #      with:
+      #        user: __token__
+      #        password: ${{ secrets.jianguo_test_pypi_password }}
+      #        repository_url: https://test.pypi.org/legacy/
+      - name: Publish distribution 📦 to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          user: __token__
+          password: ${{ secrets.jianguo_pypi_password }}
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -4,9 +4,11 @@ Thank you considering contributing to TorchCP!
 
 This document provides brief guidelines for potential contributors.
 
-Please use pull requests for new features, bug fixes, new examples, etc. If you work on something with significant efforts, please mention it in early stage using issues.
+Please use pull requests for new features, bug fixes, new examples, etc. If you work on something with significant
+efforts, please mention it in early stage using issues.
 
-We ask that you follow the `PEP8` coding style in your pull requests, [`flake8`](http://flake8.pycqa.org/) is used in continuous integration to enforce this.
+We ask that you follow the `PEP8` coding style in your pull requests, [`flake8`](http://flake8.pycqa.org/) is used in
+continuous integration to enforce this.
 
 ---
 

diff --git a/README.md b/README.md
@@ -1,11 +1,15 @@
-TorchCP is a Python toolbox for conformal prediction research on deep learning models, using PyTorch. Specifically, this toolbox has implemented some representative methods (including posthoc and training methods) for
-classification and regression tasks. We build the framework of TorchCP based on [`AdverTorch`](https://github.com/BorealisAI/advertorch/tree/master). This codebase is still under construction. Comments, issues, contributions, and collaborations are all welcomed! 
-
-
+TorchCP is a Python toolbox for conformal prediction research on deep learning models, using PyTorch. Specifically, this
+toolbox has implemented some representative methods (including posthoc and training methods) for
+classification and regression tasks. We build the framework of TorchCP based
+on [`AdverTorch`](https://github.com/BorealisAI/advertorch/tree/master). This codebase is still under construction.
+Comments, issues, contributions, and collaborations are all welcomed!
 
 # Overview
+
 TorchCP has implemented the following methods:
+
 ## Classification
+
 | Year | Title                                                                                                                                            | Venue   | Code Link                                                                         |
 |------|--------------------------------------------------------------------------------------------------------------------------------------------------|---------|-----------------------------------------------------------------------------------|
 | 2023 | [**Class-Conditional Conformal Prediction with Many Classes**](https://arxiv.org/abs/2306.09335)                                                 | NeurIPS | [Link](https://github.com/tiffanyding/class-conditional-conformal)                |
@@ -18,15 +22,15 @@ TorchCP has implemented the following methods:
 | 2013 | [**Applications of Class-Conditional Conformal Predictor in Multi-Class Classification**](https://ieeexplore.ieee.org/document/6784618)          | ICMLA   |                                                                                   |
 
 ## Regression
+
 | Year | Title                                                                                                                                          | Venue   | Code Link                                            |
 |------|------------------------------------------------------------------------------------------------------------------------------------------------|---------|------------------------------------------------------|
 | 2021 | [**Adaptive Conformal Inference Under Distribution Shift**](https://arxiv.org/abs/2106.00170)                                                  | NeurIPS | [Link](https://github.com/isgibbs/AdaptiveConformal) |
 | 2019 | [**Conformalized Quantile Regression**](https://proceedings.neurips.cc/paper_files/paper/2019/file/5103c3584b063c431bd1268e9b5e76fb-Paper.pdf) | NeurIPS | [Link](https://github.com/yromano/cqr)               |
 | 2016 | [**Distribution-Free Predictive Inference For Regression**](https://arxiv.org/abs/1604.04173)                                                  | JASA    | [Link](https://github.com/ryantibs/conformal)        |
 
-
-
 ## TODO
+
 TorchCP is still under active development. We will add the following features/items down the road:
 
 | Year | Title                                                                                                           | Venue   | Code Link                                                                  |
@@ -37,24 +41,24 @@ TorchCP is still under active development. We will add the following features/it
 | 2022 | [**Conformal Prediction Sets with Limited False Positives**](https://arxiv.org/abs/2202.07650)                  | ICML    | [Link](https://github.com/ajfisch/conformal-fp)                            |
 | 2021 | [**Optimized conformal classification using gradient descent approximation**](https://arxiv.org/abs/2105.11255) | Arxiv   |                                                                            |
 
-
-
-
-
 ## Installation
 
 TorchCP is developed with Python 3.9 and PyTorch 2.0.1. To install TorchCP, simply run
+
 ```
 pip install torchcp
 ```
+
 To install from TestPyPI server, run
+
 ```
 pip install --index-url https://test.pypi.org/simple/ --no-deps torchcp
 ```
 
 ## Examples
 
 Here, we provide a simple example for a classification task, with THR score and SplitPredictor.
+
 ```python
 from torchcp.classification.scores import THR
 from torchcp.classification.predictors import SplitPredictor
@@ -88,19 +92,21 @@ result_dict = predictor.evaluate(test_dataloader)
 print(result_dict["Coverage_rate"], result_dict["Average_size"])
 
 ```
+
 You may find more tutorials in [`examples`](https://github.com/ml-stat-Sustech/TorchCP/tree/master/examples) folder.
 
 ## Documentation
 
 The documentation webpage is on readthedocs https://torchcp.readthedocs.io/en/latest/index.html.
 
-
 ## License
+
 This project is licensed under the LGPL. The terms and conditions can be found in the LICENSE and LICENSE.GPL files.
 
 ## Citation
 
-We will release the technical report of TorchCP recently. If you find our repository useful for your research, please consider citing our paper:
+We will release the technical report of TorchCP recently. If you find our repository useful for your research, please
+consider citing our paper:
 
 ```
 @article{huang2023conformal,
@@ -110,6 +116,7 @@ We will release the technical report of TorchCP recently. If you find our reposi
   year={2023}
 }
 ```
+
 ## Contributors
 
 * [Hongxin Wei](https://hongxin001.github.io/)

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -7,9 +7,11 @@
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
 import os
 import sys
+
 sys.path.insert(0, os.path.abspath('../../'))
 
 from unittest.mock import Mock  # noqa: F401, E402
+
 # from sphinx.ext.autodoc.importer import _MockObject as Mock
 Mock.Module = object
 sys.modules['torch'] = Mock()
@@ -49,8 +51,6 @@
 with open(os.path.join(os.path.abspath('../../'), 'torchcp/VERSION')) as f:
     version = f.read().strip()
 
-
-
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
 
@@ -78,7 +78,6 @@
 # The master toctree document.
 master_doc = 'index'
 
-
 # -- Options for HTML output -------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
 
@@ -88,7 +87,6 @@
 html_theme = 'sphinx_rtd_theme'
 html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
 
-
 # A list of files that should not be packed into the epub file.
 epub_exclude_files = ['search.html']
 

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,7 +1,7 @@
 .. TorchCP documentation master file, created by
-   sphinx-quickstart on Fri Dec 22 16:28:31 2023.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
+sphinx-quickstart on Fri Dec 22 16:28:31 2023.
+You can adapt this file completely to your liking, but it should at least
+contain the root `toctree` directive.
 
 Welcome to TorchCP
 ===================================

diff --git a/examples/clip/clip.py b/examples/clip/clip.py
@@ -15,15 +15,14 @@
 
 try:
     from torchvision.transforms import InterpolationMode
+
     BICUBIC = InterpolationMode.BICUBIC
 except ImportError:
     BICUBIC = Image.BICUBIC
 
-
 if packaging.version.parse(torch.__version__) < packaging.version.parse("1.7.1"):
     warnings.warn("PyTorch version 1.7.1 or higher is recommended")
 
-
 __all__ = ["available_models", "load", "tokenize"]
 _tokenizer = _Tokenizer()
 
@@ -57,7 +56,8 @@ def _download(url: str, root: str):
             warnings.warn(f"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file")
 
     with urllib.request.urlopen(url) as source, open(download_target, "wb") as output:
-        with tqdm(total=int(source.info().get("Content-Length")), ncols=80, unit='iB', unit_scale=True, unit_divisor=1024) as loop:
+        with tqdm(total=int(source.info().get("Content-Length")), ncols=80, unit='iB', unit_scale=True,
+                  unit_divisor=1024) as loop:
             while True:
                 buffer = source.read(8192)
                 if not buffer:
@@ -91,7 +91,8 @@ def available_models() -> List[str]:
     return list(_MODELS.keys())
 
 
-def load(name: str, device: Union[str, torch.device] = "cuda" if torch.cuda.is_available() else "cpu", jit: bool = False, download_root: str = None):
+def load(name: str, device: Union[str, torch.device] = "cuda" if torch.cuda.is_available() else "cpu",
+         jit: bool = False, download_root: str = None):
     """Load a CLIP model
 
     Parameters
@@ -202,7 +203,8 @@ def patch_float(module):
     return model, _transform(model.input_resolution.item())
 
 
-def tokenize(texts: Union[str, List[str]], context_length: int = 77, truncate: bool = False) -> Union[torch.IntTensor, torch.LongTensor]:
+def tokenize(texts: Union[str, List[str]], context_length: int = 77, truncate: bool = False) -> Union[
+    torch.IntTensor, torch.LongTensor]:
     """
     Returns the tokenized representation of given input string(s)
 

diff --git a/examples/clip/model.py b/examples/clip/model.py
@@ -224,7 +224,9 @@ def forward(self, x: torch.Tensor):
         x = self.conv1(x)  # shape = [*, width, grid, grid]
         x = x.reshape(x.shape[0], x.shape[1], -1)  # shape = [*, width, grid ** 2]
         x = x.permute(0, 2, 1)  # shape = [*, grid ** 2, width]
-        x = torch.cat([self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), x], dim=1)  # shape = [*, grid ** 2 + 1, width]
+        x = torch.cat(
+            [self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device),
+             x], dim=1)  # shape = [*, grid ** 2 + 1, width]
         x = x + self.positional_embedding.to(x.dtype)
         x = self.ln_pre(x)
 
@@ -401,12 +403,14 @@ def build_model(state_dict: dict):
 
     if vit:
         vision_width = state_dict["visual.conv1.weight"].shape[0]
-        vision_layers = len([k for k in state_dict.keys() if k.startswith("visual.") and k.endswith(".attn.in_proj_weight")])
+        vision_layers = len(
+            [k for k in state_dict.keys() if k.startswith("visual.") and k.endswith(".attn.in_proj_weight")])
         vision_patch_size = state_dict["visual.conv1.weight"].shape[-1]
         grid_size = round((state_dict["visual.positional_embedding"].shape[0] - 1) ** 0.5)
         image_resolution = vision_patch_size * grid_size
     else:
-        counts: list = [len(set(k.split(".")[2] for k in state_dict if k.startswith(f"visual.layer{b}"))) for b in [1, 2, 3, 4]]
+        counts: list = [len(set(k.split(".")[2] for k in state_dict if k.startswith(f"visual.layer{b}"))) for b in
+                        [1, 2, 3, 4]]
         vision_layers = tuple(counts)
         vision_width = state_dict["visual.layer1.0.conv1.weight"].shape[0]
         output_width = round((state_dict["visual.attnpool.positional_embedding"].shape[0] - 1) ** 0.5)

diff --git a/examples/clip/simple_tokenizer.py b/examples/clip/simple_tokenizer.py
@@ -23,13 +23,13 @@ def bytes_to_unicode():
     To avoid that, we want lookup tables between utf-8 bytes and unicode strings.
     And avoids mapping to whitespace/control characters the bpe code barfs on.
     """
-    bs = list(range(ord("!"), ord("~")+1))+list(range(ord("¡"), ord("¬")+1))+list(range(ord("®"), ord("ÿ")+1))
+    bs = list(range(ord("!"), ord("~") + 1)) + list(range(ord("¡"), ord("¬") + 1)) + list(range(ord("®"), ord("ÿ") + 1))
     cs = bs[:]
     n = 0
-    for b in range(2**8):
+    for b in range(2 ** 8):
         if b not in bs:
             bs.append(b)
-            cs.append(2**8+n)
+            cs.append(2 ** 8 + n)
             n += 1
     cs = [chr(n) for n in cs]
     return dict(zip(bs, cs))
@@ -64,30 +64,32 @@ def __init__(self, bpe_path: str = default_bpe()):
         self.byte_encoder = bytes_to_unicode()
         self.byte_decoder = {v: k for k, v in self.byte_encoder.items()}
         merges = gzip.open(bpe_path).read().decode("utf-8").split('\n')
-        merges = merges[1:49152-256-2+1]
+        merges = merges[1:49152 - 256 - 2 + 1]
         merges = [tuple(merge.split()) for merge in merges]
         vocab = list(bytes_to_unicode().values())
-        vocab = vocab + [v+'</w>' for v in vocab]
+        vocab = vocab + [v + '</w>' for v in vocab]
         for merge in merges:
             vocab.append(''.join(merge))
         vocab.extend(['<|startoftext|>', '<|endoftext|>'])
         self.encoder = dict(zip(vocab, range(len(vocab))))
         self.decoder = {v: k for k, v in self.encoder.items()}
         self.bpe_ranks = dict(zip(merges, range(len(merges))))
         self.cache = {'<|startoftext|>': '<|startoftext|>', '<|endoftext|>': '<|endoftext|>'}
-        self.pat = re.compile(r"""<\|startoftext\|>|<\|endoftext\|>|'s|'t|'re|'ve|'m|'ll|'d|[\p{L}]+|[\p{N}]|[^\s\p{L}\p{N}]+""", re.IGNORECASE)
+        self.pat = re.compile(
+            r"""<\|startoftext\|>|<\|endoftext\|>|'s|'t|'re|'ve|'m|'ll|'d|[\p{L}]+|[\p{N}]|[^\s\p{L}\p{N}]+""",
+            re.IGNORECASE)
 
     def bpe(self, token):
         if token in self.cache:
             return self.cache[token]
-        word = tuple(token[:-1]) + ( token[-1] + '</w>',)
+        word = tuple(token[:-1]) + (token[-1] + '</w>',)
         pairs = get_pairs(word)
 
         if not pairs:
-            return token+'</w>'
+            return token + '</w>'
 
         while True:
-            bigram = min(pairs, key = lambda pair: self.bpe_ranks.get(pair, float('inf')))
+            bigram = min(pairs, key=lambda pair: self.bpe_ranks.get(pair, float('inf')))
             if bigram not in self.bpe_ranks:
                 break
             first, second = bigram
@@ -102,8 +104,8 @@ def bpe(self, token):
                     new_word.extend(word[i:])
                     break
 
-                if word[i] == first and i < len(word)-1 and word[i+1] == second:
-                    new_word.append(first+second)
+                if word[i] == first and i < len(word) - 1 and word[i + 1] == second:
+                    new_word.append(first + second)
                     i += 2
                 else:
                     new_word.append(word[i])