Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch PDB from PDB code #6

Open
universvm opened this issue Feb 26, 2020 · 2 comments
Open

Fetch PDB from PDB code #6

universvm opened this issue Feb 26, 2020 · 2 comments

Comments

@universvm
Copy link
Contributor

Hi there!

I'm currently using ampal in a project. I need to download several PDBs to handle with ampal. I looked at the load_pdb function in pdb_parser, and was thinking about passing an optional argument eg. fetch_pdb = True and then have a condition right before the PdbParser to download the PDBs into a specific folder and pass the path to PdbParser.

The fetching logic could be something like:


import urllib
from pathlib import Path

PDB_REQUEST_URL = 'https://files.rcsb.org/download/'
PROTEIN_DATA_FOLDER = Path('protein_data')
PROTEIN_DATA_FOLDER.mkdir(parents=True, exist_ok=True)

def download_pdb(pdb_id, output_folder):
    """
    Downloads a specific pdb file into a specific folder. 
    """
    urllib.request.urlretrieve(PDB_REQUEST_URL + pdb_id + '.pdb1.gz', filename=PROTEIN_DATA_FOLDER / f'{pdb_id}.pdb1.gz')

download_pdb('3qy1', PROTEIN_DATA_FOLDER)

Let me know what you think :)

@ChrisWellsWood
Copy link
Contributor

I'm in favour of this, but I think it should be a function along side load_pdb called fetch_pdb, which mirrors the command names in PyMol. I'd prefer PDBe as the endpoint rather than RSCB, mainly because we can fetch biological units from there.

@universvm
Copy link
Contributor Author

I'll open a pull request in the next couple of days. I'll add a test case as well. While I'm at it, should I also fix the incorrect documentation in the tests or should I open another issue for that?

    def test_parse_1ek9(self):
        """Check that **3qy1** has been parsed correctly."""
        test_file_path = str(TEST_FILE_FOLDER / '1ek9.pdb')
        self.check_ampal_contents(test_file_path)

    def test_parse_2ht0(self):
        """Check that **3qy1** has been parsed correctly.""" 
        test_file_path = str(TEST_FILE_FOLDER / '2ht0.pdb')
        self.check_ampal_contents(test_file_path)

Ie. def test_parse_2ht0 should check 2ht0 rather than 3qy1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants