Add IBM Cloud cluster resource fetcher (#54)

ComplianceAsCode · May 28, 2021 · 07e20cd · 07e20cd
1 parent ee72cf0
commit 07e20cd
Show file tree

Hide file tree

Showing 7 changed files with 248 additions and 4 deletions.
diff --git a/CHANGES.md b/CHANGES.md
@@ -1,3 +1,7 @@
+# [0.16.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.16.0)
+
+- [ADDED] IBM Cloud cluster resources fetcher added.
+
 # [0.15.0](https://github.com/ComplianceAsCode/auditree-arboretum/releases/tag/v0.15.0)
 
 - [ADDED] Github org permissions check added to `permissions`.

diff --git a/arboretum/__init__.py b/arboretum/__init__.py
@@ -14,4 +14,4 @@
 # limitations under the License.
 """Arboretum - Checking your compliance & security posture, continuously."""
 
-__version__ = '0.15.0'
+__version__ = '0.16.0'
diff --git a/arboretum/ibm_cloud/README.md b/arboretum/ibm_cloud/README.md
@@ -62,6 +62,85 @@ to include the fetchers and checks from this library in your downstream project.
    from arboretum.ibm_cloud.fetchers.fetch_cluster_list import ClusterListFetcher
    ```
 
+### Cluster Resource
+
+* Class: [ICClusterResourceFetcher][fetch-ibm-cloud-cluster-resource]
+* Purpose: Write the resources of **managed** Kubernetes clusters to the evidence locker.
+* Behavior: Retrieve managed Kubernetes cluster resource data based on clusters gathered by the [IBM Cloud cluster list fetcher][fetch-cluster-list].  TTL is set to 1 day.
+* NOTE: 
+   * Do not use this fetcher for stand-alone clusters. For Kubernetes stand-alone clusters, use the [Kubernetes cluster resource fetcher][fetch-kube-cluster-resource].
+   * This fetcher is dependent on evidence gathered by the [IBM Cloud cluster list fetcher][fetch-cluster-list], 
+ i.e. importing the IBM Cloud cluster list fetcher is a prerequisite for the IKS cluster resource fetcher to work.
+
+* Configuration elements:
+  * `org.ibm_cloud.accounts`
+    * Required
+    * List of accounts as strings
+    * Each account is an arbitrary name describing the IBM Cloud account. It is used to match to the token provided in the
+      credentials file in order for the fetcher to retrieve content from IBM Cloud for that account.
+  * `org.ibm_cloud.cluster_resources.types`
+    * Optional
+    * List of resource types as strings
+      * NOTE: For core group API resources, the resource name must be in
+      _plural form_ (e.g., `secrets`).
+      * NOTE: For other named group resources including custom API
+        resources, the resource name must be in the following format:
+        `APIGROUP/VERSION/NAME`. You can compose this by first executing
+        `kubectl api-resources` and `kubectl api-versions` and then combining
+        the results into your resource name.  Using `cronjobs` as an example:
+
+        ```sh
+        $ kubectl api-resources -o name | fgrep cronjobs
+        cronjobs.batch
+        $ kubectl api-versions | grep batch
+        batch/v1
+        batch/v1beta1
+        ```
+
+        For this example `batch/v1` is the more stable version so we use that
+        to compose `APIGROUP/VERSION/NAME` resource name as `batch/v1/cronjobs`.
+* Expected configuration:
+
+  ```json
+  {
+    "org": {
+      "ibm_cloud": {
+        "accounts": [
+          "myaccount1", "myaccount2"
+        ],
+        "cluster_resources": {
+          "types": [
+            "secrets", "batch/v1/cronjobs", "apigroup.example.com/v1/mycustom"
+          ]
+        }
+      }
+    }
+  }
+  ```
+
+* Required credentials:
+  * `ibm_cloud` credentials with read/view permissions are needed for this fetcher to successfully retrieve the evidence.
+    * `XXX_api_key`: API key string for account `XXX`.
+    * Example credential file entry:
+
+      ```ini
+      [ibm_cloud]
+      acct_a_api_key=your-ibm-cloud-api-key-for-acct-a
+      acct_b_api_key=your-ibm-cloud-api-key-for-acct-b
+      ```
+
+    * NOTE: API keys can be generated using the [IBM Cloud CLI][ic-api-key-create] or [IBM Cloud Console][ibm-cloud-gen-api-console]. Example to create an API key with IBM Cloud CLI is:
+
+      ```sh
+      ibmcloud iam api-key-create your-iks-api-key-for-acct-x
+      ```
+
+* Import statement:
+
+   ```python
+   from arboretum.ibm_cloud.fetchers.fetch_cluster_resource import ICClusterResourceFetcher
+   ```
+
 ## Checks
 
 Checks coming soon...
@@ -72,4 +151,6 @@ Checks coming soon...
 [ic-api-key-create]: https://cloud.ibm.com/docs/cli/reference/ibmcloud?topic=cloud-cli-ibmcloud_commands_iam#ibmcloud_iam_api_key_create
 [fetch-cluster-list]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/ibm_cloud/fetchers/fetch_cluster_list.py
 [ibm-cloud-api]: https://containers.cloud.ibm.com/
-[ibm-cloud-gen-api-console]: https://cloud.ibm.com/docs/account?topic=account-userapikey#create_user_key
+[ibm-cloud-gen-api-console]: https://cloud.ibm.com/docs/account?topic=account-userapikey#create_user_key
+[fetch-ibm-cloud-cluster-resource]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/ibm_cloud/fetchers/fetch_cluster_resource.py
+[fetch-kube-cluster-resource]: https://github.com/ComplianceAsCode/auditree-arboretum/tree/main/arboretum/kubernetes#cluster-resource
diff --git a/arboretum/ibm_cloud/fetchers/fetch_cluster_resource.py b/arboretum/ibm_cloud/fetchers/fetch_cluster_resource.py
@@ -0,0 +1,152 @@
+# -*- mode:python; coding:utf-8 -*-
+# Copyright (c) 2021 IBM Corp. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""IBM Cloud cluster resource fetcher."""
+
+import io
+import json
+import pathlib
+import tempfile
+import zipfile
+
+from arboretum.common.iam_ibm_utils import get_tokens
+from arboretum.common.kube_constants import RESOURCE_TYPES_DEFAULT
+from arboretum.common.kube_utils import get_cluster_resources
+
+from compliance.evidence import (
+    DAY, RawEvidence, get_evidence_dependency, store_raw_evidence
+)
+from compliance.fetch import ComplianceFetcher
+
+import yaml
+
+
+class ICClusterResourceFetcher(ComplianceFetcher):
+    """Fetch resources of IBM Cloud Kubernetes clusters."""
+
+    @classmethod
+    def setUpClass(cls):
+        """Initialize the fetcher object with configuration settings."""
+        cls.config.add_evidences(
+            [
+                RawEvidence(
+                    'cluster_resources.json',
+                    'ibm_cloud',
+                    DAY,
+                    'IBM Cloud Kubernetes cluster resources'
+                )
+            ]
+        )
+        cls.resource_types = cls.config.get(
+            'org.ibm_cloud.cluster_resources.types', RESOURCE_TYPES_DEFAULT
+        )
+        cls.tempdir = tempfile.TemporaryDirectory()
+        return cls
+
+    @classmethod
+    def tearDownClass(cls):
+        """Cleanup class."""
+        cls.tempdir.cleanup()
+
+    @store_raw_evidence('ibm_cloud/cluster_resources.json')
+    def fetch_cluster_resource(self):
+        """Fetch cluster resources."""
+        cluster_list_evidence = get_evidence_dependency(
+            'raw/ibm_cloud/cluster_list.json', self.locker
+        )
+        cluster_list = cluster_list_evidence.content_as_json
+        resources = {}
+        for account in cluster_list:
+            api_key = self.config.creds.get('ibm_cloud', f'{account}_api_key')
+            access_token, refresh_token = get_tokens(api_key)
+            headers = {
+                'Accept': 'application/json',
+                'Authorization': f'Bearer {access_token}',
+                'X-Auth-Refresh-Token': refresh_token
+            }
+            self.session('https://containers.cloud.ibm.com', **headers)
+
+            resources[account] = []
+            for cluster in cluster_list[account]:
+                self.session('https://containers.cloud.ibm.com', **headers)
+                config_url = f'/global/v1/clusters/{cluster["id"]}/config'
+                resp = self.session().get(config_url)
+                resp.raise_for_status()
+                cluster_config = zipfile.ZipFile(io.BytesIO(resp.content))
+                if cluster['type'] == 'kubernetes':
+                    cluster_token, ca_cert = self._get_iks_credentials(
+                        cluster_config
+                    )
+                elif cluster['type'] == 'openshift':
+                    cluster_token = self._get_roks_credentials(
+                        cluster, api_key
+                    )
+                    ca_cert = None
+                self.session(cluster['serverURL'], **headers)
+                cluster['resources'] = get_cluster_resources(
+                    self.session(),
+                    cluster_token,
+                    self.resource_types,
+                    ca_cert
+                )
+                resources[account].append(cluster)
+
+        return json.dumps(resources)
+
+    def _get_iks_credentials(self, cluster_config):
+        """Get credentials for an IKS cluster.
+
+        This function implements the procedure described in
+        https://cloud.ibm.com/apidocs/kubernetes#getclusterconfig
+        """
+        for name in cluster_config.namelist():
+            p = pathlib.PurePath(name)
+            if p.name.startswith('kube-config'):
+                kubeconfig = yaml.safe_load(cluster_config.read(name))
+                usr = kubeconfig['users'][0]['user']
+                cluster_token = usr['auth-provider']['config']['id-token']
+            if p.suffix == '.pem':
+                cluster_config.extract(name, path=self.tempdir.name)
+                ca_cert_filepath = str(
+                    pathlib.PurePath(self.tempdir.name, name)
+                )
+        return cluster_token, ca_cert_filepath
+
+    def _get_roks_credentials(self, cluster, api_key):
+        """Get credentials for a ROKS cluster.
+
+        This function implements the procedure described in
+        https://cloud.ibm.com/docs/openshift?topic=openshift-access_cluster#access_automation
+        """
+        s = self.session(cluster['serverURL'])
+        oauth_path = '/.well-known/oauth-authorization-server'
+        resp = s.get(oauth_path)
+        resp.raise_for_status()
+        token_endpoint = resp.json()['token_endpoint']
+        oauth_server = token_endpoint.split('/')[2]
+        s = self.session(f'https://{oauth_server}')
+        token_path = (
+            '/oauth/authorize?client_id='
+            'openshift-challenging-client&response_type=token'
+        )
+        resp = s.get(
+            token_path,
+            auth=('apikey', api_key),
+            headers={'X-CSRF-Token': 'a'},
+            allow_redirects=False
+        )
+        location = resp.headers['Location']
+        cluster_token = location.split('access_token=', 1)[1].split('&', 1)[0]
+
+        return cluster_token
diff --git a/arboretum/kubernetes/README.md b/arboretum/kubernetes/README.md
@@ -21,7 +21,9 @@ to include the fetchers and checks from this library in your downstream project.
 * Class: [ClusterResourceFetcher][fetch-cluster-resource]
 * Purpose: Write the resources of **stand-alone** Kubernetes clusters to the
 evidence locker.  **NOTE:** Do not use this fetcher for managed clusters.
-Instead use the [IBM Cloud cluster list fetcher][ibm-cloud-cluster-list-fetcher].
+For IBM Cloud clusters, use the
+[IBM Cloud cluster list fetcher][ibm-cloud-cluster-list-fetcher] and the
+[IBM Cloud cluster resource fetcher][ibm-cloud-cluster-resource-fetcher].
 * Behavior: Retrieve stand-alone Kubernetes cluster resource data for the provided
 list of clusters.  TTL is set to 1 day.
 * Configuration elements:
@@ -119,4 +121,5 @@ Checks coming soon...
 [usage]: https://github.com/ComplianceAsCode/auditree-arboretum#usage
 [fetch-cluster-resource]: https://github.com/ComplianceAsCode/auditree-arboretum/blob/main/arboretum/kubernetes/fetchers/fetch_cluster_resource.py
 [ibm-cloud-cluster-list-fetcher]: https://github.com/ComplianceAsCode/auditree-arboretum/tree/main/arboretum/ibm_cloud#cluster-list
+[ibm-cloud-cluster-resource-fetcher]: https://github.com/ComplianceAsCode/auditree-arboretum/tree/main/arboretum/ibm_cloud#cluster-resource
 [kube-rbac-docs]: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
diff --git a/devel.json b/devel.json
@@ -86,7 +86,10 @@
       ]
     },
     "ibm_cloud": {
-      "accounts": ["my_ic_account_one", "my_ic_account_two"]
+      "accounts": ["my_ic_account_one", "my_ic_account_two"],
+      "cluster_resources": {
+        "types": ["pods"]
+      }
     },
     "kubernetes": {
       "cluster_resources": {

diff --git a/setup.cfg b/setup.cfg
@@ -23,6 +23,7 @@ packages = find:
 install_requires =
     auditree-framework>=1.2.3
     auditree-harvest>=1.0.0
+    pyyaml<5.4
 
 [options.packages.find]
 exclude =