Skip to content

Commit

Permalink
Merge pull request #93 from ksenyako/release/ccl_2021.10
Browse files Browse the repository at this point in the history
Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.10
  • Loading branch information
srirajpaul authored Jul 19, 2023
2 parents 0db8329 + 3f9d767 commit b4c31ba
Show file tree
Hide file tree
Showing 204 changed files with 11,962 additions and 3,130 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -302,7 +302,7 @@ file(GLOB spv_kernels "${PROJECT_SOURCE_DIR}/src/kernels/kernels.spv")
endif()

set(CCL_MAJOR_VERSION "2021")
set(CCL_MINOR_VERSION "9")
set(CCL_MINOR_VERSION "10")
set(CCL_UPDATE_VERSION "0")
set(CCL_PRODUCT_STATUS "Gold")
string(TIMESTAMP CCL_PRODUCT_BUILD_DATE "%Y-%m-%dT %H:%M:%SZ")
Expand Down
2 changes: 1 addition & 1 deletion INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
- Ubuntu* 18
- GNU*: C, C++ 4.8.5 or higher.

Refer to [System Requirements](https://software.intel.com/content/www/us/en/develop/articles/oneapi-collective-communication-library-system-requirements.html) for more details.
Refer to [System Requirements](https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-collective-communication-library-system-requirements.html) for more details.

### SYCL support <!-- omit in toc -->
Intel(R) oneAPI DPC++/C++ Compiler with L0 v1.0 support
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# oneAPI Collective Communications Library (oneCCL) <!-- omit in toc --> <img align="right" width="100" height="100" src="https://spec.oneapi.io/oneapi-logo-white-scaled.jpg">

[Installation](#installation)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Usage](#usage)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Release Notes](https://software.intel.com/content/www/us/en/develop/articles/oneapi-collective-communication-library-ccl-release-notes.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Documentation](https://oneapi-src.github.io/oneCCL/)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[How to Contribute](CONTRIBUTING.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[License](LICENSE)
[Installation](#installation)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Usage](#usage)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Release Notes](https://www.intel.com/content/www/us/en/developer/articles/release-notes/oneapi-collective-communication-library-ccl-release-notes.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Documentation](https://oneapi-src.github.io/oneCCL/)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[How to Contribute](CONTRIBUTING.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[License](LICENSE)

oneAPI Collective Communications Library (oneCCL) provides an efficient implementation of communication patterns used in deep learning.

Expand Down Expand Up @@ -30,7 +30,7 @@ oneCCL is part of [oneAPI](https://oneapi.io).
- Ubuntu* 18
- GNU*: C, C++ 4.8.5 or higher.

Refer to [System Requirements](https://software.intel.com/content/www/us/en/develop/articles/oneapi-collective-communication-library-system-requirements.html) for more details.
Refer to [System Requirements](https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-collective-communication-library-system-requirements.html) for more details.

### SYCL support <!-- omit in toc -->
Intel(R) oneAPI DPC++/C++ Compiler with Level Zero v1.0 support.
Expand Down
59 changes: 42 additions & 17 deletions deps/hwloc/include/hwloc.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/*
* Copyright © 2009 CNRS
* Copyright © 2009-2021 Inria. All rights reserved.
* Copyright © 2009-2022 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2020 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
Expand Down Expand Up @@ -29,7 +29,7 @@
* THAT IS IN THE PDF/HTML THAT IS ***NOT*** IN hwloc.h!
*
* There are entire paragraph-length descriptions, discussions, and
* pretty prictures to explain subtle corner cases, provide concrete
* pretty pictures to explain subtle corner cases, provide concrete
* examples, etc.
*
* Please, go read the documentation. :-)
Expand Down Expand Up @@ -93,7 +93,7 @@ extern "C" {
* Two stable releases of the same series usually have the same ::HWLOC_API_VERSION
* even if their HWLOC_VERSION are different.
*/
#define HWLOC_API_VERSION 0x00020500
#define HWLOC_API_VERSION 0x00020800

/** \brief Indicate at runtime which hwloc API version was used at build time.
*
Expand Down Expand Up @@ -517,7 +517,7 @@ struct hwloc_obj {
* objects).
*
* If the ::HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED configuration flag is set,
* some of these CPUs may not be allowed for binding,
* some of these CPUs may be online but not allowed for binding,
* see hwloc_topology_get_allowed_cpuset().
*
* \note All objects have non-NULL CPU and node sets except Misc and I/O objects.
Expand Down Expand Up @@ -549,7 +549,7 @@ struct hwloc_obj {
* nodes more precisely.
*
* If the ::HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED configuration flag is set,
* some of these nodes may not be allowed for allocation,
* some of these nodes may be online but not allowed for allocation,
* see hwloc_topology_get_allowed_nodeset().
*
* If there are no NUMA nodes in the machine, all the memory is close to this
Expand Down Expand Up @@ -642,7 +642,7 @@ union hwloc_obj_attr_u {
unsigned char revision;
float linkspeed; /* in GB/s */
} pcidev;
/** \brief Bridge specific Object Attribues */
/** \brief Bridge specific Object Attributes */
struct hwloc_bridge_attr_s {
union {
struct hwloc_pcidev_attr_s pci;
Expand Down Expand Up @@ -971,7 +971,7 @@ HWLOC_DECLSPEC const char * hwloc_obj_type_string (hwloc_obj_type_t type) __hwlo
*
* If \p size is 0, \p string may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
HWLOC_DECLSPEC int hwloc_obj_type_snprintf(char * __hwloc_restrict string, size_t size,
Expand All @@ -986,7 +986,7 @@ HWLOC_DECLSPEC int hwloc_obj_type_snprintf(char * __hwloc_restrict string, size_
*
* If \p size is 0, \p string may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
HWLOC_DECLSPEC int hwloc_obj_attr_snprintf(char * __hwloc_restrict string, size_t size,
Expand Down Expand Up @@ -1089,7 +1089,7 @@ HWLOC_DECLSPEC int hwloc_obj_add_info(hwloc_obj_t obj, const char *name, const c
*
* Some operating systems only support binding threads or processes to a single PU.
* Others allow binding to larger sets such as entire Cores or Packages or
* even random sets of invididual PUs. In such operating system, the scheduler
* even random sets of individual PUs. In such operating system, the scheduler
* is free to run the task on one of these PU, then migrate it to another PU, etc.
* It is often useful to call hwloc_bitmap_singlify() on the target CPU set before
* passing it to the binding function to avoid these expensive migrations.
Expand Down Expand Up @@ -1167,7 +1167,7 @@ typedef enum {
* CPUs are idle, operating systems may execute the thread/process
* on those other CPUs instead of the designated CPUs, to let them
* progress anyway. Strict binding means that the thread/process
* will _never_ execute on other cpus than the designated CPUs, even
* will _never_ execute on other CPUs than the designated CPUs, even
* when those are busy with other tasks and other CPUs are idle.
*
* \note Depending on the operating system, strict binding may not
Expand Down Expand Up @@ -1204,7 +1204,7 @@ typedef enum {
HWLOC_CPUBIND_NOMEMBIND = (1<<3)
} hwloc_cpubind_flags_t;

/** \brief Bind current process or thread on cpus given in physical bitmap \p set.
/** \brief Bind current process or thread on CPUs given in physical bitmap \p set.
*
* \return -1 with errno set to ENOSYS if the action is not supported
* \return -1 with errno set to EXDEV if the binding cannot be enforced
Expand All @@ -1219,7 +1219,7 @@ HWLOC_DECLSPEC int hwloc_set_cpubind(hwloc_topology_t topology, hwloc_const_cpus
*/
HWLOC_DECLSPEC int hwloc_get_cpubind(hwloc_topology_t topology, hwloc_cpuset_t set, int flags);

/** \brief Bind a process \p pid on cpus given in physical bitmap \p set.
/** \brief Bind a process \p pid on CPUs given in physical bitmap \p set.
*
* \note \p hwloc_pid_t is \p pid_t on Unix platforms,
* and \p HANDLE on native Windows platforms.
Expand Down Expand Up @@ -1250,7 +1250,7 @@ HWLOC_DECLSPEC int hwloc_set_proc_cpubind(hwloc_topology_t topology, hwloc_pid_t
HWLOC_DECLSPEC int hwloc_get_proc_cpubind(hwloc_topology_t topology, hwloc_pid_t pid, hwloc_cpuset_t set, int flags);

#ifdef hwloc_thread_t
/** \brief Bind a thread \p thread on cpus given in physical bitmap \p set.
/** \brief Bind a thread \p thread on CPUs given in physical bitmap \p set.
*
* \note \p hwloc_thread_t is \p pthread_t on Unix platforms,
* and \p HANDLE on native Windows platforms.
Expand Down Expand Up @@ -1914,8 +1914,9 @@ HWLOC_DECLSPEC int hwloc_topology_set_components(hwloc_topology_t __hwloc_restri
enum hwloc_topology_flags_e {
/** \brief Detect the whole system, ignore reservations, include disallowed objects.
*
* Gather all resources, even if some were disabled by the administrator.
* Gather all online resources, even if some were disabled by the administrator.
* For instance, ignore Linux Cgroup/Cpusets and gather all processors and memory nodes.
* However offline PUs and NUMA nodes are still ignored.
*
* When this flag is not set, PUs and NUMA nodes that are disallowed are not added to the topology.
* Parent objects (package, core, cache, etc.) are added only if some of their children are allowed.
Expand Down Expand Up @@ -2059,24 +2060,48 @@ enum hwloc_topology_flags_e {
* not change to due thread binding changes on Windows
* (see ::HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING).
*/
HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING = (1UL<<6)
HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING = (1UL<<6),

/** \brief Ignore distances.
*
* Ignore distance information from the operating systems (and from XML)
* and hence do not use distances for grouping.
*/
HWLOC_TOPOLOGY_FLAG_NO_DISTANCES = (1UL<<7),

/** \brief Ignore memory attributes.
*
* Ignore memory attribues from the operating systems (and from XML).
*/
HWLOC_TOPOLOGY_FLAG_NO_MEMATTRS = (1UL<<8),

/** \brief Ignore CPU Kinds.
*
* Ignore CPU kind information from the operating systems (and from XML).
*/
HWLOC_TOPOLOGY_FLAG_NO_CPUKINDS = (1UL<<9)
};

/** \brief Set OR'ed flags to non-yet-loaded topology.
*
* Set a OR'ed set of ::hwloc_topology_flags_e onto a topology that was not yet loaded.
*
* If this function is called multiple times, the last invokation will erase
* If this function is called multiple times, the last invocation will erase
* and replace the set of flags that was previously set.
*
* The flags set in a topology may be retrieved with hwloc_topology_get_flags()
* By default, no flags are set (\c 0).
*
* The flags set in a topology may be retrieved with hwloc_topology_get_flags().
*/
HWLOC_DECLSPEC int hwloc_topology_set_flags (hwloc_topology_t topology, unsigned long flags);

/** \brief Get OR'ed flags of a topology.
*
* Get the OR'ed set of ::hwloc_topology_flags_e of a topology.
*
* If hwloc_topology_set_flags() was not called earlier,
* no flags are set (\c 0 is returned).
*
* \return the flags previously set with hwloc_topology_set_flags().
*/
HWLOC_DECLSPEC unsigned long hwloc_topology_get_flags (hwloc_topology_t topology);
Expand Down
17 changes: 13 additions & 4 deletions deps/hwloc/include/hwloc/autogen/config.h
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/* include/hwloc/autogen/config.h. Generated from config.h.in by configure. */
/* -*- c -*-
* Copyright © 2009 CNRS
* Copyright © 2009-2020 Inria. All rights reserved.
* Copyright © 2009-2022 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
Expand All @@ -12,11 +12,20 @@
#ifndef HWLOC_CONFIG_H
#define HWLOC_CONFIG_H

#define HWLOC_VERSION "2.7.0rc1-git"
#define HWLOC_VERSION "2.9.0rc2-git"
#define HWLOC_VERSION_MAJOR 2
#define HWLOC_VERSION_MINOR 7
#define HWLOC_VERSION_MINOR 9
#define HWLOC_VERSION_RELEASE 0
#define HWLOC_VERSION_GREEK "rc1"
#define HWLOC_VERSION_GREEK "rc2"

/* #undef HWLOC_PCI_COMPONENT_BUILTIN */
/* #undef HWLOC_OPENCL_COMPONENT_BUILTIN */
/* #undef HWLOC_CUDA_COMPONENT_BUILTIN */
/* #undef HWLOC_NVML_COMPONENT_BUILTIN */
/* #undef HWLOC_RSMI_COMPONENT_BUILTIN */
/* #undef HWLOC_LEVELZERO_COMPONENT_BUILTIN */
/* #undef HWLOC_GL_COMPONENT_BUILTIN */
/* #undef HWLOC_XML_LIBXML_COMPONENT_BUILTIN */

#if (__GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 95))
# define __hwloc_restrict __restrict
Expand Down
14 changes: 7 additions & 7 deletions deps/hwloc/include/hwloc/bitmap.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/*
* Copyright © 2009 CNRS
* Copyright © 2009-2020 Inria. All rights reserved.
* Copyright © 2009-2022 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
Expand Down Expand Up @@ -112,7 +112,7 @@ HWLOC_DECLSPEC int hwloc_bitmap_copy(hwloc_bitmap_t dst, hwloc_const_bitmap_t sr
*
* If \p buflen is 0, \p buf may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
HWLOC_DECLSPEC int hwloc_bitmap_snprintf(char * __hwloc_restrict buf, size_t buflen, hwloc_const_bitmap_t bitmap);
Expand All @@ -137,7 +137,7 @@ HWLOC_DECLSPEC int hwloc_bitmap_sscanf(hwloc_bitmap_t bitmap, const char * __hwl
*
* If \p buflen is 0, \p buf may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
HWLOC_DECLSPEC int hwloc_bitmap_list_snprintf(char * __hwloc_restrict buf, size_t buflen, hwloc_const_bitmap_t bitmap);
Expand All @@ -161,7 +161,7 @@ HWLOC_DECLSPEC int hwloc_bitmap_list_sscanf(hwloc_bitmap_t bitmap, const char *
*
* If \p buflen is 0, \p buf may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
HWLOC_DECLSPEC int hwloc_bitmap_taskset_snprintf(char * __hwloc_restrict buf, size_t buflen, hwloc_const_bitmap_t bitmap);
Expand Down Expand Up @@ -357,11 +357,11 @@ HWLOC_DECLSPEC int hwloc_bitmap_last_unset(hwloc_const_bitmap_t bitmap) __hwloc_
* The loop must start with hwloc_bitmap_foreach_begin() and end
* with hwloc_bitmap_foreach_end() followed by a terminating ';'.
*
* \p index is the loop variable; it should be an unsigned int. The
* first iteration will set \p index to the lowest index in the bitmap.
* \p id is the loop variable; it should be an unsigned int. The
* first iteration will set \p id to the lowest index in the bitmap.
* Successive iterations will iterate through, in order, all remaining
* indexes set in the bitmap. To be specific: each iteration will return a
* value for \p index such that hwloc_bitmap_isset(bitmap, index) is true.
* value for \p id such that hwloc_bitmap_isset(bitmap, id) is true.
*
* The assert prevents the loop from being infinite if the bitmap is infinitely set.
*
Expand Down
4 changes: 2 additions & 2 deletions deps/hwloc/include/hwloc/deprecated.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/*
* Copyright © 2009 CNRS
* Copyright © 2009-2021 Inria. All rights reserved.
* Copyright © 2009-2022 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
Expand Down Expand Up @@ -55,7 +55,7 @@ hwloc_topology_insert_misc_object_by_parent(hwloc_topology_t topology, hwloc_obj
*
* If \p size is 0, \p string may safely be \c NULL.
*
* \return the number of character that were actually written if not truncating,
* \return the number of characters that were actually written if not truncating,
* or that would have been written (not including the ending \\0).
*/
static __hwloc_inline int
Expand Down
9 changes: 5 additions & 4 deletions deps/hwloc/include/hwloc/distances.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright © 2010-2021 Inria. All rights reserved.
* Copyright © 2010-2022 Inria. All rights reserved.
* See COPYING in top-level directory.
*/

Expand Down Expand Up @@ -35,8 +35,8 @@ extern "C" {
* from a core in another node.
* The corresponding kind is ::HWLOC_DISTANCES_KIND_FROM_OS | ::HWLOC_DISTANCES_KIND_FROM_USER.
* The name of this distances structure is "NUMALatency".
* Others distance structures include and "XGMIBandwidth", "XGMIHops"
* and "NVLinkBandwidth".
* Others distance structures include and "XGMIBandwidth", "XGMIHops",
* "XeLinkBandwidth" and "NVLinkBandwidth".
*
* The matrix may also contain bandwidths between random sets of objects,
* possibly provided by the user, as specified in the \p kind attribute.
Expand Down Expand Up @@ -160,7 +160,8 @@ hwloc_distances_get_by_type(hwloc_topology_t topology, hwloc_obj_type_t type,
* Usually only one distances structure may match a given name.
*
* The name of the most common structure is "NUMALatency".
* Others include "XGMIBandwidth", "XGMIHops" and "NVLinkBandwidth".
* Others include "XGMIBandwidth", "XGMIHops", "XeLinkBandwidth",
* and "NVLinkBandwidth".
*/
HWLOC_DECLSPEC int
hwloc_distances_get_by_name(hwloc_topology_t topology, const char *name,
Expand Down
5 changes: 1 addition & 4 deletions deps/hwloc/include/hwloc/helper.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/*
* Copyright © 2009 CNRS
* Copyright © 2009-2021 Inria. All rights reserved.
* Copyright © 2009-2022 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory.
Expand Down Expand Up @@ -886,9 +886,6 @@ enum hwloc_distrib_flags_e {
* \p flags should be 0 or a OR'ed set of ::hwloc_distrib_flags_e.
*
* \note This function requires the \p roots objects to have a CPU set.
*
* \note This function replaces the now deprecated hwloc_distribute()
* and hwloc_distributev() functions.
*/
static __hwloc_inline int
hwloc_distrib(hwloc_topology_t topology,
Expand Down
Loading

0 comments on commit b4c31ba

Please sign in to comment.