Releases: spcl/dace
DaCe 0.13.3
What's Changed
- Better integration with Visual Studio Code: Calling
sdfg.view()
inside a VSCode console or debug session will open the file directly in the editor! - Code generator for the Snitch RISC-V architecture (by @Noah95 and @AM-Ivanov)
- Minor hotfixes to Python frontend, transformations, and code generation (with @orausch)
Full Changelog: v0.13.2...v0.13.3
DaCe 0.13.2
What's Changed
- New API for SDFG manipulation: Passes and Pipelines. More about that in the next major release!
- Various fixes to frontend, type inference, and code generation.
- Support for more numpy and Python functions:
arange
,round
, etc. - Better callback support:
- Support callbacks with keyword arguments
- Support literal lists, tuples, sets, and dictionaries in callbacks
- New transformations: move loop into map, on-the-fly-recomputation map fusion
- Performance improvements to frontend
- Better Docker container compatibility via fixes for config files without a home directory
- Add interface to check whether in a DaCe parsing context in #998
def potentially_parsed_by_dace():
if not dace.in_program():
print('Called by Python interpreter!')
else:
print('Compiled with DaCe!')
- Support compressed (gzipped) SDFGs. Loads normally, saves with:
sdfg.save('myprogram.sdfgz', compress=True) # or just run gzip on your old SDFGs
- SDFV: Add web serving capability by @orausch in #1013. Use for interactively debugging SDFGs on remote nodes with:
sdfg.view(8080)
(or any other port)
Full Changelog: v0.13.1...v0.13.2
DaCe 0.13.1
What's Changed
- Python frontend: Bug fixes for closures and callbacks in nested scopes
- Bug fixes for several transformations (
StateFusion
,RedundantSecondArray
) - Fixes for issues with FORTRAN ordering of numpy arrays
- Python object duplicate reference checks in SDFG validation
Full Changelog: v0.13...v0.13.1
DaCe 0.13
New Features
Cutout:
Cutout allows developers to take large DaCe programs and cut out subgraphs reliably to create a runnable sub-program. This sub-program can be then used to check for correctness, benchmark, and transform a part of a program without having to run the full application.
* Example usage from Python:
def my_method(sdfg: dace.SDFG, state: dace.SDFGState):
nodes = [n for n in state if isinstance(n, dace.nodes.LibraryNode)] # Cut every library node
cut_sdfg: dace.SDFG = cutout.cutout_state(state, *nodes)
# The cut SDFG now includes each library node and all the necessary arrays to call it with
Also available in the SDFG editor:
Data Instrumentation:
Just like node instrumentation for performance analysis, data instrumentation allows users to set access nodes to be saved to an instrumented data report, and loaded later for exact reproducible runs.
* Data instrumentation natively works with CPU and GPU global memory, so there is no need to copy data back
* Combined with Cutout, this is a powerful interface to perform local optimizations in large applications with ease!
* Example use:
@dace.program
def tester(A: dace.float64[20, 20]):
tmp = A + 1
return tmp + 5
sdfg = tester.to_sdfg()
for node, _ in sdfg.all_nodes_recursive(): # Instrument every access node
if isinstance(node, nodes.AccessNode):
node.instrument = dace.DataInstrumentationType.Save
A = np.random.rand(20, 20)
result = sdfg(A)
# Get instrumented data from report
dreport = sdfg.get_instrumented_data()
assert np.allclose(dreport['A'], A)
assert np.allclose(dreport['tmp'], A + 1)
assert np.allclose(dreport['__return'], A + 6)
Logical Groups:
SDFG elements can now be grouped by any criteria, and they will be colored during visualization by default (by @phschaad). See example in action:
Changes and Bug Fixes
- Samples and tutorials have now been updated to reflect the latest API
- Constants (added with
sdfg.add_constant
) can now be used as access nodes in SDFGs. The constants are hard-coded into the generated program, so you can run code with the best performance possible. - View nodes can now use the
views
connector to disambiguate which access node is being viewed - Python frontend:
else
clause is now handled in for and while loops - Scalars have been removed from the
__dace_init
generated function signature (by @orausch) - Multiple clock signals in the RTL codegen (by @carljohnsen)
- Various fixes to frontends, transformations, and code generators
Full Changelog available at v0.12...v0.13
DaCe 0.12
API Changes
Important: Pattern-matching transformation API has been significantly simplified. Transformations using the old API must be ported! Summary of changes:
- Transformations now expand either the
SingleStateTransformation
orMultiStateTransformation
classes instead of using decorators - Patterns must be registered as class variables called
PatternNode
s - Nodes in matched patterns can be then accessed in
can_be_applied
andapply
directly usingself.nodename
- The name
strict
is now replaced withpermissive
(False by default). Permissive mode allows transformations to match in more cases, but may be dangerous to apply (e.g., create race conditions). can_be_applied
is now a method of the transformation- The
apply
method accepts a graph and the SDFG.
Example of using the new API:
import dace
from dace import nodes
from dace.sdfg import utils as sdutil
from dace.transformation import transformation as xf
class ExampleTransformation(xf.SingleStateTransformation):
# Define pattern nodes
map_entry = xf.PatternNode(nodes.MapEntry)
access = xf.PatternNode(nodes.AccessNode)
# Define matching subgraphs
@classmethod
def expressions(cls):
# MapEntry -> Access
return [sdutil.node_path_graph(cls.map_entry, cls.access)]
def can_be_applied(self, graph: dace.SDFGState, expr_index: int, sdfg: dace.SDFG, permissive: bool = False) -> bool:
# Returns True if the transformation can be applied on a subgraph
if permissive: # In permissive mode, we will always apply this transformation
return True
return self.map_entry.schedule == dace.ScheduleType.CPU_Multicore
def apply(self, graph: dace.SDFGState, sdfg: dace.SDFG):
# Apply the transformation using the SDFG API
pass
Simplifying SDFGs is renamed from sdfg.apply_strict_transformations()
to sdfg.simplify()
AccessNodes no longer have an AccessType
field.
Other changes
- More nested SDFG inlining opportunities by default with the multi-state inline transformation
- Performance optimizations of the DaCe framework (parsing, transformations, code generation) for large graphs
- Support for Xilinx Vitis 2021.2
- Minor fixes to transformations and deserialization
Full Changelog: v0.11.4...v0.12
DaCe 0.11.4
What's Changed
- If a Python call cannot be parsed into a data-centric program, DaCe will automatically generate a callback into Python. Supports CPU arrays and GPU arrays (via CuPy) without copying!
- Python 3.10 support
- CuPy arrays are supported when calling
@dace.program
s in JIT mode - Fix various issues in Python frontend and code generation
Full Changelog: v0.11.3...v0.11.4
DaCe 0.11.3
DaCe 0.11.2
DaCe 0.11.1
What's Changed
- More flexible Python frontend: you can now call functions and object methods, use fields and globals in
@dace
programs! Some examples:- There is no need to annotate called functions
@dataclass
and general object field support- Loop unrolling: implicit and explicit (with the
dace.unroll
generator) - Constant folding and explicit constant arguments (with
dace.constant
as a type hint) - Debuggability: all functions (e.g.
dace.map
,dace.tasklet
) work in pure Python as well - and many more features
- NumPy semantics are followed more closely, e.g., subscripts create array views
- Direct CuPy and
torch.tensor
integration in@dace
program arguments - Auto-optimization (preview): use
@dace.program(auto_optimize=True, device=dace.DeviceType.CPU)
to automatically run some transformations, such as turning loops into parallel maps. - ARM SVE code generation support by @sscholbe (#705)
- Support for MLIR tasklets by @Berke-Ates in (#747)
- Source Mapping by @benibenj in #756
- Support for HBM on Xilinx FPGAs by @jnice-81 (#762)
Miscellaneous:
- Various performance optimizations to calling
@dace
programs - Various bug fixes to transformations, code generator, and frontends
Full Changelog: v0.10.8...v0.11.1
DaCe 0.10.8
What's New?
- Various bug fixes and more stable Python/NumPy frontend
- Support for running DaCe programs within the Python interpreter
- (experimental) Support for automatic optimization passes (more coming soon!)