Skip to content

Commit

Permalink
trademark
Browse files Browse the repository at this point in the history
  • Loading branch information
Sergey Rybakov committed Dec 29, 2023
1 parent 85028d1 commit dc6b551
Show file tree
Hide file tree
Showing 10 changed files with 125 additions and 122 deletions.
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Deker
# DEKER™

![image](docs/deker/images/logo_50.png)

Expand All @@ -8,22 +8,22 @@
[![codecov](https://codecov.io/gh/openweathermap/deker/branch/main/graph/badge.svg?token=Z040BQWIOR)](https://codecov.io/gh/openweathermap/deker)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

Deker is pure Python implementation of petabyte-scale highly parallel data storage engine for
DEKER™ is pure Python implementation of petabyte-scale highly parallel data storage engine for
multidimensional arrays.

Deker name comes from term *dekeract*, the [10-cube](https://en.wikipedia.org/wiki/10-cube).
DEKER™ name comes from term *dekeract*, the [10-cube](https://en.wikipedia.org/wiki/10-cube).

Deker was made with the following major goals in mind:
DEKER™ was made with the following major goals in mind:

* provide intuitive interface for storing and accessing **huge data arrays**
* support **arbitrary number of data dimensions**
* be **thread and process safe** and as **lean on RAM** use as possible

Deker empowers users to store and access a wide range of data types, virtually anything that can be
DEKER™ empowers users to store and access a wide range of data types, virtually anything that can be
represented as arrays, like **geospacial data**, **satellite images**, **machine learning models**,
**sensors data**, graphs, key-value pairs, tabular data, and more.

Deker does not limit your data complexity and size: it supports virtually unlimited number of data
DEKER™ does not limit your data complexity and size: it supports virtually unlimited number of data
dimensions and provides under the hood mechanisms to **partition** huge amounts of data for
**scalability**.

Expand All @@ -40,7 +40,7 @@ dimensions and provides under the hood mechanisms to **partition** huge amounts

## Code and Documentation

Open source implementation of Deker storage engine is published at
Open source implementation of DEKER™ storage engine is published at

* https://github.com/openweathermap/deker

Expand All @@ -52,9 +52,9 @@ API documentation and tutorials for the current release could be found at

### Dependencies

Minimal Python version for Deker is 3.9.
Minimal Python version for DEKER™ is 3.9.

Deker depends on the following third-party packages:
DEKER™ depends on the following third-party packages:

* `numpy` >= 1.18
* `attrs` >= 23.1.0
Expand All @@ -63,7 +63,7 @@ Deker depends on the following third-party packages:
* `h5py` >= 3.8.0
* `hdf5plugin` >= 4.0.1

Also please not that for flexibility few internal Deker components are published as separate
Also please not that for flexibility few internal DEKER™ components are published as separate
packages:

* [`deker-local-adapters`](https://github.com/openweathermap/deker-local-adapters)
Expand All @@ -72,17 +72,17 @@ packages:

### Install

To install Deker run:
To install DEKER™ run:

```bash
pip install deker
```
Please refer to documentation for advanced topics such as running on Apple silicone or using Xarray
with Deker API.
with DEKER™ API.

### First Steps

Now you can write simple script to jump into Deker development:
Now you can write simple script to jump into DEKER™ development:

```python
from deker import Client, ArraySchema, DimensionSchema, TimeDimensionSchema
Expand Down
4 changes: 2 additions & 2 deletions docs/deker/api/modules.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Deker public API
=================
DEKER™ public API
==================

.. toctree::
:maxdepth: 4
Expand Down
50 changes: 25 additions & 25 deletions docs/deker/collection_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ Collection Schema
Introduction
============

In some aspects Deker is similar to other database management systems. It has *collections* which
In some aspects DEKER™ is similar to other database management systems. It has *collections* which
are equivalent to tables in relational databases or collections in MongoDB.

Collection stores one of two flavors of *arrays* supported by Deker. We would look into difference
Collection stores one of two flavors of *arrays* supported by DEKER™. We would look into difference
between them later in this tutorial, but for now it is important to understand that *array* is
defined by the *schema* associated with the *collection* where it is stored.

Expand All @@ -31,8 +31,8 @@ Collection *schema* consists from several components:
Understanding Array Flavors
===========================

Two flavor of *arrays* supported by Deker are ``Array`` and ``VArray``. Those objects represent
core concept of Deker storage. Hereafter we will describe their structure, differences and
Two flavor of *arrays* supported by DEKER™ are ``Array`` and ``VArray``. Those objects represent
core concept of DEKER™ storage. Hereafter we will describe their structure, differences and
commonalities and give overview of when either of them should be used.


Expand Down Expand Up @@ -61,8 +61,8 @@ layers with particular weather characteristic values, as shown in the legend.
In the illustration above single ``Array`` has 4 cells in each dimension, in other words its
*shape* is ``(4, 4, 4)``.

Deker will store each ``Array`` data in a separate file, and when we retrieve this ``Array`` object
from ``Collection`` and access its data, all operations will affect this file only.
DEKER™ will store each ``Array`` data in a separate file, and when we retrieve this ``Array``
object from ``Collection`` and access its data, all operations will affect this file only.


VArray
Expand All @@ -80,7 +80,7 @@ something really huge like whole Earth surface satellite image. Let's say that s
would be 300000x200000 px. If stored in single file it will produce large filesystem objects
that will impose limitations on concurrent read-write access thus impending storage scalability.

To optimize this type of data storage, Deker uses tiling, i.e. it splits large ``VArray`` objects
To optimize this type of data storage, DEKER™ uses tiling, i.e. it splits large ``VArray`` objects
into series of smaller ``Array`` objects and transparently join them into for user access as
virtual array. It probably would still be impossible to access this huge array as a whole, but it
enables efficient access to digestible parts of it piece by piece.
Expand All @@ -92,7 +92,7 @@ enables efficient access to digestible parts of it piece by piece.
into separate *tiles* (``Array`` objects) with regular *grid*.

If ``Collection`` is defined to contain ``VArray`` objects, you don't have to worry about tiling,
Deker would transparently manage this for you under the hood.
DEKER™ would transparently manage this for you under the hood.

When some slice of data is queried from the ``VArray``, it automatically calculates which files
need to be opened to retrieve it and what part of requested slice data bounds belong to each of
Expand All @@ -111,8 +111,8 @@ Let's query the following slice of it: ``[1:3, :, :]``

Here you can see, that all 4 tile files will be affected, but only the highlighted pieces of them
will be actually read or written. All different files reads or writes could be done in parallel.
In case you are retrieving data, Deker will transparently combine each read piece into subset with
requested shape and return it to you. If you use these bounds to write data, Deker will
In case you are retrieving data, DEKER™ will transparently combine each read piece into subset with
requested shape and return it to you. If you use these bounds to write data, DEKER™ will
automatically split the slice you have provided into pieces and write them in parallel to
corresponding files.

Expand Down Expand Up @@ -234,15 +234,15 @@ If a dimension has a real regular scale, we may indicate it::
),
]

As you can see, regular scale can be defined either with Python ``dict`` or with Deker ``Scale``
As you can see, regular scale can be defined either with Python ``dict`` or with DEKER™ ``Scale``
named tuple. The keyword ``name`` is optional. Scale values shall be always defined as ``floats``.

The parameters ``step`` and ``start_value`` may be negative as well. For example, ``era5`` weather
model has a geo grid shaped ``(ys=721, xs=1440)`` with step ``0.25`` degrees per cell. The
zero-point of the ``map`` is north-west or left-upper corner. In other words ``era5`` grid point
``(0, 0)`` is set to coordinates ``(lat=90.0, lon=-180.0)``.

Here is an example of how this grid can be bound to real geographical coordinates in Deker::
Here is an example of how this grid can be bound to real geographical coordinates in DEKER™::

dimensions = [
DimensionSchema(
Expand Down Expand Up @@ -348,7 +348,7 @@ let you set an individual start point for each new ``Array`` or ``VArray`` at it

.. attention::
For ``start_value`` you can pass a datetime value with any timezone (e.g. your local timezone),
but you should remember that Deker converts and stores it in the UTC timezone.
but you should remember that DEKER™ converts and stores it in the UTC timezone.

Before querying some data from ``TimeDimension``, you should convert your local time to UTC to
be sure that you get a pack of correct data. You can do it with ``get_utc()`` function from
Expand Down Expand Up @@ -381,7 +381,7 @@ All databases provide some additional obligatory and/or optional information con
example, in SQL there are primary keys which indicate that data cannot be inserted without passing
them.

For this purpose Deker provides **primary** and **custom attributes** which shall be defined as a
For this purpose DEKER™ provides **primary** and **custom attributes** which shall be defined as a
list (or a tuple) of ``AttributeSchema``::

from deker import AttributeSchema
Expand Down Expand Up @@ -428,7 +428,7 @@ Primary Attributes
It is highly recommended to define at least one **primary** attribute in every schema.

Primary attributes are a strictly ordered sequence. They are used for ``Array`` or ``VArray``
filtering. When Deker is building its file system, it creates symlinks for main data files using
filtering. When DEKER™ is building its file system, it creates symlinks for main data files using
primary attributes in the symlink path. If you need to get a certain ``Array`` or ``VArray`` from a
``Collection``, you have two options how to do it:

Expand Down Expand Up @@ -483,8 +483,8 @@ dimensions schemas and ``dtype``. You may optionally pass a list of attributes s
Data Type
---------

Deker has a strong data typing. All the values of all the ``Array`` or ``VArray`` objects in one
``Collection`` shall be of the same data type. Deker accepts numeric data of the following Python
DEKER™ has a strong data typing. All the values of all the ``Array`` or ``VArray`` objects in one
``Collection`` shall be of the same data type. DEKER™ accepts numeric data of the following Python
and NumPy data types:

.. list-table:: Integers
Expand Down Expand Up @@ -649,7 +649,7 @@ That's why we need something that will fill them in.
Rules are the following:

1. ``fill_value`` **shall not be significant** for your data.
2. ``fill_value`` **is optional** - you may not provide it. In this case Deker will choose it
2. ``fill_value`` **is optional** - you may not provide it. In this case DEKER™ will choose it
automatically basing on the provided ``dtype``. For ``integer`` and ``unsigned integer`` data
types it will be the lowest value for the correspondent data type bit capacity. For example,
it will be ``-128`` for ``numpy.int8``. For ``float`` and ``complex`` data types it will be
Expand All @@ -658,8 +658,8 @@ Rules are the following:
passed to the ``dtype`` parameter. If all the values of the correspondent ``dtype`` are
significant for you, you shall choose a data type of a greater bit capacity. For example, if all
the values in the range ``[-128; 128]`` are valid for your dataset, you'd better choose
``numpy.int16`` instead of ``numpy.int8`` and set ``-129`` as ``fill_value`` or let Deker to set
it automatically. The other workaround is to choose any floating data type, e.g.
``numpy.int16`` instead of ``numpy.int8`` and set ``-129`` as ``fill_value`` or let DEKER™ to
set it automatically. The other workaround is to choose any floating data type, e.g.
``numpy.float16``, and have ``numpy.nan`` as a ``fill_value``.

Now, let's create once again some simple dimensions and attributes for both types of schemas::
Expand Down Expand Up @@ -797,7 +797,7 @@ Creating Collection

``Client`` is responsible for creating connections and its internal context.

As far as Deker is a file-based database, you need to provide some path to the storage, where your
As far as DEKER™ is a file-based database, you need to provide some path to the storage, where your
collections will be kept.


Expand All @@ -806,9 +806,9 @@ URI

There is a universal way to provide paths and connection options: an URI.

The scheme of URI string for embedded Deker databases, stored on your local drive, is ``file://``.
The scheme of URI string for embedded DEKER™ databases, stored on your local drive, is ``file://``.
It shall be followed by a path to the directory where the storage will be located. If this
directory (or even full path to it) does not exist, Deker will create it at ``Client``
directory (or even full path to it) does not exist, DEKER™ will create it at ``Client``
initialization.

.. note::
Expand All @@ -824,7 +824,7 @@ In this documentation we will use a reference to a temporary directory ``/tmp/de
Client
------

Now open the ``Client`` for interacting with Deker::
Now open the ``Client`` for interacting with DEKER™::

from deker import Client

Expand Down Expand Up @@ -918,7 +918,7 @@ some world-wide weather data::

**We did it!**

Now there is a new path ``/tmp/deker/collections/weather`` on your local drive where Deker will
Now there is a new path ``/tmp/deker/collections/weather`` on your local drive where DEKER™ will
store the data relative to the ``Collection`` named ``weather``. Each ``Array`` will contain a pack
of daily 24-hours weather data for each entire latitude and longitude degree: ``temperature``,
``humidity``, ``pressure`` and ``wind_speed``.
15 changes: 8 additions & 7 deletions docs/deker/connecting_to_server.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,33 @@ Connecting to Server
.. _OpenWeather: https://openweathermap.org
.. _Installation page: installation.html

To access remotely the data stored on OpenWeather_ managed Deker server infrastructure, you need
To access remotely the data stored on OpenWeather_ managed DEKER™ server infrastructure, you need
to use server adapters.

It is an original OpenWeather plugin, based on `httpx <https://www.python-httpx.org/>`_
with HTTP 2.0 support, that allows your local client to communicate with remote OpenWeather
public server instances of Deker.
public server instances of DEKER™.

Deker will automatically find and initialize this plugin if it is installed in current environment.
DEKER™ will automatically find and initialize this plugin if it is installed in current
environment.

.. attention::
You must install ``deker-server-adapters`` package , for details refer to the `Installation page`_


Usage
=========
To use server version, you have to initialize Deker's Client with an uri which contains
To use server version, you have to initialize DEKER™ Client with an uri which contains
``http/https`` scheme.

.. code-block:: python
from deker import Client
client = Client("http://{url-to-deker-server}") # As simple as that
And now the client will communicate with Deker server.
And now the client will communicate with DEKER™ server.

If authentication is enabled on the Deker server, you can provide credentials by adding it
If authentication is enabled on the DEKER™ server, you can provide credentials by adding it
to the url like this:

.. code-block:: python
Expand All @@ -41,7 +42,7 @@ to the url like this:
Configuration
=============
Server adapters use ``httpx client`` under the hood. You can configure its behaviour by passing
keyword arguments to the ``httpx_conf`` parameter of the Deker's Client:
keyword arguments to the ``httpx_conf`` parameter of the DEKER™ Client:

.. code-block:: python
Expand Down
8 changes: 4 additions & 4 deletions docs/deker/data_access.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ slicing parameters::

.. _`official documentation`: https://numpy.org/doc/stable/user/basics.indexing.html

Deker allows you to index and slice its ``Array`` and ``VArray`` not only with integers, but with
DEKER™ allows you to index and slice its ``Array`` and ``VArray`` not only with integers, but with
the ``types`` by which the dimensions are described.

But let's start with a **constraint**.
Expand Down Expand Up @@ -561,7 +561,7 @@ Read Xarray
-----------

.. warning::
``xarray`` package is not in the list of the Deker default dependencies. Please, refer to the
``xarray`` package is not in the list of the DEKER™ default dependencies. Please, refer to the
Installation_ chapter for more details

Xarray_ is a wonderful project, which provides special objects for working with multidimensional
Expand Down Expand Up @@ -601,8 +601,8 @@ It provides even more opportunities. Refer to ``xarray.DataArray`` API_ for deta

Locks
======
Deker is thread and process safe. It uses its own locks for the majority of operations.
Deker locks can be divided into two groups: **read** and **write** locks
DEKER™ is thread and process safe. It uses its own locks for the majority of operations.
DEKER™ locks can be divided into two groups: **read** and **write** locks

**Read locks** can be shared between threads and processes with no risk of data corruption.

Expand Down
Loading

0 comments on commit dc6b551

Please sign in to comment.