Skip to content

Mplfinance Time Axis Concerns, and Internals of Displaying or Not Displaying Non Trading periods

Daniel Goldfarb edited this page Apr 1, 2022 · 8 revisions

Who should read this article:

In this article we describe internal workings of the time axis (the x-axis) when plotting Time-Series data in general, and Financial Markets Data in particular.

This information is not necessary for most mplfinance users. The user simply provides Pandas Timestamps, or python datetimes, or strings representing a date or datetime, and mplfinance handles the rest.

However, this article is important for mplfinance users who directly access the mplfinance Axes objects in a way that requires use of the x-axis.

As a reminder, direct access of mplfinance Axes objects is discouraged. Doing so will always require more code (see below). There are cases, however, where accessing Axes is necessary. Presently, there an enhancement underway to make it much less necessary. In the meantime, this article can provide a better understanding of time axis, in order to help those users who must directly access the mplfinance Axes objects in a way that requires use of the x-axis.


Mplfinance and the Matplotlib Time Axis:

The first thing to understand is that Matplotlib datetimes are not the same as Pandas Timestamps nor python datetimes (which in turn are also different from each other). Although it is relatively simple to convert from one to the other, mplfinance does this conversion for you. Users should not have to worry or think about matplotlib datetimes.

In such cases users must be aware that the x-axis data may be matplotlib datetimes and that they may have to convert their own data.

  • Notice that we said may be matplotlib datetimes: There is another wrinkle the requires mplfinance to internally deal with yet another representation or mapping of datetimes.

Market Data and the Time Axis:

Financial Market Data (that one can obtain from any number of market data sources) typical does not include rows for periods when the market was closed. For example, daily data will not include data rows for weekends. Hourly data, or minute-by-minute data, for more than one day, will not include data rows for nighttime hours when the market is closed.


For example, some daily data might look like this:

Notice that there are no rows for 03/19/2022 and 03/20/2022:

Date Open High Low Close
03/23/2022 446.9100 448.4900 443.7100 443.8000
03/22/2022 445.8600 450.5800 445.8600 449.5900
03/21/2022 444.3400 446.4600 440.6800 444.3900
03/18/2022 438.0000 444.8600 437.2200 444.5200
03/17/2022 433.5900 441.0700 433.1900 441.0700

Similarly, the following minute-by-minute data set

contains no data from the market close at 16:00 on 3/29 until the market open at 09:30 on 3/30:

DateTime Open High Low Close
03/30 09:33 460.6900 460.8299 460.6000 460.6700
03/30 09:32 460.6250 460.7900 460.5700 460.6700
03/30 09:31 460.6858 460.7300 460.5400 460.6250
03/30 09:30 460.3400 460.7300 460.2900 460.6800
03/29 15:59 461.2900 461.6200 461.1100 461.5300
03/29 15:58 461.2500 461.4100 461.2300 461.3000
03/29 15:57 461.0850 461.2900 460.9200 461.2500
03/29 15:56 461.0550 461.1500 461.0400 461.0900

When we tell Matplotlib that the x-axis is time, Matplotlib assumes the time axis as continuous (a reasonable assumption).

This means that the time axis will display ALL times between the first time and the last time in the data set.

The result is gaps in the plot where there is NO trading data:

nontrading_gaps

Similarly with minute-by-minute data over several trading days:

iday_nontrading_gaps


However, most people working with financial markets data prefer not to see these gaps.

For this reason, in mplfinance, the default value for show_nontrading is False:

no_nontrading_gaps

iday_no_nontrading_gaps


The problem with NOT displaying Non-Trading periods is that, mathematically, THE TIME AXIS IS NOW DISCONTINUOUS with respect to time.

So, while we don't see gaps in the plotted data, there are DISCONTINUITIES, in the x-axis itself.


What this means for mplfinance:

As mentioned above, when we squeeze out the non-trading gaps from market data (for example we put Monday right after Friday, or we put 09:30 Tuesday right after 16:00 Monday) we create discontinuities in the time-axis.

This means that a particular small length of x-axis may correpsond to 1 day at one place on the axis, but the same length of axis may correspond to 2 days or 3 days or even 4 days at another place on the same axis. (This has implications for drawing trend lines, and for interpolating between data points, as we will discuss below).

The underlying matplotlib Artists (graphics) code (as far as I know) does not support a discontinuous axis. It treats every axis within an Axes object as continuous, from its minimum to its maximum. Therefore, in order to plot a discontinuous time series with no gaps, we need to provide the Axes object with a set of continuous data to use for the x-axis. This is simple to do as long as that continuous data somehow maps to the discontinuous time series.

Mplfinance does this by using the row number of the DataFrame as the x-axis variable. For example, if the DataFrame contains 90 rows of data, then the x-axis variable will be a floating point number ranging from 0.0 to 89. Mplfinance then displays datetimes along the x-axis by internally maintaining a mapping between the row number and the datetime.


What this means for mplfinance users:

Most mplfinance users can be blissfully ignorant of these internal workings. The user provides Pandas Timestamps (in the form of a DatetimeIndex within a Pandas DataFrame) and mplfinance handles the rest.

In some cases (for example with vlines and alines kwargs) the user may provide not only Pandas Timestamps, but also python datetimes, or even strings representing a Date or Timestamp (for example "03-30-2022 13:00"). Again mplfinance handles this, converting the strings or datetimes, and even handling the case where the placement of vlines or alines requires time-axis interpolation between trading points in the OHLC data. (The interested user can see the code here.)

The cases where this does affect mplfinance users are those that involve directly accessing the mplfinance Axes objects in a way that requires use of the x-axis.

In such a case, the first rule is to be aware of the show_nontrading setting. If this kwarg is not specified, then it defaults to False. Then the following table applies:

  • If show_nontrading is False, then the x-axis variable is a floating point representing the row number of the data in the dataframe.
  • If show_nontrading is True, then the x-axis variable is the matplotlib date.

When specifying x-axis data to an Axes object directly (i.e. not through mplfinance), the user must convert and pass the appropriate data (row number as a float, or matplotlib date).


Example Converting to floating point row numbers:

  • If a range of dates or datetimes are stored in a Pandas DatetimeIndex, then any Timestamp or datetime within that range can be easily converted to the floating point row number (interpolating for fractions of a row). We can do this using Pandas's ability to find (one or more rows) by the keys in an Pandas Index.
  • The simplest way to do this is to first convert the DatetimeIndex into a Pandas Series of datetimes (Timestamps) indexed by those same datetimes (Timestamps). If dtindex is the DatetimeIndex, then we simply do: dtseries = dtindex.to_series().
  • After this, the code looks something this:
    • Note that (for simplicity) this code does not truly interpolate, but rather takes the midpoint if the datetime falls between two rows.
    • For an example of code that linearly interpolates between rows see function _date_to_iloc_linear() (in file src/mplfinance/_utils.py).
def _date_to_iloc(dtseries,date):
    '''Convert a `date` to a location, given a date series w/a datetime index. 
       If `date` does not exactly match a date in the series then interpolate between two dates.
       If `date` is outside the range of dates in the series, then raise an exception
      .
    '''
    d1s = dtseries.loc[date:]
    if len(d1s) < 1:
        sdtrange = str(dtseries[0])+' to '+str(dtseries[-1])
        raise ValueError('User specified line date "'+str(date)+
                         '" is beyond (greater than) range of plotted data ('+sdtrange+').')
    d1 = d1s.index[0]
    d2s = dtseries.loc[:date]
    if len(d2s) < 1:
        sdtrange = str(dtseries[0])+' to '+str(dtseries[-1])
        raise ValueError('User specified line date "'+str(date)+
                         '" is before (less than) range of plotted data ('+sdtrange+').')
    d2 = dtseries.loc[:date].index[-1]
    # If there are duplicate dates in the series, for example in a renko plot
    # then .get_loc(date) will return a slice containing all the dups, so:
    loc1 = dtseries.index.get_loc(d1)
    if isinstance(loc1,slice): loc1 = loc1.start
    loc2 = dtseries.index.get_loc(d2)
    if isinstance(loc2,slice): loc2 = loc2.stop - 1
    return (loc1+loc2)/2.0

Example Converting to matplotlib datetimes:

  • Pandas DatetimeIndex objects store their datetimes as Pandas Timestamp objects. Timestamps are similar to but different from Python datetime objects.
  • Matplotlib dates (datetimes) are simply floating point numbers where the whole number portion is the number of days since January 1st of the year zero on the Gregorian calendar, and the fractional portion is the fraction of a day since the start of the day (midnight).
  • Afik, it is not possible to convert directly from Timestamps to Matplotlib dates, rather Timestamps must first be converted to Python datetimes which are then converted to matplotlib dates (datetimes).
  • The code looks something like this:
import pandas as pd
import matplotlib.dates as mdates
import datetime
def _date_to_mdate(date):
    # Whether a string or Timestamp, first convert to a Python datetime
    if isinstance(date,str):
        # use pandas to convert the string to a pydatetime
        # (this could be done with module dateutil, but we're already using pandas)
        pydt = pd.to_datetime(date).to_pydatetime()
    elif isinstance(date,pd.Timestamp):
        pydt = date.to_pydatetime()
    elif isinstance(date,(datetime.datetime,datetime.date)):
        pydt = date
    else:
        return None
    # convert Python datetime to matplotlib datetime:
    return mdates.date2num(pydt)
Clone this wiki locally