Skip to content

Latest commit

 

History

History
563 lines (435 loc) · 27.1 KB

README.md

File metadata and controls

563 lines (435 loc) · 27.1 KB

Codacy Badge Maven Central CircleCI GitHub Top Languages GitHub GitHub last commit GitHub issues GitHub issues by-label

Number

This project is about numbers and their mathematics. The chief features of this library are:

  • all numbers are exact wherever it is possible, including $e$ (Unicode: xD835DF00) and $\pi$ (Unicode: xD835DED1);
  • inexact numbers are represented along with their error bounds;
  • lazy evaluation of expressions to help avoid temporary inexact values from becoming part of a result;
  • there are several domains of Number (expressed with different "factors") to support angles, logarithms, roots.

There is no such thing as accidental loss of precision (at least, provided that code follows the recommendations). For example, if you write:

val x = 1 / 2

your x will be an Int of value 0, because of the way Java-style operators work (in this case, integer division).

However, if you write the idiomatically correct form:

import com.phasmidsoftware.number.core.Number.NumberOps
val x = 1 :/ 2

then x will be a Number with value exactly one half.

You probably want to see some code: so go to the worksheets package and take a look, starting with NumberWorksheet.sc, Foucault1.sc, Newton.sc, and so on.

Introduction

There are three articles on Medium regarding this library. They are Number (part 1), Number (part 2), and Fuzzy, lazy, functional numeric computing in Scala

The Number project provides mathematical utilities where error bounds are tracked (and not forgotten). All functions handle the transformation or convolution of error bounds appropriately. When the error bound is sufficiently large compared to a number, that number is considered to be zero (see Comparison). This implies that, when comparing numbers, any significant overlap of their error bounds will result in them testing as equal (according to the compare function, but not the equals function).

The values of Numbers are represented internally as either Int, Rational, or Double. Rational is simply a case class with BigInt elements for the numerator and denominator. It is of course perfectly possible to use the Rational classes directly, without using the Number (or Expression) classes.

There are four domains of values, each identified by a Factor (see Factors below). These allow the exact representation of roots, logarithmic numbers, radians, and pure numbers.

Java API

In addition to the Scala API, version 1.0.14 introduces a Java API where it is harder to invoke the Scala classes directly from Java. These situations involve classes which have similar names (or have no Java equivalent).

Here are the current API specifications:

RationalJ:

def bigDecimalToRational(x: java.math.BigDecimal): Rational
def rationalToBigDecimal(r: Rational): java.math.BigDecimal
def bigIntegerToRational(x: BigInteger): Rational
def rationalToBigInteger(r: Rational): BigInteger
def longToRational(l: java.lang.Long): Rational
def rationalToLong(r: Rational): java.lang.Long
def doubleToRational(x: java.lang.Double): Rational
def rationalToDouble(r: Rational): java.lang.Double
def stringToRational(s: String): Rational

NumberJ

def bigDecimalToNumber(x: java.math.BigDecimal): Number
def numberToBigDecimal(x: Number): java.math.BigDecimal
def bigIntegerToNumber(x: BigInteger): Number
def numberToBigInteger(x: Number): BigInteger
def longToNumber(l: java.lang.Long): Number
def numberToLong(x: Number): java.lang.Long
def doubleToNumber(x: java.lang.Double): Number
def stringToNumber(s: String): Number

ExpressionJ

def add(x: Expression, y: Expression): Expression
def multiply(x: Expression, y: Expression): Expression

Parsing

A String representing a number with two or fewer decimal places is considered exact--a number with more than two decimal places is considered fuzzy, unless it ends in two zeroes in which case it is considered exact. Here are some examples:

  • Real("1.00"): exact
  • Real("1.0100"): exact
  • Real("1.100"): exact
  • Real("1.010"): fuzzy

You can always override this behavior by adding "*" or "..." to the end of a number with fewer than two DPs, or by adding two 0s to the end of a number with more than two decimal places.

  • Real("1.100*")" fuzzy

See RealWorksheet.sc

The rules are a little different if you define a number using a floating-point literal such as Number(1.23400), the compiler will treat that as a fuzzy number, even though it ends with two zeroes because the compiler essentially ignores them. However, Real(1.23) will be considered exact while Real(1.234) will not. It's best always to use a String if you want to override the default behavior.

In general, the form of a number to be parsed from a String is:

number ::= value? factor?
factor ::= "Pi" | "pi" | "PI" | 𝛑 | 𝜀
value ::= sign? nominalValue fuzz* exponent*
nominalValue ::= integerPart ( "." fractionalPart )? | rational
rational ::= digits "/" digits
integerPart ::= digits
fractionalPart ::= digits
fuzz ::= "..." | "*" | "(" fuzz ")" | "[" fuzz "]"
exponent ::= E sign? digits
fuzz ::= one or two digits

Note that the e and pi symbols are, respectively, (in unicode): \uD835\uDF00 and \uD835\uDED1 (�� and ��)
A number must have at least one of either the value or the factor components. If no explicit factor is specified, then the number will be a Scalar (an ordinary number). If you want to get exact trigonometric values, then it's important to specify the factor as pi (or e).

Number creation

Parsing, described above is really the most precise way of specifying numerical values. But, of course, it's a lot easier to write code that uses numerical literals. For Int and Long, these give us no problems, of course. Neither is there any issue with Rational, BigDecimal, and BigInt. BigDecimal values are represented internally by Rational. There are two ways to specify Rational numbers:

  • one is to create a String of the form r"n/d" where n and d represent the numerator and the denominator;
  • the other way is simply to write n:/d (again n and d are as above).

Either of these methods will require importing the appropriate implicit classes from Rational. It's probably the simplest just to include:

import Rational._

Doubles are where the trickiest conversions apply. Writing something like Number(3.1415927) will result in a FuzzyNumber with error bounds of 5 * 10^-7. To be consistent with the String representation, Number(1.25) will result in an ExactNumber represented internally by a Rational of 5/4. However, if you want to force a number like 3.1415927 to be exact, then you will need to write

Number("3.141592700")

For fuzzy numbers in standard scientific notation, there is an operator "~" which, when following a Double, will add the next one or two integer digits as the standard deviation. For example, the proton-electron mass ratio:

1836.15267343~11

Rendering

The render method is defined in the trait NumberLike and thus is defined by all subtypes, including Field, Number, Rational, etc. For the prettiest output, you should use render rather than toString.

Generally speaking, the output String corresponding to a Number will be the same as the input String, although that is not guaranteed. Numeric quantities followed by "(xx)" show standard scientific notation where xx represents the standard deviation of the error with respect to the last two digits (sometimes there is only one x which corresponds to the last digit). If a number is followed by "[x]" or "[xx]" this corresponds to a "box" (i.e. truncated uniform) probability density function. It's unlikely that you'll need to use this form since box is the default shape when specifying fuzzy numbers with a String.

For Rational numbers, it is most likely that the number will be rendered as exactly as possible. For values which are exactly renderable using decimal notation, that will be the result. For values which have a repeating sequence in decimal notation, the repeating sequence will be enclosed within < and >. If the repeating sequence is too long (or too hard to identify), and if the denominator is less than 100,000, the number will render as a rational, i.e. numerator/denominator. Otherwise, the number will render as many digits as possible, with "..." added to the end.

Fuzzy

The Fuzzy[X] trait defines a typeclass which adds fuzziness to any object type. There is exactly one method defined and that is same:

def same(p: Double)(x1: X, x2: X): Boolean

Given a confidence value p (a probability between 0 and 1), this method will determine if any two objects of type X can be considered the same. If p is 0, then all Fuzzy quantities will be considered the same (i.e. same returns true). If p is 1, then Fuzzy quantities will only be considered the same if the numbers actually are exactly the same (in practice, this generally means that passing 1 for p will result in a false return).

The fuzzyCompare method of FuzzyNumber does use the same method.

Note that the Fuzzy trait assumes nothing at all about the representation of X, or even if X is numeric. The spec file shows an example where X is represents a color. In the vast majority of cases, the X of Fuzzy will be Double.

Comparison

Comparison between Numbers is based on their values, providing that they belong to the same domain (see Factor, below). If they are from different domains, one number will be converted to the domain of the other. If, after any conversion is taken into account, the two values compare equal, then the Numbers are equal. For ExactNumber, comparison ends there.

However, for FuzzyNumber, it is then determined whether there is significant overlap between the fuzz of the two numbers. See Fuzzy, above. The FuzzyNumber object has a method fuzzyCompare, which invokes same for two fuzzy numbers, given a confidence value (p). This, in turn, is invoked by fuzzyCompare of GeneralNumber, which compares this with another Number.

If the overlap is sufficient that there is deemed to be a 50% probability that the numbers are really the same, then the comparison yields 0 (equal). Additionally, each of the comparison methods involved has a signature which includes a p value (the confidence probability). The compare(Number) method of FuzzyNumber (arbitrarily) sets the p value to be 0.5.

Mill

The Mill trait allows expressions to be evaluated using RPN (Reverse Polish Notation). For example:

val eo: Option[Expression] = Mill.parseMill("42 37 + 2 *").toOption.flatMap(_.evaluate)

yields the optional Expression with a materialized value of 158. See the code for other methods for defining Mill operations.

The Mill.parse method in turn invokes methods of MillParser.

The Mill offers two parsers: one is a pure RPN parser (as described above). The other is an infix parser which uses Dijkstra's Shunting Yard algorithm to build a Mill.

Some of the operators of Mill are as follows:

^: Power
+: Add
-: Subtract
*: Multiply (also ×)
/: Divide (also ÷)
v: Sqrt
<>: Swap
: Noop
(: Open
): Close

Additional operators include clr, chs, inv, ln, exp, sin, cos.

Field

The most general form of mathematical quantity is represented by a Field. See Field. A field supports operations such as addition, subtraction, multiplication, and division. We also support powers because, at least for integer powers, raising to a power is simply iterating over a number of multiplications.

Field extends Numerical which, in turn, extends NumberLike (see definitions below).

The two types of Field supported are Real and Complex. Real is a wrapper around a Number (see below) while Complex (see below) is a wrapper around two _Number_s (more or less).

Number

Number is a trait which extends Numerical (but not Field).

There are two subtypes of Number: ExactNumber and FuzzyNumber. Each of these types extends an abstract type called GeneralNumber (although this relationship is expected to change at some future point). GeneralNumber has three members:

  • value (type Value): the numerical value of the number;
  • factor (type Factor): the domain of the value (scalar, radian, log, root, etc.);
  • (optional) fuzz (type Fuzz[Double]): the fuzziness of the number (always None for an ExactNumber).

Value

The "value" of a Number is represented by the following type:

type Value = Either[Either[Option[Double], Rational], Int]

Thus, an integer x is represented by Right(x). A Rational x is represented by a Left(Right(x)). A Double x is represented by a Left(Left(Some(x))). There is also an invalid Number case (NaN) which is represented by Left(Left(Left(None))).

This Value is always of the rightmost type possible: given the various possible specializations. Thus, an Int x which is in range will be represented by Right(x). Thus, a Rational with numerator x and unit denominator, where x is in the range of an Int, will be represented by Right(x). It is also possible that a Double x will be represented by a Left(Right(Rational(x))). For this to happen, the value in question must have fewer than three decimal places (similar to the parsing scheme).

Complex

There are two types of Complex: ComplexCartesian and ComplexPolar. Complex numbers support all the Field operations, as well as modulus, argument, rotate, and conjugate. It is easy to convert between the two types of Complex.

The ComplexPolar object has an additional member (as well as the real and imaginary parts): the number of branches. For example, the square root of 2 should have two branches, yielding: $±√2$.

There are two ways to parse a String as a Complex (in each case, the parsing of the String is identical):

  • Complex.parse(String): will return a Try[Complex];
  • C"...": will return a Complex (or throw an exception).

For example (see also Complex.sc),

C"1i0" : ComplexCartesian(1,0)
C"1-i1" : ComplexCartesian(1,-1)
C"1ipi" : ComplexPolar(1,pi)

Factor

There are three types of "factor:"

  • PureNumber, in particular, Scalar (for ordinary dimensionless numbers), Radian (used to represent radians or any multiple of $\pi$);
  • Logarithmic, in particular, NatLog, Log2, and Log10;
  • Root(n), in particular: Root2 (for square roots) and Root3 (for cube roots).

These allow certain quantities to be expressed exactly, for example, $sin(\frac{\pi}{3})$ is the square root of $\frac{3}{4}$. The true (Scalar) values of the logarithmic numbers are $e^x$, $2^x$, and $10^x$ respectively where $x$ is the "value" of the Number.

Trigonometrical functions are designed to work with Radian quantities. Such values are limited (modulated) to be in the range $-\pi...\pi$. However, this modulation typically happens inside operations or as part of render, so it is still possible to define a value of $2\pi$. For example, if you want to check that the sine of $\frac{\pi}{2}$ is equal to 1 exactly, then you should write the following:

val target = (Number.pi/2).sin
target shouldBe Number.one

Similarly, if you use the atan method on a Scalar number, the result will be a number (possibly exact) whose factor is Radian.

The 𝜀 factor works quite differently. It is not a simple matter of scaling. A Number of the form Number(x, NatLog) actually evaluates to $e^x$ rather than $e x$.

It would be possible to implement $\pi$ values similarly to 𝜀 values (as evaluations of $e^{ix}$). However, this is not currently done (future enhancement?). See Complex numbers.

Constants

Constant values of fields are defined in the Constants object. Many of the values are dependent on constants in the Number class which defines values for pi, $\pi$, e, one, zero, i, etc.

The Constants object also contains a number of fundamental (physical and mathematical) constant definitions, in addition to those defined by Number. For example: c (speed of light), alpha (fine structure constant), etc.

NumberLike

NumberLike defines behavior which is of the most general number-like nature. The specific methods defined are:

def isExact(maybeFactor: Option[Factor]): Boolean // determines if this object is exact in the domain of the (optional) factor
def isExact: Boolean = isExact(None)
def asNumber: Option[Number]
def render: String

Additionally, there are two methods relating to the Set of which this NumberLike object is a member, such as: the integers ($\mathbb{Z}$)

def memberOf: Option[NumberSet]
def memberOf(set: NumberSet): Boolean

Numerical

Numerical extends NumberLike. Additional methods include:

def isSame(x: Numerical): Boolean // determines if this and x are equivalent, numerically.
def isInfinite: Boolean
def isZero: Boolean
def isUnity: Boolean
def signum: Int
def unary_- : Field
def invert: Field
def normalize: Field
def asComplex: Complex
def asReal: Option[Real]

Lazy Evaluation

Version 1.0.3 supports lazy evaluation via a trait called Expression. The advantage of lazy evaluation is not so much performance. That's going to be neither here nor there. But it is in avoiding precision loss in some circumstances. The simplification mechanism which is invoked when materializing an expression goes to great lengths to cancel out any loss of precision.

An example of this is the expression (√3 + 1)(√3 - 1). It is easy to see that this should have a value of exactly 2. However, it is not trivial to do the appropriate matching to achieve this simplification. This is why Number uses the Matchers package (https://github.com/rchillyard/Matchers).

The simplification mechanism uses its own ExpressionMatchers, which is an extension of Matchers. The current set of expression optimizations is somewhat limited, but it catches the most important cases.

For example, suppose an expression you are working on involves the square root of, say, 7. However, you don't particularly pay attention to the fact that later on in the calculation, you square everything. If you don't use lazy evaluation, your final result will have an error bound, even though the true value should be proportional to exactly 7.

It's important to realize that, to get the benefit of this behavior, you must use the Expression mechanism (not a pure Number).

  it should "give precise result for sqrt(7)^2" in {
    val x: Expression = Number(7)
    val y = x.sqrt
    val z = y ^ 2
    z.materialize shouldBe Number(7)
  }
  it should "show ^2 and sqrt for illustrative purposes" in {
    val seven = Number(7)
    val x = seven.sqrt
    val y = x power 2
    y shouldEqual Number(7)
    y shouldBe Number(7)
  }

The second test fails with "7.000000000000001 was not equal to 7," although if we do a fuzzy compare, using a custom equality test, we can at least make y shouldEqual 7 work.

NOTE: from V1.0.12 onwards, there are more special cases implemented in the Number code and so many of these issues which required the use of Expressions will now work just using Numbers. This is particularly true of the example above involving the square root of 7.

There is an implicit class ExpressionOps which provides methods which allow Number operations to behave as expressions. So, for example, you can write:

val x = Number(1) + 2

For this to compile properly, you will need to import the ExpressionOps class.

The one drawback of the Expression mechanism is that, when you want to convert back to a Number, it is a little awkward. You can use the asNumber method (which returns an Option[Number]) or you can use an implicit converter (in which case you will need to ensure that you have Number._ imported). If you use the latter mechanism, keep in mind that it's possible that an exception will be thrown.

Error Bounds (Fuzziness)

The error bounds are represented by the Fuzz[Double] class. A Number with None for the fuzz is an ExactNumber, otherwise, FuzzyNumber. There are three major attributes of fuzz: shape, style (either relative or absolute), and the value (called magnitude when absolute, and tolerance when relative). Shape describes the probability density function (PDF) of values compared to the nominal value. There are currently only two types of shape:

  • Box: a truncated uniform probability density function--magnitude/tolerance relate to half the width of the non-zero probability section.
  • Gaussian: a normal probability density function: the nominal value is at the mean, and magnitude/tolerance is the standard deviation.

It's easy to convert between these four different possibilities. Generally speaking, when doing addition (or when a Number is first defined), it's convenient for the fuzz to be absolute. When performing any other operation, it's most convenient for the fuzz to be relative. It's not possible to combine two Box-shaped fuzzes: it would be possible if we allowed for trapezoids as well as rectangles, but that's far too complicated. So, whenever we combine fuzz (using convolution), we operate on Gaussian PDFs which can easily be combined.

So, why is relative fuzz usually the best? Well consider scaling--multiplying by a constant. The relative fuzz doesn't change at all. In the following, $f$ is a constant factor. Let's assume that $y=fx$.

Differentiating, we get,

$$Δy=fΔx$$

Dividing both sides by y, yields

$$\frac{Δy}{y}=\frac{Δx}{x}$$

Thus, the relative fuzz of y is equal to the relative fuzz of x.

When we multiply two fuzzy numbers together, we add the relative fuzzes together:

$$z+Δz=(x+Δx)(y+Δy)$$

Therefore, (ignoring the term which is $ΔxΔy$),

$$Δz=yΔx+xΔy$$

Dividing both sides by $z$:

$$\frac{Δz}{z}=\frac{Δx}{x}+\frac{Δy}{y}$$

Thus, the relative fuzz of z is equal to the sum of the relative fuzzes of x and y.

But, when Δx and Δy are taken from a Gaussian probability density function, the convolution of those two PDFs, is given by slightly different expressions depending on whether the PDFs are independent or correlated. See the code (Fuzz) for details.

Things get only slightly more complex when applying monadic (single operand) functions or applying a function such as $z=x^y$:

In general, when we apply a monadic operator $y=f(x)$ (such as constant factor, as above, or power, or one of the trigonometric operators), the formula for the relative fuzz of the result $\frac{Δy}{y}$ based on the relative fuzz of the input $\frac{Δx}{x}$ is:

$$\frac{Δy}{y}=\frac{x \frac{dy}{dx}(x)}{f(x)}\frac{Δx}{x}$$

Constants cancel, powers survive as is, and so on.

For example, if $y=e^x$ then

$$\frac{Δy}{y}=x\frac{Δx}{x}$$

Again, these formulas can be looked up in the code.

Comparing two fuzzy numbers involves subtracting the two numbers and then determining if the probability at zero is sufficiently high to consider the difference to be zero. If the probability is greater than 50% (the default--although there are method signatures that allow for different values), then we consider that the different is zero (method isZero) or that it has a signum of 0.

Approximation

The Approximation object provides a method solve which will implement the Newton-Raphson method of approximation and also Halley's method (if you need it). See Newton.sc for examples.

Continued Fractions

This library includes a facility to create continued fractions which can be used to define (or approximate) constant values. See the worksheet ContinuedFractions.sc.

For example, the golden ratio ($\phi$) can be evaluated using an infinite continued fraction where the coefficients are all 1. Continued fractions can be used to generate "convergents" which are rational numbers and which approximate a value. For example, the convergents for $\pi$ include with the familiar 22/7, 355/113, etc.

Versions

  • Version 1.1.0: Significant refactoring:
    • Number is no longer a subtype of Field. Code should use the wrapper Real(number) to form a Field.
    • Some of the worksheets were broken and have been fixed.
  • Version 1.0.17: Minor changes.
  • Version 1.0.16: Added C-interpolator for Complex objects; various other fixes, including radian values now range from -pi to pi.
  • Version 1.0.15: Significant improvements to the rendering of rational numbers.
  • Version 1.0.14: ComplexPolar now keeps track of branches; introduced Real type. Java API.
  • Version 1.0.13: Mostly cleanup together with some fixes related to Root factors and rendering of fuzziness.
  • Version 1.0.12: Mostly cleanup together with some fixes related to the new factors.
  • Version 1.0.11: Changes to the factors: renamed Pi as Radian, E as NatLog, and added Log2, Log10, Root2 and Root3.
  • Version 1.0.10: Many improvements and fixes:
    • added Constants,
    • implicit converter from Expression to Number,
    • refactored structure of classes,
    • totally reworked the expression matchers.
  • Version 1.0.9: Added complex numbers; improved simplifications somewhat; use version 1.0.4 of Matchers (now in main).
  • Version 1.0.8: This includes better simplification and, in particular, evaluates (√3 + 1)(√3 - 1) as exactly 2.
  • Version 1.0.7: added Matchers.
  • Version 1.0.6: added Mill (RPN evaluator).
  • Version 1.0.5: reimplement the e factor.
  • Version 1.0.4 Made improvements to Rational, removed BigInt from Value, and effected many refactorings.
  • Version 1.0.3 implements lazy evaluation.
  • Version 1.0.2 Included fixing the bug mentioned in 1.0.1 (actually a Rational bug), as well as adding the :/ operator and many other fixes/features.
  • Version 1.0.1 Fixed many issues with minor inconsistencies. Most important, perhaps, was the implementation of compare, along with signum and isZero. Each of these has, significantly, a signature with a confidence value (the default value is 0.5).
  • Initial version is 1.0.0

Future Upgrades