Skip to content

Commit

Permalink
Merge branch 'main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
lemire authored Jan 28, 2024
2 parents 9c8891e + f320619 commit 620376a
Show file tree
Hide file tree
Showing 10 changed files with 1,044 additions and 34 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.9)

project(fast_float VERSION 5.3.0 LANGUAGES CXX)
project(fast_float VERSION 6.0.0 LANGUAGES CXX)
option(FASTFLOAT_TEST "Enable tests" OFF)
if(FASTFLOAT_TEST)
enable_testing()
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTORS
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ Fabio Pellacini
Lénárd Szolnoki
Jan Pharago
Maya Warrier
Taha Khokhar
Taha Khokhar
55 changes: 50 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,21 @@
[![Ubuntu 22.04 CI (GCC 11)](https://github.com/fastfloat/fast_float/actions/workflows/ubuntu22.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/ubuntu22.yml)

The fast_float library provides fast header-only implementations for the C++ from_chars
functions for `float` and `double` types. These functions convert ASCII strings representing
decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding (including
functions for `float` and `double` types as well as integer types. These functions convert ASCII strings representing decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding (including
round to even). In our experience, these `fast_float` functions many times faster than comparable number-parsing functions from existing C++ standard libraries.

Specifically, `fast_float` provides the following two functions with a C++17-like syntax (the library itself only requires C++11):
Specifically, `fast_float` provides the following two functions to parse floating-point numbers with a C++17-like syntax (the library itself only requires C++11):

```C++
from_chars_result from_chars(const char* first, const char* last, float& value, ...);
from_chars_result from_chars(const char* first, const char* last, double& value, ...);
```
You can also parse integer types:
The return type (`from_chars_result`) is defined as the struct:
```C++
struct from_chars_result {
Expand Down Expand Up @@ -103,6 +107,43 @@ We support Visual Studio, macOS, Linux, freeBSD. We support big and little endia
We assume that the rounding mode is set to nearest (`std::fegetround() == FE_TONEAREST`).
## Integer types
You can also parse integer types using different bases (e.g., 2, 10, 16). The following code will
print the number 22250738585072012 three times:
```C++
uint64_t i;
const char str[] = "22250738585072012";
auto answer = fast_float::from_chars(str, str + strlen(str), i);
if (answer.ec != std::errc()) {
std::cerr << "parsing failure\n";
return EXIT_FAILURE;
}
std::cout << "parsed the number "<< i << std::endl;
const char binstr[] = "1001111000011001110110111001001010110100111000110001100";
answer = fast_float::from_chars(binstr, binstr + strlen(binstr), i, 2);
if (answer.ec != std::errc()) {
std::cerr << "parsing failure\n";
return EXIT_FAILURE;
}
std::cout << "parsed the number "<< i << std::endl;
const char hexstr[] = "4f0cedc95a718c";
answer = fast_float::from_chars(hexstr, hexstr + strlen(hexstr), i, 16);
if (answer.ec != std::errc()) {
std::cerr << "parsing failure\n";
return EXIT_FAILURE;
}
std::cout << "parsed the number "<< i << std::endl;
```

## C++20: compile-time evaluation (constexpr)

In C++20, you may use `fast_float::from_chars` to parse strings
Expand Down Expand Up @@ -264,7 +305,7 @@ The fast_float library provides a performance similar to that of the [fast_doubl

## Users

The fast_float library is used by [Apache Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the number parsing speed by two or three times. It is also used by [Yandex ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonnet).
The fast_float library is used by [Apache Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the number parsing speed by two or three times. It is also used by [ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonnet). It is part of GCC (as of GCC 12). It is part of WebKit (Safari).


## How fast is it?
Expand Down Expand Up @@ -330,7 +371,11 @@ the command line help.

You may directly download automatically generated single-header files:

https://github.com/fastfloat/fast_float/releases/download/v5.3.0/fast_float.h
https://github.com/fastfloat/fast_float/releases/download/v6.0.0/fast_float.h

## RFC 7159

If you need support for RFC 7159 (JSON standard), you may want to consider using the [fast_double_parser](https://github.com/lemire/fast_double_parser/) library instead.

## Credit

Expand Down
107 changes: 104 additions & 3 deletions include/fast_float/ascii_number.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <cstdint>
#include <cstring>
#include <iterator>
#include <limits>
#include <type_traits>

#include "float_common.h"
Expand Down Expand Up @@ -115,7 +116,7 @@ FASTFLOAT_SIMD_RESTORE_WARNINGS
#if defined(_MSC_VER) && _MSC_VER <= 1900
template <typename UC>
#else
template <typename UC, FASTFLOAT_ENABLE_IF(!has_simd_opt<UC>())>
template <typename UC, FASTFLOAT_ENABLE_IF(!has_simd_opt<UC>()) = 0>
#endif
// dummy for compile
uint64_t simd_read8_to_u64(UC const*) {
Expand Down Expand Up @@ -223,15 +224,15 @@ FASTFLOAT_SIMD_RESTORE_WARNINGS
#if defined(_MSC_VER) && _MSC_VER <= 1900
template <typename UC>
#else
template <typename UC, FASTFLOAT_ENABLE_IF(!has_simd_opt<UC>())>
template <typename UC, FASTFLOAT_ENABLE_IF(!has_simd_opt<UC>()) = 0>
#endif
// dummy for compile
bool simd_parse_if_eight_digits_unrolled(UC const*, uint64_t&) {
return 0;
}


template <typename UC, FASTFLOAT_ENABLE_IF(!std::is_same<UC, char>::value)>
template <typename UC, FASTFLOAT_ENABLE_IF(!std::is_same<UC, char>::value) = 0>
fastfloat_really_inline FASTFLOAT_CONSTEXPR20
void loop_parse_if_eight_digits(const UC*& p, const UC* const pend, uint64_t& i) {
if (!has_simd_opt<UC>()) {
Expand Down Expand Up @@ -439,6 +440,106 @@ parsed_number_string_t<UC> parse_number_string(UC const *p, UC const * pend, par
return answer;
}

template <typename T, typename UC>
fastfloat_really_inline FASTFLOAT_CONSTEXPR20
from_chars_result_t<UC> parse_int_string(UC const* p, UC const* pend, T& value, int base)
{
from_chars_result_t<UC> answer;

UC const* const first = p;

bool negative = (*p == UC('-'));
if (!std::is_signed<T>::value && negative) {
answer.ec = std::errc::invalid_argument;
answer.ptr = first;
return answer;
}
#ifdef FASTFLOAT_ALLOWS_LEADING_PLUS // disabled by default
if ((*p == UC('-')) || (*p == UC('+'))) {
#else
if (*p == UC('-')) {
#endif
++p;
}

UC const* const start_num = p;
while (*p == UC('0')) {
++p;
}
const bool has_leading_zeros = p > start_num;

UC const* const start_digits = p;

uint64_t i = 0;
if (base == 10) {
loop_parse_if_eight_digits(p, pend, i); // use SIMD if possible
}
while (p != pend) {
uint8_t digit = ch_to_digit(*p);
if (digit >= base) {
break;
}
i = uint64_t(base) * i + digit; // might overflow, check this later
p++;
}

size_t digit_count = size_t(p - start_digits);

if (digit_count == 0) {
if (has_leading_zeros) {
value = 0;
answer.ec = std::errc();
answer.ptr = p;
}
else {
answer.ec = std::errc::invalid_argument;
answer.ptr = first;
}
return answer;
}

answer.ptr = p;

// check u64 overflow
size_t max_digits = max_digits_u64(base);
if (digit_count > max_digits) {
answer.ec = std::errc::result_out_of_range;
return answer;
}
// this check can be eliminated for all other types, but they will all require a max_digits(base) equivalent
if (digit_count == max_digits && i < min_safe_u64(base)) {
answer.ec = std::errc::result_out_of_range;
return answer;
}

// check other types overflow
if (!std::is_same<T, uint64_t>::value) {
if (i > uint64_t(std::numeric_limits<T>::max()) + uint64_t(negative)) {
answer.ec = std::errc::result_out_of_range;
return answer;
}
}

if (negative) {
#ifdef FASTFLOAT_VISUAL_STUDIO
#pragma warning(push)
#pragma warning(disable: 4146)
#endif
// this weird workaround is required because:
// - converting unsigned to signed when its value is greater than signed max is UB pre-C++23.
// - reinterpret_casting (~i + 1) would work, but it is not constexpr
// this is always optimized into a neg instruction.
value = T(-std::numeric_limits<T>::max() - T(i - std::numeric_limits<T>::max()));
#ifdef FASTFLOAT_VISUAL_STUDIO
#pragma warning(pop)
#endif
}
else { value = T(i); }

answer.ec = std::errc();
return answer;
}

} // namespace fast_float

#endif
8 changes: 7 additions & 1 deletion include/fast_float/fast_float.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ namespace fast_float {
* to determine whether we allow the fixed point and scientific notation respectively.
* The default is `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
*/
template<typename T, typename UC = char>
template<typename T, typename UC = char, typename = FASTFLOAT_ENABLE_IF(is_supported_float_type<T>())>
FASTFLOAT_CONSTEXPR20
from_chars_result_t<UC> from_chars(UC const * first, UC const * last,
T &value, chars_format fmt = chars_format::general) noexcept;
Expand All @@ -36,6 +36,12 @@ template<typename T, typename UC = char>
FASTFLOAT_CONSTEXPR20
from_chars_result_t<UC> from_chars_advanced(UC const * first, UC const * last,
T &value, parse_options_t<UC> options) noexcept;
/**
* from_chars for integer types.
*/
template <typename T, typename UC = char, typename = FASTFLOAT_ENABLE_IF(!is_supported_float_type<T>())>
FASTFLOAT_CONSTEXPR20
from_chars_result_t<UC> from_chars(UC const * first, UC const * last, T& value, int base = 10) noexcept;

} // namespace fast_float
#include "parse_number.h"
Expand Down
85 changes: 81 additions & 4 deletions include/fast_float/float_common.h
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ using parse_options = parse_options_t<char>;
#define FASTFLOAT_NEON 1
#endif

#if defined(FASTFLOAT_SSE2) || defined(FASTFLOAT_ARM64)
#if defined(FASTFLOAT_SSE2) || defined(FASTFLOAT_NEON)
#define FASTFLOAT_HAS_SIMD 1
#endif

Expand Down Expand Up @@ -173,7 +173,7 @@ using parse_options = parse_options_t<char>;
// rust style `try!()` macro, or `?` operator
#define FASTFLOAT_TRY(x) { if (!(x)) return false; }

#define FASTFLOAT_ENABLE_IF(...) typename std::enable_if<(__VA_ARGS__), int>::type = 0
#define FASTFLOAT_ENABLE_IF(...) typename std::enable_if<(__VA_ARGS__), int>::type


namespace fast_float {
Expand All @@ -186,6 +186,20 @@ fastfloat_really_inline constexpr bool cpp20_and_in_constexpr() {
#endif
}

template <typename T>
fastfloat_really_inline constexpr bool is_supported_float_type() {
return std::is_same<T, float>::value || std::is_same<T, double>::value;
}

template <typename UC>
fastfloat_really_inline constexpr bool is_supported_char_type() {
return
std::is_same<UC, char>::value ||
std::is_same<UC, wchar_t>::value ||
std::is_same<UC, char16_t>::value ||
std::is_same<UC, char32_t>::value;
}

// Compares two ASCII strings in a case insensitive manner.
template <typename UC>
inline FASTFLOAT_CONSTEXPR14 bool
Expand Down Expand Up @@ -270,10 +284,10 @@ uint64_t umul128_generic(uint64_t ab, uint64_t cd, uint64_t *hi) {
uint64_t ad = emulu((uint32_t)(ab >> 32), (uint32_t)cd);
uint64_t bd = emulu((uint32_t)ab, (uint32_t)cd);
uint64_t adbc = ad + emulu((uint32_t)ab, (uint32_t)(cd >> 32));
uint64_t adbc_carry = !!(adbc < ad);
uint64_t adbc_carry = (uint64_t)(adbc < ad);
uint64_t lo = bd + (adbc << 32);
*hi = emulu((uint32_t)(ab >> 32), (uint32_t)(cd >> 32)) + (adbc >> 32) +
(adbc_carry << 32) + !!(lo < bd);
(adbc_carry << 32) + (uint64_t)(lo < bd);
return lo;
}

Expand Down Expand Up @@ -674,6 +688,69 @@ constexpr char32_t const * str_const_inf<char32_t>()
{
return U"infinity";
}


template <typename = void>
struct int_luts {
static constexpr uint8_t chdigit[] = {
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 255, 255, 255, 255, 255, 255,
255, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 255, 255, 255, 255, 255,
255, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255
};

static constexpr size_t maxdigits_u64[] = {
64, 41, 32, 28, 25, 23, 22, 21,
20, 19, 18, 18, 17, 17, 16, 16,
16, 16, 15, 15, 15, 15, 14, 14,
14, 14, 14, 14, 14, 13, 13, 13,
13, 13, 13
};

static constexpr uint64_t min_safe_u64[] = {
9223372036854775808ull, 12157665459056928801ull, 4611686018427387904, 7450580596923828125, 4738381338321616896,
3909821048582988049, 9223372036854775808ull, 12157665459056928801ull, 10000000000000000000ull, 5559917313492231481,
2218611106740436992, 8650415919381337933, 2177953337809371136, 6568408355712890625, 1152921504606846976,
2862423051509815793, 6746640616477458432, 15181127029874798299ull, 1638400000000000000, 3243919932521508681,
6221821273427820544, 11592836324538749809ull, 876488338465357824, 1490116119384765625, 2481152873203736576,
4052555153018976267, 6502111422497947648, 10260628712958602189ull, 15943230000000000000ull, 787662783788549761,
1152921504606846976, 1667889514952984961, 2386420683693101056, 3379220508056640625, 4738381338321616896
};
};

template <typename T>
constexpr uint8_t int_luts<T>::chdigit[];

template <typename T>
constexpr size_t int_luts<T>::maxdigits_u64[];

template <typename T>
constexpr uint64_t int_luts<T>::min_safe_u64[];

template <typename UC>
fastfloat_really_inline
constexpr uint8_t ch_to_digit(UC c) { return int_luts<>::chdigit[static_cast<unsigned char>(c)]; }

fastfloat_really_inline
constexpr size_t max_digits_u64(int base) { return int_luts<>::maxdigits_u64[base - 2]; }

// If a u64 is exactly max_digits_u64() in length, this is
// the value below which it has definitely overflowed.
fastfloat_really_inline
constexpr uint64_t min_safe_u64(int base) { return int_luts<>::min_safe_u64[base - 2]; }

} // namespace fast_float

#endif
Loading

0 comments on commit 620376a

Please sign in to comment.