Skip to content

Commit

Permalink
Update implementation
Browse files Browse the repository at this point in the history
1. Conform to the standard library
2. Update documentation
3. Fix warnings with MSVC STL
  • Loading branch information
HenryAWE committed Oct 16, 2024
1 parent 7280859 commit fc92038
Show file tree
Hide file tree
Showing 12 changed files with 147 additions and 41 deletions.
7 changes: 7 additions & 0 deletions doc/en/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,10 @@ Run `build.sh` from the project root directory. For Windows, run `build.ps1` ins

## Install the Library
Run `cmake --install` in the `build/` directory.

## Use the Library in `CMakeLists.txt`
```cmake
find_package(Papilio REQUIRED)
target_link_libraries(main PRIVATE papilio::papilio)
```
9 changes: 5 additions & 4 deletions doc/en/builtin_accessor.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ Example: Given string `"hello world!"`, then
- `[-1]`: Returns `'!'`.
- `[-100]`: Returns null character.

### Slicing
The left-closed, right-open interval `[begin, end)` consists of index pair `begin`, `end`
Default values are `0` and `.length`, respectively
### Indexing by Range
The left-closed, right-opened range `[begin, end)` consists of index pair `begin`, `end`. Default values are `0` and `.length`, respectively.

Example: Given string "hello world!"
- `[:]`: Returns `"hello world!"`
- `[:-1]`: Returns `"hello world"`
Expand All @@ -24,7 +24,8 @@ Example: Given string "hello world!"
### Attributes
- `size`:The number of *elements* in the string. That is, the string is regarded as a container whose value type is `char` (or other character type), and the result is its number of elements.
- `length`:The number of *characters* in the string.
For string containing non-ASCII characters, these two values may not be equal. For example, the `size` of string `"ü"` is `2`, but its `length` is `1`; for string `L"ü"` (`wchar_t` string), its `size` and `length` are both `1`.
For string containing non-ASCII characters, these two values may not be equal.
For example, the `size` of string `"ü"` is `2`, but its `length` is `1`. For string `L"ü"` (`wchar_t` string), its `size` and `length` are both `1`.

## Tuples (`tuple` and `pair`)
### Indexing by Integer
Expand Down
43 changes: 40 additions & 3 deletions doc/en/builtin_formatter.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# Built-In Formatter
The format specification of most built-in formatters are compatible with the usage of the corresponding parts of the standard library `<format>`.
See the [standard library documentation](https://en.cppreference.com/w/cpp/utility/format/spec) for more detailed explanation.

## Format Specification for Common Types
Used by fundamental types, character and string.
```
fill-and-align sign # 0 width .precision L type
```
These arguments are all optional.

- **Note:**
In most of the cases the syntax is similar to the C-style `%`-formatting of `printf` family, with the addition of the `{}` and with `:` used instead of `%`.
For example, `%03.2f` can be translated to `{:03.2f}`

### Fill and Align
Fill can be any character, followed by align option which is one of the `<`, `>` and `^`.
Align Option:
Expand Down Expand Up @@ -72,6 +79,7 @@ This option is only available for some types. It may cause the output to be affe
### Type
#### String
- None, `s`: Copy the string to the output.
- `?`: Copy escaped string (see below) to the output.
#### Integral Type (Except `bool` type)
- `b`: Binary output.
Expand All @@ -83,6 +91,7 @@ This option is only available for some types. It may cause the output to be affe
#### Character Type
- None, `c`: Copy the character to the output.
- `?`: Copy the escaped character (see below) to the output.
- `b`, `B`, `d`, `o`, `x`, `X`: Use integer representation types with `static_cast<std::uint32_t>(value)`.
#### `bool` Type
Expand All @@ -106,6 +115,34 @@ Infinite values and NaN are formatted to `inf` and `nan`, respectively.
- None, `s`: Copy the corresponding string of the enumeration value to the output.
- `b`, `B`, `d`, `o`, `x`, `X`: Use integer representation types with `static_cast<std::underlying_type_t<Enum>>(value)`.
Note: The `enum_name` function defined in `<papilio/utility.hpp>` uses compiler extension to retrieve string from enumeration value. It has following limitations:
1. Only support enumeration values within the `[-128, 128)` range.
2. Output result of multiple enumerations with same value is compiler dependent.
Note: The `enum_name` function defined in `<papilio/utility.hpp>` uses compiler extension to retrieve string from enumeration value. It has following limitations:
1. Requires compiler extension. If supported by the compiler, the implementation will define a `PAPILIO_HAS_ENUM_NAME` macro.
2. Only support enumeration values within the `[-128, 128)` range.
3. Output result of multiple enumerations with same value is compiler dependent.
# Formatting escaped characters and strings
A character or string can be formatted as escaped to make it more suitable for debugging or for logging.
For a character `C`:
- If `C` is one of the characters in the following table, the corresponding escape sequence is used.
| Character | Escape sequence | Notes |
| --------------- | --------------- | ----------------------------------------------- |
| Horizontal tab | `\t` | |
| New line | `\n` | |
| Carriage return | `\r` | |
| Double quote | `\"` | Used only if the output is double-quoted string |
| Single quote | `\'` | Used only if the output is single-quoted string |
| Backslash | `\\` | |
- If `C` and following characters form a sequence that is not printable.
- If `C` and following characters cannot form a valid code point. The hexadecimal digits will be used to represent the invalid sequence.
## Example
```c++
papilio::format("{:?} {:?}", '\'', '"'); // Returns "\\' \""
papilio::format("{:?}", "hello\n"); // Returns "hello\\n"
papilio::format("{:?}", std::string("\0 \n \t \x02 \x1b", 9)); // Returns "\\u{0} \\n \\t \\u{2} \\u{1b}"
// Invalid UTF-8
papilio::format("{:?}", "\xc3\x28"); // Returns "\\x{c3}("
```
2 changes: 1 addition & 1 deletion doc/en/formatter.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ format("{:S}", used_adl_ex{}); // Returns "ADL (EX)"
```

# Overloaded `operator<<`
If a type does not implement the above format methods, but it has a overloaded `operator<<` of traditional C++, that overload will be used for outputting.
If a type does not implement the above format methods, but it has a overloaded `operator<<` for legacy stream output, that overload will be used for outputting.

# Disabled Formatter
Explicitly prevent a type from being formatted:
Expand Down
7 changes: 7 additions & 0 deletions doc/zh-CN/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,10 @@

## 安装库
`build/` 目录执行 `cmake --install .` 即可。

## `CMakeLists.txt` 中使用库
```cmake
find_package(Papilio REQUIRED)
target_link_libraries(main PRIVATE papilio::papilio)
```
10 changes: 5 additions & 5 deletions doc/zh-CN/builtin_accessor.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,10 @@
- `[-1]`:返回 `'!'`
- `[-100]`:返回空字符

### 切片
由索引对 `begin``end` 所组成的左闭右开区间 `[begin, end)`
默认值分别为 `0``.length`
示例:给定字符串 `"hello world!"`
### 索引范围
由索引对 `begin``end` 所组成的左闭右开区间 `[begin, end)`。默认值分别为 `0``.length`

示例:给定字符串 `"hello world!"`
- `[:]`:返回 `"hello world!"`
- `[:-1]`:返回 `"hello world"`
- `[6:-1]`:返回 `"world"`
Expand All @@ -24,7 +23,8 @@
### 属性
- `size`:字符串中的**元素**数。即将字符串视作一个值类型为 `char`(或其他字符类型)的容器,结果为其元素的个数。
- `length`:字符串中的**字符**数。
对于含有非 ASCII 字符的字符串,这两个值可能会不相等。如对字符串 `"ü"` 而言,其 `size``2`,而 `length` 则为 `1`;对字符串 `L"ü"``wchar_t` 字符串)而言,其 `size``length` 均为 `1`
对于含有非 ASCII 字符的字符串,这两个值可能会不相等。
如对字符串 `"ü"` 而言,其 `size``2`,而 `length` 则为 `1`;对字符串 `L"ü"``wchar_t` 字符串)而言,其 `size``length` 均为 `1`

## 元组 (`tuple``pair`
### 整数索引
Expand Down
43 changes: 40 additions & 3 deletions doc/zh-CN/builtin_formatter.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# 内建格式化器(Formatter)
绝大部分的内建格式化器的格式说明都兼容标准库 `<format>` 里对应部分的用法。
可以参考[标准库文档](https://zh.cppreference.com/w/cpp/utility/format/spec)获取更详细的说明。

## 常见类型的格式说明
用于基本类型、字符和字符串。
```
填充与对齐 符号 # 0 宽度 .精度 L 类型
```
这些参数都是可选的。

- ****
大多数情况下,这个语法与 C 式(`printf` 族函数)的 `%` 格式化类似。仅增加了 `{}`, 并用 `:` 替换掉 `%`
例如 `%03.2f` 可被转换为 `{:03.2f}`

### 填充与对齐
填充可以为任意字符,后随对齐选项为 `<``>``^` 之一。
对齐选项:
Expand Down Expand Up @@ -72,6 +79,7 @@ papilio::format("{:.<5.5s}", "文文文"); // "文文."
### 类型
#### 字符串类型
- 无、`s`:复制字符串到输出
- `?`: 复制转义过的字符串(见下文)到输出
#### 整数类型(除 `bool` 类型)
- `b`:二进制输出
Expand All @@ -83,6 +91,7 @@ papilio::format("{:.<5.5s}", "文文文"); // "文文."
#### 字符类型
- 无、`c`:复制字符到输出
- `?`: 复制转义过的字符(见下文)到输出
- `b`、`B`、`d`、`o`、`x`、`X`:以值 `static_cast<std::uint32_t>(value)` 使用整数表示字符
#### `bool` 类型
Expand All @@ -106,6 +115,34 @@ papilio::format("{:.<5.5s}", "文文文"); // "文文."
- 无、`s`:复制枚举值对应的字符串到输出中
- `b`、`B`、`d`、`o`、`x`、`X`:以值 `static_cast<std::underlying_type_t<Enum>>(value)` 使用整数表示
注意:`<papilio/utility.hpp>` 中定义的 `enum_name` 函数使用编译器扩展从枚举值中获取字符串,它有以下限制:
1. 仅支持 `[-128, 128)` 范围内的枚举值
2. 具有相同值的多个枚举的输出结果取决于编译器
注意:`<papilio/utility.hpp>` 中定义的 `enum_name` 函数使用编译器扩展从枚举值中获取字符串,它有以下限制:
1. 需要编译器拓展。如果编译器支持,实现会定义 `PAPILIO_HAS_ENUM_NAME` 宏
2. 仅支持 `[-128, 128)` 范围内的枚举值
3. 具有相同值的多个枚举的输出结果取决于编译器
# 格式化输出转义过的字符与字符串
字符或字符串可以在格式化时进行转义,使其更适合用于调试或记录日志。
对于字符 `C` 而言:
- 如果 C 是下表中的字符之一,那么使用对应的转义序列:
| 字符 | 转义序列 | 注解 |
| -------- | -------- | -------------------------------------- |
| 横向制表 | `\t` | |
| 换行 | `\n` | |
| 回车 | `\r` | |
| 双引号 | `\"` | 仅会在输出是用双引号包围的字符串时使用 |
| 单引号 | `\'` | 仅会在输出是用单引号包围的字符串时使用 |
| 反斜杠 | `\\` | |
- 如果 `C` 及其后续字符形成不可打印的序列。
- 如果 `C` 及其后续字符不能形成有效的码点。将使用十六进制数字来表示无效的序列。
## 示例
```c++
papilio::format("{:?} {:?}", '\'', '"'); // 返回 "\\' \""
papilio::format("{:?}", "hello\n"); // 返回 "hello\\n"
papilio::format("{:?}", std::string("\0 \n \t \x02 \x1b", 9)); // 返回 "\\u{0} \\n \\t \\u{2} \\u{1b}"
// 无效的 UTF-8
papilio::format("{:?}", "\xc3\x28"); // 返回 "\\x{c3}("
```
2 changes: 1 addition & 1 deletion doc/zh-CN/formatter.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ format("{:S}", used_adl_ex{}); // 返回 "ADL (EX)"
```

# `operator<<` 重载
如果上述格式化方式某个类型均未实现,但该类型实现了 C++ 传统的 `operator<<` 重载,则该重载将会被用于输出。
如果上述格式化方式某个类型均未实现,但该类型实现了用于流式输出的老式 `operator<<` 重载,则该重载将会被用于输出。

# 禁用格式化器(Disabled Formatter)
显式阻止某个类型被格式化:
Expand Down
50 changes: 31 additions & 19 deletions include/papilio/core.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2251,13 +2251,24 @@ class format_context_traits
format_context_traits() = delete;

private:
static void append_hex_digits(context_type& ctx, int_type val)
static void append_hex_digits(context_type& ctx, int_type val, bool is_valid)
{
format_to(
ctx,
PAPILIO_TSTRING_VIEW(char_type, "\\u{{{:x}}}"),
val
);
if(is_valid)
{
format_to(
ctx,
PAPILIO_TSTRING_VIEW(char_type, "\\u{{{:x}}}"),
val
);
}
else
{
format_to(
ctx,
PAPILIO_TSTRING_VIEW(char_type, "\\x{{{:x}}}"),
val
);
}
}

template <bool DoubleQuote, bool SingleQuote>
Expand All @@ -2267,7 +2278,7 @@ class format_context_traits
{
default:
other_ch:
append_hex_digits(ctx, val);
append_hex_digits(ctx, val, true);
break;

case '\t':
Expand Down Expand Up @@ -2316,7 +2327,8 @@ class format_context_traits
val == '\r' ||
val == '\\' ||
(DoubleQuote && val == '"') ||
(SingleQuote && val == '\'');
(SingleQuote && val == '\'') ||
val < U' ';
}

public:
Expand Down Expand Up @@ -2459,12 +2471,12 @@ class format_context_traits
{
if(PAPILIO_NS utf::is_leading_byte(str[i]))
{
std::uint8_t size_bytes = utf::byte_count(str[i]);
std::uint8_t size_bytes = PAPILIO_NS utf::byte_count(str[i]);
if(i + size_bytes > str.size())
{
for(std::size_t j = i; j < str.size(); ++j)
{
append_hex_digits(ctx, static_cast<std::uint8_t>(str[j]));
append_hex_digits(ctx, static_cast<std::uint8_t>(str[j]), false);
}
return;
}
Expand Down Expand Up @@ -2497,7 +2509,7 @@ class format_context_traits

for(auto it = str.begin() + i; it != stop; ++it)
{
append_hex_digits(ctx, static_cast<std::uint8_t>(*it));
append_hex_digits(ctx, static_cast<std::uint8_t>(*it), false);
}

i += std::distance(str.begin() + i, stop);
Expand All @@ -2508,7 +2520,7 @@ class format_context_traits
}
else
{
append_hex_digits(ctx, str[i]);
append_hex_digits(ctx, str[i], false);
++i;
}
}
Expand All @@ -2526,16 +2538,16 @@ class format_context_traits
append_as_esc_seq<true, false>(ctx, ch);
++i;
}
else if(utf::is_high_surrogate(ch))
else if(PAPILIO_NS utf::is_high_surrogate(ch))
{
if(i + 1 >= str.size())
{
append_hex_digits(ctx, ch);
append_hex_digits(ctx, ch, false);
return;
}
else if(!utf::is_low_surrogate(str[i + 1]))
else if(!PAPILIO_NS utf::is_low_surrogate(str[i + 1]))
{
append_hex_digits(ctx, ch);
append_hex_digits(ctx, ch, false);
}
else
{
Expand All @@ -2546,9 +2558,9 @@ class format_context_traits
}
else
{
if(utf::is_low_surrogate(ch))
if(PAPILIO_NS utf::is_low_surrogate(ch))
{
append_hex_digits(ctx, ch);
append_hex_digits(ctx, ch, false);
}
else
{
Expand Down Expand Up @@ -2586,7 +2598,7 @@ class format_context_traits
* @code{.cpp}
* append_escaped(ctx, "hello\t"); // Appends "hello\\t"
* // Invalid UTF-8
* append_escaped(ctx, "\xc3\x28"); // Appends "\u{c3}("
* append_escaped(ctx, "\xc3\x28"); // Appends "\x{c3}("
* @endcode
*/
static void append_escaped(context_type& ctx, string_view_type str)
Expand Down
8 changes: 4 additions & 4 deletions include/papilio/utf/string.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -985,16 +985,16 @@ class basic_string_container<CharT> : public detail::str_impl<CharT, basic_strin
string_type buf;
if constexpr(std::ranges::sized_range<R>)
buf.reserve(std::ranges::size(r));
for(CharT ch : r)
buf.push_back(ch);
for(auto&& ch : r)
buf.push_back(static_cast<CharT>(ch));

assign(std::move(buf));
}
else if constexpr(std::convertible_to<std::ranges::range_reference_t<R>, utf::codepoint>)
{
string_type buf;
for(utf::codepoint cp : r)
cp.append_to(buf);
for(auto&& cp : r)
utf::codepoint(cp).append_to(buf);

assign(std::move(buf));
}
Expand Down
2 changes: 1 addition & 1 deletion test/test_core/format_context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ TYPED_TEST(format_context_suite, append_escaped)
{
context_t::append_escaped(ctx, reinterpret_cast<const TypeParam*>("\xc3\x28"));

const auto expected_str = PAPILIO_TSTRING(TypeParam, "\\u{c3}(");
const auto expected_str = PAPILIO_TSTRING(TypeParam, "\\x{c3}(");
EXPECT_EQ(result, expected_str);
}
}
Expand Down
Loading

0 comments on commit fc92038

Please sign in to comment.