Skip to content

Commit

Permalink
Add chapter on Lee and Ready algorithm + proofreading 🔢 (#154)
Browse files Browse the repository at this point in the history
Adresses #10 and #9
  • Loading branch information
KarelZe authored Feb 12, 2023
1 parent e4ed15b commit 3ef3b29
Show file tree
Hide file tree
Showing 40 changed files with 348 additions and 1,957 deletions.
2 changes: 1 addition & 1 deletion references/obsidian/.obsidian/graph.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@
"repelStrength": 10,
"linkStrength": 1,
"linkDistance": 250,
"scale": 1.476986098068651,
"scale": 2.070709249745304,
"close": true
}
60 changes: 45 additions & 15 deletions references/obsidian/.obsidian/workspace.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,46 @@
{
"id": "2785db0e51818bf2",
"type": "tabs",
"dimension": 49.568965517241374,
"dimension": 26.791013141161507,
"children": [
{
"id": "94415710fad4ab13",
"type": "leaf",
"state": {
"type": "markdown",
"state": {
"file": "📖chapters/🔢Tick Test.md",
"file": "📑notes/🔢Hybrid rules notes.md",
"mode": "source",
"source": false
}
}
},
{
"id": "e87aa9ab943386a7",
"type": "leaf",
"state": {
"type": "markdown",
"state": {
"file": "📖chapters/🔢LR Algorithm.md",
"mode": "source",
"source": false
}
}
}
]
},
{
"id": "8ced9dafaca4ec30",
"type": "tabs",
"dimension": 26.791013141161507,
"children": [
{
"id": "48b0c3f58330dbd5",
"type": "leaf",
"state": {
"type": "markdown",
"state": {
"file": "📑notes/🔢Hybrid rules notes.md",
"mode": "source",
"source": false
}
Expand All @@ -25,7 +56,7 @@
{
"id": "98f847228f79bc8c",
"type": "tabs",
"dimension": 50.431034482758626,
"dimension": 46.41797371767698,
"children": [
{
"id": "f35a250acbe1a2dc",
Expand Down Expand Up @@ -79,8 +110,7 @@
}
],
"direction": "horizontal",
"width": 399.5110321044922,
"collapsed": true
"width": 399.5110321044922
},
"right": {
"id": "9db2ac4605daf425",
Expand All @@ -96,7 +126,7 @@
"state": {
"type": "backlink",
"state": {
"file": "📖chapters/🔢Tick Test.md",
"file": "📑notes/🔢Hybrid rules notes.md",
"collapseAll": false,
"extraContext": false,
"sortOrder": "alphabetical",
Expand Down Expand Up @@ -124,7 +154,7 @@
"state": {
"type": "outline",
"state": {
"file": "📖chapters/🔢Tick Test.md"
"file": "📑notes/🔢Hybrid rules notes.md"
}
}
},
Expand Down Expand Up @@ -157,15 +187,15 @@
},
"active": "94415710fad4ab13",
"lastOpenFiles": [
"📑notes/🍪Selection of supervised approaches notes.md",
"📖chapters/🔢Tick Test.md",
"📑notes/🔢Tick test notes.md",
"📑notes/🔢Hybrid rules notes.md",
"❓Questions.md",
"📑notes/🔢Basic rules notes.md",
"📖chapters/🔢Basic rules.md",
"📖chapters/🔢Quote Rule.md",
"📖chapters/🔢Hybrid rules.md",
"📑notes/🔢Hybrid rules notes.md",
"📑notes/🔢LR algorithm notes.md",
"🖼️Media/pseudocode-of-algorithms.png",
"📖chapters/🔢Hybrid rules.md"
"📖chapters/🔢Basic rules.md",
"🖼️Media/visualization-of-quote-and-tick.png",
"📖chapters/🔢LR Algorithm.md",
"📖chapters/🔢Tick Test.md",
"📖chapters/🏅Feature importance measure.md"
]
}
3 changes: 2 additions & 1 deletion references/obsidian/❓Questions.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
## Open

- Progress slowed down lately. Currently, I have written roughly 20 pages (14 Transformer, 2 related work + other). However, I plan to revise some chapters / rewrite them from scratch, as the why remains unclear for the Transformers and some paragraphs are hard to understand.
- Progress slowed down lately. Currently, I have written roughly 29 pages (9 classical algorithms, 14 Transformer, 2 related work + other). However, I plan to revise the Transformer chapters / rewrite them from scratch, as the why remains unclear for the Transformers and some paragraphs are hard to understand.
- Ask about classification rules. Do you also want a short discussion with the different views on the trade initiator? Do you like the mix of formal definition, intuition etc. Do you regard your stacking approach as a another hybrid algorithm?
- Ask about scope of related work. Currently, trade classification in option markets (i) and trade classification with machine learning (ii). Already spent two days improving the chapters, but still not satisfied.
- Ask for CBOE for unlabelled data. Poses a major risk as I'm not sure regarding performance / training times etc.

Expand Down
1 change: 1 addition & 0 deletions references/obsidian/📑notes/🅰️attention notes.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html

![[context-xl-transformer.png]]
(found in [[@daiTransformerXLAttentiveLanguage2019]])
Expand Down
25 changes: 23 additions & 2 deletions references/obsidian/📑notes/🔢Basic rules notes.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,29 @@
Tags: #trade-classification

Denote sells with $0$ and buys with $1$.
In absence of the .

We start by the popular quote rule and tick test in Section ... and continue with two recent alternatives.

Because every trade has both a buyer and a seller, it is necessary to classify the “active” side of each option transaction.

- requires separatiing buys from sells in the raw trading data

While the information about the initator of a trade is missing in public data sets ... we infer the

**Possible openings:**

“The improved ability to discern whether a trade was a buy order or a sell order is of particular importance” ([[@leeInferringTradeDirection1991]], p. 1)
“Therefore, trade classification rules (TCR) have been developed in order to classify trades as buyer- or seller-initiated, when the true originator is unknown” (Frömmel et al., 2021, p. 4)

“The trade indicator is a binary variable stating whether the buyer or seller of an asset has initiated the trade by submitting a market order or an immediately executed limit order.” ([[@frommelAccuracyTradeClassification2021]], p. 4)

“The goal of the trade side classification is to determine the initiator of the transaction and to classify trades as being either buyer or seller motivated. However, a formal definition of a trade initiator is rarely stated in the literature.” ([[@olbrysEvaluatingTradeSide2018]], p. 4)

“Trade classification rules (hereafter referred to as TCR) are intended to indicate the party to a trade who initiates a transaction. It may by either a buyer or a seller. Such indication made directly from the data is nowadays in mostly cases inaccessible, since the majority of public databases including transaction data do not contain information of trade initiators and trade direction.” ([[@nowakAccuracyTradeClassification2020]], p. 65)


“Methods of inferring trade direction can be classified as: tick tests, which use changes in trade prices; the quote method, which compares trade prices to quotes;” (Finucane, 2000, p. 557)

“The trade indicator is a binary variable stating whether the buyer or seller of an asset has initiated the trade by submitting a market order or an immediately executed limit order.” (Frömmel et al., 2021, p. 4)


<mark style="background: #FFB86CA6;">“Methods of inferring trade direction can be classified as: tick tests, which use changes in trade prices; the quote method, which compares trade prices to quotes;” (Finucane, 2000, p. 557)
Expand Down
18 changes: 17 additions & 1 deletion references/obsidian/📑notes/🔢CLNV method notes.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,20 @@
Tags: #trade-classification #CLVN
Tags: #trade-classification #CLNV

Long form:
$$
  \begin{equation}
    \text{Trade}_{i,t}=
    \begin{cases}
      \operatorname{tick}(), & \text{if}\ p_{i, t} \in \left(a_{i, t}, \infty\right) \\
      1, & \text{if}\ p_{i, t} \in \left[\frac{3}{10} b_{i,t} + \frac{7}{10} a_{i,t}, a_{i, t}\right] \\
      \operatorname{tick}(), & \text{if}\ p_{i, t} \in \left(\frac{7}{10} b_{i,t} + \frac{3}{10} a_{i,t}, \frac{3}{10} b_{i,t} + \frac{7}{10} a_{i,t}\right) \\
      0, & \text{if} p_{i, t} \in \left[ b_{i,t}, \frac{7}{10} b_{i,t} + \frac{3}{10} a_{i,t}\right] \\
\operatorname{tick}(), & \text{if} \ p_{i, t} \in \left(-\infty, b_{i, t}\right) \\
    \end{cases}
  \end{equation}
$$


Long form:
$$
Expand Down
12 changes: 6 additions & 6 deletions references/obsidian/📑notes/🔢EMO rule notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,15 @@ Turn it into simple formula, instead of lengthy algorithm

Similar to LR algorithm in [[@carrionTradeSigningFast2020]]:

The tick rule (TICK) relies solely on trade prices for classifying trades and does not use any quote data. To classify trades, the tick rule compares the current trade price to the price of the preceding trade. A trade is classified as a buy if the trade price is higher than the preceding trade price (uptick). Likewise, a trade is classified as a sell if the trade price is lower than the preceding trade price (downtick). If the preceding trade price is the same, then the tick rule looks back to the last different price to classify the trade. Likewise, a trade is classified as a sell if it occurs on a zero-downtick. Formally denoting the trade price of security $i$ at time $t$ as $P_{i, t}$ and $\Delta P_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta P_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta P_{i, t}<0$, Trade ${i, t}=$ Sell,
If $\Delta P_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.
The tick rule (TICK) relies solely on trade prices for classifying trades and does not use any quote data. To classify trades, the tick rule compares the current trade price to the price of the preceding trade. A trade is classified as a buy if the trade price is higher than the preceding trade price (uptick). Likewise, a trade is classified as a sell if the trade price is lower than the preceding trade price (downtick). If the preceding trade price is the same, then the tick rule looks back to the last different price to classify the trade. Likewise, a trade is classified as a sell if it occurs on a zero-downtick. Formally denoting the trade price of security $i$ at time $t$ as $p_{i, t}$ and $\Delta p_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta p_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta p_{i, t}<0$, Trade ${i, t}=$ Sell,
If $\Delta p_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.
The LR algorithm is based on a combination of the tick rule and the quote rule. Using the quote rule, a trade is classified as a buy if the price is above the midpoint of the quoted bid and ask and as a sell if the price is below the midpoint. Denoting the midpoint of the quoted spread by $m_{i, t}$, the predicted trade direction as per the quote rule is as follows:
$$
\begin{aligned}
& \text { If } P_{i, t}>m_{i, t}, \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } P_{i, t}<m_{i, t} \text { Trade }_{i, t}=\text { Sell. }
& \text { If } p_{i, t}>m_{i, t}, \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } p_{i, t}<m_{i, t} \text { Trade }_{i, t}=\text { Sell. }
\end{aligned}
$$

5 changes: 3 additions & 2 deletions references/obsidian/📑notes/🔢Hybrid rules notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@ Tags: #trade-classification
- “The paper further shows that, while the Lee and Ready (1991) algorithm has been the default choice among the traditional trade classification algorithm—possibly partly due to being automatically supplied by data vendors, partly due to its simplicity—the similar simplistic algorithms of Chakrabarty et al. (2007) and Ellis et al. (2000) tend to perform better and may be preferred in certain applications.” (Jurkatis, 2022, p. 23)
- use the problems of the single tick test to motivate extended rules like EMO / LR?
- that lead to a fine-grained fragmentation?
![[visualization-of-quote-and-tick.png]]
(image copied from [[@poppeSensitivityVPINChoice2016]]) ^3d69f3

![[viz-rules.png]]
(similar to [[@poppeSensitivityVPINChoice2016]]) ^3d69f3
“Fig. 1. Classification algorithms. This chart illustrates the functioning of three different trade-by-trade classification algorithms: LR by Lee and Ready (1991), EMO by Ellis et al. (2000) and CLNV by Chakrabarty et al. (2007).” ([Pöppe et al., 2016, p. 167](zotero://select/library/items/5A83SDDB)) ([pdf](zotero://open-pdf/library/items/4XIK47X6?page=3&annotation=8XUJ32R2))

“Sophisticated algorithms combine the quote and tick rule: Thus, trades at the midpoint are always classified by the tick rule, and trades at the best bid or ask are classified by the quote rule. The three most common algorithms differ in how they divide the remaining trades between quote and tick rule, as illustrated in Fig. 1.” ([Pöppe et al., 2016, p. 166](zotero://select/library/items/5A83SDDB)) ([pdf](zotero://open-pdf/library/items/4XIK47X6?page=2&annotation=4A3YAHN2))
Expand Down
16 changes: 10 additions & 6 deletions references/obsidian/📑notes/🔢LR algorithm notes.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
Tags: #trade-classification #lee-ready



Accuracy has been tested in [[@odders-whiteOccurrenceConsequencesInaccurate2000]], [[@finucaneDirectTestMethods2000]] and [[@leeInferringInvestorBehavior2000]] on TORQ data set which contains the true label. (see [[@bessembinderIssuesAssessingTrade2003]])

“Additionally, trade direction may not always be unambiguously determined. While LR assume that trades generally occur only when a market buy or sell order arrives, trades that do not involve market orders also can occur, such as when two limit orders are crossed. Although the trade can be classified by the tick test or LR's algorithm, the true direction of the trade is ambiguous. Classifying such trades as buys or sells may lead to erroneous conclusions in empirical studies.” (Finucane, 2000, p. 559)

**Algorithm:** See [[@leeInferringTradeDirection1991]]
**Algorithm:** Use a combination of quote and tick rule. Use tick rule to classify trades at midpoint and use the quote rule elsewhere.
![[lr-algo.png]]

- “The tick test will only misclassify the second midpoint trade after the arrival of the standing order if it misclassifies the first midpoint trade and the second trade is in the same direction as the first trade (i.e., another buy).” (Lee and Ready, 1991, p. 11)

Precise description from [[@carrionTradeSigningFast2020]] (Similar efforts in [[@jurkatisInferringTradeDirections2022]] or [[@olbrysEvaluatingTradeSide2018]]):
The tick rule (TICK) relies solely on trade prices for classifying trades and does not use any quote data. To classify trades, the tick rule compares the current trade price to the price of the preceding trade. A trade is classified as a buy if the trade price is higher than the preceding trade price (uptick). Likewise, a trade is classified as a sell if the trade price is lower than the preceding trade price (downtick). If the preceding trade price is the same, then the tick rule looks back to the last different price to classify the trade. Likewise, a trade is classified as a sell if it occurs on a zero-downtick. Formally denoting the trade price of security $i$ at time $t$ as $P_{i, t}$ and $\Delta P_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta P_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta P_{i, t}<0$, Trade $_{i, t}=$ Sell,
If $\Delta P_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.
The tick rule (TICK) relies solely on trade prices for classifying trades and does not use any quote data. To classify trades, the tick rule compares the current trade price to the price of the preceding trade. A trade is classified as a buy if the trade price is higher than the preceding trade price (uptick). Likewise, a trade is classified as a sell if the trade price is lower than the preceding trade price (downtick). If the preceding trade price is the same, then the tick rule looks back to the last different price to classify the trade. Likewise, a trade is classified as a sell if it occurs on a zero-downtick. Formally denoting the trade price of security $i$ at time $t$ as $p_{i, t}$ and $\Delta p_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta p_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta p_{i, t}<0$, Trade $_{i, t}=$ Sell,
If $\Delta p_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.
The LR algorithm is based on a combination of the tick rule and the quote rule. Using the quote rule, a trade is classified as a buy if the price is above the midpoint of the quoted bid and ask and as a sell if the price is below the midpoint. Denoting the midpoint of the quoted spread by $m_{i, t}$, the predicted trade direction as per the quote rule is as follows:
$$
\begin{aligned}
& \text { If } P_{i, t}>m_{i, t}, \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } P_{i, t}<m_{i, t} \text { Trade }_{i, t}=\text { Sell. }
& \text { If } p_{i, t}>m_{i, t}, \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } p_{i, t}<m_{i, t} \text { Trade }_{i, t}=\text { Sell. }
\end{aligned}
$$

Expand Down
4 changes: 2 additions & 2 deletions references/obsidian/📑notes/🔢Quote rule notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Tags: #trade-classification #quote-rule
**Notation:** For notation see [[@carrionTradeSigningFast2020]] or [[@jurkatisInferringTradeDirections2022]] or [[@olbrysEvaluatingTradeSide2018]].. Denoting the midpoint of the quoted spread by $m_{i, t}$, the predicted trade direction as per the quote rule is as follows:
$$
\begin{aligned}
& \text { If } P_{i, t}>m_{i, t} \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } P_{i, t}<m_{i, t}, \text { Trade }_{i, t}=\text { Sell. } \\
& \text { If } p_{i, t}>m_{i, t} \text { Trade }_{i, t}=\text { Buy, } \\
& \text { If } p_{i, t}<m_{i, t}, \text { Trade }_{i, t}=\text { Sell. } \\
&
\end{aligned}
$$
Expand Down
8 changes: 4 additions & 4 deletions references/obsidian/📑notes/🔢Tick test notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ One of the first works who mention the tick test is [[@holthausenEffectLargeBloc
“his approach eschews any distributional assumptions and relies instead on the basic notion that buys raise prices and sells lower them. But how well this approximation works to infer trades, or underlying information, is debatable,” (Easley et al., 2016, p. 271)

**Algorithm:** Formal description in [[@olbrysEvaluatingTradeSide2018]] and [[@carrionTradeSigningFast2020]] (see below) and [[@jurkatisInferringTradeDirections2022]]:
Formally denoting the trade price of security $i$ at time $t$ as $P_{i, t}$ and $\Delta P_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta P_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta P_{i, t}<0$, Trade $_{i, t}=$ Sell,
If $\Delta P_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.
Formally denoting the trade price of security $i$ at time $t$ as $p_{i, t}$ and $\Delta p_{i, t}$ as the price change between two successive trades and the assigned trade direction at time $t$ as Trade, we have:
If $\Delta p_{i, t}>0$, Trade $_{i, t}=$ Buy,
If $\Delta p_{i, t}<0$, Trade $_{i, t}=$ Sell,
If $\Delta p_{i, t}=0$, Trade $_{i, t}=$ Trade $_{i, t-1}$.

**Informal description:** Tick tests use changes in trade prices and look at previous trade prices to infer trade direction. If the trade occurs at a higher price, hence uptick, as the previous trade its classified as as buyer-initiated. If the trade occurs at a lower price its seller-initiated. If the price change is zero, the last price is taken, that is different from the current price. (see e. g., [[@grauerOptionTradeClassification2022]] or [[@finucaneDirectTestMethods2000]] or [[@leeInferringTradeDirection1991]] for similar framing)

Expand Down
Loading

0 comments on commit 3ef3b29

Please sign in to comment.