diff --git a/.vscode/settings.json b/.vscode/settings.json index 7cb8a0e..a91b1ed 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -1,11 +1,13 @@ { "cSpell.words": [ + "Caswell", "defun", "funcs", "iframe", "inlinehilite", "kroki", "longjmp", + "nilly", "Nuvoton", "oldenv", "phoo", diff --git a/build.py b/build.py index 6322e0d..56bcef8 100755 --- a/build.py +++ b/build.py @@ -122,14 +122,23 @@ def lv_fence(source, language, css_class, options, md, **kwargs): + 'width="{options.get("width", 600)}">') -def kroki_fence(source, language, css_class, options, md, **kwargs): +def kroki_fence(source, language, css_class, options, md, classes, attrs, + **kwargs): data = base64.urlsafe_b64encode(zlib.compress( source.encode("utf-8"), 9)).decode("ascii") lang = options.get("type", options.get("name", "svgbob")) attr = "" if "width" in options and "height" in options: attr = f' width="{options["width"]}" height="{options["height"]}"' - return f'' + if classes: + divopen = "
" + divclose = "
" + else: + divopen = divclose = "" + attr += "".join(f'{k}="{v}"' for k, v in attrs.items()) + return (divopen + + f'' + + divclose) def named_kroki(name): diff --git a/docs/2022/how-i-came-up-with-phoo/index.html b/docs/2022/how-i-came-up-with-phoo/index.html index c9680bd..f6c1cec 100644 --- a/docs/2022/how-i-came-up-with-phoo/index.html +++ b/docs/2022/how-i-came-up-with-phoo/index.html @@ -138,11 +138,11 @@

Related Posts

+ + + + + + + + + + + + + + + + + + + + + + + + + +
+
Patrick the purple dragon

dragoncoder047’s blog

random thoughts about nonrandom things

+ +
+
+

A Hash-Mapped Mess

+ + + +

This post is part 10 of the pickle series:

+
    +
  1. + Pickles! +
  2. +
  3. + Manual Memory Management Madness +
  4. +
  5. + Pickle Tokenizer +
  6. +
  7. + Yet Another Garbage Collector +
  8. +
  9. + Powerful PICKLE Pattern Matching +
  10. +
  11. + PICKLE Has Regular Expressions, Apparently +
  12. +
  13. + It's September!! +
  14. +
  15. + Continuations and the thunk queue +
  16. +
  17. + The Lesser of Two Evils +
  18. +
  19. + A Hash-Mapped Mess +
  20. +
+

It has not been a good week. I set out on Tuesday to actually add real objects to PICKLE, with a hashmap of properties and multiple inheritance and everything. Suffice to say, that wasn’t easy. Between null pointer dereferencing, sloppy APIs, and an incomplete algorithm, it took several hours’ work total to root out all the bugs.

+

The hashmap design I used was a sort of “binary trie”, based on this gist of Tim Caswell. The design of it seems simple enough – each node can store a key-value pair, and to find an element you first check to see if the root node holds the key you’re looking for, if it doesn’t, you treat the hash value of the key as a bit string and get the next-most-significant bit of the hash, and recurse on the left or right child node depending on whether the bit is a 0 or a 1.

+

Except, in practice, it’s not that simple.

+

Obviously, hashmaps for object properties aren’t static – they will have properties inserted, updated, and removed. Updating an existing property is almost trivial – you just find the corresponding node in the hashmap and replace the value.

+

The algorithms outlined for adding or deleting a property end up causing problems. Deleting a property just clears the node, it doesn’t delete the node, and so the property-addition algorithm can check for and re-use cleared nodes instead of creating new children. This means that a simple add-or-update implementation might end up accidentally inserting the value twice, because it found a cleared node higher-up in the tree than the existing node’s position and stopped too soon.

+
+

Consider what happens if you have, say, three nodes called foo, bar, and baz. First foo is inserted to an empty tree, so it becomes the root node, Then bar and baz are added, and become children of foo.

+
+

Now foo is deleted. The first node matching it is cleared - no problem. There are no foos in the tree.

+
+

bar is updated. Since there is a free node above the old bar, there ends up two bar nodes.

+

Up until now, there isn’t any problem. Finding any node, even in the tree with the duplicated bar, finds the correct value.

+
+

The problem arises when you try to delete bar on this duplicated tree – and since the old bar node wasn’t ever deleted, this “shadow” node now rears it ugly head and causes the key bar to revert to its old value, instead of being deleted like it was supposed to be.

+

I spent a long time trying to figure out how to combat this problem. The easiest solution, which I implemented, is to traverse the entire hash’s search path, not just stopping at the first free node, when updating a value. If the new value is set by filling a free node (rather than simply updating the existing node with the same key), there may be shadow nodes under it, so the rest of the tree has to be traversed, and these shadow nodes cleared. This guarantees there will only be one node for any given key in use at the same time.

+
+

The one last bug, that I still haven’t fixed, is the way these hashmaps work with the garbage collector. When you delete a node, you don’t actually delete the node, you just clear it / mark it as free. The memory is still in use as far as the garbage collector is concerned, and is never freed. Even if the an object has no properties currently, if the object had, say, a thousand different properties at some point in the past, there will now be a thousand unused nodes in the hashmap that the garbage collector just won’t collect.

+

I am not sure how to fix this. If I go start removing empty nodes willy-nilly, how will that impact looking up existing non-empty nodes? How do I put the tree back together after I delete a node? I don’t know quite yet. A search trie is (yet again) something new to me, and I’ll have to do more research. Every day is a learning experience – that is to be celebrated.

+
+

Related Posts

+ + +
+
+ +
+ +
+
+ + + \ No newline at end of file diff --git a/docs/2024/depends-on-your-definition-of-viral/index.html b/docs/2024/depends-on-your-definition-of-viral/index.html index fdab0f4..ddd7d3b 100644 --- a/docs/2024/depends-on-your-definition-of-viral/index.html +++ b/docs/2024/depends-on-your-definition-of-viral/index.html @@ -77,6 +77,11 @@

+ A Hash-Mapped Mess + → +
Posted diff --git a/docs/2024/order-up/index.html b/docs/2024/order-up/index.html index 2687191..9bce054 100644 --- a/docs/2024/order-up/index.html +++ b/docs/2024/order-up/index.html @@ -113,8 +113,8 @@

LILduino
  • uLisp Thoughts
  • Two Down, A Zillion More To Go
  • +
  • A Hash-Mapped Mess
  • The Lesser of Two Evils
  • -
  • Continuations and the thunk queue
  • + + + + + + + + + + + + + + + + + + + + + + + +
    +
    Patrick the purple dragon

    dragoncoder047’s blog

    random thoughts about nonrandom things

    + +
    +
    +

    Articles tagged with language-design

    + +
    +

    Pickles!

    +
    + +
    By + dragoncoder047 +
    +
    +

    I’ve been playing around a little bit with LIL on my ESP32 arduino. It works, but there are a few things I don’t like. LIL isn’t object-oriented by default, so I can’t do a lot of what I am used to writing code in Javascript and …

    +
    +
    +

    TEHSSL

    +
    + +
    By + dragoncoder047 +
    +
    +

    I started writing a new programming language, TEHSSL, a few days ago. Starting from scratch (again!) wasn’t easy, and I’m nowhere near done yet. I have got two things working so far: the garbage collector, and the tokenizer. I have no idea how to handle the glue in …

    +
    +
    +

    uLisp Thoughts

    +
    + +
    By + dragoncoder047 +
    +
    +

    For a while I have been trying to work out some bugs in David Johnson-Davies’ uLisp interpreter for Arduinos. I ported some macro and quasiquote extensions for an older version of uLisp to the current version, and apparently I did not do something right – it crashes whenever I try to …

    +
    +
    +

    How I came up with Phoo

    +
    + +
    By + dragoncoder047 +
    +
    +

    Several moths ago I stumbled upon Gordon Charlton’s Quackery language while paroosing Github for something. Usually I don’t pay much attention to obviously irrelevant search results, but this one seemed worth a look. I found Quackery to be a simple stack-based semi-compiled language that makes infrequent use of …

    +
    +

    + <<First + Page 2 of 2 +

    +
    +
    + +
    + +
    +
    + + + \ No newline at end of file diff --git a/docs/tag/programming/index.html b/docs/tag/programming/index.html index 6a2e645..6140422 100644 --- a/docs/tag/programming/index.html +++ b/docs/tag/programming/index.html @@ -96,6 +96,16 @@

    Articles tagged with programming

    +

    A Hash-Mapped Mess

    +
    + +
    By + dragoncoder047 +
    +
    +

    It has not been a good week. I set out on Tuesday to actually add real objects to PICKLE, with a hashmap of properties and multiple inheritance and everything. Suffice to say, that wasn’t easy. Between null pointer dereferencing, sloppy APIs, and an incomplete algorithm, it took several hours …

    +
    +

    The Lesser of Two Evils

    @@ -185,16 +195,6 @@

    Powerful PICKLE Pattern Matching

    -
    - -
    By - dragoncoder047 -
    -
    -

    I did a lot of work on Tinobsy, the garbage collector for PICKLE. It’s pretty robust now, and passes all my tests – plus I translated it to C++ so I can take advantage of C++’s syntactic sugar for objects. All I think that I’ll be doing with …

    -

    Page 1 of 4 Next> diff --git a/docs/tag/programming/page2/index.html b/docs/tag/programming/page2/index.html index 31e98bc..ee14527 100644 --- a/docs/tag/programming/page2/index.html +++ b/docs/tag/programming/page2/index.html @@ -96,6 +96,16 @@

    Articles tagged with programming

    +

    Powerful PICKLE Pattern Matching

    +
    + +
    By + dragoncoder047 +
    +
    +

    I did a lot of work on Tinobsy, the garbage collector for PICKLE. It’s pretty robust now, and passes all my tests – plus I translated it to C++ so I can take advantage of C++’s syntactic sugar for objects. All I think that I’ll be doing with …

    +
    +

    Yet Another Garbage Collector

    @@ -188,17 +198,6 @@

    Debugger, Almost

    -
    - -
    By - dragoncoder047 -
    -
    -

    Today I started work on the Phoo debugger. As-is, it is very simple – I already programmed in a “tick” function into Phoo that gets called every item, and so my debugger only needs to patch itself into this function.

    -

    However, the three buttons – “Into”, “Over”, and “Out” – caused me some …

    -

    <<First Page 2 of 4 diff --git a/docs/tag/programming/page3/index.html b/docs/tag/programming/page3/index.html index b763f2e..1b94b39 100644 --- a/docs/tag/programming/page3/index.html +++ b/docs/tag/programming/page3/index.html @@ -96,6 +96,17 @@

    Articles tagged with programming

    +

    Debugger, Almost

    +
    + +
    By + dragoncoder047 +
    +
    +

    Today I started work on the Phoo debugger. As-is, it is very simple – I already programmed in a “tick” function into Phoo that gets called every item, and so my debugger only needs to patch itself into this function.

    +

    However, the three buttons – “Into”, “Over”, and “Out” – caused me some …

    +
    +

    I Still Have No Idea

    @@ -196,17 +207,6 @@

    Langton's Ant Music

    -
    - -
    By - dragoncoder047 -
    -
    -

    Over the weekend I joined the conwaylife.com forums because I am interested in cellular automata. I find watching the mechanisms mesmerizing, and building them exciting.

    -

    I also have an interest in music, and so a year or two ago I tried to generate music from cellular automata. I used …

    -

    <<First <Previous diff --git a/docs/tag/programming/page4/index.html b/docs/tag/programming/page4/index.html index a5f8210..e1b1d5b 100644 --- a/docs/tag/programming/page4/index.html +++ b/docs/tag/programming/page4/index.html @@ -96,6 +96,17 @@

    Articles tagged with programming

    +

    Langton's Ant Music

    +
    + +
    By + dragoncoder047 +
    +
    +

    Over the weekend I joined the conwaylife.com forums because I am interested in cellular automata. I find watching the mechanisms mesmerizing, and building them exciting.

    +

    I also have an interest in music, and so a year or two ago I tried to generate music from cellular automata. I used …

    +
    +

    Gah... I broke it!

    diff --git a/docs/tags/index.html b/docs/tags/index.html index 6bed24b..273bfdb 100644 --- a/docs/tags/index.html +++ b/docs/tags/index.html @@ -54,18 +54,18 @@

    Tags for dragoncoder047’s blog

    • arduino (4)
    • armdroid (1)
    • -
    • c (16)
    • +
    • c (17)
    • cellular-automata (2)
    • css (1)
    • electronics (11)
    • game-design (4)
    • -
    • garbage-collector (2)
    • +
    • garbage-collector (3)
    • javascript (10)
    • -
    • language-design (13)
    • +
    • language-design (14)
    • lisp (2)
    • machine-learning (1)
    • phoo (9)
    • -
    • programming (38)
    • +
    • programming (39)
    • python (8)
    • reverse-engineering (5)
    • robotics (1)
    • diff --git a/markdown/pickle_slow_progress.md b/markdown/pickle_slow_progress.md new file mode 100644 index 0000000..7b4e199 --- /dev/null +++ b/markdown/pickle_slow_progress.md @@ -0,0 +1,58 @@ +Title: A Hash-Mapped Mess +Date: 2024-04-05 +Series: pickle +Tags: programming, c, garbage-collector, language-design + +It has not been a good week. I set out on Tuesday to actually add real objects to PICKLE, with a hashmap of properties and multiple inheritance and everything. Suffice to say, that wasn't easy. Between null pointer dereferencing, sloppy APIs, and an incomplete algorithm, it took several hours' work total to root out all the bugs. + +The hashmap design I used was a sort of "binary trie", based on [this gist][hmap] of Tim Caswell. The design of it seems simple enough -- each node can store a key-value pair, and to find an element you first check to see if the root node holds the key you're looking for, if it doesn't, you treat the hash value of the key as a bit string and get the next-most-significant bit of the hash, and recurse on the left or right child node depending on whether the bit is a 0 or a 1. + +[hmap]: https://gist.github.com/creationix/3ea0d27dd100c5b53ca8546a2084ad47 + +Except, in practice, it's not that simple. + +Obviously, hashmaps for object properties aren't static -- they will have properties inserted, updated, and removed. Updating an existing property is almost trivial -- you just find the corresponding node in the hashmap and replace the value. + +The algorithms outlined for adding or deleting a property end up causing problems. Deleting a property just clears the node, it doesn't delete the node, and so the property-addition algorithm can check for and re-use cleared nodes instead of creating new children. This means that a simple add-or-update implementation might end up accidentally inserting the value twice, because it found a cleared node higher-up in the tree than the existing node's position and stopped too soon. + +```{.mermaid .float-right} +graph TD + foo --> bar + foo --> baz +``` + +Consider what happens if you have, say, three nodes called `foo`, `bar`, and `baz`. First `foo` is inserted to an empty tree, so it becomes the root node, Then `bar` and `baz` are added, and become children of `foo`. + +```{.mermaid .float-right} +graph TD + foo[ ] --> bar + foo --> baz +``` + +Now `foo` is deleted. The first node matching it is cleared - no problem. There are no `foo`s in the tree. + +```{.mermaid .float-right} +graph TD + foo["bar (new)"] --> bar["bar (old)"] + foo --> baz +``` + +`bar` is updated. Since there is a free node above the old `bar`, there ends up two `bar` nodes. + +Up until now, there isn't any problem. Finding any node, even in the tree with the duplicated `bar`, finds the correct value. + +```{.mermaid .float-right} +graph TD + foo[ ] --> bar["bar (old)"] + foo --> baz +``` + +The problem arises when you try to delete `bar` on this duplicated tree -- and since the old `bar` node wasn't ever deleted, this "shadow" node now rears it ugly head and causes the key `bar` to revert to its old value, instead of being deleted like it was supposed to be. + +I spent a long time trying to figure out how to combat this problem. The easiest solution, which I implemented, is to traverse the entire hash's search path, not just stopping at the first free node, when updating a value. If the new value is set by filling a free node (rather than simply updating the existing node with the same key), there may be shadow nodes under it, so the rest of the tree has to be traversed, and these shadow nodes cleared. This guarantees there will only be one node for any given key in use at the same time. + +--- + +The one last bug, that I still haven't fixed, is the way these hashmaps work with the garbage collector. When you delete a node, you don't actually *delete* the node, you just clear it / mark it as free. The memory is still in use as far as the garbage collector is concerned, and is never freed. Even if the an object has no properties currently, if the object had, say, a thousand different properties at some point in the past, there will now be a thousand unused nodes in the hashmap that the garbage collector just won't collect. + +I am not sure how to fix this. If I go start removing empty nodes willy-nilly, how will that impact looking up existing non-empty nodes? How do I put the tree back together after I delete a node? I don't know quite yet. A search trie is (yet again) something new to me, and I'll have to do more research. Every day is a learning experience -- that is to be celebrated.