Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race condition with docc preview that appears to be acerbated by including snippets in DocC content #1084

Open
2 tasks done
heckj opened this issue Nov 4, 2024 · 10 comments · May be fixed by #1091
Open
2 tasks done
Labels
bug Something isn't working

Comments

@heckj
Copy link
Member

heckj commented Nov 4, 2024

Description

Originally logged in swift-docc-plugin as swiftlang/swift-docc-plugin#99

I found that adding snippet connect (Snippets/somefile.swift) to a caused the preview-documentation command to intermittently fail in terms of rendering all the content. The top-level module symbol graph file (in my case, voxels.json) is missing in some iterations of invoking this command, and present in others.

After iterating with various plugin versions and nailing down that this was, in fact, an intermittent issue, I found the plugin to be invoking the following command:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/docc preview /Users/heckj/src/Voxels/Sources/Voxels/Documentation.docc --emit-lmdb-index --fallback-display-name Voxels --fallback-bundle-identifier Voxels --additional-symbol-graph-dir /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/.build/symbol-graphs/unified-symbol-graphs/Voxels-7 --output-path /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/Voxels.doccarchive

I'm Opening this bug on swift-docc to continue to debug, and closing the issue on swift-docc-plugin since it doesn't appear to be related to the plugin content.

Checklist

  • If possible, I've reproduced the issue using the main branch of this package.
  • This issue hasn't been addressed in an existing GitHub issue.

Expected Behavior

When invoking swift package --disable-sandbox preview-documentation --target MyTarget on a package using swift-docc-plugin and containing any snippet content, I expect the documentation archive to be fully rendered. In my example. showing something akin to:

========================================
Starting Local Preview Server
	 Address: http://localhost:8080/documentation/voxels
========================================

Actual behavior

Intermittently, and accerbated by having snippet content in the project, the preview fails to render, showing instead:

========================================
Starting Local Preview Server
	 Address: http://localhost:8080/documentation/
========================================

And inside the docs-archive output path that preview used, the top-level target symbol graph JSON file (in my case, voxels.json) is missing. But this is intermittent, and repeated invocations will get different results.

Steps To Reproduce

  • Get example Swift package (public repository)
git clone https://github.com/heckj/voxels
cd voxels
  • Check out content before snippets were added:
git checkout 0.2.4

(commit shows 7048693)

  • repeatedly invoke the preview command:
swift package --disable-sandbox preview-documentation --target Voxels

For 5 runs:

  • content rendered

  • content rendered

  • content rendered

  • content rendered

  • content rendered

  • Check out content with snippet content added:

git checkout 6089364
  • repeatedly invoke the preview command:
swift package --disable-sandbox preview-documentation --target Voxels

For 5 runs:

  • content missing
  • content rendered
  • content rendered
  • content rendered
  • content missing

Swift-DocC Version Information

'6.0.2' (included with Xcode 16.1)

Swift Compiler Version Information

swift-driver version: 1.115 Apple Swift version 6.0.2 (swiftlang-6.0.2.1.2 clang-1600.0.26.4)
Target: arm64-apple-macosx15.0
@heckj
Copy link
Member Author

heckj commented Nov 4, 2024

repeating with main branch of docc:

export DOCC_HTML_DIR=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/share/docc/render/
/Users/heckj/src/swift-project/swift-docc/.build/debug/docc preview /Users/heckj/src/Voxels/Sources/Voxels/Documentation.docc --emit-lmdb-index --fallback-display-name Voxels --fallback-bundle-identifier Voxels --additional-symbol-graph-dir /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/.build/symbol-graphs/unified-symbol-graphs/Voxels-7 --output-path /Users/heckj/src/Voxels/.build/plugins/Swift-DocC\ Preview/outputs/Voxels.doccarchive

5 runs:

  • no content
  • content rendered
  • no content
  • no content
  • content rendered

So the same behavior exists with the main development branch.

@d-ronnqvist
Copy link
Contributor

d-ronnqvist commented Nov 4, 2024

I'd be curious to see if it also reproduces with the changes from #1070 or if the race conditions fixed in that PR (aimed at fixing flaky tests) also fix this issue. 🤔

I can give that a try tomorrow.

@heckj
Copy link
Member Author

heckj commented Nov 4, 2024

I can easily give it a shot later today - I've continued debugging it. Away from my laptop for lunch, but I'll give that specific PR a run and see how it does.

UPDATE:

I switched to that branch, did a build, and then tried the "direct" docc preview command another 5 times. I'm afraid it doesn't appear to solve the issue:

  • rendered
  • rendered
  • missing file
  • missing file
  • rendered

I read through the PR, and it looks like it didn't touch any of the internals of the converters, more focused on catching errors and dealing with handing out log handles. My current hypothesis is that there's a top-level writer that finishes up with the "module" JSON rendering, that isn't being explicitly "wrapped up" - I haven't found my way through the code to verify, but something perhaps writing on deinit or a defer, that is taking a bit longer or has an unclear deadline. All of the internal JSON symbolgraph files are written out, those are there no issue - it's the top level module one that isn't getting dropped into place occasionally. It seems to reliably get written on "convert" alone, but when convert is followed by spinning up the preview server, it's occasionally going awry.

I thought maybe it had it's own DocumentationNode instance, a root of the tree, that was processed in the converter, but as I added in my own debugging code to see the names of the symbols processed as they went by, I never saw one that appeared to represent the top module - so I haven't found the right place to look as yet. I'll keep looking tomorrow and dig back into the code a bit more to see if I can spot where this missing file gets written, and track-back from there to see why it might not be always getting written.

@heckj
Copy link
Member Author

heckj commented Nov 5, 2024

Narrowing this down a bit more, I dropped into ConvertActionConverter.convert() and added a bit of extra code to understand what was flowing, and what wasn't.

Within the concurrentPerform iteration that does the lifting, I looked for an entity for the module, and sometimes it appeared, and when the result was "missing content" it wouldn't.

Shimmed in code, and I tacked a breakpoint onto the print statement

                    if entity.name.plainText == "Voxels" {
                        print("FOUND IT!: \(entity.name)")
                    }

This is leading me to believe there's a possible issue with context.knownPages.concurrentPerform() - in some iterations, it's never returning the identifier for the top-level module.

@heckj
Copy link
Member Author

heckj commented Nov 5, 2024

continuing my poking, and it's above/earlier than concurrentPerform - I checked the contents of context.knownPages - and in some cases that doesn't include the expected resolved page.

The context doesn't appear to be be mutated at all in this method, so I'm looking upstream. In the docc preview use case, this whole method - context passed in - is called from ConvertAction

@heckj
Copy link
Member Author

heckj commented Nov 5, 2024

Tracking it back to line 281 in ConvertAction:

let context = try DocumentationContext(bundle: bundle, dataProvider: dataProvider, diagnosticEngine: diagnosticEngine, configuration: configuration)

I've got a stupid bit of printf debugging running:

for page in context.knownPages {
    //print(page.absoluteString)
    if page.url.absoluteString == "doc://Voxels/documentation/Voxels" {
        print("PING")
    }
}

And in the race condition cases where the node is missing, it's missing from the start of the context generation.

Looking into DocumentationContext.init(...), the heavy lifting in there is the
try register(bundle) which looks to be the source of the racing here.

I've tracked it back within the sequence of the bits in flight during register(), and when it's appearing correctly, it's in place right after the call to try registerSymbols(from: bundle, symbolGraphLoader: symbolGraphLoader, documentationExtensions: documentationExtensions) (line 2284) - which makes sense, as it seems that function is the one that establishes and builds out the topicGraph structure inside the context.

@heckj
Copy link
Member Author

heckj commented Nov 5, 2024

After digging back into the registerSymbols() function (line 1142) where the topic is set up in it's tree, I realized that the key piece that's being established that wrong in the race condition is the node's isVirtual property is resolving to true (which prevents it from being renderered/included) in some cases.

I've tracked back how/where that's set up to the iteration over symbolGraph files (

for (moduleName, unifiedSymbolGraph) in symbolGraphLoader.unifiedGraphs {
) that loads up the information.

Pausing for now - the flip of isVirtual being true or false is happening somewhere inside SymbolGraphLoader as it processes its graphs, and the resulting data is getting placed into the loader's unifiedGraphs property.

@heckj
Copy link
Member Author

heckj commented Nov 6, 2024

The source of the isVirtual are the upstream symbol graphs that come from the compiler (or from symbol-graph-extract). I haven't sorted out entirely how they're merged, but notably when you have a Snippet, the symbol graph for that snippet lists its module with isVirtual=true.

My updated hypothesis is that the graphs are being processed in a non-deterministic order, and as the relevant top-level module symbols are merged together into a unified set of symbols that include the snippets, the top-level module gets arbitrarily picked from those, rather than "if any are non-virtual", prefer that.

@heckj
Copy link
Member Author

heckj commented Nov 6, 2024

Following up with how SymbolGraph operates internally, there's a private function that provides an indicator of if the symbolgraph in question is the "main" one or not, that used the logic:

let isMainSymbolGraph = !url.lastPathComponent.contains("@")

This doesn't account for a snippets module graph, which also appears to be main. However, updating the private function in question (private static func moduleNameFor(_ symbolGraph: SymbolGraph, at url: URL) -> (String, Bool) ) to do a bit more exclusion didn't resolve the issue.

Turns out there's almost identical logic in SymbolKit's GraphCollector, which likewise doesn't account for the possibility of snippet modules.

The function public static func moduleNameFor(_ graph: SymbolGraph, at url: URL) -> (String, Bool)
An instance of GraphCollector is maintained in the SymbolGraphLoader, and this same logic is hit when calling it's
public func mergeSymbolGraph(_ inputGraph: SymbolGraph, at url: URL, forceLoading: Bool = false) function. It then uses that "Does an @ symbol exist in the module name" to determine extension (@ symbol exists) or primary - and because there's multiple primaries, (and it grabs the first one it finds), the value looked up gets indeterministic.

I think the ultimate resolution here would be to extend the logic in SymbolKit's GraphCollector.moduleNameFor() needs some tweaking to account for a 3rd kind of collection of symbols - the snippets - and record that as an extension kind rather than a primary kind. Just looking for "snippets" in the name of the URL is a potential solution, but suboptimal if any module name actually contains the string "snippets", at which point it's a false positive. That said, the SymbolGraph itself has metadata (the generator) that can be used to also identify where the symbolGraph came from that would be a good double-check on that logic.

@heckj
Copy link
Member Author

heckj commented Nov 6, 2024

Since this digging extends down into SymbolKit, I opened swiftlang/swift-docc-symbolkit#88 with an explanation or what was incorrect there. In addition to the upstream logic fix, which seems needed to resolve this issue, the replicated logic in SymbolGraphLoader should likely be removed, and the logic from SymbolKit (when fixed) used in it's place in order to localize the knowledge and not have to solve this in two places.

The local logic in swift-dock's SymbolGraphLoader doesn't appear to be causing an issue, but it is a bit incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants