Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for using existing Dash.app docsets #53

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

smackesey
Copy link
Contributor

Per my comment on #52:

My approach here was pretty simple, I added a new configuration variable DASHT_USE_DASH_APP that makes dasht use the Dash.app docsets. This also causes all of the docset management commands (install, update, remove, extract) to exit with a (new) exit code of 46. This seemed appropriate because if you are going to use the Dash.app library, it is best to just let Dash.app manage the docsets rather than trying to make sure dasht and Dash.app handle things exactly the same way.

One thing I wasn't sure of was whether to commit updated manpages. The Development section of the README says to rebuild the manpages, but the man directory is also in the gitignore. I can rebuild them and add them to this PR if you like.

Copy link
Owner

@sunaku sunaku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! 👍 Looks good overall, but needs a few tweaks. :neckbeard:

bin/dasht Outdated Show resolved Hide resolved
bin/dasht Outdated Show resolved Hide resolved
bin/dasht-docsets Outdated Show resolved Hide resolved
bin/dasht-docsets-extract Outdated Show resolved Hide resolved
bin/dasht-docsets-extract Outdated Show resolved Hide resolved
@@ -95,7 +105,12 @@ status=44 # (default) exit with a nonzero status when no results are found
trap 'status=0' USR1 # override default exit status when results are found

dasht-docsets "$@" | while read -r docset; do
database="$DASHT_DOCSETS_DIR/$docset".docset/Contents/Resources/docSet.dsidx
if $DASHT_USE_DASH_APP; then
relpath=("$DASHT_DOCSETS_DIR"/*/*/"$docset".docset)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the ( array ) notation? Should be removed.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
bin/dasht Outdated
# documents extracted. Dash.app leaves the documents in the tarball by default.
if $DASHT_USE_DASH_APP; then
dasht-docsets "$@" | while read docset; do
root=("$DASHT_DOCSETS_DIR/"*/*"/${docset}.docset")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need ( array ) syntax.

Copy link
Owner

@sunaku sunaku Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your comment:

those particular globs won't expand in my bash v3.2.57 without the array syntax. Something about putting an * next to a quote, I think.

Could you please try surrounding the stars with slashes (by moving them out of the surrounding strings)? Like this:

root="$DASHT_DOCSETS_DIR"/*/*/"$docset".docset

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if BASH is requiring you to add ( array ) syntax because the glob can potentially match more than one file, then it would be better to convert this expression into a nested loop:

for root in "$DASHT_DOCSETS_DIR"/*/*/"$docset".docset; do
  # ...
done

@smackesey
Copy link
Contributor Author

smackesey commented Nov 15, 2020

Hey, sorry this took me a little bit. I addressed all your requests except for the one about unnecessary array syntax. I'm not sure why exactly, but those particular globs won't expand in my bash v3.2.57 without the array syntax. Something about putting an * next to a quote, I think.

Also, did you want me to rebuild the manpages? See my question in my first PR comment.

@sunaku
Copy link
Owner

sunaku commented Nov 20, 2020

Thanks, I'll try to review this (free time is also scarce for me these days 😅) but yes: please include the updated man pages.

bin/dasht Outdated
# documents extracted. Dash.app leaves the documents in the tarball by default.
if $DASHT_USE_DASH_APP; then
dasht-docsets "$@" | while read docset; do
root=("$DASHT_DOCSETS_DIR/"*/*"/${docset}.docset")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if BASH is requiring you to add ( array ) syntax because the glob can potentially match more than one file, then it would be better to convert this expression into a nested loop:

for root in "$DASHT_DOCSETS_DIR"/*/*/"$docset".docset; do
  # ...
done

bin/dasht Outdated
dasht-docsets "$@" | while read docset; do
root=("$DASHT_DOCSETS_DIR/"*/*"/${docset}.docset")
if ! test -d "${root}/Contents/Resources/Documents"; then
dasht-docsets-extract "$root/Contents/Resources/tarix.tgz"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this extraction would better fit in a different script, say dasht-docsets-inherit, which would focus on preparing docsets from other apps for use with dasht. This way, the dasht script won't exceed its original purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I made the requested changes and tested it out a bit, I think all is working well. BTW, you should update gitignore and instructions re manpages:

  • /man is in gitignore despite being included in the repo
  • I am not familiar with binman-rake, but I think it's behavior changed since you posted the instructions. I ran it and it generated a bunch of html, markdown, and a CSS flie. I deleted all of the excess files it generated and just included the changes to the manpage-format one and the dasht-docsets-inherit.

Copy link
Owner

@sunaku sunaku Jul 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback. Actually, both of those are exactly as they're meant to be:

  • For gitignore, /man is ignored (line 2) but some of its contents aren't (line 1). 🤓
  • For binman-rake, the excess files it generates are ignored by gitignore line 2.
!/man/man?/*.?
/man

@sunaku sunaku mentioned this pull request Jul 22, 2021
Copy link
Owner

@sunaku sunaku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for revising this PR. 👍 It's coming along well, but I think it would benefit from some additional changes that I have requested via the GitHub code review system. In case I'm asking for too much, 😅 let me know -- I don't mind doing these myself. 🤓

dasht-docsets-extract "$root/Contents/Resources/tarix.tgz"
fi
done
dasht-docsets-inherit "$@"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good refactoring, but I don't think dasht should be calling the dasht-docsets-inherit script at all -- that should be left up to the user to run, if they choose. So I would remove this entire if-statement from dasht.


: ${DASHT_USE_DASH_APP:=false}
if $DASHT_USE_DASH_APP; then
: ${DASHT_DOCSETS_DIR:="$HOME/Library/Application Support/Dash"}
Copy link
Owner

@sunaku sunaku Jul 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the user to set the DASHT_USE_DASH_APP flag; this inheritance script should simply inherit from Dash for macOS if it's found. This would pave the way for further enhancement of this script in the future, say, to inherit from additional sources such as Zeal docsets and custom user docsets.

# Ensure any docsets we are accessing from the Dash.app library have had their
# documents extracted. Dash.app leaves the documents in the tarball by default.
dasht-docsets "$@" | while read docset; do
for root in "$DASHT_DOCSETS_DIR"/*/*/"$docset".docset; do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think $DASHT_DOCSETS_DIR shouldn't be redefined, because the user may have set it to something special. Instead, the loop should try looking in $HOME/Library/Application Support/Dash (which can be factored out as follows).

KAPELI_DASH_DIR="$HOME/Library/Application Support/Dash"
# ...
if test -e "$KAPELI_DASH_DIR"; then
  for root in "$KAPELI_DASH_DIR"/*/*/"$docset".docset; do
    # ...

Copy link
Contributor Author

@smackesey smackesey Jul 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably better for you to make the changes, because it seems like we have somewhat different architectural ideas about the interface, and I think if I made just the changes you requested, it would leave the project with a weird mixture-- if you want to move in the requested direction, you'll probably need to adjust some of my other changes as well. So I'll explain the overall design I followed, then you can do what you want with it.

When making my changes, this was my thinking: the user is using dasht in one of two "modes": regular mode or Dash.app mode. Dash.app mode is activated by setting the DASHT_USE_DASH_APP flag. All of the other scripts dealing with docset management (install,extract,remove,update) behave conditionally on this flag-- that is why dasht-docsets-inherit does as well, it would be weird to have only script that didn't.

Similarly, dasht-docsets-inherit just follows the behavior of the other scripts re DASHT_DOCSETS_DIR as well-- it sets a default value, the macOS location (so a user setting would not be changed). In general, I don't expect people to simultaneously set DASHT_DOCSETS_DIR and DASHT_USE_DASH_APP-- the exception is if Dash.app is in some exotic location.

I was also a bit confused by the name dasht-docsets-inherit that you requested. To me, this suggests a kind of transferring of Dash.app docsets into dasht, e.g. Dash.app docsets would be copied over and extracted into the dedicated DASHT_DOCSETS_DIR directory. If that were the case, I would understand why you wouldn't want this called in the main dasht executable-- it would be a kind of separate import operation. But this is not how it works at present. All dasht-docsets-inherit does is perform in-place extraction of Dash.app docsets within their existing directory. This is a one-time operation that is necessary when accessing Dash.app docsets. Given the current design, I don't understand the rationale for running this separately. As it is, if the user downloads some datasets in Dash.app, they are seamlessly available in dasht. But if you make them run dasht-docsets-inherit separately, then every time they download a new Dash docset, they have to remember to run dasht-docsets-inherit, for no benefit that I can see.

It seems to me that your vision is better suited to a system where, rather than having Dash.app mode and regular mode (as I designed it), dasht always just reads from it's own dedicated docset database/directory, and you can import into this directory from either Dash.app or Zeal/custom whatever. That sounds like a good design, but that implies bigger changes (removing DASHT_USE_DASH_APP entirely).

Copy link
Owner

@sunaku sunaku Jul 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well said! 💯 You've understood my vision correctly and, thanks to your clear explanation, I now realize that I didn't quite understand your approach in this PR until now. 😅 Let me think more about how to reconcile our different approaches. 🤔

One of the reasons for my knowledge gap is that I don't have access to (and have never actually used) the Dash for macOS app firsthand. So I don't know how Dash organizes its docsets on the filesystem nowadays, other than from a helpful (but now possibly outdated) user-provided tip in #15 (comment).

In particular, since you intended to extract Dash's docsets directly within its own internal area (as opposed to dasht's DASHT_DOCSETS_DIR area), I'm wondering how Dash operates: 🤔 it either has its own internal area where it extracts docset tarballs (in which case, couldn't we just point to that instead of re-extracting for dasht?) or it has some kind of virtual filesystem adapter that dynamically exposes the contents of docset tarballs as if they were normal extracted files on disk.

I'm intrigued by the latter possibility 🤓 and am curious to try implementing a very basic, shell scripted version thereof. A direct consequence of this would be that dasht would need to become web-server based by default, since direct file access can't be intercepted by mere shell scripts: that would require a real VFS, which is beyond the scope of this project.

Could you help me understand how Dash organizes its docsets on your system? A simple diagram, such as the output of the tree command, as exemplified in #15 (comment)), would be sufficient. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad I could clear things up!

Since the initial work on this was done close to a year ago, I don't remember everything I learned about Dash.app's internal organization of docsets when I started work on this, but I remember being surprised that I couldn't find extracted versions of the docsets in Dash's saved data. I don't know how it accesses the tarball content, I assumed it uses some library I was unaware of for reading tarballs directly, and since this was a shell script project, I didn't investigate further.

Here's an example of the structure of a DocSet as Dash stores it:

── SciPy.docset
    └── Contents
        ├── Info.plist
        └── Resources
            ├── docSet.dsidx
            ├── optimizedIndex.dsidx
            ├── tarix.tgz
            └── tarixIndex.db

And attached is the full output of ~/Library/Application Support/Dash.
tree.txt

Copy link
Owner

@sunaku sunaku Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed report! 🙇 There seems to be a mixture of fully extracted docsets (which is what dasht currently uses) as well as unextracted tarix.tgz ones.

Could you inspect the tarixIndex.db file further? I'm thinking that it's either a repackaged version of tarix.tgz (to avoid the overhead of gunzip before accessing the contents of the underlying tar) or a searchable index thereof (to avoid the same gunzip overhead, but meant for traversing the underlying tar). 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mixture is there because I use my fork of dasht, which does the extraction in place on-demand. I haven't searched every docset in my Dash library with dasht, so not every docset is extracted.

Here's a link to a tarixIndex.db file, it's a SQLite database (Github won't let me attach this file type directly to the issue). https://file.io/1Lsva6fX1MLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants