Draft of feat: allow "negative" text queries #19

ramayer · 2021-09-28T08:06:42Z

This is a draft of a patch for something similar to issue #18 -- but it was not implemented exactly according to the requirements described in that issue.

Instead of a single string containing both the positive and negative clauses, I think it would be cleaner if the additive and subtractive phrases used separate command line parameters, like:

rclip zebra --minus="black and white" --plus="red and blue"

More details are mentioned in a comment under #18 .

If you think this is a good direction, I could clean it up more (add examples to docs; and remove a no-longer-used method) and re-submit it.

yurijmikhalevich · 2021-09-28T16:33:33Z

Thank you! I like your suggestion, responded in the GHI: #18 (comment)

rclip/utils.py

rclip/model.py

rclip/main.py

ramayer · 2021-09-29T03:14:36Z

I updated the branch and rebased them to a single commit.

I think I incorporated most of the feedback, but pycodestyle was pretty picky about long lines, so a couple of the lines are wrapped uncomfortably. Let me know if there's more cleanup needed.

rclip/model.py

rclip/utils.py

rclip/model.py

yurijmikhalevich · 2021-09-29T08:54:17Z

Looks great! Thank you 😄 Can you please address a few nits, change "phrase" to "query" we will ship.

yurijmikhalevich · 2021-09-29T09:13:26Z

Do you have the time to add some tests? It will be great to have one that checks that we don't break the search when introducing changes like this one. It will make sense to PR this test separately.

ramayer · 2021-09-30T05:22:29Z

I'd be happy to add some tests if there was a template/framework for me to add to --- even if the initial template is just the equivalent of assert(1>0). If there already is such a framework in the project, sorry I didn't notice it. Googling it seems there are enough different ways of adding tests to python packages I'm not sure which you'd prefer. Also not sure if you'd want to add test images to the project; or download them from somewhere at test time.

I think I incorporated the recent feedback in the pull request; but I do agree with the idea of adding tests first before merging.

yurijmikhalevich · 2021-09-30T11:08:37Z

@ramayer, OK, let's leave tests aside for now. I'll take a look at them when I have time. It makes sense for me to store a dozen test images in git repo; just let's make sure that they are not heavy.

rclip/main.py

rclip/model.py

yurijmikhalevich · 2021-09-30T11:25:17Z

Thank you! A few more comments regarding the computation of the features.

ramayer · 2021-10-01T02:40:00Z

I agree with all your suggestions, and thanks for teaching me better numpy tricks. Too busy with work to get to them today; but I should have a cleaner pull request on the weekend.

ramayer · 2021-10-01T03:10:15Z

It makes sense for me to store a dozen test images in git repo; just let's make sure that they are not heavy.

Clip's preprocess tranform seems to scale everything to 244x244 (at least at the settings rclip uses), it seems it should work with pretty light test images ....

..... but hmm..... that means CLIP is missing small details from high resolution pictures.... which makes me want to consider another possible feature request. Maybe I want to index multiple different CLIP vectors from different crops of my larger pictures......

yurijmikhalevich · 2021-10-01T05:44:19Z

Too busy with work to get to them today; but I should have a cleaner pull request on the weekend.

Sure. No rush. Thank you for doing this! :-)

yurijmikhalevich · 2021-10-01T05:49:46Z

CLIP is missing small details from high resolution pictures.... which makes me want to consider another possible feature request. Maybe I want to index multiple different CLIP vectors from different crops of my larger pictures......

You'll be surprised how well 244x244 works. It will be interesting to experiment with different resolutions or crops, but I suspect that 244x244 should be good enough for rclip. And it is much faster compared to using multiple crops or a higher resolution. Usually, researchers tend to go with the smallest resolution that produces decent results.

ramayer · 2021-10-04T05:57:42Z

Updated with the most recent feedback. Now using np.add.reduce(...) instead of functools.reduce(...) and now passing lists to clip's encode_text.

rclip/model.py

yurijmikhalevich · 2021-10-04T08:09:27Z

rclip/model.py

-    similarities = (text_features @ item_features.T).squeeze(0).tolist()
+    positive_features = self.compute_text_features(positive_queries)
+    negative_features = self.compute_text_features(negative_queries)
+    text_features = np.add.reduce(positive_features) - np.add.reduce(negative_features)


This is good enough for now 👍 , but we may have done something like: np.add.reduce(self.compute_text_features(all_queries) * [1,1,1,-1,-1]) to compute all of the features at once. The array of ones and negative ones should be pre-created.

I thought about that - but I think the multiply is more expensive. If this times it correctly:

import numpy as np import timeit npos = nneg = 1000 pos = np.random.rand(npos,512) neg = np.random.rand(nneg,512) timeit.timeit(lambda: np.add.reduce(pos) - np.add.reduce(neg),number=1000) all = np.random.rand(npos + nneg,512) signs = np.array([1] * npos + [-1] * nneg) timeit.timeit(lambda: np.add.reduce(signs * all.T),number=1000)

I'm getting

>>> timeit.timeit(lambda: np.add.reduce(signs * all.T),number=1000) 2.642763003001164 >>> timeit.timeit(lambda: np.add.reduce(pos) - np.add.reduce(neg),number=1000) 0.7372207119988161

I also don't find the array of signs as easy to read (but maybe that's just me).

@ramayer, thank you for the benchmark! I agree that readibility suffers from the "sign array" approach. My main concern was with the performance difference of calling compute_text_features once vs. twice.

yurijmikhalevich · 2021-10-04T08:10:12Z

Hi! Thank you for the update. One nit.

yurijmikhalevich · 2021-10-04T08:21:14Z

Can you, please, also rebase on top of master?

talpay · 2021-10-04T09:30:29Z

2 questions on the latest PR:

Isn't add redundant as a CL parameter since it is the same as query? What content is supposed to go into add that is not in query (which is required)?

e.g. rclip "zebra black and white" -s "red and blue"
and rclip "zebra" -a "black and white" -s "red and blue"
are identical, right?

Wouldn't it make more sense to get rid of the redundant -a option to simplify things for users?

Queries without subtract (e.g. rclip "query") are now throwing:

Traceback (most recent call last):
  File "main.py", line 58, in <module>
    main()
  File "main.py", line 22, in main
    result = rclip.search(args.query, current_directory, args.top, args.add, args.subtract)
  File "/rclip/search.py", line 131, in search
    sorted_similarities = self._model.compute_similarities_to_text(features, positive_queries, negative_queries)
  File "/rclip/model.py", line 46, in compute_similarities_to_text
    negative_features = self.compute_text_features(negative_queries)
  File "/rclip/model.py", line 35, in compute_text_features
    text_encoded = self._model.encode_text(clip.tokenize(text).to(self._device))
  File "/anaconda3/lib/python3.8/site-packages/clip/model.py", line 344, in encode_text
    x = self.transformer(x)
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/anaconda3/lib/python3.8/site-packages/clip/model.py", line 199, in forward
    return self.resblocks(x)
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/anaconda3/lib/python3.8/site-packages/clip/model.py", line 186, in forward
    x = x + self.attention(self.ln_1(x))
  File "/anaconda3/lib/python3.8/site-packages/clip/model.py", line 183, in attention
    return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 978, in forward
    return F.multi_head_attention_forward(
  File "/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 4265, in multi_head_attention_forward
    k = k.contiguous().view(-1, bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0, 64] because the unspecified dimension size -1 can be any value and is ambiguous

I think the problem is that the current code assumes that there is always a negative query. One solution would be something like:

 def compute_similarities_to_text(
            self, item_features: np.ndarray,
            positive_queries: List[str], negative_queries: List[str]) -> List[Tuple[float, int]]:

        positive_features = self.compute_text_features(positive_queries)
        if len(negative_queries) > 0:
            negative_features = self.compute_text_features(negative_queries)
            text_features = np.add.reduce(positive_features) - np.add.reduce(negative_features)
        else:
            text_features = np.add.reduce(positive_features)

ramayer · 2021-10-04T10:08:58Z

Good point. I think the "--add" option is unnecessary. I'm happy to submit a version without it if we prefer
Thanks for catching that - fixing it now.

Rebased with: updates based on github code review feedback Github issue yurijmikhalevich#18 Co-authored-by: Yurij Mikhalevich <[email protected]>

ramayer · 2021-10-04T10:38:01Z

I think I fixed the bug.

I also created a branch without the explicit --add parameter here:
https://github.com/ramayer/rclip/tree/without_explicit_add_param
I don't have a strong opinion either way on that one.

I'm not sure how to re-open the pull request, though :( It seems it was closed when I was updating the branch.

ramayer · 2021-10-04T10:57:04Z

attempt at reopening using this technique: https://gist.github.com/guille-moe/cd41fdbc8969b15428a50af2543a5cfa --- sorry about my trouble with github

yurijmikhalevich · 2021-10-04T12:35:43Z

Hi @talpay, thank you for commenting.

This is where I was standing initially, but I changed my opinion recently.

I've checked and rclip "zebra black and white" and rclip "zebra" -a "black and white produce pretty similar, but different vectors.

More importantly, --add param will come in handy when we will start using images instead of the textual "query". Then -a will make perfect sense, for example: rclip ./daylight-forest-picture.jpg --add "night".

I believe that you will almost always be able to substitute --add with a properly phrased single query, but I think that we should leave it anyway for 1) future similar images lookup support 2) for flexibility. If you don't think that you need additions, don't use them, the change is non-breaking, and rclip will continue to support the "original" interface.

rclip/model.py

yurijmikhalevich · 2021-10-04T12:55:34Z

@ramayer, merged! Thank you! Great job! 😄

talpay · 2021-10-04T15:36:22Z

@yurijmikhalevich Thanks for the reply (and the awesome project). I've had a closer look and you're right, they're quite different and it is actually a very important distinction that should be clearly communicated:

rclip "zebra black and white"
leads to embedding(zebra black and white)

rclip "zebra" -a "black and white":
leads to embedding(zebra) + embedding(black and white)

These two might still get similar results but if we shift one more word, it should become clear:

rclip "zebra black" -a "and white":
leads to embedding(zebra black) + embedding(and white)

The first 2 queries will also return grayscale ("black and white") images of zebras whereas the last query will most likely not because "black and white" is not part of the embeddings (you can test this on @ramayer 's web ui).

After checking the code, I agree that this syntax is better but I would suggest making this behavior clearer by adding some good examples to the README.md, e.g. like above but also showing how you can chain multiple subtractions together like this:

rclip "zebra" -a "black and white" -s "car" -s "animal":
leads to embedding(zebra) + embedding(black and white) - embedding(car) - embedding(animal)

Maybe also adding the fact that we can actually use quotation marks in queries e.g. 'this is a "quoted" query' since this functionality is quite important to CLIP and also discussed in the paper).

On a final note, I could (subjectively) still imagine a single-string-syntax like rclip '+(positive word) +URL -word -("quoted" word)' to be more intuitive but implementing this parser is a bit trickier (e.g., to ensure correct grouping, that "-" can still function as a hyphen, and probably other stuff I'm overlooking that might turn out to be a headache).

yurijmikhalevich · 2021-10-04T17:54:37Z

@talpay, thank you! I am happy to hear that you like rclip. And thanks for a more detailed explanation. This is exactly what I was talking about.

I'll definitely add examples to the README before releasing the v2. And I agree with your outlook on a single-string syntax. I was considering something like that too and came to a similar conclusion. I also think that separate arguments just look cleaner.

ramayer · 2021-10-04T19:45:30Z

Here's a great example of the differences of some ways of processing phrases:

The phrase horse fly.
The phrase fly horse.
The vector for the word "horse" plus the vector for the word "fly"

Also interestingly:

The phrase "horse fly" with quotation marks. CLIP seems to understand that the intent is not a literal horse fly.
The phrase 'horse fly' - it infers something different depending on the kind of quotation marks.

On a final note, I could (subjectively) still imagine a single-string-syntax like rclip '+(positive word) +URL -word -("quoted" word)' to be more intuitive but implementing this parser is a bit trickier (e.g., to ensure correct grouping, that "-" can still function as a hyphen, and probably other stuff I'm overlooking that might turn out to be a headache).

I'm kinda forced to do that on the web UI (unless I want a multi-field advanced-search form) - but it becomes hard to remember, and excruciatingly painful to communicate. I even started going down the path of a crazy math for embeddings; starting with +2.5(dragon) +1.5(castle) to give different weights to the different phrases; and started looking into other operators (rotate the vector for "dragon" 30% into the direction of the vector of "castle").

And as we're discussing -- so many symbols mean something to CLIP like the phrase ❤️🌭 that it becomes hard to express different strings you might want to express without a syntax that's really robust at quoting weird characters. That path leads to madness or json or lisp.

I'm even thinking of just switching to a JSON syntax for all my web UI's math operators (which I already do in places like this that I wasn't bothered to come up with a clean syntax to express).

yurijmikhalevich · 2021-10-05T10:28:02Z

@ramayer, these are great examples indeed. It's also interesting to see how CLIP "distinguishes" between different kinds of quotes. I suspect that it's an artifact/bias learned from a dataset that inconsistently used both types of quotes.

ramayer mentioned this pull request Sep 28, 2021

feat: allow "negative" text queries #18

Closed

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch from f1c1aef to 8af4280 Compare September 28, 2021 08:13

yurijmikhalevich self-requested a review September 28, 2021 16:00

yurijmikhalevich linked an issue Sep 28, 2021 that may be closed by this pull request

feat: allow "negative" text queries #18

Closed

yurijmikhalevich reviewed Sep 28, 2021

View reviewed changes

rclip/utils.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 28, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 28, 2021

View reviewed changes

rclip/main.py Outdated Show resolved Hide resolved

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch 2 times, most recently from 373ee7b to 6d26173 Compare September 29, 2021 03:02

yurijmikhalevich reviewed Sep 29, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 29, 2021

View reviewed changes

rclip/utils.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 29, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch from 1959d1f to af08bd8 Compare September 30, 2021 05:18

yurijmikhalevich reviewed Sep 30, 2021

View reviewed changes

rclip/main.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 30, 2021

View reviewed changes

rclip/main.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 30, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Sep 30, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch 2 times, most recently from 66da726 to f79e2de Compare October 4, 2021 05:32

yurijmikhalevich reviewed Oct 4, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

yurijmikhalevich reviewed Oct 4, 2021

View reviewed changes

yurijmikhalevich previously approved these changes Oct 4, 2021

View reviewed changes

ramayer dismissed yurijmikhalevich’s stale review via 27b5e5d October 4, 2021 09:19

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch 2 times, most recently from 27b5e5d to 4874780 Compare October 4, 2021 09:28

feat: allow "negative" text queries

33dd995

Rebased with: updates based on github code review feedback Github issue yurijmikhalevich#18 Co-authored-by: Yurij Mikhalevich <[email protected]>

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch from 4874780 to 33dd995 Compare October 4, 2021 10:14

ramayer closed this Oct 4, 2021

ramayer deleted the allow_negative_text_queries_using_command_line_arguments branch October 4, 2021 10:27

ramayer reopened this Oct 4, 2021

ramayer force-pushed the allow_negative_text_queries_using_command_line_arguments branch from 0aee0f7 to 33dd995 Compare October 4, 2021 10:58

yurijmikhalevich reviewed Oct 4, 2021

View reviewed changes

rclip/model.py Outdated Show resolved Hide resolved

nit: if in python don't need braces

967f8bf

yurijmikhalevich approved these changes Oct 4, 2021

View reviewed changes

yurijmikhalevich merged commit acdecbe into yurijmikhalevich:main Oct 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft of feat: allow "negative" text queries #19

Draft of feat: allow "negative" text queries #19

ramayer commented Sep 28, 2021

yurijmikhalevich commented Sep 28, 2021

ramayer commented Sep 29, 2021

yurijmikhalevich commented Sep 29, 2021

yurijmikhalevich commented Sep 29, 2021 •

edited

Loading

ramayer commented Sep 30, 2021

yurijmikhalevich commented Sep 30, 2021

yurijmikhalevich commented Sep 30, 2021

ramayer commented Oct 1, 2021

ramayer commented Oct 1, 2021

yurijmikhalevich commented Oct 1, 2021

yurijmikhalevich commented Oct 1, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich Oct 4, 2021

ramayer Oct 4, 2021

yurijmikhalevich Oct 4, 2021

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021

talpay commented Oct 4, 2021 •

edited

Loading

ramayer commented Oct 4, 2021

ramayer commented Oct 4, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

talpay commented Oct 4, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich commented Oct 5, 2021

Draft of feat: allow "negative" text queries #19

Draft of feat: allow "negative" text queries #19

Conversation

ramayer commented Sep 28, 2021

yurijmikhalevich commented Sep 28, 2021

ramayer commented Sep 29, 2021

yurijmikhalevich commented Sep 29, 2021

yurijmikhalevich commented Sep 29, 2021 • edited Loading

ramayer commented Sep 30, 2021

yurijmikhalevich commented Sep 30, 2021

yurijmikhalevich commented Sep 30, 2021

ramayer commented Oct 1, 2021

ramayer commented Oct 1, 2021

yurijmikhalevich commented Oct 1, 2021

yurijmikhalevich commented Oct 1, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich Oct 4, 2021

Choose a reason for hiding this comment

ramayer Oct 4, 2021

Choose a reason for hiding this comment

yurijmikhalevich Oct 4, 2021

Choose a reason for hiding this comment

yurijmikhalevich commented Oct 4, 2021 • edited Loading

yurijmikhalevich commented Oct 4, 2021

talpay commented Oct 4, 2021 • edited Loading

ramayer commented Oct 4, 2021

ramayer commented Oct 4, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich commented Oct 4, 2021 • edited Loading

yurijmikhalevich commented Oct 4, 2021 • edited Loading

talpay commented Oct 4, 2021 • edited Loading

yurijmikhalevich commented Oct 4, 2021

ramayer commented Oct 4, 2021

yurijmikhalevich commented Oct 5, 2021

yurijmikhalevich commented Sep 29, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

talpay commented Oct 4, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

yurijmikhalevich commented Oct 4, 2021 •

edited

Loading

talpay commented Oct 4, 2021 •

edited

Loading