Skip to content

How to remove the contents of the images? #1346

Answered by JorjMcKie
yusheng0104 asked this question in Q&A
Discussion options

You must be logged in to vote

I think I don't understand your problem.
There are several ways how to ignore images and only extract text: page method get_text has an option string as argument one, and a keyword parameter flags, which together control this:

  • option = "text" will only extract text
  • switching off TEXT_PRESERVE_IMAGES from flags will only extract text for the other option values ("blocjs", "dict" etc.)

Please be more specific.

Replies: 2 comments 7 replies

Comment options

You must be logged in to vote
3 replies
@yusheng0104
Comment options

@JorjMcKie
Comment options

@yusheng0104
Comment options

Answer selected by JorjMcKie
Comment options

You must be logged in to vote
4 replies
@yusheng0104
Comment options

@JorjMcKie
Comment options

@yusheng0104
Comment options

@JorjMcKie
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #1344 on October 27, 2021 23:26.