Skip to content
Ahmed Hisham Ismail edited this page Jun 27, 2014 · 2 revisions

The class spdx.parsers.lexers.tagvalue.Lexer is responsible for converting a Tag/Value format SPDX file to a stream of tokens to be parsed. It has a token for every valid Tag. The Lexer by default ignores all empty lines and lines starting with a # character.

It's implemented using ply.lex

Fields

  • reserved A dict of (keyword, token) pairs, it contains all of the valid tags in addition to the special values UNKNOWN, NOASSERTION, NONE, SOURCE, BINARY, OTHER and ARCHIVE.
  • states a tuple of states for the lexer.
  • tokens A list of all token types in order for the parser to know them.

Lexer States

There are two states INITIAL which is the default initial state and text which is the state for recognizing free form text between <text> and </text> tags. The rules for the text state are exclusive to it.

Rules

Rules are defined as a series of methods with the t_ prefix the regex that recognizes each token is the method's docstring. The order of the functions specifies the precedence of the rules.

Clone this wiki locally