DevGuide

ShellCheck Dev Guide

Want to write a new test?

Some familiarity with Haskell helps. Most checks just use pattern matching and function calls. Grokking monads is generally not required, but do notation may come in handy.

Feel free to skip ahead to "ShellCheck in Practice".

ShellCheck Theory

Here's the basic flow of code through ShellCheck:

Parsing (emits parser warnings (1xxx))
AST Analysis (emits other warnings (2xxx))
Formatting and output

Of these, AST analysis is the most relevant, and where most of the interesting checks happen.

Parsing

The parser turns a string into an AST and zero or more warnings.

Parser warnings come in two flavors: problems and notes.

Notes are only emitted when parsing succeeds (they are stored in the Parsec user state). For example, a note is emitted when adding spaces around = in assignments, because if the parser later fails (i.e. it's not actually an assignment), we want to discard the suggestion:

when (hasLeftSpace || hasRightSpace) $
    parseNoteAt pos ErrorC 1068 "Don't put spaces around the = in assignments."

On the other hand, problems are always emitted, even when parsing fails (they are stored in a StateT higher than Parsec in the transformer stack). For example, a problem is emitted for unicode quotes, because this issue is likely to cause parsing to fail:

when (single && '\n' `elem` space) $
    parseProblemAt pos ErrorC 1080 "When breaking lines in [ ], you need \\ before the linefeed."

So basically, notes are emitted for non-fatal warnings while problems are emitted for fatal ones.

There's a distinction because often you can emit useful information even when parsing fails (suggestions for how to fix it). Likewise, there's often issues that only make sense in context, and shouldn't be emitted if the result does not end up being used. There are probably better solutions for this.

Here are the full types of the parser:

                              v-- Read real/mocked files  v-- Stores parse problems
type SCBase m = Mr.ReaderT (SystemInterface m) (Ms.StateT SystemState m)
type SCParser m v = ParsecT String UserState (SCBase m) v
                                   ^-- Stores parse notes and token offsets

AST Analysis

AST analysis comes in two primary flavors: checks that run on the root node (sometimes called "tree checks"), and checks that run on every node (sometimes called "node checks"). Due to poor planning, these can't be distinguished by type because they both just take a Token parameter.

Here's a simple check designed to run on each node, using pattern matching to find backticks:

checkBackticks _ (T_Backticked id list) | not (null list) =
    style id 2006 "Use $(..) instead of legacy `..`."
checkBackticks _ _ = return ()

A lot of checks are just like this, though usually with a bit more matching logic.

Each check is preceded by some mostly self-explanatory unit tests:

prop_checkBackticks1 = verify checkBackticks "echo `foo`"
prop_checkBackticks2 = verifyNot checkBackticks "echo $(foo)"
prop_checkBackticks3 = verifyNot checkBackticks "echo `#inlined comment` foo"

There are a few specialized test types for efficiency reasons.

For example, many tests trigger only for certain commands. This could be done by N tests like the above, each matching command nodes and checking that the command name applies (N node patches, N command name extractions, N comparisons). It's more efficient to just have 1 node match, 1 name extraction, and then a map lookup to find one or more command handlers. Such checks just register to handle a command name, and can be found in Checks/Command.hs.

Similarly, some checks only trigger for a certain shell. This could be done by N tree checks that optionally iterate the tree, or N node checks that match a node and skip emitting for certain shells, but it's more efficient to iterate the tree once with all applicable checks. Such checks just register to handle nodes for a certain shell, and can be found in Checks/ShellSupport.hs.

Formatting

ShellCheck has multiple output formatters. These take parsing results and outputs them as JSON, XML or human readable output. They rarely need tweaking. Anyone looking for a different output format should consider transforming one of the existing ones (with XSLT or python or similar) rather than writing a new formatter.

ShellCheck in Practice

Let's say that we have a pet peeve: people who use tmp as a temporary filename. We want to warn about statements like sort file > tmp && mv tmp file, and suggest using mktemp instead.

We can start by looking at the AST for sort file > tmp. In a ShellCheck source directory, run cabal build to generate necessary files, then run ghci ShellCheck/Parser.hs:

$ ghci ShellCheck/Parser.hs
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /mnt/backup/live/devices/vm-home/haskell/shpell/.ghci
[1 of 7] Compiling Paths_ShellCheck ( dist/build/autogen/Paths_ShellCheck.hs, interpreted )
[2 of 7] Compiling ShellCheck.Regex ( ShellCheck/Regex.hs, interpreted )
[3 of 7] Compiling ShellCheck.AST   ( ShellCheck/AST.hs, interpreted )
[4 of 7] Compiling ShellCheck.ASTLib ( ShellCheck/ASTLib.hs, interpreted )
[5 of 7] Compiling ShellCheck.Interface ( ShellCheck/Interface.hs, interpreted )
[6 of 7] Compiling ShellCheck.Data  ( ShellCheck/Data.hs, interpreted )
[7 of 7] Compiling ShellCheck.Parser ( ShellCheck/Parser.hs, interpreted )
Ok, modules loaded: ShellCheck.Parser, ShellCheck.AST, ShellCheck.ASTLib, ShellCheck.Data, ShellCheck.Interface, ShellCheck.Regex, Paths_ShellCheck.
*ShellCheck.Parser>

This has given us a REPL where we can call parsing functions. There's a convenient debugParse function that will take a parser and a string, and give the result. The main parser function is readScript:

*ShellCheck.Parser> debugParse readScript "sort file > tmp"
Right (T_Annotation (Id 1) [] (T_Script (Id 0) "" [T_Pipeline (Id 3) [] [T_Redirecting (Id 4) [T_FdRedirect (Id 10) "" (T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))] (T_SimpleCommand (Id 5) [] [T_NormalWord (Id 6) [T_Literal (Id 7) "sort"],T_NormalWord (Id 8) [T_Literal (Id 9) "file"]])]]))
*ShellCheck.Parser>

Not very pretty, but we can see the part we're interested in:

(T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"]))

We can compare this with the definition in AST.hs:

              v-- Redirection operator (T_Greater)
| T_IoFile Id Token Token
                    ^-- Filename (T_NormalWord)

Let's just add a check to Analytics.hs:

  checkTmpFilename _ token =
      case token of
        T_IoFile id operator filename  ->
          warn id 9999 $ "We found this node: " ++ (show token)
        _ -> return ()

and then append checkTmpFilename to the list of node checks at the top of the file:

  nodeChecks :: [Parameters -> Token -> Writer [TokenComment] ()]
  nodeChecks = [
      checkUuoc
      ,checkPipePitfalls
      ,checkForInQuoted
      ...
      ,checkTmpFilename  -- Here
    ]

Now we can compile and build to see the check apply:

cabal build && dist/build/shellcheck/shellcheck - <<< "sort file > tmp"

Alternatively, we can run it in interpreted mode, which is often way faster:

./quickrun - <<< "sort file > tmp"

In either case, our warning now shows up:

In - line 1:
sort file > tmp
^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.
          ^-- SC9999: We found this node: T_IoFile (Id 11) (T_Greater (Id 12)) (T_NormalWord (Id 13) [T_Literal (Id 14) "tmp"])

Now we can flesh out the check. See ASTLib.hs and AnalyzerLib.hs for convenient functions to work with AST nodes, such as getting the name of an invoked command, getting a list of flags using canonical flag parsing rules, or in this case, getting the literal string of a T_NormalWord so that it doesn't matter if we use > 'tmp', > "tmp" or > "t"'m'p:

  checkTmpFilename _ token =
      case token of
        T_IoFile id operator filename  ->
          when (getLiteralString filename == Just "tmp") $
            warn (getId filename) 9999 $ "Please use mktemp instead of the filename 'tmp'."
        _ -> return ()

We can also prepend a few unit tests that will automatically be picked up if they start with prop_:

prop_checkTmpFilename1 = verify checkTmpFilename "sort file > tmp"
prop_checkTmpFilename2 = verifyNot checkTmpFilename "sort file > $tmp"

We can run these tests with cabal test, or in interpreted mode with ./quicktest. If the command exits with success, it's good to go.

If we wanted to submit this test, we could run ./nextnumber which will output the next unused SC2xxx code, e.g. 2213 as of writing.

We now have a completely functional test, yay!

Enjoy what I am trying to create as I am the creator

ShellCheck

Installation
- CentOS6
- More Installation Guides
Usage
- Directive
- Ignore
Integrating and extending
- Misc. contributions
- Azure Pipelines
- Buildkite
- CircleCI
- DevGuide
- Integration
- JUnit
- Phabricator
- TravisCI
- GitLab CI
- Drone
- GitHub Actions
Severity
Template

Each individual ShellCheck warning has its own wiki page like SC1000. Use GitHub Wiki's "Pages" feature above to find a specific one, or see Checks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly