Optional indentation layout: insert NEWLINE / INDENT / DEDENT tokens. More...

#include <cstddef>
#include <limits>
#include <span>
#include <stdexcept>
#include <string>
#include <vector>
#include "token.hpp"

Include dependency graph for layout.hpp:

Classes
class	scilex::layout_error
	Thrown when a line's indentation matches no enclosing level. More...

Namespaces
namespace	scilex
	The SciLex public API (scilex::lexer, scilex::rule, scilex::token).

Functions
std::vector< token >	scilex::layout (std::span< const token > tokens, const std::vector< bool > &mode_significant={})
	Rewrites `tokens` with NEWLINE / INDENT / DEDENT inserted.

Variables
constexpr int	scilex::newline {std::numeric_limits<int>::min() + 1}
	Reserved kind: end of a logical line.

constexpr int	scilex::indent {std::numeric_limits<int>::min() + 2}
	Reserved kind: indentation increased (start of a deeper block).

constexpr int	scilex::dedent {std::numeric_limits<int>::min() + 3}
	Reserved kind: indentation decreased (end of a block).

Detailed Description

Optional indentation layout: insert NEWLINE / INDENT / DEDENT tokens.

Some languages (Python-like, e.g. SciLang) make indentation significant. This opt-in pass turns a flat token stream into a layout-aware one: it inserts a scilex::newline at each logical line end, and scilex::indent / scilex::dedent where the leading indentation changes.

It works purely from token positions — every scilex::token already carries its source line and (byte) column — so the base lexer needs no change and may keep skipping whitespace. Lines with no token (blank or comment-only) carry no structure and are naturally ignored.

Indentation width is the byte column of a line's first token (tabs and spaces each count as one column; it does not police mixed tabs/spaces, and there is no implicit line continuation inside brackets).

This pass is positional. With no significance policy it is mode-blind — every token shapes indentation — which is byte-for-byte the original behaviour. A per-mode significance policy (Layout Awareness Level A) lets a mode be marked insignificant, so its tokens pass through without affecting layout: this is how a multi-line flow collection (examples/yaml.hpp) and implicit line continuation inside brackets (examples/python.hpp) avoid spurious structure. A mode marked insignificant must be self-delimited (entered and left by its own tokens). Block scalars | / > and heredocs are a deeper case (a reference indent in the frame) — Layout Awareness Level B, still to come. Two invariants hold: (1) an empty policy ⇒ byte-for-byte the positional pass, at zero cost; (2) the mode is the single source of truth for the policy — there is no per-rule flag (e.g. ignore_layout); significance is derived from the mode, never beside it.

Input must be an end-of-input-terminated token sequence (the lexer's eof_policy::append); the terminal scilex::end_of_input is preserved.

Definition in file layout.hpp.

Classes

Namespaces

Functions

Variables

Detailed Description