-*- mode: org -*- Guile-syntax-highlight is a general-purpose syntax highlighting library for GNU Guile. It can parse code written in various programming languages into a simple s-expression that can be easily converted to HTML (via SXML) or any other format for rendering. * Supported Languages - C - CSS - Scheme - XML * Example #+BEGIN_SRC scheme (use-modules (syntax-highlight) (syntax-highlight scheme) (sxml simple)) (define code "(define (square x) \"Return the square of X.\" (* x x))") ;; Get raw highlights list. (define highlighted-code (highlight lex-scheme code)) ;; Convert to SXML. (define highlighted-sxml (highlights->sxml highlighted-code)) ;; Write HTML to stdout. (sxml->xml highlighted-sxml) (newline) #+END_SRC * Implementation details Very simple monadic parser combinators (supporting only regular languages) are used to tokenize the characters within a string or port and return a list consisting of two types of values: strings and two element tagged lists. A tagged list consists of a symbol designating the type of the text (symbol, keyword, string literal, etc.) and a string of the text fragment itself. #+BEGIN_SRC scheme ((open "(") (special "define") " " (open "(") (symbol "square") " " (symbol "x") (close ")") " " (string "\"Return the square of X.\"") " " (open "(") (symbol "*") " " (symbol "x") " " (symbol "x") (close ")") (close ")")) #+END_SRC The term "parse" is used loosely here as the general act of reading text and building a machine readable data structure out of it based on a set of rules. These parsers perform lexical analysis; they are not intended to produce the abstract syntax-tree for any given language. The parsers, or lexers, attempt to tokenize and tag fragments of the source. A "catch all" rule in each language's highlighter is used to deal with text that doesn't match any recognized syntax and simply produces an untagged string. Most syntax highlighters use lots of regular expressions to do their magic, but guile-syntax-highlight uses a purely functional, monadic parser combinator interface instead. This makes it easy for developers to build complex parsers by creating compositions of many simpler ones. Additionally, rather than working with raw strings or Guile's file ports, the input code is represented as a lazy stream of characters using the SRFI-41 streams library. By using streams, parsers do not have to worry about things like reverting the character index from which a file is being read upon a failure, or any other state management. Parsers simply return the thing they parsed, if any, and the remaining stream to be consumed by another parser. * Requirements - GNU Guile >= 2.0.9 * Building Guile-syntax-highlight uses the familiar GNU build system and requires GNU Make to build. ** From tarball After extracting the tarball, run: #+BEGIN_SRC sh ./configure make make install #+END_SRC ** From Git In addition to GNU Make, building from Git requires GNU Automake and Autoconf. #+BEGIN_SRC sh git clone git@dthompson.us:guile-syntax-highlight.git cd guile-syntax-highlight ./bootstrap ./configure make make install #+END_SRC * License LGPLv3 or later. See =COPYING= for the full license text.