summaryrefslogtreecommitdiff
path: root/README
blob: 7b4ee3e881e8a8bf14b7ddf01a6b54594c4f5435 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
-*- mode: org -*-

Guile-syntax-highlight is a general-purpose syntax highlighting
library for GNU Guile.  It can parse code written in various
programming languages into a simple s-expression that can be easily
converted to HTML (via SXML) or any other format for rendering.

* Supported Languages

  - C
  - CSS
  - Scheme
  - XML

* Example

  #+BEGIN_SRC scheme
    (use-modules (syntax-highlight)
                 (syntax-highlight scheme)
                 (sxml simple))

    (define code
      "(define (square x) \"Return the square of X.\" (* x x))")

    ;; Get raw highlights list.
    (define highlighted-code
      (highlight lex-scheme code))

    ;; Convert to SXML.
    (define highlighted-sxml
      (highlights->sxml highlighted-code))

    ;; Write HTML to stdout.
    (sxml->xml highlighted-sxml)
    (newline)
  #+END_SRC

* Implementation details

  Very simple monadic parser combinators (supporting only regular
  languages) are used to tokenize the characters within a string or
  port and return a list consisting of two types of values: strings
  and two element tagged lists.  A tagged list consists of a symbol
  designating the type of the text (symbol, keyword, string literal,
  etc.) and a string of the text fragment itself.

  #+BEGIN_SRC scheme
    ((open "(")
     (special "define")
     " "
     (open "(")
     (symbol "square")
     " "
     (symbol "x")
     (close ")")
     " "
     (string "\"Return the square of X.\"")
     " "
     (open "(")
     (symbol "*")
     " "
     (symbol "x")
     " "
     (symbol "x")
     (close ")")
     (close ")"))
  #+END_SRC

  The term "parse" is used loosely here as the general act of reading
  text and building a machine readable data structure out of it based
  on a set of rules.  These parsers perform lexical analysis; they are
  not intended to produce the abstract syntax-tree for any given
  language.  The parsers, or lexers, attempt to tokenize and tag
  fragments of the source.  A "catch all" rule in each language's
  highlighter is used to deal with text that doesn't match any
  recognized syntax and simply produces an untagged string.

  Most syntax highlighters use lots of regular expressions to do their
  magic, but guile-syntax-highlight uses a purely functional, monadic
  parser combinator interface instead.  This makes it easy for
  developers to build complex parsers by creating compositions of many
  simpler ones.  Additionally, rather than working with raw strings or
  Guile's file ports, the input code is represented as a lazy stream
  of characters using the SRFI-41 streams library.  By using streams,
  parsers do not have to worry about things like reverting the
  character index from which a file is being read upon a failure, or
  any other state management.  Parsers simply return the thing they
  parsed, if any, and the remaining stream to be consumed by another
  parser.

* Requirements

  - GNU Guile >= 2.0.9

* Building

  Guile-syntax-highlight uses the familiar GNU build system and
  requires GNU Make to build.

** From tarball

   After extracting the tarball, run:

   #+BEGIN_SRC sh
     ./configure
     make
     make install
   #+END_SRC

** From Git

   In addition to GNU Make, building from Git requires GNU Automake
   and Autoconf.

   #+BEGIN_SRC sh
     git clone git@dthompson.us:guile-syntax-highlight.git
     cd guile-syntax-highlight
     ./bootstrap
     ./configure
     make
     make install
   #+END_SRC

* License

  LGPLv3 or later.  See =COPYING= for the full license text.