RedPen Server

The RedPen server delivers the majority of RedPen’s functionality via a simple HTTP REST API.

Starting the RedPen server

Please refer to the Commands page for details on how to start the RedPen server.

RedPen Server API

Configuration

/rest/config/redpens

Return the configuration for available, preconfigured redpens.

GET Parameters:

  • lang=xx restricts the returned configurations to those that match the specified language. By default, all configurations are returned.

The JSON response is as follows:

{
  "version": "1.1.2",
  "documentParsers": ["PLAIN", "MARKDOWN", "WIKI"],
  "redpens": {
     "en": {
      "lang": "en",
      "tokenizer": "cc.redpen.tokenizer.WhiteSpaceTokenizer",
      "validators": {
         "CommaNumber": { "languages": [], "properties": {} },
         "Contraction": { "languages": ["en"], "properties": {} },
         "DoubledWord": { "languages": [], "properties": {} },
         "EndOfSentence": { "languages": ["en"], "properties": {} },
         "InvalidExpression": { "languages": [], "properties": {} },
         "InvalidSymbol": { "languages": [], "properties": {} },
         "InvalidWord": { "languages": ["en"], "properties": {} },
         "ParagraphNumber": { "languages": [], "properties": {} },
         "Quotation": { "languages": ["en"], "properties": {} },
         "SectionLength": { "languages": [], "properties": {"max_char_num": "2000"} },
         "SentenceLength": { "languages": [], "properties": {"max_len": "200"} },
         "SpaceBetweenAlphabeticalWord": { "languages": [], "properties": {} },
         "Spelling": { "languages": [], "properties": {} },
         "StartWithCapitalLetter": { "languages": ["en"], "properties": {} },
         "SuccessiveWord": { "languages": [], "properties": {} },
         "SymbolWithSpace": { "languages": [], "properties": {} },
         "WordNumber": { "languages": [], "properties": {} }
      }
    },
    "ja": {
      "lang": "ja",
      "tokenizer": "cc.redpen.tokenizer.JapaneseTokenizer",
      "validators": {
         "CommaNumber": { "languages": [], "properties": {} },
         "DoubledWord": { "languages": [], "properties": {} },
         "HankakuKana": { "languages": ["ja"], "properties": {} },
         "InvalidSymbol": { "languages": [], "properties": {} },
         "KatakanaEndHyphen": { "languages": ["ja"], "properties": {} },
         "KatakanaSpellCheck": { "languages": ["ja"], "properties": {} },
         "ParagraphNumber": { "languages": [], "properties": {} },
         "SectionLength": { "languages": [], "properties": {"max_num": "1500"} },
         "SentenceLength": { "languages": [], "properties": {"max_len": "100"} },
         "SpaceBetweenAlphabeticalWord": { "languages": [], "properties": {} },
         "SuccessiveWord": { "languages": [], "properties": {} }
      }
    }
  }
}
  • The version property indicates the version of RedPen.

  • The documentParsers array contains all supported document parsers

  • The redpens object shows the available pre-configured redpens and how they are configured. Within each object:

    • lang specifies the language the redpen is designed for

    • tokenizer specifies the tokenizer class used by the redpen

    • validators shows which validators are configured within the redpen. This object is in a format suitable for the document/validate/json request below. For each validator:

      • The languages array indicates which languages for which the validator is suitable. An empty array indicates all languages.
      • The properties object specifies the currently configured properties for this validator, as described in Supported Validators

Document Validation

/document/validate

This POST request validates a document and returns the errors.

POST Parameters:

  • document contains the text of the document RedPen is to validate

  • documentParser specifies which parser should be used to parse the document. Valid options are:
    • PLAIN
    • MARKDOWN
    • WIKI
  • lang specifies the language used to tokenize the document. Currently, values of ja (Japanese) and en (English/Whitespace) are supported.

  • The optional format field determines the format for the results. It can be one of json (the default), json2, plain, plain2 or xml.

  • The optional config field contains the contents of a RedPen XML configuration file

Examples using curl and document/validate

$ curl --data document="Twas brillig and the slithy toves did gyre and gimble in the wabe" \
     --data lang=en --data format=PLAIN2 \
     --data config="`cat ./redpen-server/target/classes/conf/redpen-conf.xml`" \
     localhost:8080/rest/document/validate/
Line: 1, Offset: 0
    Sentence: Twas brillig and the slithy toves did gyre and gimble in the wabe
        Spelling: Found possibly misspelled word "brillig".
        Spelling: Found possibly misspelled word "slithy".
        Spelling: Found possibly misspelled word "toves".
        Spelling: Found possibly misspelled word "gyre".
        Spelling: Found possibly misspelled word "gimble".
        Spelling: Found possibly misspelled word "wabe".
        DoubledWord: Found repeated word "and".
$ curl -s --data document="古池や,蛙飛び込む水の音" \
          --data config="`cat ./redpen-server/target/classes/conf/redpen-conf-ja.xml`" \
          localhost:8080/rest/document/validate/ | json_reformat
{
    "errors": [
        {
            "sentence": "古池や,蛙飛び込む水の音",
            "endPosition": {
                "offset": 4,
                "lineNum": 1
            },
            "validator": "InvalidSymbol",
            "lineNum": 1,
            "sentenceStartColumnNum": 0,
            "message": "Found invalid symbol \",\".",
            "startPosition": {
                "offset": 3,
                "lineNum": 1
            }
        }
    ]
}

/document/validate/json

This POST request processes a redpen validation request, specified in JSON, and returns redpen errors in a supported RedPen format.

Request format:

{
  "document": "Theyre is a blak rownd borl.",
  "format": "json2",
  "documentParser": "PLAIN",
  "config": {
    "lang": "en",
    "validators": {
      "CommaNumber": {},
      "Contraction": {},
      "DoubledWord": {},
      "EndOfSentence": {},
      "InvalidExpression": {},
      "InvalidSymbol": {},
      "InvalidWord": {},
      "ParagraphNumber": {},
      "Quotation": {},
      "SectionLength": {
        "properties": {
          "max_char_num": "2000"
        }
      },
      "SentenceLength": {
        "properties": {
          "max_len": "200"
        }
      },
      "SpaceBetweenAlphabeticalWord": {},
      "Spelling": {},
      "StartWithCapitalLetter": {},
      "SuccessiveWord": {},
      "SymbolWithSpace": {},
      "