org.vishia.zbnf
Class ZbnfParser

java.lang.Object
  extended by org.vishia.zbnf.ZbnfParser

public class ZbnfParser
extends java.lang.Object

An instance of ZbnfParser contains a syntax prescript inside and is able to parse a text, test the syntax and output a tree of information given in the input text.
The invocation is in followed manner:

 ZbnfParser parser = new ZbnfParser(reportConsole);
 try{ parser.setSyntax(syntaxString);}
 catch(ParseException exception)
 { writeError("parser reading syntax error: " + exception.getMessage();
   return;
 }
 if(!parser.parse(inputString))
 { writeError(parser.getSyntaxErrorReport());
 }
 else
 { ParseResultItem resultItem = parser.getFirstParseResult();
   while( resultItem != null)
   { evaluateResult(resultItem);
     resultItem = resultItem.next(null))
   }
 }

The syntax

The Syntax given as argument of setSyntax(StringPart) is to be defined in the Semantic Backus Naur-Form (ZBNF, Z is a reverse S for Semantic). It is given as a String or StringPart. The method setSyntax, reads the string and convert it in internal data. The input string (mostly readed from a file) may be consist of a sequence of variables beginning with $ and syntax terms. A syntax term is described on the class ZbnfSyntaxPrescript, because this class converts a syntax term in an internal tree of syntax nodes. Downside it is shown an example of a syntax file or string with all variables.
 <?ZBNF-www.vishia.org version="1.0" encoding="iso-8859-1" ?>  ##this first line is not prescribed but possible.
 $setLinemode.                                 ##if set, than the newline char \n is not overwritten as whitespace
 $endlineComment=##.                           ##defines the string introducing a comment to eol, default is //
 $comment=[*...*].                             ##... between [* ... *] all chars are ignored, default is /*...* /
 $keywords=if|else.                            ##that identifiers are not accepted as identifiers parsing by <$?...>
 $inputEncodingKeyword="encoding".             ##it helps to define the encoding of the input file via a keyword input-file
 $inputEncoding="UTF-8".                       ##it helps to define the encoding of the input file (useable outside parser core)
 $xmlns:nskey="value".                         ##defines a namespace key for XML output (useable outside parser core)
 
 component::=<$?name>=<#?number> { <value> , }. ##The first syntax term is the toplevel syntax.
 value::= val = [ a | b | c].         ##another syntax term
 

White space and comment handling when parsing

The whitespaces and/or comments may be skipped over while parsing or not. The following rules ar valid:

Evaluate the parsers result

By calling Parser.parse() a new result buffer is created. The result buffer contains entries with the parsed informations appropriate to the semantic semantic named in the syntax prescript. The evaluation of result starts with getFirstParseResult() to get the toplevel item.


Constructor Summary
ZbnfParser(Report report)
          Creates a empty parser instance.
ZbnfParser(Report report, int maxParseResultEntriesOnError)
          Creates a empty parser instance.
 
Method Summary
 java.lang.String getExpectedSyntaxOnError()
          Returns the expected syntax on error position.
 ZbnfParseResultItem getFirstParseResult()
          Returns the first parse result item to start stepping to the results.
 java.lang.String getFoundedInputOnError()
          Returns about 50 chars of the input string founded at the parsing error position.
 java.nio.charset.Charset getInputEncoding()
          Returns the setting of $inputEncoding="...".
 java.lang.String getInputEncodingKeyword()
          Returns the setting of $inputEncodingKeyword="...".
 long getInputPositionOnError()
          Returns the position of error in input string.
 java.lang.String getLastFoundedResultOnError()
          Returns the up to now founded result on error position.
 java.lang.String getSyntaxErrorReport()
          assembles a string with a user readable syntax error message.
 java.util.TreeMap<java.lang.String,java.lang.String> getXmlnsFromSyntaxPrescript()
          Returns a TreeMap of all xmlns keys and strings.
 boolean parse(java.lang.String input)
          Parsed a given Input and produces a parse result.
 boolean parse(StringPart input)
          parses a given Input and produces a parse result.
 boolean parse(StringPart input, java.util.List<java.lang.String> additionalInfo)
          parses a given Input, see [parse(StringPart), but write additional semantic informations into the first parse result (into the top level component).
 void reportStore(Report report)
          Reports the whole content of the parse result in the Report.fineInfo-level.
 void reportStore(Report report, int reportLevel)
           
 void reportStore(Report report, int reportLevel, java.lang.String sTitle)
          Reports the whole content of the parse result.
 void reportSyntax(Report report, int reportLevel)
          Reports the syntax.
 void setLinemode(boolean bTrue)
          Sets the line mode or not.
 void setReportIdents(int identError, int identInfo, int identComponent, int identFine)
          sets the ident number for report of the progress of parsing.
 void setSkippingComment(java.lang.String sCommentStringStart, java.lang.String sCommentStringEnd, boolean bStoreComment)
          Set the mode of skipping comments.
 void setSkippingEndlineComment(java.lang.String sCommentStringStart, boolean bStoreComment)
          Set the mode of skipping comments to end of line.
 boolean setStoringConstantSyntax(boolean bStore)
          Determines wether or not constant syntax (teminal syntax items or terminal morphes) should also strored in the result buffer.
 void setSyntax(java.io.File fileSyntax)
           
 void setSyntax(java.lang.String syntax)
          Sets the syntax from given string.
 void setSyntax(StringPart syntax)
          Sets the syntax from given String.
 void setWhiteSpaces(java.lang.String sWhiteSpaces)
          Sets the chars which are recognized as white spaces.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ZbnfParser

public ZbnfParser(Report report)
Creates a empty parser instance.

Parameters:
report - A report output

ZbnfParser

public ZbnfParser(Report report,
                  int maxParseResultEntriesOnError)
Creates a empty parser instance.

Parameters:
report - A report output
maxParseResultEntriesOnError - if 0 than no parse result is stored. If >0, than the last founded parse result is stored to support better analysis of syntax errors, but the parser is slower.
Method Detail

setSyntax

public void setSyntax(java.lang.String syntax)
               throws java.text.ParseException
Sets the syntax from given string. Further expanations, see setSyntax(vishia.StringScan.StringPart)

Parameters:
syntax - The ZBNF-Syntax.
Throws:
java.text.ParseException

setSyntax

public void setSyntax(java.io.File fileSyntax)
               throws java.nio.charset.IllegalCharsetNameException,
                      java.nio.charset.UnsupportedCharsetException,
                      java.io.FileNotFoundException,
                      java.io.IOException,
                      java.text.ParseException
Throws:
java.nio.charset.IllegalCharsetNameException
java.nio.charset.UnsupportedCharsetException
java.io.FileNotFoundException
java.io.IOException
java.text.ParseException

setSyntax

public void setSyntax(StringPart syntax)
               throws java.text.ParseException
Sets the syntax from given String. The String should contain the syntax in ZBNF-Format. The string is parsed and converted into a tree of objects of class SyntaxPrescript. The class SyntaxPrescript is private inside the Parser, but its matter of principle may be explained here.
The class SyntaxPrescript contains a list of elements (listSyntaxElements) or a list of such listSyntaxElements. The list of listSyntaxElements is used if there are some alternatives.
The listSyntaxElements contains objects of type String, SyntaxPrescript, Component or Repetition. It is the sequence of syntax elements of one syntax-path in ZBNF. An object of type String represents a terminal symbol (constant string). An element of SyntaxPrescript is an option construction [...|..|..] or also a simple option [...]. The Repetition represents the {...?...}-construction. A Repetition contains one or two objects of type SyntaxPrescript for the forward and optional backward syntax. This syntax-prescripts may be build complexly in the same way.
An object of type Component in the listSyntaxElements represents a construction <...?...>. It may contained the semantic information, it may containded a reference to another SyntaxPrescript if there is required in the wise <syntax.... It is also built if a construction of kind <!regex..., <$..., <#... or such else is given.
The tree of SyntaxPrescript is passed by syntax test, the right way is searched, see method parse()

Parameters:
syntax - The syntax in ZBNF-Format.
Throws:
java.text.ParseException - If any wrong syntax is containing in the ZBNF-string. A string-wise information of the error location is given.

setSkippingComment

public void setSkippingComment(java.lang.String sCommentStringStart,
                               java.lang.String sCommentStringEnd,
                               boolean bStoreComment)
Set the mode of skipping comments. It it is set, comments are always skipped on every parse operation. This mode may or should be combinded with setIgnoreWhitespace.

Parameters:
sCommentStringStart - The start chars of comment string, at example '/ *'
sCommentStringEnd - The end chars of comment string, at example '* /'
bStoreComment - If it is true, the comment string will be stored in the ParserStrore and can be evaluated from the user.

setSkippingEndlineComment

public void setSkippingEndlineComment(java.lang.String sCommentStringStart,
                                      boolean bStoreComment)
Set the mode of skipping comments to end of line. It it is set, comments to end of line are always skipped on every parse operation. This mode may or should be combinded with setIgnoreWhitespace.

Parameters:
sCommentStringStart - The start chars of comment string to end of line, at example '/ /'
bStoreComment - If it is true, the comment string will be stored in the ParserStrore and can be evaluated from the user.

setWhiteSpaces

public void setWhiteSpaces(java.lang.String sWhiteSpaces)
Sets the chars which are recognized as white spaces. The default without calling this method is " \t\r\n\f", that is: space, tab, carrige return, new line, form feed. This mehtod is equal to the using of the syntaxprescript variable $Whitespaces,

Parameters:
sWhiteSpaces - Chars there are recognize as white space.
See Also:
setSyntax(String).

setLinemode

public void setLinemode(boolean bTrue)
Sets the line mode or not. The line mode means, a new line character is not recognize as whitespace, it must considered in syntax prescript as a signifying element. This mehtod is equal to the using of the syntaxprescript variable $setLinemode,

See Also:
setSyntax(String).

setReportIdents

public void setReportIdents(int identError,
                            int identInfo,
                            int identComponent,
                            int identFine)
sets the ident number for report of the progress of parsing. If the idents are >0 and < Report.fineDebug, theay are used directly as report level.

Parameters:
identError - ident for error and warning outputs.
identInfo - ident for progress information output.
identComponent - ident for output if a component is parsing
identFine - ident for fine parsing outputs.

parse

public boolean parse(java.lang.String input)
Parsed a given Input and produces a parse result. See parse(StringPart).

Parameters:
input -
Returns:

parse

public boolean parse(StringPart input)
parses a given Input and produces a parse result. The method setSyntax(vishia.StringScan.StringPart) should be called before. While parsing the pathes in the tree of SyntaxPrescript are tested. If a matching path is found, the method returns true, otherwise false. The result of parsing is stored inside the parser (private internal class ParserStore). To evaluate the parse result see getFirstParseResult().

Parameters:
input - The source to be parsed.
Returns:
true if the input is matched to the syntax, otherwise false.

parse

public boolean parse(StringPart input,
                     java.util.List<java.lang.String> additionalInfo)
parses a given Input, see [parse(StringPart), but write additional semantic informations into the first parse result (into the top level component).

Parameters:
input - The text to parse
additionalInfo - Pairs of semantic idents and approriate information content. The elements [0], [2] etc. contains the semantic identifier whereas the elements [1], [3] etc. contains the information content.
Returns:
true if the input is matched to the syntax, otherwise false.

reportSyntax

public void reportSyntax(Report report,
                         int reportLevel)
Reports the syntax.


reportStore

public void reportStore(Report report,
                        int reportLevel,
                        java.lang.String sTitle)
Reports the whole content of the parse result. The report is grouped into components. A component is represented by an own syntax presript, written in the current syntax prescript via <ident...>. A new nested component forces a deeper level.
The output is written in the form:
 parseResult:  <?semanticIdent> Component
 parseResult:   <?semanticIdent> ident="foundedString"
 parseResult:   <?semanticIdent> number=foundedNumber
 parseResult:  </?semanticIdent> Component
 
Every line is exactly one entry in the parsers store.

Parameters:
report - The report output instance
reportLevel - level of report. This level is shown in output. If the current valid reportLevel of report is less than this parameter, no action is done.

reportStore

public void reportStore(Report report,
                        int reportLevel)

reportStore

public void reportStore(Report report)
Reports the whole content of the parse result in the Report.fineInfo-level.

Parameters:
report - The report output instance.
See Also:
reportStore(Report report, int reportLevel)}.

getInputEncodingKeyword

public java.lang.String getInputEncodingKeyword()
Returns the setting of $inputEncodingKeyword="...". in the syntax prescript or null it no such entry is given.

Returns:

getInputEncoding

public java.nio.charset.Charset getInputEncoding()
Returns the setting of $inputEncoding="...". in the syntax prescript or null it no such entry is given.

Returns:

getExpectedSyntaxOnError

public java.lang.String getExpectedSyntaxOnError()
Returns the expected syntax on error position. This position is matched to the report of getFoundenInputOnError(). Because the syntax may be differently, much more as a deterministic string is possible, the returned syntax are only one possibility and don't may be non-ambiguous. It may be only a help to detect the error. It is the same problem as error messages by compilers.

Returns:
A possible expected syntax.

getLastFoundedResultOnError

public java.lang.String getLastFoundedResultOnError()
Returns the up to now founded result on error position. This position is matched to the report of getFoundenInputOnError() and getExpectedSyntaxOnError().

Returns:
A possible founded result or null if this feature is not switched on.

getFoundedInputOnError

public java.lang.String getFoundedInputOnError()
Returns about 50 chars of the input string founded at the parsing error position. If the error position is the end of file or near them, this string ends with the chars "<<
Returns:
The part of input on error position.

getInputPositionOnError

public long getInputPositionOnError()
Returns the position of error in input string. It is the same number as in report.


getSyntaxErrorReport

public java.lang.String getSyntaxErrorReport()
assembles a string with a user readable syntax error message. This method is useable if the user should be inform about the error and the application should be controlled by the users directives.

Returns:
String with syntax error message.

getFirstParseResult

public ZbnfParseResultItem getFirstParseResult()
Returns the first parse result item to start stepping to the results. See samples at interface ParseResultItem.

Returns:
The first parse result item.

getXmlnsFromSyntaxPrescript

public java.util.TreeMap<java.lang.String,java.lang.String> getXmlnsFromSyntaxPrescript()
Returns a TreeMap of all xmlns keys and strings. This is the result of detecting $xmlns:ns="string". -expressions in the syntax prescript.


setStoringConstantSyntax

public boolean setStoringConstantSyntax(boolean bStore)
Determines wether or not constant syntax (teminal syntax items or terminal morphes) should also strored in the result buffer.

Parameters:
bStore - true if they should strored, false if not.
Returns:
The old value of this setting.