/*----------------------------------------------------------------------------*/
/*                                                                            */
/* Description: Very simple class to parse XML.                               */
/*                                                                            */
/* Copyright (c) 2006 Rexx Language Association. All rights reserved.         */
/*                                                                            */
/* This program and the accompanying materials are made available under       */
/* the terms of the Common Public License v1.0 which accompanies this         */
/* distribution. A copy is also available at the following address:           */
/* http://www.ibm.com/developerworks/oss/CPLv1.0.htm                          */
/*                                                                            */
/* Redistribution and use in source and binary forms, with or                 */
/* without modification, are permitted provided that the following            */
/* conditions are met:                                                        */
/*                                                                            */
/* Redistributions of source code must retain the above copyright             */
/* notice, this list of conditions and the following disclaimer.              */
/* Redistributions in binary form must reproduce the above copyright          */
/* notice, this list of conditions and the following disclaimer in            */
/* the documentation and/or other materials provided with the distribution.   */
/*                                                                            */
/* Neither the name of Rexx Language Association nor the names                */
/* of its contributors may be used to endorse or promote products             */
/* derived from this software without specific prior written permission.      */
/*                                                                            */
/* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS        */
/* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT          */
/* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS          */
/* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT   */
/* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,      */
/* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED   */
/* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,        */
/* OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY     */
/* OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING    */
/* NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS         */
/* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.               */
/*                                                                            */
/* Author: W. David Ashley                                                    */
/*                                                                            */
/*----------------------------------------------------------------------------*/


/*----------------------------------------------------------------------------*/
/*                                                                            */
/* Notes:                                                                     */
/*                                                                            */
/* The xmlparser class is a very simple parser for XML files. It is not an    */
/* official 100% compatible XML parser because it has a lot of limitations,   */
/* most of which you could probably care less about.                          */
/*                                                                            */
/* 1. The parser only understands ASCII, which is a valid subset of UTF-8.    */
/*    It does not understand any other encoding except 8-bit ASCII.           */
/* 2. It does not test that the document is well-formed. It assumes that the  */
/*    document is a well-formed XML document.                                 */
/* 3. The parser does not know how to handle XML processing instructions. It  */
/*    passes those instructions through the passthrough method intact so      */
/*    that the user can try to make sense of them.                            */
/*                                                                            */
/* To use the xmlparser you need to be aware of the following.                */
/*                                                                            */
/* 1. The parser uses a SAX-like interface, but methods of the class are used */
/*    instead of a call-back mechanism. The default methods perform no        */
/*    actions. The user will need to subclass the xmlparser class and         */
/*    override the call-back methods in order to insert their own actions     */
/*    for each XML chunk type.                                                */
/* 2. The call-back methods use the xmlchunk class to pass data to the        */
/*    methods. This is a very simple class and is used as a container for     */
/*    specific types of XML chunks.                                           */
/* 3. Text chunks (CDATA) are passed through the text method intact. If the   */
/*    parser encounters multiple lines of text it invokes the text method     */
/*    for each line individually. It also does not collapse white space chars.*/
/* 4. XML tags are collapsed. This means that if a tag crosses a line         */
/*    boundary then the lines are collapsed together. This is important       */
/*    for processing instruction tags, comment tags and other special tags.   */
/*                                                                            */
/*----------------------------------------------------------------------------*/


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLPARSER                                                           */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::class xmlparser subclass object public


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLPARSER                                                           */
/*        Private methods                                                     */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::method parserver   attribute private -- the version of this parser
::method src         attribute private -- the array of xml lines to be parsed
::method lineidx     attribute private -- the index into the array of xml lines
::method charidx     attribute private -- the index into a xml line
::method errortxt    attribute private -- error text
::method eof         attribute private -- done parsing?


/*----------------------------------------------------------------------------*/
/* Method: create_error                                                       */
/* Description: creates an xmlerror instance.                                 */
/*----------------------------------------------------------------------------*/

::method create_error private
use arg msg, errline, errpos
xmlerror = .xmlerror~new
xmlerror~text = msg
xmlerror~filename = self~filename
xmlerror~line = errline
xmlerror~charpos = errpos
return xmlerror


/*----------------------------------------------------------------------------*/
/* Method: xlatetext                                                          */
/* Description: translate & attributes to their normal characters.            */
/*----------------------------------------------------------------------------*/

::method xlatetext private
use arg text
text = text~changestr('&gt;', '>')
text = text~changestr('&lt;', '<')
text = text~changestr('&amp;', '&') -- always do this one last!
return text


/*----------------------------------------------------------------------------*/
/* Method: current char                                                       */
/* Description: return the current character.                                 */
/*----------------------------------------------------------------------------*/

::method currentchar private
expose src lineidx charidx
if lineidx > src~items then return .nil
return src[lineidx]~substr(charidx, 1)


/*----------------------------------------------------------------------------*/
/* Method: getchar                                                            */
/* Description: get a single character from the xml document.                 */
/*----------------------------------------------------------------------------*/

::method getchar private
expose src lineidx charidx eof
character = src[lineidx]~substr(charidx, 1)
charidx = charidx + 1
if charidx > src[lineidx]~length then do
   lineidx = lineidx + 1
   charidx = 1
   end
if lineidx > src~items then do
   eof = .true
   return ''
   end
return character


/*----------------------------------------------------------------------------*/
/* Method: getchunk                                                           */
/* Description: returns a chunk of the xml document.                          */
/*----------------------------------------------------------------------------*/

::method getchunk private
expose src lineidx charidx errortxt eof
errlineidx = lineidx
errcharidx = charidx
chunk = .xmlchunk~new
if self~currentchar() <> '<' then do
   /* we found some CDATA */
   chunk~text = ''
   curline = lineidx
   do while eof = .false & self~currentchar() <> '<'
      -- Do NOT collapse the white space and newlines out of the chunk!
      -- We leave that task up to the client of this class.
      -- Instead, we return each line of text individually.
      chunk~text = chunk~text || self~getchar()
      if curline <> lineidx then leave
      end
   if eof = .true & chunk~text~strip <> '' then do
      errortxt = 'Error line' errlineidx 'column 'errcharidx': EOF within CDATA.'
      self~error(self~create_error(errortxt, errlineidx, errcharidx)) -- call the public override method
      return .nil
      end
   chunk~text = self~xlatetext(chunk~text)
   if chunk~text~strip <> '' then ,
    self~text(chunk) -- call the public override method
   return chunk
   end
/* we found an XML tag, process it */
character = self~getchar() -- skip the '<'
element = ''
curline = lineidx
nestlevel = 0
do while eof = .false
   if element~substr(1, 1) = '!' then do
      -- It is possible for tags to be contained within other tags in XML
      -- processing tags. The next two IF statements take care of that nesting
      -- possibility. It will be up to the user to parse out the contained
      -- tags.
      if self~currentchar() = '<' then nestlevel = nestlevel + 1
      if self~currentchar() = '>' & level > 0 then nestlevel = nestlevel - 1
      end
   element = element || self~getchar()
   if curline <> lineidx then do
      element = element || ' '
      curline = lineidx
      end
   if self~currentchar() = '>' & nestlevel = 0 then leave
   end
if eof = .true then do
   errortxt = 'Error line' errlineidx': EOF within an XML tag.'
   self~error(self~create_error(errortxt, errlineidx, errcharidx)) -- call the public override method
   return .nil
   end
element = element~strip()
select
   when element~substr(1, 1) = '/' then do
      chunk~tag = element~substr(2)
      self~end_element(chunk) -- call the public override method
      end
   when pos('?', element~substr(1, 1)) > 0 then do
      chunk~tag = ''
      chunk~text = element
      self~passthrough(chunk) -- call the public override method
      end
   when pos('!--', element~substr(1, 3)) > 0 then do
      chunk~tag = ''
      chunk~text = element
      self~passthrough(chunk) -- call the public override method
      end
   when pos('!', element~substr(1, 1)) > 0 then do
      chunk~tag = ''
      chunk~text = element
      self~passthrough(chunk) -- call the public override method
      end
   when pos(element~substr(1, 1), xrange('a', 'z') || xrange('A', 'Z')) > 0 then do
      parse var element tag element
      chunk~tag = tag
      /* process the attributes */
      if element~length > 0 then chunk~attr = .directory~new
      do while element~length() > 0
         if pos('=', element~word(1)) > 0 then do
            parse var element attrname '="' attrvalue '"' element
            attrname = attrname~strip()
            attrvalue = attrvalue~strip()
            attrvalue = self~xlatetext(attrvalue)
            chunk~attr[attrname] = attrvalue
            end
         else do
            parse var element attrname element
            if attrname <> '/' then do
               -- do not allow attributes without values!
               errortxt = 'Error line' errlineidx 'column' errcharidx || ,
                          ': Invalid tag attribute' attrname'.'
               self~error(self~create_error(errortxt, errlineidx, errcharidx)) -- call the public override method
               /* stop parsing */
               eof = .true
               return .nil
               end
            end
         end
      self~start_element(chunk) -- call the public override method
      if attrname = '/' then do
         endchunk = .xmlchunk~new
         endchunk~tag = tag
         self~end_element(endchunk) -- call the public override method
         end
      end
   otherwise do
      errortxt = 'Error line' errlineidx 'column 'errcharidx': Invalid tag name.'
      self~error(self~create_error(errortxt, errlineidx, errcharidx)) -- call the public override method
      /* stop parsing */
      eof = .true
      return .nil
      end
   end
character = self~getchar() -- skip the '>'
return chunk


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLPARSER                                                           */
/*        Public methods                                                      */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::method filename attribute private -- the XML file name, if known


/*----------------------------------------------------------------------------*/
/* Method: init                                                               */
/* Description: instance initialization                                       */
/*----------------------------------------------------------------------------*/

::method init
expose src
if arg() > 0 then raise syntax 93.902 array (0)
self~parserver = '0.3'
self~filename = ''
return


/*----------------------------------------------------------------------------*/
/* Method: start_element                                                      */
/* Description: called when a start element tag has been encountered.         */
/* Arguments:   an xmlchunk instance.                                         */
/*----------------------------------------------------------------------------*/

::method start_element
/* this method is designed to be overridden by a subclass */
use arg chunk
return


/*----------------------------------------------------------------------------*/
/* Method: end_element                                                        */
/* Description: called when an end element tag has been encountered.          */
/* Arguments:   an xmlchunk instance.                                         */
/*----------------------------------------------------------------------------*/

::method end_element
/* this method is designed to be overridden by a subclass */
use arg chunk
return


/*----------------------------------------------------------------------------*/
/* Method: text                                                               */
/* Description: called when character data has been encountered.              */
/* Arguments:   an xmlchunk instance.                                         */
/*----------------------------------------------------------------------------*/

::method text
/* this method is designed to be overridden by a subclass */
use arg chunk
return


/*----------------------------------------------------------------------------*/
/* Method: passthrough                                                        */
/* Description: called when comment tag or a processing instruction has been  */
/*              encountered.                                                  */
/* Arguments:   an xmlchunk instance.                                         */
/*----------------------------------------------------------------------------*/

::method passthrough
/* this method is designed to be overridden by a subclass */
use arg chunk
return


/*----------------------------------------------------------------------------*/
/* Method: error                                                              */
/* Description: called on an error.                                           */
/* Arguments:   an xmlerror instance.                                         */
/*----------------------------------------------------------------------------*/

::method error
/* this method is designed to be overridden by a subclass */
use arg xmlerror
return


/*----------------------------------------------------------------------------*/
/* Method: getversion                                                         */
/* Description: return the version of this class.                             */
/*----------------------------------------------------------------------------*/

::method getversion
return self~parserver


/*----------------------------------------------------------------------------*/
/* Method: parse_array                                                        */
/* Description: parse the specified array of XML code.                        */
/*----------------------------------------------------------------------------*/

::method parse_array
expose src lineidx charidx errortxt eof
if arg() < 1 then raise syntax 93.901 array (1)
if arg() > 1 then raise syntax 93.902 array (1)
use arg src
eof = .false
/* make sure this is an xml document */
if src[1]~pos('<?xml') <> 1 then do
   errortxt = 'Error: Invalid XML document.'
   self~error(self~create_error(errortxt, 1, 1))
   return
   end
/* parse the xml array */
lineidx = 1
charidx = 1
errortxt = ''
do while eof = .false
   chunk = self~getchunk()
   end
return errortxt


/*----------------------------------------------------------------------------*/
/* Method: parse_file                                                         */
/* Description: parse the specified file of XML code.                         */
/*----------------------------------------------------------------------------*/

::method parse_file
expose errortxt filename
if arg() < 1 then raise syntax 93.901 array (1)
if arg() > 1 then raise syntax 93.902 array (1)
use arg xmlfile
tfile = .stream~new(xmlfile)
errortxt = tfile~open('read')
if errortxt <> 'READY:' then do
   tfile~close()
   return errortxt
   end
lines = tfile~arrayin()
tfile~close()

errortxt = self~parse_array(lines)
return errortxt


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLCHUNK                                                            */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::class xmlchunk subclass object public


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLCHUNK                                                            */
/*        Private methods                                                     */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLCHUNK                                                            */
/*        Public methods                                                      */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::method text        attribute         -- the text
::method tag         attribute         -- the xml tag name
::method attr        attribute         -- the tag attributes


/*----------------------------------------------------------------------------*/
/* Method: init                                                               */
/* Description: instance initialization                                       */
/*----------------------------------------------------------------------------*/

::method init
self~text = .nil  -- For the start_element and passthrough methods this contains
                  -- the entire text string enclosed within the '<' and '>'
                  -- brackets. For the text method it contains a single line
                  -- of CDATA text.
self~tag = .nil   -- For  the start_element and end_element methods this is
                  -- the XML element (tag) name. For the end_element method the
                  -- leading '/' character is not a part of this string.
self~attr = .nil  -- For the start_element method this is an ooRexx directory
                  -- class instance. Each attribute and value is contained in
                  -- the ooRexx directory instance.
return


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLERROR                                                            */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::class xmlerror subclass object public


/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLERROR                                                            */
/*        Private methods                                                     */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/
/* Class: XMLERROR                                                            */
/*        Public methods                                                      */
/*----------------------------------------------------------------------------*/
/*----------------------------------------------------------------------------*/

::method text        attribute         -- the error message text, if any
::method filename    attribute         -- the xml file name, if known
::method line        attribute         -- the error line number
::method charpos     attribute         -- the error character position


/*----------------------------------------------------------------------------*/
/* Method: init                                                               */
/* Description: instance initialization                                       */
/*----------------------------------------------------------------------------*/

::method init
self~text = ''
self~filename = ''
self~line = 0
self~charpos = 0
return

