www.destructor.de

About | Contact | Impressum


Home |  Code |  Articles |  Misc |  x
XML Parser |  TAR Library |  Linked Lists |  WinSock 1.1 |  x
General |  Downloads |  Documentation |  History |  x
x

XML Scanner Reference Documentation

TXmlScanner is a VCL/CLX component wrapper for TXmlParser. You can simply click a TXmlScanner non-visual component on your form or data module and just use it.

Methods Properties Events
LoadFromFile
LoadFromBuffer
SetBuffer
Execute
Filename
Normalize
OnAttList
OnCData
OnComment
OnContent
OnDtdError
OnDtdRead
OnElement
OnEmptyTag
OnEndTag
OnEntity
OnLoadExternal
OnNotation
OnPI
OnStartTag
OnTranslateEncoding
OnXmlProlog

How to use it


TEasyXmlScanner

The TEasyXmlScanner component is easier to use than TXmlScanner: it lacks events which you won't need for standard applications. So there are only those events left in the object inspector that you're most likely to use. Everything else is the same.


LoadFromFile

PROCEDURE LoadFromFile (Filename : STRING); 

Loads the given file into the internal buffer of TXmlScanner. If you specify an unknown filename, the internal buffers will be cleared and a subsequent call to Execute will do nothing.

Parameters

Filename The name of the XML file to read in

See also

LoadFromBuffer, SetBuffer, Filename property


LoadFromBuffer

PROCEDURE LoadFromBuffer (Buffer : PChar); 

Loads the given buffer into the internal buffer of TXmlScanner. Buffer must be a null terminated string containing the XML document.

Parameters

Buffer Pointer of the Buffer to read in

See also

LoadFromFile, SetBuffer


SetBuffer

PROCEDURE SetBuffer (Buffer : PChar);

Makes the given buffer the data buffer where the parser gets the data from. This buffer must not be deallocated as long as Execute is working on it.

The Buffer must be null terminated.

The parser will not modify the contents of the buffer.

Parameters

Buffer Pointer of the Buffer to use

See also

LoadFromFile, LoadFromBuffer


Execute

PROCEDURE Execute;

This is where the scanning takes place. Execute runs through the XML document and triggers one of the Events for every XML part it finds in the document.

When Execute returns, the whole document has been scanned.

Calling of the event methods is done synchronously (no threading, one after the other) so when Execute returns you can be sure that all of your Event handler methods have finished running.

Parameters

None

See also

LoadFromFile, LoadFromBuffer, SetBuffer, Normalize Property


Filename

PROPERTY Filename : STRING READ GetFilename WRITE LoadFromFile;

Gets/Sets the filename of the scanner. If you set the filename, the file is read into TXmlScanner's internal buffer.

See also

LoadFromFile


Normalize

PROPERTY Normalize : BOOLEAN READ GetNormalize WRITE SetNormalize;

If you set Normalize to TRUE, the following normalization on element text content will take place:

Whitespace, as according to the XML 1.0 specification, is one of the following characters:

See also

Execute


OnAttList

PROPERTY OnAttList : TElementEvent  READ FOnAttList WRITE FOnAttList;

TElementEvent = PROCEDURE (Sender : TObject; ElemDef : TElemDef) OF OBJECT;

OnAttList is fired when the parser has found an <!ATTLIST> definition in the DTD. It gets passed the complete element definition as the ElemDef parameter which is a name/value pair list (TNvpList) containing the Attribute definitions (TAttrDef).

Parameters

Sender The TXmlScanner instance which triggers the event
ElemDef Element definitions of the element the attributes belong to. Complete with attribute definitions.

Example

FOR I := 0 TO ElemDef.Count-1 DO BEGIN
    AD := TAttrDef (ElemDef [I]);
    IF AD.AttrType IN [atNotation, atEnumeration] 
      THEN S  := AD.Name + ': ' + AD.TypeDef + ' Default=' + AnsiQuotedStr (AD.Value, '''')
      ELSE S  := AD.Name+': '+CAttrType_Name [AD.AttrType] + ' Default=' + AnsiQuotedStr (AD.Value, '''');
    Memo1.Lines.Add (S);
    END;

OnCData

TContentEvent = PROCEDURE (Sender : TObject; Content : STRING) OF OBJECT;
PROPERTY OnCData : TContentEvent READ FOnCData WRITE FOnCData;

Is called whenever the Execute method has parsed a CDATA section.

Parameters

Sender The TXmlScanner instance which triggers the event
Content The text content of the CDATA section.

See also

Execute, OnStartTag, OnContent


OnComment

TCommentEvent = PROCEDURE (Sender : TObject; Comment : STRING) OF OBJECT;
PROPERTY OnComment : TCommentEvent READ FOnComment WRITE FOnComment;

Is called whenever the Execute method has parsed a comment.

Parameters

Sender The TXmlScanner instance which triggers the event
Comment The comment

See also

Execute


OnContent

TContentEvent = PROCEDURE (Sender : TObject; Content : STRING) OF OBJECT;
PROPERTY OnContent : TContentEvent READ FOnContent WRITE FOnContent;

Is called whenever the Execute method has parsed element text content.

Parameters

Sender The TXmlScanner instance which triggers the event
Content The text content. Entities have already been resolved.

See also

Execute, OnStartTag, OnCData


OnDtdError

TErrorEvent = PROCEDURE (Sender : TObject; ErrorPos : PChar) OF OBJECT;
PROPERTY OnDtdError : TErrorEvent READ FOnDtdError WRITE FOnDtdError;

Gets called whenever there is an error found in the DTD.

Parameters

Sender The TXmlScanner instance which triggers the event
ErrorPos Points to the position of the error in the DTD

OnDtdRead

TDtdEvent = PROCEDURE (Sender : TObject; RootElementName : STRING) OF OBJECT;
PROPERTY OnDtdRead : TDtdEvent READ FOnDtdRead WRITE FOnDtdRead;

Is called whenever the Execute method has finished parsing the document type declaration.

Parameters

Sender The TXmlScanner instance which triggers the event
RootElementName The name of the DTD root element

See also

Execute


OnElement

PROPERTY OnElement : TElementEvent READ FOnElement WRITE FOnElement;

TElementEvent = PROCEDURE (Sender : TObject; ElemDef : TElemDef) OF OBJECT;

OnElement is fired when the parser has found an <!ELEMENT> definition in the DTD. It gets passed the complete element definition as the ElemDef parameter. The Attribute list is empty when OnElement gets fired.

Parameters

Sender The TXmlScanner instance which triggers the event
ElemDef Element definitions of the element

Example

Memo1.Lines.Add ('OnElement: '+ElemDef.Name+': '+ElemDef.Definition);

OnEmptyTag

TStartTagEvent = PROCEDURE (Sender : TObject; TagName : STRING; Attributes : TAttrList) OF OBJECT;
PROPERTY OnEmptyTag : TStartTagEvent READ FOnEmptyTag WRITE FOnEmptyTag;

Is called whenever the Execute method has read in an Empty-Element Tag of the form <name/>

When you have an Empty-Element Tag of the form <name></name>, the events OnStartTag and OnEndTag are triggered.

You can access the attributes by name or by scanning through the list of attributes (Value and Name are STRING variables, i is of type INTEGER):

Value := Attributes.Value ('name');         // Access by name
for i := 0 to Attributes.Count-1 do begin   // Scan through attributes
  Name  := Attributes.Name (i);
  Value := Attributes.Value (i);
  end;

Note that all names in XML are case-sensitive. This includes attribute names. The list of attributes is always sorted by name by the parser.

Parameters

Sender The TXmlScanner instance which triggers the event
TagName The name of the Start Tag. Case sensitive.
Attributes List of attributes

See also

Execute, OnStartTag, OnEndTag


OnEndTag

TEndTagEvent = PROCEDURE (Sender : TObject; TagName : STRING) OF OBJECT;
PROPERTY OnEndTag : TEndTagEvent READ FOnEndTag WRITE FOnEndTag;

Is called whenever the Execute method has parsed an End Tag.

Parameters

Sender The TXmlScanner instance which triggers the event
TagName The name of the End Tag. Case sensitive.

See also

Execute, OnStartTag


OnEntity

TEntityEvent = PROCEDURE (Sender : TObject; EntityDef : TEntityDef) OF OBJECT;
PROPERTY OnEntity : TEntityEvent READ FOnEntity WRITE FOnEntity;

This event is fired when the parser has found an <!ENTITY> definition in the DTD.

Parameters

Sender The TXmlScanner instance which triggers the event
EntityDef A TEntityDef instance defining the contents of the entity

OnLoadExternal

TExternalEvent = PROCEDURE (Sender : TObject; 
                            SystemId, PublicId, NotationId : STRING;
                            VAR Result : TXmlParser) OF OBJECT;
PROPERTY OnLoadExternal : TExternalEvent READ FOnLoadExternal WRITE FOnLoadExternal;

When your XML document contains references to External Entities or External DTDs, the external entity must be loaded into memory. When the parser finds an External Entity reference or an external DTD reference, it fires to OnLoadExternal event.

The event handler must

Parameters

Sender The TXmlScanner instance which triggers the event
SystemId The SYSTEM identifier of the external entity. This is always given and is the URI of the file to load
PublicId The PUBLIC identifier of the external entity. This is optional.
NotationId The NOTATION identifier of the external entity. Only applicable for General Unparsed External Entities.

Example

There is a call to an external entity in the DTD:

<!ENTITY blah SYSTEM "myentity.xml">

In this case, "myentity.xml" is passed as the SYSTEM identifier. This is how the event handler looks like:

Result := TXmlParser.Create;
Result.LoadFromFile (SystemId);

Please note that TXmlParser is not able to do HTTP or FTP transfers. In case you need this, you must implement it yourself (e.g. by using Indy or other components that come with Delphi). When you have downloaded from HTTP or FTP, you can use TXmlParser.LoadFromBuffer or TXmlParser.SetBuffer to load the XML into the parser instance.


OnNotation

TNotationEvent = PROCEDURE (Sender : TObject; NotationDef : TNotationDef) OF OBJECT;
PROPERTY OnNotation : TNotationEvent READ FOnNotation WRITE FOnNotation;

OnNotation gets called when the parser has found a <!NOTATION> definition in the DTD.

Parameters

Sender The TXmlScanner instance which triggers the event
NotationDef The Definition of the Notation (Name, Value=System ID, Public ID)

OnPI

TPIEvent = PROCEDURE (Sender : TObject; Target, Content: STRING; Attributes : TAttrList) OF OBJECT;
PROPERTY OnPI : TPIEvent READ FOnPI WRITE FOnPI;

Is called whenever the Execute method has parsed a Processing Instruction (PI).

A PI can have any format after the target. It is a common practice to specifiy "pseudo" attributes like the ones in start tags. For this, you get passed a list of attributes here, which is of course only valid and useable if the PI contains pseudo attributes.

Parameters

Sender The TXmlScanner instance which triggers the event
Target The PI target name
Content The content of the PI, not including the target name and the final ?>
Attributes Pseudo attributes

See also

Execute


OnStartTag

TStartTagEvent = PROCEDURE (Sender : TObject; TagName : STRING; Attributes : TAttrList) OF OBJECT;
PROPERTY OnStartTag : TStartTagEvent READ FOnStartTag WRITE FOnStartTag;

Is called whenever the Execute method has read in a Start Tag.

You can access the attributes by name or by scanning through the list of attributes (Value and Name are STRING variables, i is of type INTEGER):

Value := Attributes.Value ('name');         // Access by name
for i := 0 to Attributes.Count-1 do begin   // Scan through attributes
  Name  := Attributes.Name (i);
  Value := Attributes.Value (i);
  end;

Note that all names in XML are case-sensitive. This includes attribute names. The list of attributes is always sorted by name by the parser.

Parameters

Sender The TXmlScanner instance which triggers the event
TagName The name of the Start Tag. Case sensitive.
Attributes List of attributes

See also

Execute, OnEndTag


OnTranslateEncoding

TEncodingEvent = FUNCTION  (Sender : TObject; CurrentEncoding, Source : STRING) : STRING OF OBJECT;
PROPERTY OnTranslateEncoding : TEncodingEvent READ FOnTranslateEncoding WRITE FOnTranslateEncoding;

The XML Specification states that every XML parser must be able to handle UTF-8 and UTF-16 documents. Beside these, parsers should be able to handle other Encodings. The encoding for a document is defined in the XML Prolog (for entire XML Documents) or in a Text Declaration at the beginning of Parsed External Entities or External DTD subsets.

So there is a source Encoding (the Encoding of the Document and its external parts) and a destination encoding (the encoding your application wishes to process). For every content string which is passed to your application (Text Content between Tags, CDATA sections, Attribute values) the OnTranslateEncoding event is fired. It retrieves the current source encoding by looking at the CurrentEncoding argument and translates the passed Source string into the desired destination encoding.

The TranslateEncoding method that is built into TXmlParser assumes that the destination encoding is the Windows ANSI encoding used in Windows apps. It can handle UTF-8 and ISO-8859-1 as source encodings. Note: It is assumed here that ISO-8859-1 and "Windows ANSI" are the same, which is not exactly true for some characters. 

UTF-8 correctly translated into the single-byte ANSI Windows-1252 format.

At the time of this writing, TXmlParser and TXmlScanner are not able to handle multi-byte character strings. This is likely to change in the future.

Parameters

Sender The TXmlScanner instance which triggers the event
CurrentEncoding The name of the encoding of the current part of XML. This name is NOT case sensitive. Examples are: UTF-8, ISO-8859-1, WINDOWS-1252, etc. The encoding is always given in uppercase.
Source The source string which must be translated

Example

Let's assume that the CurrentEncoding is UTF-8 and you want to translate it to the Windows ANSI 1252 character set:

IF CurrentEncoding = 'UTF-8' THEN
  Result := Utf8ToAnsi (Source);     // Utf8ToAnsi provided by LibXmlParser

OnXmlProlog

TXmlPrologEvent = PROCEDURE (Sender : TObject; XmlVersion, Encoding: STRING; Standalone : BOOLEAN) OF OBJECT;
PROPERTY OnXmlProlog : TXmlPrologEvent READ FOnXmlProlog WRITE FOnXmlProlog;

Is called when the Execute method has parsed an XML Declaration or a Text Declaration.

Parameters

Sender The TXmlScanner instance which triggers the event
XmlVersion The XML version number specified in the prolog. The current version of XML is 1.0
Encoding The character encoding specified in the prolog. There are a lot of possible values here. Usual values are: UTF-8, ISO-8859-1. You can assume UTF-8 if there is no encoding specified by the prolog.
Standalone True if there is a standalone='yes' in the prolog. Please refer to the XML spec for an exact definition of the standalone declaration.

See also

Execute