| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215 |
- * XML Classes
- ** Abstract
- XML library is used by several field of Mono such as ADO.NET and XML
- Digital Signature (xmldsig). Here I write about System.Xml.dll and
- related tools. This page won't include any classes which are in other
- assemblies such as XmlDataDocument.
- Note that current corlib has its own XML parser class named Mono.Xml.MiniParser.
- Basically System.XML.dll feature has finished, or almost finished, so
- I write this page mainly for bugs and improvement hints.
- ** System.Xml namespace
- *** Document Object Model (Core)
- DOM feature has already implemented. There is still missing feature.
- <ul>
- * ID constraint support is problematic because W3C DOM does not specify
- handling of ID attributes into non-adapted element. (MS.NET also
- looks incomplete in this area).
- * I think, event feature is not fully tested. There are no concrete
- desctiption on which events are risen, so we have to do some
- experiment on MS.NET.
- </ul>
- *** Xml Writer
- Here XmlWriter almost equals to XmlTextWriter. If you want to see
- another implementation, check XmlNodeWriter.cs used in monodoc.
- XmlTextWriter is completed. However, it looks nearly twice as slow as
- MS.NET (I tried 1.1)
- *** XmlResolver
- Currently XmlTextReader uses specified XmlResolver. If nothing was supplied,
- then it uses XmlUrlResolver. XmlResolver is used to parse external DTD,
- importing XSL stylesheets and schemas etc.
- However, XmlUrlResolver is still buggy (mainly because System.Uri is also
- incomplete yet) and this results in several loading error.
- XmlSecureResolver, which is introduced in MS .NET Framework 1.1 is basically
- implemented, but it requires CAS (code access security) feature. We need to
- fixup this class after ongoing CAS effort works.
- *** XmlNameTable
- XmlNameTable itself is implemented. However, it should be actually used in
- several classes. Currently it makes sense if compared names are both in
- the table, but if it is obvious that compared names are both in this table,
- it should be simply compared using ReferenceEquals() (if these names are
- different, the comparison is still inefficient yet).
- *** Xml Stream Reader
- When we are using ASCII document, we don't care which encoding we are using.
- However, XmlTextReader must be aware of the specified encoding in XML
- declaration. So we have internal XmlStreamReader class (and currently
- XmlInputStream class. This may disappear since XmlStreamReader is enough to
- handle this problem).
- However, there are some problems lies in these classes on reading network
- stream (especially on Linux). This should be fixed soon.
- *** XML Reader
- XmlTextReader, XmlNodeReader and XmlValidatingReader are almost finished.
- - Most of the OASIS conformance test passes as Microsoft does, but
- about W3C tests, it is not perfect.
- - I won't add any XDR support on XmlValidatingReader. (I haven't
- ever seen XDR used other than Microsoft's BizTalk Server 2000,
- and Now they have 2003 with XML Schema support)
- XmlTextReader and XmlValidatingReader should be faster than now. Currently
- XmlTextReader looks nearly twice as slow as MS.NET, and XmlValidatingReader
- (which uses this slow XmlTextReader) looks nearly three times slower. (Note
- that XmlValidatingReader won't be slow as itself. It uses schema validating
- reader and dtd validating reader.)
- **** Some Advantages
- The design of Mono's XmlValidatingReader is radically different from
- that of Microsoft's implementation. Under MS.NET, DTD content validation
- engine is in fact simple replacement of XML Schema validation engine.
- Mono's DTD validation is designed fully separate and does validation
- as normal XML parser does. For example, Mono allows non-deterministic DTD.
- Another advantage of this XmlValidatingReader is support for *any* XmlReader.
- Microsoft supports only XmlTextReader.
- I added extra support interface named "IHasXmlParserContext", which is
- considered in XmlValidatingReader.ResolveEntity(). Microsoft failed to
- design XmlReader to support pluggable use of XmlReader (i.e. wrapping use
- of other XmlReader) since XmlParserContext is required to support both
- entity resolution and namespace manager. (In .NET 1.2, Microsoft also
- supported similar to IHasXmlParserContext, named IXmlNamespaceResolver,
- but it still does not provide any DTD information.)
- We also have RELAX NG validating reader. See mcs/class/Commons.Xml.Relaxng.
- ** System.Xml.Schema
- *** Schema Object Model
- Basically it is implemented. Some features still needs to fix:
- - Complete facet support. Currently some of them is missing. Recently
- David Sheldon is doing several fixes on them.
- - Complete derivation by restriction (DBR) support. Especially
- substitution group won't work with it (However, I won't recommend
- both substitution group and DBR, regardless of this incompleteness.)
- Some bugs are remaining, but as far as I tried W3C XML Schema test suite
- with bugfixes (of test suite), only 69 out of 7581 has failed. With my test
- suite fix, MS.NET failed 48 cases.
- *** Validating Reader
- XML Schema validation feature is (currently) implemented on
- Mono.Xml.Schema.XsdValidatingReader, which is internally used in
- XmlValidatingReader.
-
- Basically this is implemented and actually its feature is almost complete,
- but I have only did validation feature testing. So we have to write more
- tests on properties, methods, and events (validation errors).
- ** System.Xml.Serialization
- Lluis rules ;-)
- Well, in fact XmlSerializer is almost finished and is on bugfix phase.
- However, more tests are required especially schema import and export
- feature. Please try xsd.exe to create classes from schema, or schema
- from class. And if any problems were found, please file it to bugzilla.
- ** System.Xml.XPath and System.Xml.Xsl
- There are two implementations for XSLT. One (and historical) implementation
- is based on libxslt. Now we uses fully implemented managed XSLT.
- Putting aside bug fixes, we have to support:
- - embedded script (such as VB, C#, JScript). So some packages like
- latest NAnt (for MS.NET) won't be compiled.
- It would be nice if we can support <a href="http://www.exslt.org/">EXSLT</a>.
- <a href="http://msdn.microsoft.com/WebServices/default.aspx?pull=/library/en-us/dnexxml/html/xml05192003.asp">Microsoft has already done it</a>, but it
- is not good code since it depends on internal concrete derivatives of
- XPathNodeIterator classes. In general, .NET's "extension objects" is not
- usable to return node-sets, so if we support EXSLT, it has to be done
- internally inside our System.XML.dll. Volunteers are welcome.
- Our managed XSLT implementation is still inefficient. XslTransform.Load()
- and .Transform() looks three times slower (However it depends on
- XmlTextReader which is also slow, so we are starting optimization from
- that class, not XSLT itself). These number are only for specific cases,
- and there might be more critical point on XSLT engine (mainly
- XPathNodeIterator).
- ** Miscellaneous Class Libraries
- *** RELAX NG
- I implemented an experimental RelaxngValidatingReader. It is far from
- complete, especially simplification stuff (see RELAX NG spec chapter 4),
- some constraints (in chapter 7), and datatype handling.
- I am planning improvements (starts with renaming classes, giving more
- kind error messages, supporting compact syntax and even object mapping),
- but it is still my wishlist.
- ** Tools
- *** xsd.exe
- xsd.exe is used to:
- 1) generate classes source code from schema
- 2) generate DataSet classes source code from schema
- 3) generate schema documents from assembly (classes)
- 4) infer schema documents from XML instance
- 5) convert XDR into XSD
- As descrived above, I won't work on 5) XDR stuff.
- Current xsd.exe supports 1) and 3)
- As for 2) and 4), Currently there is no works on them. (This inference
- feature is rather DataSet specific than general purpose use.)
- Microsoft has another inference class from XmlReader to XmlSchemaCollection.
- It may be useful, but it won't be so easy.
- any volunteers?
|