intro.bbdoc 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525
  1. ## What is XML?
  2. XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding
  3. documents in a format that is both human-readable and machine-readable. It is designed to store
  4. and transport data, as well as being a universal standard for data interchange between different
  5. applications and platforms.
  6. XML files have a hierarchical structure, where data is organized into elements, attributes,
  7. and text nodes. Elements are the primary building blocks of XML, and they can be nested within each
  8. other to create a tree-like structure. Attributes provide additional information about elements and
  9. are defined within the opening tag of an element. Text nodes store the actual content of the elements.
  10. Similar to XML, JSON (JavaScript Object Notation) is another popular format for data interchange,
  11. which also uses a hierarchical structure to organize data. Both XML and JSON are designed to be
  12. human-readable and machine-readable, making them well-suited for exchanging data between different
  13. programming languages and platforms.
  14. However, there are some differences between XML and JSON. While XML relies on tags and attributes to
  15. define and structure data, JSON uses key-value pairs and objects (denoted by curly braces) or arrays
  16. (denoted by square brackets) to represent data. In general, JSON is considered to be more compact and less
  17. verbose than XML, which can result in faster parsing and reduced data size for transmission.
  18. Here is a comparison of XML and JSON representations for the same data:
  19. XML:
  20. ```xml
  21. <bookstore>
  22. <book id="1">
  23. <title>Book Title 1</title>
  24. <author>Author 1</author>
  25. <price>19.99</price>
  26. </book>
  27. <book id="2">
  28. <title>Book Title 2</title>
  29. <author>Author 2</author>
  30. <price>24.99</price>
  31. </book>
  32. </bookstore>
  33. ```
  34. JSON:
  35. ```json
  36. {
  37. "bookstore": {
  38. "book": [
  39. {
  40. "id": "1",
  41. "title": "Book Title 1",
  42. "author": "Author 1",
  43. "price": 19.99
  44. },
  45. {
  46. "id": "2",
  47. "title": "Book Title 2",
  48. "author": "Author 2",
  49. "price": 24.99
  50. }
  51. ]
  52. }
  53. }
  54. ```
  55. ## XML by Example
  56. We are going to explore the fundamentals of XML by creating an example book store application.
  57. This application will manage a list of books, including their titles, authors, publication years,
  58. prices, and other relevant details. Through this example, we will demonstrate how XML is used to
  59. organize and structure data, making it easier to store, exchange, and process information.
  60. Our example book store will maintain an XML file, books.xml, which contains the following book data:
  61. ```xml
  62. <?xml version="1.0" encoding="UTF-8"?>
  63. <bookstore>
  64. <book category="cooking">
  65. <title lang="en">Everyday Italian</title>
  66. <author>Giada De Laurentiis</author>
  67. <year>2005</year>
  68. <price>30.00</price>
  69. </book>
  70. <book category="children">
  71. <title lang="en">Harry Potter</title>
  72. <author>J K. Rowling</author>
  73. <year>2005</year>
  74. <price>29.99</price>
  75. </book>
  76. <book category="web">
  77. <title lang="en">XQuery Kick Start</title>
  78. <author>James McGovern</author>
  79. <author>Per Bothner</author>
  80. <author>Kurt Cagle</author>
  81. <author>James Linn</author>
  82. <author>Vaidyanathan Nagarajan</author>
  83. <year>2003</year>
  84. <price>49.99</price>
  85. </book>
  86. <book category="web" cover="paperback">
  87. <title lang="en">Learning XML</title>
  88. <author>Erik T. Ray</author>
  89. <year>2003</year>
  90. <price>39.95</price>
  91. </book>
  92. </bookstore>
  93. ```
  94. __Source:__ https://www.w3schools.com/xml/books.xml
  95. In this XML file, each book is represented as a separate element node, with the books nested within
  96. the parent "bookstore" element. Element nodes can contain attributes, such as "category" or "lang",
  97. and they can wrap either text or child element nodes.
  98. For instance, the last book in our example has a "category" attribute with the value "web" and a
  99. "cover" attribute with the value "paperback". This book also has multiple authors, which
  100. demonstrates that element nodes can contain multiple child nodes of the same type.
  101. #### Nodes
  102. In XML, the DOM (Document Object Model) represents everything within an XML file as a node.
  103. A node is a fundamental component of an XML document, and there are various types of nodes,
  104. including element nodes, attribute nodes, and text nodes. Understanding these different node
  105. types is crucial for effectively working with XML data.
  106. * **Element nodes** : These are the primary building blocks of an XML document, representing
  107. the XML tags and their content. In our example, `<book>` is an element node. Element nodes can
  108. be nested within one another, creating a tree-like structure that organizes the data.
  109. * **Attribute nodes** : These nodes provide additional information about an element node and are
  110. defined within the opening tag of the element. For instance, the category attribute in the `<book>`
  111. element is an attribute node. Attribute nodes are associated with a specific element and cannot exist
  112. on their own.
  113. * **Text nodes** : These nodes store the actual content within element nodes. In our example, the
  114. text "Everyday Italian" within the `<title>` element is a text node.
  115. For simplicity, we will often refer to them as elements, attributes, and text.
  116. #### Analyzing an XML file
  117. Examining an XML file allows us to extract various valuable information, such as:
  118. * **File encoding, XML version, and root node name** : These metadata details are essential for processing
  119. and interpreting the XML file. In our example, the file encoding is UTF-8, the XML version is 1.0,
  120. and the root node name is "bookstore."
  121. * **Attributes of nodes** : Attributes provide extra information about elements, such as
  122. categorization, language, or other properties. In our book store example, the "category" and
  123. "lang" attributes are used to classify books and specify the language of the titles, respectively.
  124. * **Content of nodes** : The content within nodes can be either text or other nodes, representing
  125. data or further organization of the XML structure. In our example, the `<book>` element contains
  126. both text nodes (e.g., the book title) and child element nodes (e.g., the `<author>` element).
  127. * **Number of nodes, attributes in a node, and children of a node** : Counting and understanding the
  128. relationships between nodes, their attributes, and their children is vital for navigating and manipulating
  129. XML data effectively. For instance, knowing the number of books in the bookstore or the number of
  130. authors for a particular book can be useful for various operations and analyses.
  131. ### Reading XML Files
  132. To read data from an existing XML file, you first need to create an #TXmlDoc instance and parse
  133. the `books.xml` file.
  134. ```blitzmax
  135. Local xmlDoc:TXmlDoc = TXmlDoc.parseFile("books.xml")
  136. ```
  137. However, we also want to store the book data as objects in BlitzMax,
  138. so we will create a custom #Type for it first. As you observed in the XML file, the book titles
  139. can be in different languages, so we need to ensure that we can identify them using the `lang`
  140. attribute of the `<title>` element. We can achieve this with a key-value mapping using a #StringMap.
  141. For authors, since they are not identified by a key, we can use a simple #String array to store them.
  142. ```blitzmax
  143. Type TBook
  144. Field category:String
  145. Field cover:String
  146. Field title:StringMap
  147. Field authors:String[]
  148. Field year:Int
  149. Field price:Float
  150. End Type
  151. ```
  152. _(Note: Using floating-point numbers for financial values should generally be avoided,
  153. as it can lead to rounding errors, which is undesirable in financial contexts. For the sake
  154. of simplicity, we will use `Float` in this example.)_
  155. Now that we have parsed the XML file and created the `TBook` type to store the information,
  156. we are ready to read the data from the XML and store it in our custom class.
  157. To extract information from the nodes in the parsed XML file, we can start by retrieving
  158. the root node using the `getRootElement()` method provided by the #TXmlDoc instance.
  159. ```blitzmax
  160. Local rootNode:XmlNode = xmlDoc.getRootElement()
  161. ```
  162. #XmlNode objects have various methods for setting and retrieving values. While reading our
  163. XML file, the following methods might prove useful:
  164. * **Finding elements**: `getAttribute()`, `getAttributeList()`, `hasAttribute()`, `getChildren()`,
  165. `nextSibling()`, `previousSibling()`, and `findElement()` (if you know what you're looking for).
  166. * **Retrieving content**: `getContent()`.
  167. Now that we have the root node - which is the `<bookstore>` element - we can access its
  168. children, which in this case are the `<book>` elements. To iterate over all the books,
  169. we can use the root node's `getChildren()` method, which returns a #TList. We can then
  170. iterate over this list using a `For EachIn` loop.
  171. ```blitzmax
  172. For local childNode:XmlNode = EachIn rootNode.getChildren()
  173. ' read data of interest from childNode
  174. Next
  175. ```
  176. Let's take a closer look at a typical book entry in our XML file:
  177. ```xml
  178. <book category="web">
  179. <title lang="en">XQuery Kick Start</title>
  180. <author>James McGovern</author>
  181. <author>Per Bothner</author>
  182. <author>Kurt Cagle</author>
  183. <author>James Linn</author>
  184. <author>Vaidyanathan Nagarajan</author>
  185. <year>2003</year>
  186. <price>49.99</price>
  187. </book>
  188. ```
  189. To extract information from this book entry, we will use the following approaches:
  190. * **Category attribute** : Use `getAttribute()` to retrieve the category value.
  191. * **Title element and `lang` attribute** : Use `getAttribute()` to obtain the language key of the
  192. title, and `getContent()` to get the title text.
  193. * **Author, year, and price elements** : These elements only contain text, so we can use `getContent()`
  194. to read their values.
  195. For each book, we will create a new `TBook` instance and add it to a #TList that stores all our books.
  196. In your own projects, you might want to add conditional checks before adding a book to the list.
  197. For example, you could skip adding a book if certain required data is missing from the XML file
  198. (e.g., both "title" and "author" are absent). For the purpose of this example, we will skip checking
  199. for duplicates or incomplete entries.
  200. ```blitzmax
  201. Local allBooks:TList = New TList
  202. For Local childNode:TxmlNode = EachIn rootNode.getChildren()
  203. ' create a new book instance
  204. Local book:TBook = New TBook
  205. book.category = childNode.getAttribute( "category" )
  206. ' this stays empty if the attribute is not set
  207. book.cover = childNode.getAttribute( "cover" )
  208. ' loop over all child nodes and handle them according to
  209. ' their name
  210. For Local subNode:TxmlNode = EachIn childNode.getChildren()
  211. Select subNode.getName()
  212. Case "title"
  213. Local key:String = subNode.getAttribute( "lang" ).ToLower()
  214. Local value:String = subNode.getContent()
  215. book.title.Insert(key, value)
  216. Case "author"
  217. ' add the "single entry array" to the authors array
  218. book.authors :+ [ subNode.getContent() ]
  219. Case "year"
  220. ' cast the string content value to Int
  221. book.year = Int( subNode.getContent() )
  222. Case "price"
  223. ' cast the string content value to Float
  224. book.price = Float( subNode.getContent() )
  225. End Select
  226. Next
  227. ' add the book to our book storage list
  228. allBooks.AddLast( book )
  229. Next
  230. ' close the TxmlDoc instance
  231. xmlDoc.Free()
  232. ```
  233. Now that we have successfully extracted all the book information from the XML file and stored
  234. it in our application, it is time to take the next step. Let's explore how to save our book data back
  235. to a new XML file.
  236. ### Saving Changes to XML Files
  237. When working with XML files, it is crucial to save your changes periodically.
  238. In this section, we will explore the process of saving objects to XML files, otherwise known as
  239. "serialization." This involves converting the desired data into text format, making it compatible
  240. with XML file structures.
  241. In our book example, we only have to handle numbers and text data, which simplifies the serialization process.
  242. To save our data, we will need a #TxmlDoc object as the foundation and a root node to add our book nodes.
  243. When creating a new #TxmlDoc, it is a good practice to use the `newDoc()` helper function, which allows you
  244. to specify the XML version to use. This ensures that your output file is consistent with the original
  245. `books.xml` file format.
  246. For demonstration purposes, we will also print the content of the document using the #TxmlDoc's `saveFile()`
  247. method (passing "-" skips saving and redirects content to the output):
  248. ```blitzmax
  249. Local exportXmlDoc:TxmlDoc = TxmlDoc.newDoc("1.0")
  250. Local exportRootNode:TxmlNode = TxmlNode.newNode("bookstore")
  251. ' Set the newly created node as the root for the empty document
  252. exportXmlDoc.setRootElement(exportRootNode)
  253. ' Print the content in a formatted manner
  254. exportXmlDoc.saveFile("-", True, True)
  255. ```
  256. This code will output the following:
  257. ```xml
  258. <?xml version="1.0" encoding="utf-8"?>
  259. <bookstore />
  260. ```
  261. As you can see, we've successfully created an empty XML file with the correct root element.
  262. In order to save each book, we need to create a new #TxmlNode for each book, along with additional
  263. nodes or attributes to represent the properties. The BlitzMax XML module simplifies this process
  264. by allowing you to pass values directly when creating a new child node using `addChild(name, value)`.
  265. ```blitzmax
  266. ' Create and attach a new book node under the root node
  267. Local bookNode:TxmlNode = exportRootNode.addChild("book")
  268. bookNode.setAttribute("category", book.category)
  269. ' Iterate over title keys and store the corresponding values
  270. ' Using method chaining, we can save storing the node first
  271. For local lang:String = EachIn book.title.Keys()
  272. Local title:String = String(book.title.ValueForKey(lang))
  273. bookNode.addChild("title", title).addAttribute("lang", lang)
  274. Next
  275. For local author:String = EachIn book.authors
  276. bookNode.addChild("author", author)
  277. Next
  278. bookNode.addChild("year", book.year)
  279. bookNode.addChild("price", book.price)
  280. ```
  281. Now, we can combine all of the code blocks we've discussed so far to accomplish the following tasks:
  282. 1. Load the `books.xml` file.
  283. 2. Save all the books as a new file named `books.new.xml`.
  284. By combining these code blocks, you can create a complete solution to read, manipulate, and save
  285. XML data using BlitzMax.
  286. ```blitzmax
  287. SuperStrict
  288. ' xml.mod is not automatically imported like brl.mod and pub.mod, so you need to manually import it
  289. Import text.xml
  290. Type TBook
  291. Field category:String
  292. Field cover:String
  293. Field title:TStringMap = New TStringMap
  294. Field authors:String[]
  295. Field year:Int
  296. Field price:Float
  297. End Type
  298. Local allBooks:TList = New TList
  299. ' === LOADING ===
  300. ' Load and parse the books.xml file
  301. Local xmlDoc:TxmlDoc = TxmlDoc.parseFile("books.xml")
  302. ' Retrieve the root element (bookstore)
  303. Local rootNode:TxmlNode = xmlDoc.getRootElement()
  304. ' Loop through all child nodes (books)
  305. For Local childNode:TxmlNode = EachIn rootNode.getChildren()
  306. ' Create a new TBook instance
  307. Local book:TBook = New TBook
  308. ' Get the book attributes
  309. book.category = childNode.getAttribute("category")
  310. book.cover = childNode.getAttribute("cover") ' This remains empty if the attribute is not set
  311. ' Loop through all child nodes of the current book and handle them according to their name
  312. For Local subNode:TxmlNode = EachIn childNode.getChildren()
  313. Select subNode.getName()
  314. Case "title"
  315. ' Store the title in the TStringMap using the language as the key
  316. Local key:String = subNode.getAttribute("lang").ToLower()
  317. Local value:String = subNode.getContent()
  318. book.title.Insert(key, value)
  319. Case "author"
  320. ' Add the author to the authors array
  321. book.authors :+ [subNode.getContent()]
  322. Case "year"
  323. ' Convert the year from a string to an integer
  324. book.year = Int(subNode.getContent())
  325. Case "price"
  326. ' Convert the price from a string to a float
  327. book.price = Float(subNode.getContent())
  328. End Select
  329. Next
  330. ' Add the TBook instance to the allBooks list
  331. allBooks.AddLast(book)
  332. Next
  333. ' Release resources associated with xmlDoc
  334. xmlDoc.Free()
  335. ' === SAVING ===
  336. ' Create a new TxmlDoc with version 1.0
  337. Local exportXmlDoc:TxmlDoc = TxmlDoc.newDoc("1.0")
  338. ' Create a new root node (bookstore) for the export document
  339. Local exportRootNode:TxmlNode = TxmlNode.newNode("bookstore")
  340. ' Set the new root node as the document's root element
  341. exportXmlDoc.setRootElement(exportRootNode)
  342. ' Loop through all books in the allBooks list
  343. For Local book:TBook = EachIn allBooks
  344. ' Create a new book node under the root node
  345. Local bookNode:TxmlNode = exportRootNode.addChild("book")
  346. ' Set the book's attributes
  347. bookNode.setAttribute("category", book.category)
  348. If book.cover
  349. bookNode.setAttribute("cover", book.cover)
  350. EndIf
  351. ' Add title nodes with their corresponding language attributes
  352. For local lang:String = EachIn book.title.Keys()
  353. Local title:String = String(book.title.ValueForKey(lang))
  354. bookNode.addChild("title", title).addAttribute("lang", lang)
  355. Next
  356. ' Add author nodes
  357. For local author:String = EachIn book.authors
  358. bookNode.addChild("author", author)
  359. Next
  360. ' Add year and price nodes
  361. bookNode.addChild("year", book.year)
  362. bookNode.addChild("price", book.price)
  363. Next
  364. ' Save the XML file as books.new.xml with formatting
  365. exportXmlDoc.saveFile("books.new.xml", True, True)
  366. ' Release resources associated with exportXmlDoc
  367. exportXmlDoc.free()
  368. ' Now, the new XML file has been created with the processed book data from the original XML file.
  369. ' You can check the output in the "books.new.xml" file.
  370. ```
  371. This is the content of the generated `books.new.xml` file:
  372. ```xml
  373. <?xml version="1.0" encoding="utf-8"?>
  374. <bookstore>
  375. <book category="cooking">
  376. <title lang="en">Everyday Italian</title>
  377. <author>Giada De Laurentiis</author>
  378. <year>2005</year>
  379. <price>30.0000000</price>
  380. </book>
  381. <book category="children">
  382. <title lang="en">Harry Potter</title>
  383. <author>J K. Rowling</author>
  384. <year>2005</year>
  385. <price>29.9899998</price>
  386. </book>
  387. <book category="web">
  388. <title lang="en">XQuery Kick Start</title>
  389. <author>James McGovern</author>
  390. <author>Per Bothner</author>
  391. <author>Kurt Cagle</author>
  392. <author>James Linn</author>
  393. <author>Vaidyanathan Nagarajan</author>
  394. <year>2003</year>
  395. <price>49.9900017</price>
  396. </book>
  397. <book category="web" cover="paperback">
  398. <title lang="en">Learning XML</title>
  399. <author>Erik T. Ray</author>
  400. <year>2003</year>
  401. <price>39.9500008</price>
  402. </book>
  403. </bookstore>
  404. ```
  405. ### Summary
  406. Throughout this short guide, we have demonstrated how to create a bookstore application using XML
  407. files to store and manipulate book data. Here's a recap of the key steps we covered:
  408. 1. **Understanding XML Structure** : We examined the hierarchical structure of XML files, which consists
  409. of elements, attributes, and text nodes. We also touched upon the similarities between XML and JSON.
  410. 2. **Creating a Bookstore Example** : We designed an XML file representing a bookstore that contains
  411. information about various books, including their categories, titles, authors, publication years, and prices.
  412. 3. **XML Nodes** : We explored the concept of nodes in XML and their various types, including element
  413. nodes, attribute nodes, and text nodes.
  414. 4. **Reading XML Files** : We demonstrated how to read an existing XML file using the #TxmlDoc class
  415. and parse the book data into our custom `TBook` type.
  416. 5. **Storing Book Data** : We created a custom `TBook` type to store book data within our application and
  417. added the book instances to a #TList.
  418. 6. **Accessing XML Elements** : We retrieved specific elements and their attributes from the XML file
  419. using various #TxmlNode methods.
  420. 7. **Saving XML Files** : We discussed how to serialize our `TBook` objects and save them into a new
  421. XML file, creating new TxmlNodes for each book and setting their attributes and child nodes accordingly.
  422. By following these steps, we successfully built a bookstore application that reads, processes, and saves book
  423. data using XML files.