Strict Standards: date_default_timezone_get(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EDT/-4.0/DST' instead in /homepages/14/d176026529/htdocs/htdocs/wiki/wiki/includes/Setup.php on line 368
XML - WinMerge Development Wiki

XML

From WinMerge Development Wiki
Jump to: navigation, search

Novadays XML is everywhere. Lots of file/dataformats are XML. So it is very natural to WinMerge have a good support for XML also.

Contents

Comparing XML files

WinMerge versions 2.4 and 2.6 come with plugin to ease XML file compare.

But it would be good to have more effective XML compare than that. So, how we should compare XML files?

Project files

WinMerge project files are XML files. Currently project files contain paths to compare plus possible filter name and subfolder inclusion. In future we want to have at least some compare option there too.

XML parser

Internal parser

WinMerge 2.4 and 2.6 have a mix of code for handling XML files. There is partial installation of expat parser files in /Src/ExpatLib/ and /Src/ExpatMapLib/. Then there is custom parser code in form of Jochen's markdown and Perry's XmlDoc class build over it.

Expat + SCEW

WinMerge 2.7.x now uses expat XML parser and SCEW wrapper for XML handling. They allow flexible DOM-like XML-tree handling.

Using SCEW

Unfortunately documentation for SCEW isn't very rich, and it is a bit hard to read too. So there is short intro to using SCEW (in WinMerge):

For starters, add the include line:

#include <scew/scew.h>

Opening the XML File (or creating a new file)

Use C-lib functions for opening files, since SCEW function only accept ANSI paths. And WinMerge must support Unicode:

FILE * fp = _tfopen(path, _T("r")); // "r" for reading, "w" for writing

Parsing the XML File

To parse a file, we need to create a parser:

scew_parser* parser = NULL;
parser = scew_parser_create();

Set it to ignore whitespaces (generally a good idea):

scew_parser_ignore_whitespaces(parser, 1);

Parse the open file:

if (scew_parser_load_file_fp(parser, fp))
{
 ...
}

Get the XML tree:

scew_tree* tree = NULL;
tree = scew_parser_tree(parser);

Get the root element for starting point:

scew_element * root = scew_tree_root(tree);

Now you can traverse the tree with many SCEW functions, including:

scew_element * element = scew_element_by_name(scew_element const* parent, XML_Char const* name);
unsigned int scew_element_count(scew_element const* parent);
scew_element* scew_element_next(scew_element const* parent, scew_element const* element);

More functions are listed in scew/element.h.

As these functions return pointer to existing element in the tree, there is no need to free/release those elements.

But remember to free the tree, parser and filepointer after parsing:

scew_tree_free(tree);
scew_parser_free(parser);
fclose(fp);

Creating a new XML File

When creating a new XML file you start by creating the tree and then adding root element into it:

scew_tree* tree = scew_tree_create();
scew_element * root = scew_tree_add_root(tree, root_element_name);

Now you can use scew/element.h functions to add elements, including:

scew_element * element = scew_element_add(scew_element const* parent, XML_Char const* name);
scew_element* scew_element_add_elem(scew_element* element, scew_element* new_elem);

You set the element data with:

XML_Char const* scew_element_set_contents(scew_element* element, XML_Char const* data);

Once tree is formatted, set encoding to UTF-8 (which is what we want to use in WinMerge):

scew_tree_set_xml_encoding(tree, _T("UTF-8"));

And set it to stand-alone XML file:

scew_tree_set_xml_standalone(tree, 1);

Then you can write the file:

scew_writer_tree_fp(tree, fp)

Note: Parser is not created when writing the data!

And remember to free the parser after you've done with the data and the file.

Examples

There are couple of simple examples in scew/examples. And Project file code in Src/ProjectFile.cpp and Src/ProjectFile.h work as an example too.

Development

Some future development ideas:

  • convert filter definition files to XML
  • options import/export to XML files
  • use XML files as options storage instead of registry
  • better XML file compare support. What this means in practice? Can we compare individual elements in XML tree?

Convert Using Expat + wrapper

We'll convert XML code to use expat XML parser and SCEW wrapper for it. This allows simple handling of XML files in C/C++ code.

- Conversion is now done for project files.

There are several steps needed for this:

  • check if we can drop markdown also - it might be used by archive support/Merge7z ?

Jtuc: WinMerge uses markdown for 7-Zip version detection, and for encoding detection of XML and HTML files. Besides, I used to have plans for a UniFile derived class that employs markdown to canonicalize whitespace in XML files on the fly while populating the CrystalEdit.

Unit testing:

  • Project file loading cases are in repository and all tests pass.
  • Project file saving tests are not yet written.
Personal tools