xml parsing for FileAttachments

It is possible to parse xml FileAttachments first as text, and then use other parsers like DOMParser to parse to a DOM node tree, for convenient management.

Since xml is (still?) used a lot, it may make sense to extend FileAttachments itself with a new .xml() method to get the parsed result directly. Afterall, FileAttachment now supports not just the fetch provided parsers but also csv, arrow, sqlite, zip … Also, d3-fetch supports xml parsing (based on DOMParser).

Since @mbostock provided the nice LocalFile example, I tried a probably simplistic implementation in a fork:

It works in principle, except that of course the default value does not get parsed if it is a FileAttachment. If the default value is a LocalFile itself, there is an error with the url.

I do not understand enough to fix the default value issue. It may be easy or it may be hard. So any hints appreciated.

But I think the example shows that it may be feasible to add xml parsing to FileAttachments in the standard library. I cannot really think of a reason why it is missing. Because DOMParser is a browser only API ?

Here is the FileAttachment implementation:

I added feature feedback here:

which points to small patch to fileAttachment.js . I could test by using devtools and the debugger on observablehq.com.

It seems easier to patch the source than to wrap or extend the library.

1 Like

It is unclear if this could be added to FileAttachment since the feedback issue was closed.

In the mean time I made some progress extending FileAttachment from a notebook:

That solution is as stripped down as I could figure out and I think would work well enough. There are two drawbacks:

  • Access to the FileAttachment by its internal URL is not allowed by CORS policy from the notebook page. That seems strange but there may be a reason for this. This can be worked around by using the long download URL for the file, in the notebook. It looks a little ugly but is not too bad.

  • Instead of invoking FileAttachment as a function, it is necessary to use the new keyword to create an instance of a class. This may be also ok and it may be possible to find a wrapper function.

I am not sure why the name property remains undefined. On the other hand, the name will always be known to the notebook anyways.

Any feedback or comment much appreciated.

Of course, instead of having to import such a class from a notebook one could also import a simple ‘toXML’ function and say .text().toXML().

Native xml parsing has been added ! In add xml support for file attachments by CobusT · Pull Request #246 · observablehq/stdlib · GitHub.

I updated the notebook above to show that it works.

1 Like