Converting XMLDocument to JSON or data wrangling with XML

It seems like XML is not a data format of choice for data wrangling like it is for JSON. Unfortunately, all my data is in XML format and I would like to get into a format that tutorials can help me with.

Code

The URL needs a service key, so I used Secrets. The webservice also doesnโ€™t have CORS enabled, so I set up the Herokuapp as instructed.

xmlData = d3.xml(`https://ciscorucinski-cors.herokuapp.com/http://openapi.data.go.kr/openapi/service/rest/Covid19/getCovid19SidoInfStateJson?serviceKey=${Secret('KEY_DATA.GO.KR')}&pageNo=1&numOfRows=100&startCreateDt=20200920&endCreateDt=20200920`)

Is there a better way of getting this data via Observable?

Anyways, I see that I can get a list of all the item tags, but I cannot easy view the data.

xmlData.getElementsByTagName('item')

But, I still cannot see the data. I havenโ€™t been able to find good tutorials on how to convert XML to JSON (or even just dealing with XML as-is)

XML from webservice (Snippet)

<response>
    <header>
        <resultCode>00</resultCode>
        <resultMsg>NORMAL SERVICE.</resultMsg>
    </header>
    <body>
        <items>
            <item>
                <createDt>2020-09-21 10:04:45.368</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>1451</defCnt>
                <gubun>๊ฒ€์—ญ</gubun>
                <gubunCn>้š”้›ขๅ€</gubunCn>
                <gubunEn>Lazaretto</gubunEn>
                <incDec>10</incDec>
                <isolClearCnt>1355</isolClearCnt>
                <isolIngCnt>96</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>10</overFlowCnt>
                <qurRate>-</qurRate>
                <seq>4472</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.368</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>58</defCnt>
                <gubun>์ œ์ฃผ</gubun>
                <gubunCn>ๆตŽๅทž</gubunCn>
                <gubunEn>Jeju</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>45</isolClearCnt>
                <isolIngCnt>13</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>8.65</qurRate>
                <seq>4471</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.368</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>285</defCnt>
                <gubun>๊ฒฝ๋‚จ</gubun>
                <gubunCn>ๅบ†ๅ—</gubunCn>
                <gubunEn>Gyeongsangnam-do</gubunEn>
                <incDec>2</incDec>
                <isolClearCnt>254</isolClearCnt>
                <isolIngCnt>31</isolIngCnt>
                <localOccCnt>2</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>8.48</qurRate>
                <seq>4470</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>55</deathCnt>
                <defCnt>1512</defCnt>
                <gubun>๊ฒฝ๋ถ</gubun>
                <gubunCn>ๅบ†ๅŒ—</gubunCn>
                <gubunEn>Gyeongsangbuk-do</gubunEn>
                <incDec>1</incDec>
                <isolClearCnt>1411</isolClearCnt>
                <isolIngCnt>46</isolIngCnt>
                <localOccCnt>1</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>56.79</qurRate>
                <seq>4469</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>167</defCnt>
                <gubun>์ „๋‚จ</gubun>
                <gubunCn>ๅ…จๅ—</gubunCn>
                <gubunEn>Jeollanam-do</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>118</isolClearCnt>
                <isolIngCnt>49</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>8.96</qurRate>
                <seq>4468</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>115</defCnt>
                <gubun>์ „๋ถ</gubun>
                <gubunCn>ๅ…จๅŒ—</gubunCn>
                <gubunEn>Jeollabuk-do</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>88</isolClearCnt>
                <isolIngCnt>27</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>6.33</qurRate>
                <seq>4467</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>2</deathCnt>
                <defCnt>468</defCnt>
                <gubun>์ถฉ๋‚จ</gubun>
                <gubunCn>ๅฟ ๅ—</gubunCn>
                <gubunEn>Chungcheongnam-do</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>348</isolClearCnt>
                <isolIngCnt>118</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>22.05</qurRate>
                <seq>4466</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>1</deathCnt>
                <defCnt>158</defCnt>
                <gubun>์ถฉ๋ถ</gubun>
                <gubunCn>ๅฟ ๅŒ—</gubunCn>
                <gubunEn>Chungcheongbuk-do</gubunEn>
                <incDec>2</incDec>
                <isolClearCnt>136</isolClearCnt>
                <isolIngCnt>21</isolIngCnt>
                <localOccCnt>2</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>9.88</qurRate>
                <seq>4465</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>3</deathCnt>
                <defCnt>217</defCnt>
                <gubun>๊ฐ•์›</gubun>
                <gubunCn>ๆฑŸๅŽŸ</gubunCn>
                <gubunEn>Gangwon-do</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>195</isolClearCnt>
                <isolIngCnt>19</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>14.09</qurRate>
                <seq>4464</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>63</deathCnt>
                <defCnt>4174</defCnt>
                <gubun>๊ฒฝ๊ธฐ</gubun>
                <gubunCn>ไบฌ็•ฟ</gubunCn>
                <gubunEn>Gyeonggi-do</gubunEn>
                <incDec>18</incDec>
                <isolClearCnt>3457</isolClearCnt>
                <isolIngCnt>654</isolIngCnt>
                <localOccCnt>18</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>31.50</qurRate>
                <seq>4463</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.367</createDt>
                <deathCnt>0</deathCnt>
                <defCnt>70</defCnt>
                <gubun>์„ธ์ข…</gubun>
                <gubunCn>ไธ–ๅฎ—</gubunCn>
                <gubunEn>Sejong</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>64</isolClearCnt>
                <isolIngCnt>6</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>20.45</qurRate>
                <seq>4462</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>2</deathCnt>
                <defCnt>142</defCnt>
                <gubun>์šธ์‚ฐ</gubun>
                <gubunCn>่”šๅฑฑ</gubunCn>
                <gubunEn>Ulsan</gubunEn>
                <incDec>0</incDec>
                <isolClearCnt>117</isolClearCnt>
                <isolIngCnt>23</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>12.38</qurRate>
                <seq>4461</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>3</deathCnt>
                <defCnt>356</defCnt>
                <gubun>๋Œ€์ „</gubun>
                <gubunCn>ๅคง็”ฐ</gubunCn>
                <gubunEn>Daejeon</gubunEn>
                <incDec>2</incDec>
                <isolClearCnt>291</isolClearCnt>
                <isolIngCnt>62</isolIngCnt>
                <localOccCnt>2</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>24.15</qurRate>
                <seq>4460</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>3</deathCnt>
                <defCnt>486</defCnt>
                <gubun>๊ด‘์ฃผ</gubun>
                <gubunCn>ๅ…‰ๅทž</gubunCn>
                <gubunEn>Gwangju</gubunEn>
                <incDec>1</incDec>
                <isolClearCnt>423</isolClearCnt>
                <isolIngCnt>60</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>1</overFlowCnt>
                <qurRate>33.36</qurRate>
                <seq>4459</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>8</deathCnt>
                <defCnt>881</defCnt>
                <gubun>์ธ์ฒœ</gubun>
                <gubunCn>ไปๅท</gubunCn>
                <gubunEn>Incheon</gubunEn>
                <incDec>2</incDec>
                <isolClearCnt>764</isolClearCnt>
                <isolIngCnt>109</isolIngCnt>
                <localOccCnt>1</localOccCnt>
                <overFlowCnt>1</overFlowCnt>
                <qurRate>29.80</qurRate>
                <seq>4458</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>193</deathCnt>
                <defCnt>7125</defCnt>
                <gubun>๋Œ€๊ตฌ</gubun>
                <gubunCn>ๅคง้‚ฑ</gubunCn>
                <gubunEn>Daegu</gubunEn>
                <incDec>1</incDec>
                <isolClearCnt>6873</isolClearCnt>
                <isolIngCnt>59</isolIngCnt>
                <localOccCnt>0</localOccCnt>
                <overFlowCnt>1</overFlowCnt>
                <qurRate>292.43</qurRate>
                <seq>4457</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>4</deathCnt>
                <defCnt>385</defCnt>
                <gubun>๋ถ€์‚ฐ</gubun>
                <gubunCn>้‡œๅฑฑ</gubunCn>
                <gubunEn>Busan</gubunEn>
                <incDec>8</incDec>
                <isolClearCnt>324</isolClearCnt>
                <isolIngCnt>57</isolIngCnt>
                <localOccCnt>8</localOccCnt>
                <overFlowCnt>0</overFlowCnt>
                <qurRate>11.28</qurRate>
                <seq>4456</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.366</createDt>
                <deathCnt>48</deathCnt>
                <defCnt>4995</defCnt>
                <gubun>์„œ์šธ</gubun>
                <gubunCn>้ฆ–ๅฐ”</gubunCn>
                <gubunEn>Seoul</gubunEn>
                <incDec>23</incDec>
                <isolClearCnt>3985</isolClearCnt>
                <isolIngCnt>962</isolIngCnt>
                <localOccCnt>21</localOccCnt>
                <overFlowCnt>2</overFlowCnt>
                <qurRate>51.32</qurRate>
                <seq>4455</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
            <item>
                <createDt>2020-09-21 10:04:45.365</createDt>
                <deathCnt>385</deathCnt>
                <defCnt>23045</defCnt>
                <gubun>ํ•ฉ๊ณ„</gubun>
                <gubunCn>ๅˆ่ฎก</gubunCn>
                <gubunEn>Total</gubunEn>
                <incDec>70</incDec>
                <isolClearCnt>20248</isolClearCnt>
                <isolIngCnt>2412</isolIngCnt>
                <localOccCnt>55</localOccCnt>
                <overFlowCnt>15</overFlowCnt>
                <qurRate>44.45</qurRate>
                <seq>4454</seq>
                <stdDay>2020๋…„ 09์›” 21์ผ 00์‹œ</stdDay>
                <updateDt>null</updateDt>
            </item>
        </items>
        <numOfRows>100</numOfRows>
        <pageNo>1</pageNo>
        <totalCount>19</totalCount>
    </body>
</response>

Webservice (Extra Info If Needed)

Itโ€™s in Korean, so, if needed, search for Javascript to find a sample to acquire the data. It needs a service key, so you wonโ€™t be able to test it.

hi @ciscorucinski,

Recently someone had a similar issue here trying to parse a RSS feed, which is essentially a XML document. Then @visnup made this notebook with a helper to convert a XML to JSON, which is the best format for your data on Observable.

You can import


import { xmlToJSON } from '@visnup/xml-to-json'

load the XML text

xmlString = fetch( `https://ciscorucinski-cors.herokuapp.com/http://openapi.data.go.kr/openapi/service/rest/Covid19/getCovid19SidoInfStateJson?serviceKey=${Secret('KEY_DATA.GO.KR')}&pageNo=1&numOfRows=100&startCreateDt=20200920&endCreateDt=20200920`
).then(res => res.text())

parse the data

data = xmlToJSON(xmlString)
items = data.response.body.items.item

here is live notebook in case you need https://observablehq.com/d/40d489d25fcf288b

1 Like