Transforming JSON to Tables: the Impedance Problem
This is a note from 2022 I found by accident today. Teaser: It will become relevant again, soon, so I want to share the rough idea with you today.
JSON data containing nested object lists within object lists, namely, arrays of objects within other arrays, cannot be mapped directly to a 2D table representation.
You need to cut off the depth at a certain level and make compromises.
The following tablular data, represented with Markdown table syntax:
| title | prices |
| ----- | ------- |
| Foo | [1,2,3] |
| Bar | [4,5] |
… could be mapped to JSON like so:
[{"title": "Foo",
"prices": [1,2,3]},
{"title": "Bar",
"prices": [4,5]}]
Using this technique, you can also incorporate one extra object level into a comparable table structure, but this is the extent of its capacity.
Here’s an array with an object that basically includes JSON similar to the one above:
[{"shop": "Foo",
"articles": [{"title": "Bar",
"price": 100},
{"title": "Baz",
"price": 200}]
}]
You can flatten the array of objects associated with "articles"
by collecting both values for the "title"
and "price"
keys:
| shop | articles.title | articles.price |
| ---- | -------------- | -------------- |
| Foo | ["Bar","Baz"] | [100,200] |
But it quickly grows beyond a scale that is human-readable as a table. You can imagine adding more properties to objects nested in "articles"
, e.g. a "date"
key, that becomes the column "articles.date"
in the table.
Adding extra columns is feasible – however, expanding the nested object array to encompass more than a few items renders the table format unwieldy. As shown above, it’s clear that "Bar"
and 100
, and "Baz"
and 200
correlate. But with 10 items? Identifying the 5th or 7th in the sequence becomes arduous, particularly if the titles lengthen.
Although it’s technically feasible to depict a 2-level deep nested JSON structure in such a manner, the method is inadequate.
This is the impedance problem all over again. An object–relational impedance mismatch. How do you map complex objects into (relational) table structures?
JSON itself is already a flexible object store doesn’t have (or need?) a table schema. It quickly grows out of a 2D projection, and using 2D projections limits the real world use of this attempt to very few .json
files.