May 20, 2014

On D3 Components

“The Ideas are boundaries or limits, they are definite, as opposed to indefinite Space, and impress or imprint themselves like rubber-stamps, or better, like moulds, upon Space, thus generating sensible things”

— Karl Popper, The Open Society

Discussion on web components is quite a contentious and shifting landscape. During the last couple of years, we’ve built a lot of components and explored various approaches. This is (now a relatively-outdated) article I wrote as an attempt to capture and share some distilled thoughts on one particular approach which we found to be surprisingly powerful in many respects. This is by no means a silver bullet, but hopefully you may find some of the techniques/concepts useful.

Enter The Closure #

To begin with, it’s worth mentioning that this article follows on from and assumes familiarity with the pattern outlined in this article (which won’t be re-evaluated here). In fact, we leverage a lot of philosophy from D3 as a foundation, hence why these are sometimes called “D3 Components”, or repos are prefixed with “d3-”. The conclusion of that argument can be best summarised as to implement components as “(I) closures with (II) getter-setter methods”.

Closures are used to bind the configuration to the function.
Getter-setter methods are used for reconfiguration that also enables method chaining.

A typical skeletal component would thus look like:

function createChart() {
  /* internal variables, with defaults */
  var width = 720
    , height = 80

  function chart() {
    /* generate chart here, using `width` and `height` */
  }

  /* getter-setter accessor methods: */
  chart.width = function(value) {
    if (!arguments.length) return width
    width = value
    return chart
  }

  chart.height = function(value) {
    if (!arguments.length) return height
    height = value
    return chart
  }

  return chart
}

And a typical method of invocation would look like:

var myChart = createChart()
  .width(720)
  .height(80)

d3.select(".chart").call(myChart)

The Hollywood Principle #

The first thing to notice is that we never store the selection. Being stateless is one desirable quality we’ll discuss later, but the key maxim here is actually “transformation, not representation”. These components should be viewed as transformation functions. More tangibly, the best metaphor I’ve found is “rubber-stamping” - a function that rubber-stamps the DOM and moves on. This makes it a very transparent layer (sometimes called representational-transparency), one that interoperates cleanly with web standards (HTML, SVG, CSS) - not hides them away. It shines the primary spotlight on the fact that your HTML is a very public part of your API - encouraging more semantic markup and thoughtful DOM footprint. This is in contrast to traditional components that are more encapuslation-heavy, a bag of state and contain more arbitrary abstractions - because that’s where all the attention is on (the component layer).

Another interpretation of this architectural principle is the Hollywood Principle, “a useful paradigm that assists in the development of code with high cohesion and low coupling that is easier to debug, maintain and test”. Normally, configuration and control would be handed over to a component which internally creates an explicit linear control path. This is the easiest thing to do in the short-term, but turns your component code to spaghetti in the long-term as the logic starts branching and it begins to excessively worry about cross-dependencies on the inside.

Update Pattern #

One phenomenon this sacrifice of monopoly of control visibly gives rise to is the inverted update idiom. For example, you might be used to the following update pattern:

var myChart = $('.chart').initChart()
myChart.width(300).mode(3).type(1)
myChart.update()

But this muddies the water a little around what a chart function is. Currently it is a function that operates on a selection (any selection). The update function spins this nice Hollywood principle on its head and clings on to a selection like its special. What for example would happen if the chart was called on two selections, and then update was called?

Keep it simple director.call(outOfWorkActor)

Lean Sub-Components #

The above is a way of thinking, rather than just a change in API invocations. So it applies recursively throughout the entire component. In fact, you can fractally unpack a whole app in this manner very neatly like lego blocks, rather than a traditional jenga tower. At every stage, each part is deliberately made blind to its brethren parts (low coupling), so that each one solely has one concern (high cohesion).

Take for example the following code which spawns 3 boxes for a filter sentence:

d3.select('ul')
  .selectAll('li')
  .data([{...}, {...}, {...}])
  .enter()
  .append('li')

ul
  li
  li
  li

The following renderer function is now given only a selection (this) to operate on, and the datum (d) relevant to it. This means it can now be tested and evolve in isolation without really affecting other parts of the code.

d3.selectAll('li')
  .each(renderer)

function renderer(d){
  // operate just on the node using 'this' and 'd'
}

Loosely Decoupled #

Breaking down large systems into modular pieces is a common refrain. But how often do we see this in existing web components? This is because the approach taken fundamentally does not lend it self to good modularity. Rather it encourages an outcome where they turn into one epically long super-transformer. It then becomes nigh on impossible to make a change without first understanding the entire code, and how you will be affecting 102389 other different features that depend on that change you are about to make. But in our case, modularity is highly incentivised. Consider the lookups as an example. A lookup is essentially made up of two-parts: a text-field and a drop-down. The main component wrapper simply “mixes-and-matches” different sub-components to create a whole “family of components”:

// a simple lookup
d3.select(".text-field").call(textField)
d3.select(".drop-down").call(singleDropdown)

// a grouped lookup
d3.select(".text-field").call(textField)
d3.select(".drop-down").call(groupedDropdown)

In this scenario, any developer can effortlessly contribute more drop-down or text-field sub-components to enrich the range of available flavours.

It’s worth mentioning that an insightful yet skeptical attitude to making decisions on functionality and modularity is still required. It is still logically conceivable to over-bloat a component into a super-transformer with unrelated sugary features or break down parts too finely into shrapnel. An example of the former would be adding remote data loading capabilities into a lookup component, whose primary concern is much more basic, and should in itself remain indifferent to where the array of data it receives comes from.

Declarative, not Imperative #

As aforementioned, one of the biggest difficulties with writing components is how to deal with exponential complexity as it grows over time. This problem of unmaintability and brittleness stem in large part from the strongly imperative nature of programming styles. For example, if I have a tree and click on a leaf, I instantly want to add a tick to all its parents. What do I do? Obviously reach for the $ and tick ‘em!

.grand-father
  .parent
    .child
    .child
  .parent
    .child
.grand-mother
...

function onClicked(){
  $('*').removeClass('ticked')
  $(this).addClass('ticked')
  recursivelyTick(this.parentNode)
} 

function recursivelyTick(){
  if (isInsideTree(this)) {
    $(this).addClass('ticked')
    recursivelyTick(this.parentNode)
  }
}

In contrast, declarative programming does not explicitly say what to do step-by-step. This principle very strongly underscores the idea of Data-Driven Documents (D3). With D3 and its concept of joins, you would commonly define a render sequence only once, and when called upon, that would be responsible for mutating the DOM to match a given dataset. We take this idea to heart in our components by having only one render sequence responsible for making changes. Thus, the more declarative solution to the above problem would be to not overreach our jurisdiction from that single item (creating unnecessary coupling), but from the confines of our context, simply change the datum, then call the redraw.

function onClicked(d){
  d.selected = true
  draw()
}

function draw(){
  ...
  li.classed('is-selected', isSelected)
    .classed('is-selected', isChildSelected)  
  ...
}

function isSelected(d) { 
  return d.selected 
}

function isChildSelected(d) {
  var children = [d].reduce(flatten, [])
    , ticked = children.some(isSelected)

  return ticked
}

Idempotent Components #

D3 Joins are usually used for data joins, but this is largely because that’s how they are portrayed in simple sketches where the rest of the structure is taken for granted. In reusable components, we take the idea of joins to the max by also using them for the generation of the structure of the component itself (typically, .data([0]). This means that with a traditional component like a jQuery plugin, if you try to initialise an element twice ($('div').initChart().initChart()) it would likely try to insert those structural elements twice and barf, or perhaps be blocked explicitly by an if guard. But repeatedly rubber-stamping an element with a D3 component is completely innocuous - and relatively cheap.

One effect this gives rise to is called the “re-render everything principle”. Because D3 calculates the difference between the current state and the desired state, then selectively applies operations (if at all), “re-render everything” suddenly becomes much more inexpensive (but not free).

This concept is somewhat analogous to React’s virtual DOM-diff'ing algorithm that achieves the same goal and behaviour, but with their own custom API. React, however, takes this into a nice abstraction and makes this much more performant by picking up even the most vagrant of techniques (like reparenting nodes), whereas none of that comes for free with D3. Beyond the tripartite division of enter/update/exit, further performance gains would be achieved by crafting further “subselections”. Performance aside, there are semantic reasons to be doing this.

rows.filter(isNormal).each(normal)
rows.filter(isCritical).each(critical)
rows.filter(isWarning).each(warning)

Event Communication #

Another major player in bringing about this unique decoupled environment is the event-messaging system. Parts of the component send and receive messages to each other using messages. For this, we simply (almost) just use the native d3.dispatch and d3.rebind techniques. This simplifies the main component into a middleware-like pattern for which the order of registering middleware becomes unimportant:

var lookup = createLookup()
  .textField(textField)
  .dropDown(dropDown)
  .anotherPlugin(anotherPlugin)
  ...

I say “almost”, since events are a bit harder to proxy because there’s no native equivalent for 'rebinding’ events from different dispatchers. You can roll your own little utility for that, but the basic practice of proxying dispatchers by manually subscribing to child events is normally fine.

These events are then broadcast on component itself, rather than the DOM, since it’s the component that takes the raw browser events and ‘understands’ them and decides what they mean and how they should be represented from its API perspective. It’s also more honest to not proxy the custom events through the node as it can become hard to understand where that particular event really came from.

Living Consensus #

These are some of the concepts we have found useful for authoring components, but nothing is ever set in stone. Our approach currently seems quite refined, although it may continue to change and adapt. There are a number of places where it does start to creak a little, and below are a few of the main areas areas for further discussion:

State Storage: Admittedly, they are not pure transformation functions. It is decoupled from the DOM (.select), the data (.datum), but some transient configuration state is still held in the function itself. Is the force layout the best example to follow? Or should the configuration state be devolved to the DOM too with __component__, similar to __data__ and __chart__?
Decoupling Events: More precisely, event sources from actions, using queues.
Referentially Apparent: It would be more inline with functional principles if the accessors didn’t mutate a configurable function, but returned a new immutable object every time.
Reactive: FRP is a powerful concept and these transformation functions could lend themselves very well to the ability to react to changes in data.
Real-Time Performance: Where realtime performance is important, use requestAnimationFrame to batch and defer redraws.

Twitter: @pemrouz

Discuss on HN