mgrafeeds [fuelled by mgrafit.com]

Reviews, snippets, opinions and other gems on data analysis and visualization in clinical pharmacology (and else)
Ask me anything
Submit

Nice programming fonts

Droid sans mono: The Droid font family (available for download here) is a nice font family. It’s got a bit of flair, and stands out among the other monospace fonts I’ve listed, and its only real flaw is the lack of a slashed zero.

Inconsolata Inconsolata is designed to be used with anti-aliasing enabled, but it’s surprisingly legible even at very small sizes.

Upcoming talk on Interactive Graphics (incl. d3js examples) at ISCB34

The International Society for Clinical Biostatistics was founded in 1978 to stimulate research into the principles and methodology used in the design and analysis of clinical research and to increase the relevance of statistical theory to the real world of clinical medicine.

Between Aug 25th and Aug 29th, 2013, the 34th annual conference of the ISCB will be held in Munich, Germany.

The program is rich, the audience massive. In the afternoon of Monday 25th, I will have the pleasure to present short talk (15 min) on “Interactive graphics to improve communication of statistics models” in Session C 05.

The objective of my talk is to raise awareness on interactive graphics which are VERY seldom used in clinical biostatistics. I hope this talk will ignite some desire to know more about interactive graphics, and D3.js, in the clinical biostatistician’s community.

Hopefully this after-lunch talk will be “refreshing”, so please come and meet me at the occasion of this event, in Munich, on Aug 25th 2013 from 2 to 3 pm.

Learning D3 by the book

image

D3 books are piling up! Here are the references available for whoever is interested in learning d3.js by the book:

  1. Getting Started with D3” – O’Reilly, by Mike Dewar, approx. $US 18.00 is introducing you to D3 through baby-steps. The code repository accompanying the book is available here.
  2. Interactive Data Visualization for the Web” – O’Reilly, by Scott Murray (alignedleft), FREE. This book is going through the same introductory chapters than Mike Dewar and is going a deeper in certain chapters like Updates, Transitions and Motion and Interactivity. Some of the examples covered in this book are available online here.
  3. Developing a D3.js edge – constructing reusable D3 components and charts” – Bleeding Edge Press, by Christophe Viau (d3visualization), Andrew Thornton (graftdata), Ger Hobbelt (ger_hobbelt) and Roland Dunn (roland_dunn), approx. $US 15.00  is a more advanced reading for people who are already proficient with JS and D3.js. The objective is to give the keys for developing your own reusable charts using D3.
  4. D3 Tips and Tricks” – Leanpub, by Malcolm Maclean (d3noob), almost FREE (actually suggested price $US 1.00), with >270 pages, is a rather comprehensive document which provided some code examples for various types of graphics (basic bar chart as well as Sankey diagram, bullet chart, etc …).

Coming soon (August/September 2013)

  • Data Visualization with d3.js” – Packt Publishing, by Swizec Teller, approx. $US 30.00; ISBN:978-1-78216-000-7; 175 pages.
  • “Data Visualization with D3 Cookbook”, by Nick Qi Zhu (author of library dc.js), ISBN: 978-1-78216-216-2; The source code is available online here.

D3js (meta-)libraries: a contrasted landscape

image

As D3 is gaining momentum, more people are joining the D3-community and more training material is being shared (especially training material for beginners). Yet, the initial D3 learning phase remains a major hurdle for many. As a consequence, potential users (especially those who are not so literate in web-programming) could fail to embrace this new technology.

D3.js is a versatile library to build (simple or) sophisticated graphics with animations and/or interactions, and other ‘visual effects’ in a web page. However, such sophistication comes at a certain price: the JS code can be long, easily >300 lines, and its development labor-intensive.

To address these two issues, D3 “meta-libraries” have been developed to make the programming of interactive graphics simpler (less verbose) and therefore shorter (less lines), which means less error-prone as well. Here is the list (in alphabetic order) of D3 meta-libraries that are current available (AFAIK):

The main limitation when you use these meta-libraries is a reduced flexibility in designing new charts:

  • the choice of charts which can be built is inevitably less than what can be achieved with D3 itself,
  • the aesthetic features are often fixed or difficult to change,
  • the interaction or animation modalities are limited.

Nevertheless, it may well be that 30% to 50% (this is a guesstimate) of the charts you are preparing on a daily basis could be prepared using one of these libraries.

These various tools have been designed by different people to address different needs, targeting different audiences probably. Hence, it is indeed difficult to compare them bluntly and this post deserves a word of caution: the best way to decide if you should use one of the above library and if so, which one is to try them all :)).

If polychart, nvd3, datawrapper, rickshaw and dc are relatively mature libraries, (it is our impression that) d3.chart, d3-generator, dexcharts, xcharts, vega, and (to some extent) dimple are still in the infancy (or adolescence). As a consequence, we decided to focus our review on the first 5.

Polychart.js was developed by Polychart (the company founded in Jan-2012 and led by Lisa Zhang, Fravic Fernando and Samson Hu). Polychart.js is a commercial product ($99/license). This product has dependencies with d3.js and raphael.js. Attention! There is a non-commercial library called Polychart2, but this version has raphael.js, underscore.js and moment.js dependencies … totally different story!

The documentation of polychart.js is available in a wiki page on github. The examples provided on github are diverse and simple. We may perhaps regret not to see more examples though (18 case-studies). Similarly the API is very concise and would probably benefit from further explanations and examples – easier to say than to do, I know!

A neat feature of polychart is the faceting (i.e. possibility to create small multiple plots).

The development of Polychart.js seems to have slowed down these days. It is unclear what the strategy of the company is. May be they will put more effort in consolidating Polychart2 rather than on its D3 commercial counterpart.

Dc.js is consists in a thin layer of functions added on top of D3 to develop re-usable charts rapidly. This free (non-commercial) charting library was built with crossfilter and D3. Version 1.0 was released 7 months ago so this library is still young. The main author is Nick Qi Zhu. The API consists in a detailed walk-trough the various options available for each graph type (dc.pieChart, dc.barChart, dc.rowChart, dc.lineChart, dc.bubbleChart, dc.geoChloroplethChart, dc.dataTable). Unlike Polychart or Datawrapper, Dc.js requires a basic knowledge of D3.

Here again, the API could perhaps be improved by presenting each option as a standalone feature, independent of any particular graph rather than bound to a specific chart. More examples would also increase the value of this library, as well as a better aesthetics … although this is a matter of taste of course.

Rickshaw.js is another free and open source D3 meta-library for creating interactive “time series” graphs, says the headline on the website. Actually, it is not specific to time series; any series of data would do. This toolkit will allow you to render area chart, line plots, bar charts, or scatterplots, with a control on some other features like axes and tick marks, interactive legend, annotations, or range sliders. It was developed at Shutterstock, mainly by David Chester, with a first version created in October 2011.

The API is clear and links to standalone examples, which is nice. However the grammar of Richshaw is quite different from the one of D3, which may be a bit disturbing and require some time to learn.

The other weakness for this library comes again from the limited number of examples; only 15 case studies are proposed.

The strategy of the rickshaw team regarding further developments of this library is also not clear (to the end-users at least) at the moment.

NVD3.js was initially created by Bob Monteverde; then his company, Novus Partners, took over the control of the project, and decided to continue the development of this free and open-source library. This is the oldest and perhaps the most accomplished D3 meta-library, as of today. The development of this library started in 2011 and the current version is nvd3-v1.0.0-beta. There is no API. Instead 12 online examples are presented together with the corresponding code, and about 30 examples are joined in the Example folder when the library is uploaded. The range of charts covered in this library is pretty large: simple line or area chart, scatterplot, bar plot (horizontal or vertical, stacked or side-by-side), piecharts, bullet chart, or combinations of these. The design and ergonomy of this library are pretty up-scale, with nice tooltips on mouse over.

Datawrapper was created by Mirko Lorenz, Nicolas Kayser-Bril and Gregor Aisch. The project started at the same time as NVD3 in Feb-2011, and was released in Jan-2012. Version 1.0 was launched in Nov-2012 and rapidly spread throughout the community of data journalists. The tool is open source, design to create simple, embeddable charts in a couple of minutes in a webpage.

One caveat is that you have to upload your data on the Datawrapper website before proceeding with the preparation of the graphic via a GUI. Three types of graphical object can be displayed: lines, bars (or columns), pie (or donut). The design and ergonomy of this library are elegant and successful, with nice tooltips on mouse over, like in NVD3.

What is great about Datawrapper is the large amount of examples. It looks like journalists are big fans of bar charts and pie charts … not so much of scatterplots though!

Some comments

These libraries are useful complements of D3 for several reasons. We can use them to

  • save time in sketching some ideas of interactive graphics
  • buy some time! e.g. sketching on nvd3, refining in D3
  • become familiar with (core) D3

As recently illustrated by Mike Bostock, examples are crucial to explain/educate new users to a new technology. More examples of charts prepared with nvd3, rickshaw and dc would probably be welcome!

It seems difficult to develop a library without a solid supporting team and management endorsement. Novus (nvd3), Shutterstock (rickshaw), Trifacta (vega), are concrete examples are such engagement.

Finally, I would like to thank and congratulate the authors of these libraries for their commitment and efforts, in particular those who are not directly funded for this work.

I hope this post will encourage people in trying some of these libraries.

Several projects have been released in the past 6 months which present data on Paris using interactive maps. The most recent and most impressive one (IMHO) is a comprehensive map http://dataparis.io/#, prepared by 4 students from the HETIC. Despite some minor defaults (like the initial pseudo 3D effect which distorts the map or the suboptimal color coding of the dots), this is a nice display of data coming from the 2009 INSEE census. In particular, it offers the possibility to display data along a subway line, giving to the user a sense of the gradient of population navigating in the capital.
Another interesting project was presented earlier this year by Etienne Come, http://www.comeetie.fr/, who did an extensive work on “velib”, the bicycle-sharing system implemented in Paris. His work takes root in data mining applied to transportation systems and it led him to develop an interactive map displaying the (bike) traffic density over time (hour/day). For instance, for those of you who know Paris, it makes sense to see that the Bac station is always busy, while the ButtesChaumont station is clear during the week and booked over the week-ends.
Finally, two similar projects were dedicated to the subway in Paris, mainly answering the questions: What’s the best option to go from point A to point B using the metro? How fast will it be? These two projects were released by Jerome Cukier (http://www.jeromecukier.net/projects/metro/map.html ) in January this year, and two months later by Dataveyes (http://metropolitain.io/). I’m not a big fan of the “re-centered” maps, as I feel a bit lost with these new subway maps which are disconnected from any geographic reality. Two nice features in the Dataveyes work though are the so-called ‘Time View’ and ‘Crowd view’ which are useful information for the Paris’ metro users.
 
Some additional links:
"Ou manger en terrasse a Paris" - link
 Paris branche - link
 

Several projects have been released in the past 6 months which present data on Paris using interactive maps. The most recent and most impressive one (IMHO) is a comprehensive map http://dataparis.io/#, prepared by 4 students from the HETIC. Despite some minor defaults (like the initial pseudo 3D effect which distorts the map or the suboptimal color coding of the dots), this is a nice display of data coming from the 2009 INSEE census. In particular, it offers the possibility to display data along a subway line, giving to the user a sense of the gradient of population navigating in the capital.

Another interesting project was presented earlier this year by Etienne Come, http://www.comeetie.fr/, who did an extensive work on “velib”, the bicycle-sharing system implemented in Paris. His work takes root in data mining applied to transportation systems and it led him to develop an interactive map displaying the (bike) traffic density over time (hour/day). For instance, for those of you who know Paris, it makes sense to see that the Bac station is always busy, while the ButtesChaumont station is clear during the week and booked over the week-ends.

Finally, two similar projects were dedicated to the subway in Paris, mainly answering the questions: What’s the best option to go from point A to point B using the metro? How fast will it be? These two projects were released by Jerome Cukier (http://www.jeromecukier.net/projects/metro/map.html ) in January this year, and two months later by Dataveyes (http://metropolitain.io/). I’m not a big fan of the “re-centered” maps, as I feel a bit lost with these new subway maps which are disconnected from any geographic reality. Two nice features in the Dataveyes work though are the so-called ‘Time View’ and ‘Crowd view’ which are useful information for the Paris’ metro users.

 

Some additional links:

  • "Ou manger en terrasse a Paris" - link
  •  Paris branche - link

 

Rolling out the R versions

How many lives for the R-project? See how many versions have been released so far: R timeline … Quite amazing actually.

More Information