This post will be a kind of a scratchpad where I try to draw and describe the goal and the steps necessary to build a data driven visualization web app. I have in mind two completely unrelated datasets, but datasets that I am kind of comfortable with which I feel is important.

Building a Data Driven Visualization Web App

This post will be a kind of a scratchpad where I try to draw and describe the goal and the steps necessary to build a data driven visualization web app. I have in mind two completely unrelated datasets, but datasets that I am kind of comfortable with which I feel is important.

The first one is a dataset of NBA players and statistics and the second one would be the Foreign Trade data of some countries. In this post I will focus on the NBA dataset.

Let me quickly describe what I have in mind. I want to be able to quickly compare 2 or 3 or 4 players - I guess I'll figure out the best number - visually with their key stats: points per game, assists, rebounds, shooting percentage and so on.

I will start by examining the available data from various sources, namely the NBA website and figuring ways to scrape the data periodically or on demand into a database, SQL or NoSQL. The second phase, once we establish what kind of data is available, will be to draw or sketch (by hand!) the necessary charts: bar charts, scatter plots, maybe some custom chart. We'll figure what goes best, trying as much as possible to stay within the recommendations and style guides of a few selected books that I will duly cite in the post(s).

Finally, the tech stack! I have recently discovered that there is a brand new way of implementing D3.js visualizations by "enclosing" the data visualization into a React app. While the process isn't straightforward and there are important decisions to be made, the benefits seem to be overwhelming, especially when it comes to implementing interactivity. React hooks provide a clean and neat mechanism for loading, selecting and filtering data, while D3.js provides it's own toolset of visualizations, DOM manipulations and effects. As we will see later, and I will provide a couple of really cool and informative videos, the path should be interesting and there will be numerous decisions to be made along the way. React and D3 have a fair overlapping when it comes to managing the elements on the page and I will try to use them in a way that optimizes clarity and development time.

As far as the backend is concerned, the scraped NBA data has to go somewhere and there we face several choices:

  • a simple CSV or JSON file that would contain all the scraped data and could be parsed directly from D3.js
  • a CSV file turned into an API using something like FastAPI (ultra hip python REST framework)
  • a plain old SQL database, I'm thinking sqlite
  • a mongodb instance with json output

Since I will definitely need to do some querying, i.e. pull the dataset regarding only a single player or three players in order to compare them visually, I will need some kind of API that will enable me to do so. I want to try the minimal resistance path and try to get something up and running as fast as possible, so I will probably start with a static CSV or JSON file and later figure a way to serve the data in a smarter way.