The Big Table Issue

by Jin, 01-08-09 // 25 comments

My friend/ex-coworker Sam the Wonder Boy used to send me late night AIM messages comprised of only three letters, “M.F.R.” This would then send chills up my spine, and I’d curl up on the floor in fetal position and weep nonstop.

OK, I’m exaggerating a quite bit, and Sam doesn’t do that anymore.

“M.F.R.” stands for “Monthly Forecast Report.” It was one of the many modules of a huge intranet web application we worked on years ago. As the name implies, it was a report. Before arriving to the actual report screen, the user could select some criteria such as date range, products, etc. Depending on the selection, the report can have up to sixty columns and thousands of rows. It was quite a challenge both on the backend and frontend. I initially created it using server side Excel API and dumped it to the frontend as an excel sheet. In version 2 I made it as an HTML table with the Excel export option. In version 3 I ditched HTML and went for Crystal Report, in version 3.5 it became Active Report. In version 4 we rewrote the whole application as a .NET client app, with the report section being Excel again. In version 5, well there wasn’t a version 5. The whole project got outsourced to India and the team was disbanded. But that’s a blog for another day.

The Big Table Problem

In the perfect world of a web developer’s life, a data table would have a few columns and rows so it’d fit perfectly in our content area and blends with the rest of the UI elements seamlessly. But in reality, a data table can get very big and ugly especially in financial web applications.

big table

The problem is when the table gets big it no longer fits in the viewport. A viewport can be the whole area of the browser window, or a container area within the browser window(e.g. a scrollable DIV). To see the rest of the data the user would have to scroll. However this causes the user to lose sight of the column headers and the first column.

Another problem is performance. A huge recordset takes a while to generate on the backend, and even longer to render to the frontend.

The Approach

I feel as UI developers/designers, our first role is a problem solver. This means the solution doesn’t always lie in HTML/CSS/JS. Looking at the situation at hand, is HTML the best medium? I have to say no, that’s why I opted for Excel first. The benefit of using Excel:

However, there are downsides to using Excel:

Let’s look at other alternatives.

Active Reports and Crystal Reports are two popular enterprise level reporting tools. I used both of them with .Net. Both software have RIA report viewers using ActiveX.

Pros:

Cons:

Of course, the approaches I listed are for Microsoft shops. There may be other non-MS solutions I’m not familiar with.

Excel is my favorite so far. However, there are times when you’re stuck doing it the HTML way, which is what the rest of this article is about.

HTML Solutions

We have three issues: vertical, horizontal, and performance.

Vertical is the easiest to solve. First, we can lock the column headers, so when the user scrolls down they’d still see what data they’re looking at.

When the table has hundreds or thousands of rows, it’s best not to display them all at once. Not only is it a performance hog, but it’s also rather meaningless from a usability POV. Pagination is the most commonly used. One of the best examples of pagination combined with AJAX I’ve seen is done by Blizzard’s World of Warcraft Armory page. The filtering feature allows the user quickly find data on the fly.

Warcraft Armory Table

Another method without using pagination is, to load just enough rows to over fill the viewport area. Then when the user scrolls a bit more, it loads additional rows using AJAX. Google reader does this when loading feeds. I feel this is the best way for front-load performance. However, it may not be feasible where the user needs to see additional data quickly.

Google Reader

Horizontal scrolling issue, unfortunately is something I haven’t been able to solve despite much research. Ideally, we should be able to lock the first column(typically the identifier column), so when scrolling sideways, the user can easily identify which row they’re looking at. As of now, we can’t freeze column easily. During my research, I’ve run into a few sites that came close, but all with drawbacks.

My Proposal

I think the best way is to have the browser to have the ability to lock columns natively. Unfortunately as of now, COL and COLGROUP have very little support other than few CSS attributes. Ideally, to freeze a column, I’d like it to be as easy as:

COL.freeze, COLGROUP.freeze {
   position: fixed/relative; /* lock */
   display: none/inline/block etc /*so we can toggle */
   left: x;
   stacking-order: n; /* allows multiple freezing */
}

I’ve been reading up on HTML5 and CSS3 specs. I don’t see any evidence yet that much, or any improvement will be done with COL/COLGROUP in the near future. One may argue that this is asking too much, after all HTML/CSS is for presentation only. But seeing the advanced selectors in CSS3, I believe the consensus is to have CSS replace some of the workarounds that are currently done by Javascript.

If you have any suggestions/solutions to this problem, I’d LOVE to hear from you.

update: this post has been translated to Serbo-Croatian language, courtesy of Jovana Milutinovich.

25 comments

also feel free to contact me on twitter or via email
Victor Boba 01-08-09

AHhhhhhhhhhhh….the lovely MFR functionality. As one of the middle-tier developers on that project you mentioned, I never did quite understand WHY it took you front end guys to get around to coming up with a solution. Now that I too am working in the UI more and more I understand. The web is great for basic display of data, just not TOO much data. It seems that the web is still not up to what you can do on a windows client installed application. Maybe someday. Then all of us Windows guys will be out of a job I guess.

Good job Jin. I look forward to working with you again someday soon.

Jin 01-08-09

Hey Victor, good times. To be honest, I don’t think tabular is the only way to go. But in the situation of MFR it was specifically requested by the stake holders, who were Excel powerusers. They wanted HTML to mimic Excel.

Charlie 01-08-09

Web designers are horizontally challenged! True story.

BobJ 01-08-09

“The web is great for basic display of data, just not TOO much data. It seems that the web is still not up to what you can do on a windows client installed application.”

QFT 100% — we are constantly using the Web for what it was 100% not meant to do in large data delivery systems. Jin can attest to that big time when he was back here in NC; some of the stuff we put out defies reason. And trying to get smart clients promoted… dont get me started.

But whats the goal though, a “general” solution? I mean, you can certainly find “client-like-grid-controls” for .NET that are ok to use (minus that you have to learn someone other person’s code methodology), but that’s obiously no good if you aren’t in a .NET environment.

Generally if it fits, I like just seperating the X and Y axis out of any table/div and then placing the grid (Data Only) where it fits, IF it will line up ok. But that removes any ability to dynamically expand/contract cell size

Janko 01-08-09

Now that’s quite challenging problem (I mean horizontal issue). I am fully aware of it since I develop financial-related apps :)

Some of the examples you mentioned uses row selection (painting selected row bg color) but they also uses nested tables which just won’t help. So, making row selection is quite good (although not good enough) solution.

Anyway, great articile, Jin!

Joel Laumans 01-08-09

Thanks for this post Jin, good read about tables!

Just wanted to let you know I posted a link to your site on: http://creatinginspiration.net

[...] The Big Table Problem | 8164 What is the best way to display a huge table (vertically and horizontally)? (tags: webdesign webdevelopment css table javascript) [...]

vvvlad 01-09-09

Hi,
I use combination of couple of ideas to solve the issue of horizontal length.
My solution is not perfect for any situation, but whenever I needed, it was perfect.

1. If you cant show the whole row – show it in 2,3,4… rows
This solves number of problems. one of the the horizontal length, the other is the problem of always showing the title row.
2. Not all info is equally important, so some of it can be hidden.

So what I got:

———————————————————————–
Visible area button
———————————————————————–
========================================

Hidden area (hide/reveal by clicking the button)

========================================

Hope this sounds reasonable,
Vlad

Jin 01-09-09

@Charlie, that’d be true, in this case. :)

@Joel, thanks for the link. I corrected your URL and combined your comments.

@Bob, @Janko, I think the issue here is, trying to make webapps, act and behave like client apps people are used to. HTML started out to have some basic client app features. Form elements are good examples.

Of course as the web now is a framework and this calls for more complex support on CSS/HTML side. I’ve done a lot of “hacks” in the past to get things done. But I always wished HTML would just support it natively. Then again, this begs the question “How far do we go with HTML?” Shouldn’t RIA be the solution? I honestly don’t have the right answer for that. But I believe in using the right tool for the right job. For intranet app, I don’t think any RIA would be a problem as long as it’s the company standard. The challenge lies in making the webapp for general public, that’s when compatible/accessibility becomes an issue.

@Vlad, I’m definitely for not showing all Columns, unless absolutely needed. I like your solution. This is something I’ve tried in the past. Instead of showing data in a single row in tabular format, I converted the row information into something more “organic.” Then this can be done as a template in a .net repeater, instead of using a datagrid.

I think visually it’s more pleasing, however as you said, it all depends on the situation. In my original financial report situation, the users definitely wanted a more spreadsheet look, where if i converted them to multiple rows it take up way too much V space. Also the sheet had Rowspans too. very nasty.

Thanks for your comments.

Martinos 01-09-09

Maybe you could use the google visualization table, I didn’t try it, but it may help you.
http://code.google.com/apis/visualization/documentation/gallery/table.html

Steven Clark 01-10-09

From an experience point of view the real important question is understandability from the user’s perspective, rather than simply displaying four columns at a time or technically achieving simulation. For example, in a similar situation with large data sets a team I worked with were jamming about 50 columns and 100 rows of intense numeric data into a scrolling section about 200px by 300px – which made for a very poor user experience. The real challenge is in representing the data in a way that correlations remain and can be logically retained by most humans.

So large data sets aren’t really the solution IMO, at a certain point it might be better passing it back to the client with a question about why this for content?

In the case of the example I mentioned, maybe 1 scientist would look at that data set a year (optimistically) and be disappointed. Now the cost of doing that for that one person? The web and computer interfaces aren’t necessarily the place to emulate applications like Excel, really. Unfortunately it’s business pressures and managers who will insist – my only defense is really the one about “show me the users” and “translate that to cost”.

Another example I’d suggest. If this dataset were put into a book with each page having say 4 columns and 20 rows – how would that book read? Would it still be understandable? Or just pages of hard to relate back gibberish that had you flipping back and forward?

From a user experience perspective the large dataset, if it’s not going to be used or usable by many people, is just interface noise the 99% have to negotiate.

Which brings us back to the question – what do you do with clients who want everything on the web? So, given my head at least, I’d be inclined to toss large datasets into a downloadable file, or omit them.

BTW picked up my nifty fifty a few days ago and went to the market yesterday. Awesome.

Jin 01-10-09

@Martinos, thanks for the link. It has some useful information, however it doesn’t have the solution to freeze panes. A good read regardless.

@Steven, Agree the grid format isn’t always the first solution to be considered. My take on it: it depends on the type of data and users in question. For example, if most data are pure numerical, and each row shows some sort of relationship to each other (e.g. monthly stats, price increments etc) then I think a tabular form is still the better way to go. This allows quick scan to see the trends. Also a filtering system in place is nice too, this allows the user to customize what needs to be displayed.

On the other hand, if each row is pure individual record, for example (first/last name, address, email, url, phone, add. info1, 2, 3 etc) then I think a more user friendly way would be displaying the data in a non-tabular form. Simply a well formatted block would do, with clear distinction from the next record.

Grats on your 50mm! I’m having a lot of fun with it. Finally getting used to having to walk in/out to adjust. the kit lens spoiled me somewhat… :)

Steven Clark 01-10-09

Yes I’m enjoying the 50mm, it adds an immediacy to the photography process so you can’t stand outside of it. I’m trying to get used to that immediacy in street photos… slowly building my nerve.

My point with the tabular data is more an Edward Tufte question. If you can’t see column 4 for reference from column 12, for example ,then it quickly becomes quite a useless dataset anyway. Similarly your column headers and row headers would need to stay in the view… it quickly becomes like the analogy to the book where you have to flick back and forth.

But true, it depends on the data set. It also depends on the size of the user’s screen and the expectation of usefulness – for example, is there a valid user persona who would be used to justify the dataset on the website / in the application? If so, does the cost of generation and display add up in dollars to the effort involved in achieving it. Lots of times those two answers would exclude the dataset (hopefully).

The only time I’ve seen these occur, and my experience there is rather limited I confess, is when management or some driving owner of the dataset is just convinced it will WOW everyone. The Edward Tufte questions then would be “is it still understandable in context when displayed to a general user” and probably also “is it real content that users want or something that the organisation feels compelled to provide”?

However, we can think of a hundred sites straight off the bat where all the questions are yes yes big dataset is required… essential even – census statistics, weather, sports, etc. So I’m probably more generalising about this… :)

steve 01-11-09

you obviously didnt read his post, but I dont blame you.

Not only can’t browsers handle thousands of rows at a time, humans cant either!

It all boils down to what I like to call, “ACTIONABLE INTELLIGENCE” and a table with 20,000 results is just not actionable, i.e. you cant DO anything with it. you can barely read it, and you cant visually verify the data ’cause there is just too much (Using intelligent statistical sampling methods you can take a few hundred records and verify them independently).

It needs to be presented as:
1) A data dump download, readable and parseable by client-side tools.
2) A summary that describes the data
: We have 50 columns, Name,sex,age,height…. and 10,000 rows
: we can drill down or filter/sort by Name,sex,height and reigion and any combination of those
…. in essence… a THUMBNAIL! :)

Steven Clark 01-12-09

Yep I read the post, and I agree with what you’re saying… I just said it slightly differently. I like your term “actionable intelligence”, it fits well and I seem to have heard it before somewhere… My point really was that large data sets are unusable, and even more often they are there for stakeholder appeasement rather than desire to meet a user need. Apologies if I rambled that into a confused muddle…

Steven Clark 01-12-09

Although, I confess it was a scan read because it was a long post. I didn’t spend that much time Steve :)

Martinos 01-12-09
Jin 01-13-09

Hi Martinos. I’ve experimented with YUI before. I didn’t see any solutions they have for freezing columns.

Dmitry 01-13-09

Hmm.. I’ve never had to create an interface for such large data sets, so I’ve honestly no idea how to best approach this, especially the column locking issue. However, I think Steven and steve are spot on. The data here is overwhelming and I don’t see how a human can process it if it can’t even fit on one screen.

To design an interface, you need to know exactly what people want to do with it. You don’t need to focus on the content as much as the process that the user will go through. What are they going to look at on this table? What are they going to compare? etc. Once you know what people really need, the resulting interface may not even be a table. Maybe all they actually want is a set of charts.

I quite like the way Google Analytics presents their data. The data itself is overwhelming, but Google crunches it down into focused graphs and charts, showing you how different areas of your website perform. You can customize the graphics on the fly, selecting the fields you want to see, as well as their presentation (bar, pie, line etc).

I think the best UI for a huge table isn’t a table at all, but a presentation interface and a search interface. The faster you can get meaningful data out, rather than all of the data, the more useful it would be.

Just my opinion though ;-)

[...] The Big Table Issue [...]

Telerik has a great grid control with column & row freeze that works on most major browsers. Here’s a live demo: http://demos.telerik.com/aspnet-ajax/grid/examples/client/scrolling/defaultcs.aspx

Marion 08-11-09

I had found a great option for this, but now I can’t find it! Basically, it enabled the viewer to collapse a series of columns- which was perfect for what I have in mind….

While trying to find that solution again, I came across this one:
http://www.massless.org/%5Ftests/grid1/?s=1&r=40&c=40

[...] I missed something please let me know! I also recommend you reading two more articles about tables: Big Table issue that tries to find an solution for tables that are so big they no longer fit in the viewport, and [...]

Junior 10-04-12

Sure, but any kind of CSV format, once ictsenped, is still easier to write a parser for than PDF. And, arguably, better than XML for most applications. Perl’s module, which comes with the distribution by default, handles all the issues you describe above except those that are Excel’s fault. And quoting fields that aren’t really numeric (eg. zipcodes, phone numbers) will fix most of those.It’s not that you can’t make some godawful messes with CSV, but they’re *still* more parsable than PDF.