Archive for the ‘Charting’ Category.

Updated Box and Whisker Chart Utility

In New Charting Utility - Box and Whisker Charts I introduced a utility for making simple box and whisker charts. In the intervening 14 months I have made a number of enhancements to this utility, including the ability to display outliers like the big expensive stats packages and a number of additional formatting options, and I’ve eliminated some bugs and enabled the utility to run without crashing in non-English versions of Excel (the utility uses English labels in all versions, but at least it doesn’t crash).

The Box and Whisker Charts utility is designed to work in Excel versions 2000, 2002, and 2003. It has not been tested in any Macintosh version or in Excel 97, and because of the older version of VBA in these products, the utility is not expected to run. The utility has only been lightly tested in Excel 2007, but it seems to work. This Box and Whisker Charts utility should be considered a beta version, and it is available at no cost in exchange for useful feedback.

The next version will most likely be incorporated into a commercial Advanced Charting utility. In addition to the features shown here, it will allow charting from computed statistics (the utility now uses the raw data as its input), and it will allow VBA programmers to call its methods from other VBA procedures.

The Box and Whisker Charts dialog is shown below:

Box Chart Utility Dialog

Typical output, in the form of a vertically oriented chart and a table, is shown here:

Vertical Box Chart

The chart is also available horizontally oriented:

Horizontal Box Chart

There are three styles available: Box and Whisker Quartiles, Four-Box Quartiles, and Box and Whisker with Outliers:

Box and Whisker QuartilesFour-Box QuartilesBox and Whisker with Outliers

The utility can be downloaded from Box and Whisker Plots in the form of a zipped exe installation file. This page has instructions for installation and use. Any comments and suggestions are appreciated.

The Great Chart Debate

J-Walk and The Skeptical Optimist have been discussing charts lately. Methods in Excel even has a recent post about charting, but I doubt there’s any controversy over his findings. It’s high time I offered my unsolicited opinion.

Steve linked to an SAP guide to charts, which I hadn’t seen before but which looked interesting. I generally agree with everything on that page. If the data is showing proportions to a whole, a pie chart is better suited than a bar chart, in my opinion. When I see a pie chart, I know that the circle represents the whole of something and that each slice is someone or something’s share of it. I don’t think that’s abundantly clear with a bar chart. Therefore, I think a pie chart is most appropriate for the national debt data.

What’s not appropriate for any pie chart is too many data points. I think the limit is about six. More than that and it’s too hard to tell what the proportions are.

The first question that you should ask yourself when creating a chart is “What am I trying to sell and to whom am I trying to sell it?”. Not trying to sell something? Then why make a chart. You may be trying to convince you readers that the foreign owned portion of the national debt is not as high as they may have heard. You may be trying to convince your boss that your budget overages were necessary. You may be trying to convince yourself that you can afford that new tablet pc.

I like the pie-to-bar chart that J-Walk made, but the bar portion is unnecessary for this application. The sell job is the proportion of foreign owned debt, not how much is owned by Japan or China - so don’t put that information on there. If you think it’s interesting, make another chart. Here’s my proposal:

two charts

As you can see, my chart formatting skills are lacking, so consider this a prototype. Thoughts?

XY Chart Labeler Version 7.0

I’ve just posted version 7.0 of the XY Chart Labeler on my web site. This is the first significant upgrade to this utility in almost five years. Because the XY Chart Labeler was already a mature, stable utility prior to version 7, most new features in this version are background improvements that provide better performance for pre-existing features. However, several new minor features make their debut in version 7, including:

  • Smart Selection - If you select the chart series, data point or label you want to operate on, the XY Chart Labeler will recognize the selection and pre-select the appropriate entries in its dialogs.
  • Active Selection - As you select chart series and data points in the various chart labeler dialog dropdowns, the objects you select will also be selected in the chart below, so you can see exactly what you’ll be modifying.
  • Larger Dialogs - The width of all XY Chart Labeler dialogs has been increased in order to improve the display of wide entries.
  • Automation - In this version, the core chart series labeling logic has been exposed as a public function that can be called from other Excel VBA programs.
  • Help File - I’ve written a comprehensive help file and example files for this version.

I’m making no promises regarding Excel 2007 compatability other than the chart labeler will work in the simplest cases. I’ll evaluate this policy again once we see a service pack or two. There will be a German translation available shortly courtesy of Excel MVP Thomas Ramel. You can get the latest version of the XY Chart Labeler here:

http://www.appspro.com/Utilities/ChartLabeler.htm

The Future of Excel Charting

With the recent Excel beta, which now has become a released product, I’ve been thinking about how Microsoft can enhance the charting experience. In Excel 2007, there were precious few features added to charts. The new chart formatting is certainly nice, but the new user interface is awkward. The only real enhancement to charts in Excel 2007 is that the minimum and maximum of a logarithmic axis scale are no longer tied to powers of ten. However, we’ve given up a few things, some perhaps just to bugs which crept in during the complete restructuring of the charting module. The Size with Window setting for chart sheets was deprecated purposefully, and I’m hoping they bring it back. In Excel 2007, among the things I consider bugs is the inability of an XY series to be plotted on a line chart’s category axis, which means you need to use a secondary axis for the XY series, and that means the secondary axis is no longer available for another series that provides an additional effect.

Part of my thinking has been about new chart types that should be built into Excel. I’ve decided that Microsoft should not spend much effort to recreate the chart types that clever people can build already with combinations of existing Excel chart types. One type of chart that Excel does not do is a 3D XYZ type of chart. To fake this requires some real heroics, and I think Microsoft should develop a 3D XYZ chart type, with points only and also with a surface. To fake a histogram, waterfall chart, box and whisker chart, and other similar charts, on the other hand, requires only careful assembly of individual series, and I think Microsoft’s efforts can be more wisely spent in other ways, rather than trying to make everyone happy with the difficult design decisions. Add examples, templates, and tutorials to the MS Knowledge Base, and include links to pages on the web sites of charting heroes.

My feeling is that Microsoft should add features to charts that enable more of the fancy combinations we can come up with. I have a fuzzy list of items in my mind, which change in importance depending on what I’m doing this week. I’ll describe some of them, and ask for additional suggestions.

1. Give us real text boxes for chart and axis titles and data labels, which can be resized and which have VBA properties like height and width.

2. Enhance a series, so it has more associated ranges. Allow us to more directly interact in VBA with the ranges describing the X and Y values of a series, without painful and error-prone parsing of addresses. Formalize the association of a series to its custom error bar values, so we can deal with error bar ranges in VBA. Add a range to a series that contains data labels for the series.

3. Allow more chart types to be combined. I’d like to be able to combine area charts with bubble charts, so the bubbles have a nice patterned background to lie on. Let me add an XY series to a contour chart. Don’t ask me why I want this, just enable it and a hundred users will have a hundred whys.

4. Combine Line and XY series into a single type. If the X values are labels, then treat the series as today’s line series. If the Y values are categories, then treat the series as a dot plot (a rotated line chart). If X and Y are both labels, well, we’ve heard people asking how to plot, say, classroom along X and letter grades along Y.

5. Construct an Axis object model, so we can apply any arbitrary funtion (or even range) to describe the axis and define its scale and ticks. Then such things as probability axes, reciprocal axes, even broken axes will be much easier to construct.

6. Let any numerical and text aspects of a chart be linked to a worksheet cell or range. For text, we can do this with titles and data labels, but not with axis tick labels. We should be able to link axis scales to the worksheet and use our favorite algorithms to provide autmatic scaling that suits our needs.

I could go on and on, but I’ve said enough to prime the pump.

I know that this is pretty much out of the box thinking, but if features like these are added, they provide flexibility that users like you and I can use to make a much wider array of great charts than a handful of added chart types could provide. What kind of features would you like to see? What would allow you to make better charts? What would make Excel 2010 a kick-ass charting product?

Modifying Shapes (and Charts) With UDFs

Do you know that you can write VBA worksheet functions that modify shapes on a worksheet? I didn't know this until about an hour ago.

Paste the following UDF into a VBA module:

Function ModifyShape(ShapeNumber, ShapeType, Vis)
    With ActiveSheet.Shapes(ShapeNumber)
        .AutoShapeType = ShapeType
        .Visible = Vis
    End With
End Function

Then, add a shape to the worksheet and enter this formula into any cell:

=ModifyShape(1,55,TRUE)

The first argument is the shape's index number. The second argument is a value that represents the shape's type (values from 1-138 are supported). The last argument determines whether the shape is visible. The arguments, of course, could use cell references. Change the second argument and watch the shape change its shape. Change the third argument to FALSE and watch the shape disappear.

This sort of thing is a lot more useful in Excel 2007, because embedded charts are contained in shapes. Therefore, you can write formulas that manipulate the chart object properties such as size, position, and visibility. Even better, your UDF can even access the Chart object contained in the shape, and manipulate that. Here's a simple example:

Function ChangeChartType(CName, CType)
'   Excel 2007 only
    ActiveSheet.Shapes(CName).Chart.ChartType = CType
End Function

This function assumes an embedded chart. It uses two arguments: the ChartObject's name, and the chart type (e.g., 5 is xlPie, -4100 is xl3DColumn, etc.).

Even better -- a function that allows you to specify a chart's min and max scale values:

Function ChangeChartAxisScale(CName, lower, upper)
'   Excel 2007 only
    With ActiveSheet.Shapes(CName).Chart.Axes(xlValue)
        .MinimumScale = lower
        .MaximumScale = upper
    End With
End Function

The ability to adjust a chart's axes based on calculations has long been on the Excel wish list. I wonder if Microsoft even knows this is now possible?

Website update

I imagine this is the way it is for most people:

Vast number of ideas of what I'd like to do
    A few get implemented as add-ins, utilities, tutorials, what have you
        Even fewer find their way to the website
            And, it takes even longer to find a proper location for very new topics.

I spent the better part of today adding one new tutorial, one new template, and hooking up several old pages that were already part of the www.tushar-mehta.com website but were never part of the site navigation structure. In the process, I added a new high level category called "Templates." The following introduces each item.

Enhanced Find: This is a source form utility that I wrote for my own use. I imagine it would have been much simpler if Microsoft had added the UI 'Find All' feature to the Excel object model. But, it didn't, and, in any case, the utility does a lot more than just 'find all.' To those who will want to tell me I should have used the TreeView control in the userform used to display the result: I did. {grin} Then, I decided it would be prudent to forgo it since I don't know how that control finds its way onto a particular machine.

Chart Image to Data: I wrote this utility to help out someone who wanted to convert an image of a graph into the associated data points. No, it doesn't use artificial intelligence, or OCR, or some fancy heuristic to figure out the data. {grin} Instead, it relies on the user first calibrating the image and then clicking on various points on a series to convert the click location into a data value.

The templates hierarchy: Several templates that I created for various reasons have languished on the website with no good place to put them. So, I finally broke down and added a high level category, Templates. As usual, whenever a file moves from one directory to another, I leave -- or at least try to leave -- a stub in the old location pointing to the new one.

Solver template to find a subset of entries that add up to a given total: One would expect that when someone makes a payment against a list of different charges (invoices), they would list which charges invoices are being paid. Yet, every so often we find a request for help from someone who has a total and now wants to find a possible subset of pending payments that total the payment amount. This template contains the necessary Solver set up for the task. Obviously, this is not an easy task and the ability to find an acceptable solution is limited both by the requirements and limitations placed on Solver and of the algorithm used by Solver.

A calendar template: Even if I say so myself, I think this is worth checking out. It's a template that lets one generate a calendar for any year from 1900 to 3000 with the starting-day-of-week set to any weekday. The result can be in one of 2 formats -- one month-per-page or 12 months-per-page. The design is clean and simple. And, best of all, no programming!

Compare sizes of TVs, computer monitors, and broadcasts: Over the past month or so, I replaced my dying TV with a HD TV and added a widescreen computer monitor. In the process, I spent a fair amount of time understanding what's what with widescreen devices and HD broadcasts. This template addresses one specific aspect of that research: comparing the sizes of various devices. One thing I realized very quickly in my research was that a widescreen device would have to be somewhat larger than its standard counterpart if it was to have at least the same height. I did most of my work with algebra. But, then decided I'd share some of it in the form of a template. While creating the template I also realized it could be used to explain the bands that appear when one views a broadcast in one format (say widescreen) on an incompatible TV (i.e., a standard size TV). I also realized that the work in the template would form a good tutorial for data validation.

Data Validation I: There's a whole bunch of stuff I have on data validation that I'd like to share. Given the big weakness in Excel's native data validation (copy+paste into a cell to wipe out the validation criterion in it!), I prefer alternatives. This tutorial documents some simple ways to validate data without Excel's Data Validation or VBA. There's a whole bunch more I will share in the weeks/months to come.

Userform coding interface: I started work on a VBA chapter -- and it is far from complete -- that addresses two favorite topics of mine: (1) The structure of the interface between the code that manages a userform and the code that does the actual task of a utility, and (2) The RefEdit control including a workaround that lets one simulate its functionality in a modeless userform, together with a class module that lets one "drop" the solution into any userform.