Getting Array Data from a Filtered List in VBA

Getting data from an Excel sheet into an array is usually best accomplished with a statement that looks something like this:

vMyArray = Sheet1.UsedRange.Value

I thought it would be keen to fill an array from a filtered list, so I coded

Sub ArrFilteredList()
   
    Dim vArr As Variant
   
    vArr = Sheet1.UsedRange.SpecialCells(xlCellTypeVisible).Value
   
    Stop
   
End Sub

on this list

The SpecialCells returns a range of only those cells that are visible, i.e. unfiltered in this case. The problem, it turns out, is that this method doesn’t work with noncontiguous ranges and that’s just what SpecialCells returns. I put the Stop in there so I could check the Locals Window.

It filled from the first Area of the range, then stopped. I confirmed that it was the lack of continutity of the range with this code

Sub ArrNonContiguous()
   
    Dim vArr As Variant
   
    vArr = Union(Sheet1.Range("A1:C1"), Sheet1.Range("A4:C6")).Value
   
    Stop
   
End Sub

which returned similar results. So I’m stuck iterating through the range, I guess. But then my array is backward; column, row instead row, column because I can’t change the first element of an array with Redim Preserve.

Sub ArrFilteredList2()
   
    Dim rRow As Range
    Dim aArr() As String
    Dim i As Long
    Dim lCount As Long
   
    ReDim aArr(1 To 3, 1 To Sheet1.UsedRange.Rows.Count)
    lCount = 0
   
    For Each rRow In Sheet1.UsedRange.Rows
        If rRow.Hidden = False Then
            lCount = lCount + 1
            For i = 1 To 3
                aArr(i, lCount) = rRow.Cells(i).Value
            Next i
        End If
    Next rRow
   
    ReDim Preserve aArr(1 To 3, 1 To lCount)
   
    Stop
   
End Sub

Is there a better way?

Plain and Pretty

Mike Alexander tells me that I have to sell, so sell I shall.


First, the pretty:


Microsoft MVPs Jon Peltier (Peltier Technical Services) and Mike Alexander (DataPig Technologies) are joining together again to bring you our acclaimed Excel Dashboard and Visualization Bootcamp!

This 3-day boot camp is designed for Excel users who need to more effectively synthesize data into meaningful dashboards, charts, and visualizations. The topics presented during this boot camp will introduce you to advanced techniques that will help you build and manage better reporting mechanisms. Going beyond simple tables and charts, you will learn to:

* Synthesize data in meaningful views with advanced charting techniques
* Create reports and dashboards that communicate and get noticed
* Create interactive reporting mechanisms
* Implement macro-charged reporting
* Automate the creation of PowerPoint slides directly from Excel
* Integrate external data into your reports


And for those of us who love data and don’t need pretty pictures, the plain:


Microsoft MVPs Dick Kusleika (Daily Dose of Excel) and Mike Alexander (DataPig Technologies) bring you our first ever Excel and Access Power User Workshop!

This 3-day workshop is designed for power users who what to expand their skill-set and get more out of Excel and Access. During this workshop, you’ll be introduced to a wide array of tips and techniques that will muscle up your skills in Data Reporting, Automation, and Application Development.

The topics presented during this 3-day workshop will help you go beyond basic spreadsheets and databases, to to robust professional-grade solutions. Learn how to:

* Move Data between Excel and Access using MSQuery and SQL
* Use ADO Scripting to build robust data entry models in Excel
* Automate Excel from Access
* Run Access Processes from Excel
* Automate Outlook Interactions
* Building client-side solutions that use SQL server as the back end


And now the closer:

Sign up before April 1st and save $200 on your registration. On April 1st, prices return to $800.

OK, I feel sufficiently dirty now. But these are going to great workshops, so go sign up now!

Office Automation

Someone asked if anyone automates Office applications any more. That struck me as kind of funny. If I had to guess, I would guess that there are more lines of VBA code (poorly written or not) that automate Office applications than all the COBOL, VB6, PHP, and any other language you can think of doing anything. I don’t have any evidence of that. It’s in the Outlook-is-the-largest-data-store vein. Because there are so many Excel, Word, Access, etc documents out there, I think all the code in them would add up to a bunch. What do you think?

There are more lines of VBA code automating Office apps than all other kinds of code doing all other tasks
True
False
Can’t/Won’t Guess

  
Free polls from Pollhost.com

PowerPivot - Part 3 of 4: Conditional Shape Colors

Part 1: http://www.dailydoseofexcel.com/archives/2010/02/26/powerpivot-part-1-of-4/

Part 2: http://www.dailydoseofexcel.com/archives/2010/02/27/powerpivot-part-2-of-4-prepping-the-census-data/

As mentioned in the Part 1, I had developed this technique to conditionally color shapes some time ago. Based on customer feedback and my own experience I made changes to simplify the various connections required to make it all work.

Also, documenting the technique was easier to do with a webpage for two reasons: 1) the length of the post, and 2) I find the UI for a blog post somewhat cumbersome for managing images.

The updated solution is at http://www.tushar-mehta.com/publish_train/xl_vba_cases/0301-dashboard-conditional_shape_colors.htm

In Part 4, the last part of this multi-part post, I will look at using PowerPivot from within Excel to analyze the 18 million row data set generated in Part 2 and on integrating the result with the Conditional Shape Color solution of this post.

picture11

Web Query Lessons

I’ve recently learned some things about Web Queries that I’d like to share with you. But first a little background. You create a Web Query in Excel 2007 by clicking From Web on the Data tab. This launches a browser within Excel through which you navigate to a web page and select a “table”. Tables on web pages can be a few different things. I think the common “tables” are html tables and html div tags.

Lesson 1: The web browser Excel uses is some variant of Internet Explorer. If you don’t use IE, and I don’t, then you haven’t bothered to set the home page. In that case, the home page will be msn.com and you might see errors like this:

Awesome. Why did I ever stop using IE? If I open IE8, set my home page to google.com (or anything else), then the errors go away. Bonus April Fools day tip: Find a co-worker who uses Firefox and set his IE home page to hawtness.com. It’s marginally NSFW, so it will be funny when he does a web query, but he won’t get fired (probably).

Lesson 2: If you try to refresh a Web Query for a web page that requires you to log in, it won’t work (sometimes). You’ll get

This Web query returned no data. To modify the query, click OK, click the name of the external data range in the name box on the formula bar, and then click Edit Query on the External Data toolbar.

I have a workbook with a couple dozen web queries all pointing to the same site. If I ‘edit’ one of the queries, Excel’s instance of IE remembers that I have logged in (cookies maybe? that aren’t shared with normal IE?) and I can refresh all the remaining queries without error. So I created a sheet near the beginning called ‘Credentials’ where I have a Web Query that brings in a small table near the top of the home page. I ‘edit’ that query to log in, but I don’t have to navigate around much to bring it in. Then I run my code which updates all the URLs and refreshes all the queries. The poor man’s IE automation, I guess.

IsHex Function

I have a need to verify some user input is hexadecimal (0-9 and A-F).

Public Function IsHex(sInput As String) As Boolean
   
    Dim lResult As Long
   
    On Error Resume Next
        lResult = CLng("&H" & sInput)
       
    IsHex = sInput = "0" Or lResult > 0
       
End Function

With help from this Mr. Excel thread. So where did &H come from? Are there more like it?

PowerPivot - Part 2 of 4: Prepping the census data

Part 1: http://www.dailydoseofexcel.com/archives/2010/02/26/powerpivot-part-1-of-4/

Since Excel 2007 (and later) can handle a million rows of data, I figured it made sense to explore PowerPivot (PP) with a data set larger than that. In addition, I have had a long time curiosity about the US census data. So, this seemed an appropriate time to combine the two. After some exploration of www.census.gov, I learned of http://usa.ipums.org/usa/index.shtml This is a project of the Minnesota Population Center of the University of Minnesota.

Do note that this post is not meant to be a step-by-step tutorial on how to use the IPUMS website. While it took some time, it took even more effort to understand the census data but an explanation of the data is also not my intent with this post. The focus is on loading a large data set in PowerPivot.

The final data set I downloaded expanded to nearly 18 million rows of data. But, it took a while before I got to the point of getting that data set prepped for PP.

A relatively superficial exploration of the IPUMS site led to two conclusions. One, the result of ‘analyze data online’ would yield results online but the result sets were intentionally limited in size. Two, trying to get detailed data through the ‘Create an Extract’ or the ‘Download or Revise Extracts’ led to large data sets but of the kind that were intended for use with SPSS, SAS, or STATA. The data set itself consisted of fixed length columns. In addition, there was a delimiter at the end of each row record (discovered through trial and error).

In addition to the data set, the website provided the command sets to load the data into SPSS, SAS, and STATA.

picture1

The SPSS instructions were fairly helpful in understanding the layout and the content of the data (other than the secret delimiter).

So, for one data set the columns were

 year           1-4
 datanum        5-6
 serial         7-14
 numprec       15-16
 hhwt          17-20
 statefip      21-22
 county        23-26
 urban         27
 city          28-31
 citypop       32-36
 urbpop        37-41
 ownershpd     42-43
 mortotal      44-47
 mortamt1      48-52
 rentgrs       53-56
 hhincome      57-63
 pernum        64-67
 perwt         68-71
 age           72-74
 sex           75
 marst         76
 raced         77-79
 gradeattd     80-81
 occ1950       82-84
 ind1950       85-87
 inctot        88-94
 ftotinc       95-101

In addition, the SPSS instructions also contained information about the text values associated with the numeric codes in the data set. For example, ownershpd values were:

value labels ownershpd
00 "N/A"
 10 "Owned or being bought"
 11 "Check mark (owns?)"
 12 "Owned free and clear"
 13 "Owned with mortgage or loan"
 20 "Rented"
 21 "No cash rent"
 22 "With cash rent"

The above information is important since we can create Excel tables of these number-to-text maps and link them to the data table in PP.

What made the process of getting the data in PP challenging was that PP will not read fixed column width data from a text file though it will read comma delimited text files. I would have thought this would be elementary but I couldn’t find any way to get PP to do so. Maybe, I missed something that would have made the rest of the exercise below unnecessary.

So, as far as I could tell, I had to convert the fixed-column-width text file into a file where the columns were delimited by some character. I picked the comma. But, I couldn’t use Excel (too many records) or Word (too large a file) to do the conversion. I also tried Notepad and WordPad before falling back on true and tried VB(A).

I did not start with the 18 million record data set. I used the smaller data set for a single census from the 19th century to explore the structure of the data. Through trial and error I discovered the secret delimiter at the end of each record. I also discovered that the software did not summarize the data in any way. The code that I felt confident I could use for data from all the censuses combined is below.

It’s not elegant. It’s not flexible. It’s not meant to be either. At some point I might be tempted to paste the information about the column names and widths into a Excel worksheet and have the code automagically work off that. But, for now, the goal was to prep the data for PP.

One interesting thing I did learn was that using FreeFile() twice in succession yields the same integer value. That, of course, leads to an error when opening the 2nd file. So, the correct approach is to get a free file number, open that file, and then get a second free file number for another file.

Also, if one were to add up the column widths of the various fields one would get 101 but the record being read has 102 characters. The last byte is the delimiter that I want to drop.

Option Explicit

Sub writeOneRec(OutF As Long, Rslt As String)
    Dim Yr As Integer, DataNum As Integer, Serial As Long, _
        NumPrec As Integer, HHwt As Integer, _
        StateFIP As Integer, County As Integer, Urban As Integer, _
        City As Integer, CityPop As Long, _
        UrbPop As Integer, _
        OwnerShip As Integer, MorgTotal As Integer, _
        MorgAmt1 As Integer, RentGross As Integer, _
        HHIncome As Long, _
        PerNum As Long, PerWt As Integer, Age As Integer, _
        Sex As Integer, MarSt As Integer, Race As Integer, _
        GradeAtt As Integer, Occ1950 As Integer, Ind1950 As Integer, _
        IncTot As Long, FamTotInc As Long
    On Error Resume Next
    Yr = Left(Rslt, 4)
    DataNum = Mid(Rslt, 5, 2)
    Serial = Mid(Rslt, 7, 8)
    NumPrec = Mid(Rslt, 15, 2)
    HHwt = Mid(Rslt, 17, 4)
    StateFIP = Mid(Rslt, 21, 2)
    County = Mid(Rslt, 23, 4)
    Urban = Mid(Rslt, 27, 1)
    City = Mid(Rslt, 28, 4)
    CityPop = Mid(Rslt, 32, 5)
    UrbPop = Mid(Rslt, 37, 5)
    OwnerShip = Mid(Rslt, 42, 2)
    MorgTotal = Mid(Rslt, 44, 4)
    MorgAmt1 = Mid(Rslt, 48, 5)
    RentGross = Mid(Rslt, 53, 4)
    HHIncome = Mid(Rslt, 57, 7)
    PerNum = Mid(Rslt, 64, 4)
    PerWt = Mid(Rslt, 68, 4)
    Age = Mid(Rslt, 72, 3)
    Sex = Mid(Rslt, 75, 1)
    MarSt = Mid(Rslt, 76, 1)
    Race = Mid(Rslt, 77, 3)
    GradeAtt = Mid(Rslt, 80, 2)
    Occ1950 = Mid(Rslt, 82, 3)
    Ind1950 = Mid(Rslt, 85, 3)
    IncTot = Mid(Rslt, 88, 7)
    FamTotInc = Mid(Rslt, 95, 7)
   
    On Error GoTo 0
    Write #OutF, Yr, DataNum, Serial, NumPrec, HHwt, StateFIP, _
        County, Urban, City, CityPop, UrbPop, OwnerShip, _
        MorgTotal, MorgAmt1, RentGross, HHIncome, _
        PerNum, PerWt, Age, Sex, MarSt, Race, _
        GradeAtt, Occ1950, Ind1950, IncTot, FamTotInc

    End Sub
Sub openDB9()
    Dim X, Rslt As String, OutF As Long
    X = FreeFile()
    Open "c:\tushar\work\usa_00009.dat\usa_00009.dat" _
        For Binary As #X Len = 102
    OutF = FreeFile()
    Open "c:\tushar\work\usa_00009.txt" _
        For Output As #OutF
       
    Write #OutF, "Yr", "DataNum", "Serial", _
        "NumPrec", "HHwt", _
        "StateFIP", "County", "Urban", _
        "City", "CityPop", _
        "UrbPop", _
        "OwnerShip", "MorgTotal", _
        "MorgAmt1", "RentGross", _
        "HHIncome", _
        "PerNum", "PerWt", "Age", _
        "Sex", "MarSt", "Race", _
        "GradeAtt", "Occ1950", "Ind1950", _
        "IncTot", "FamTotInc"
    Rslt = String(102, " ")
    Get #X, , Rslt
    Do While Not EOF(X)
        writeOneRec OutF, Rslt
        Get #X, , Rslt
        Loop
    'Stop
   Close #X: Close #OutF
    End Sub

And, so now, I have a CSV data set ready to load into PowerPivot.

PowerPivot - Part 1 of 4

After the MVP Summit I was motivated to further explore a Microsoft add-in for Excel 2010 called PowerPivot. I don’t know all its capabilities but at the very least it can analyze millions of rows of data. If you are using Excel 2010 you can download the add-in from http://www.microsoft.com/downloads/details.aspx?FamilyID=48a5b47b-8c9c-450f-ab6e-178600a733ca&displaylang=en

Do keep one thing in mind. The Community Technical Preview (CTP) version of the PowerPivot addin that is currently available is incompatible with the currently available version of Office 2010, i.e., the Release Candidate (RC). The problem seems to lie in that the data in the PowerPivot tables are not stored in the Excel file as they should be. So, on reopening the file, there is no database for PP to use! I assume (hope?) Microsoft will fix this incompatibility soon.

A day after returning from the Summit, I ran into a contest organized by Microsoft on creative uses of PowerPivot. I scrambled to put together an entry since the contest closed in 2 days. First, I had to install PowerPivot. Then, I decided to use census data as the source for some kind of analysis. It took me quite a while to figure out how to get that data. Then, I analyzed the data set with PP. Finally, I decided to marry the “%change in population from one census to the next” with a solution I had posted on my web site to conditionally color shapes in an Excel worksheet (http://www.tushar-mehta.com/excel/charts/0301-dashboard-conditional%20shape%20colors.htm).

I will post more detailed reports on my experience with PP and the census data. Part 2 will be about my experience getting the census data. Part 3 will be about the add-in to conditionally color shapes, and Part 4 will be about using PP in a relatively basic manner while integrating the result with the add-in.

For now, here is an image representing the result.

powerpivot

For those curious about the contest (now closed), here is a link: http://wildfireapp.com/website/6/contests/19877/voteable_entries/3165298