Posted in hacks and kludges on Monday, October 26 2015

One unfortunate thing about using excel for coding at work is the difficulty in version controlling spreadsheets. Excel files are big binary blobs and, unlike the source files for most programming languages which are text files, cannot be easily diffed. This raises all sorts of issues with the maintainability of the spreadsheets and the macros that are embedded in them. I hacked together a simple solution to this problem in powershell, for newer versions of Excel (I have Excel 2013 on my work computer and it works there!)

Unfortunately all the VBA code is stored in a vbaproject.bin file inside that zip file. This is a binary blob, an OLE file, and has all the same maintainability issues as the spreadsheet as a whole. There are various tools online for viewing and extracting code from this file. I looked at what it would take to read/write directly into the file and ugh no. Too much work given that Excel can import and export the contents of a VBA project easily and with a few keystrokes.

Putting all this together, I took my spreadsheet for doing compressible pipeflow calculations unpacked it and put it on github. Then I wrote a short powershell script to do the following:

1. Take the unpacked directory of xml files and zip it into temp.zip
2. Rename temp.zip to temp.xlsm, it is now a real spreadsheet (minus the vba code)
3. Open Excel in background and open the temp.xlsm file
4. Walk through the directory of basic modules and import each into the Excel file
5. Walk through the directory of class modules and import each into the Excel file
6. Save and close temp.xlsm, quit Excel
7. Rename temp.xlsm to Pipeflow.xlsm

This short script takes that directory and re-makes a spreadsheet. By unpacking and re-packing the spreadsheet I can maintain it with a normal version control system, tracking only the small changes to the individual text files.

So far it works pretty well, the only giant caveat is that Excel, by default, does not allow outside programs to muck about in the VBA side (probably for good security reasons). So you will have to open your own copy of Excel, migrate through the options and turn on "Trust access to the VBA project object model". (Also you need windows powershell, and have its permissions set so you can run scripts (duh!))

$wb_src = Join-Path (pwd) Pipeflow.xlsm.extracted\$wb_dst = Join-Path (pwd) temp.zip
$wb_temp = Join-Path (pwd) temp.xlsm$wb_final = Join-Path (pwd) Pipeflow.xlsm

[System.IO.Compression.ZipFile]::CreateFromDirectory($wb_src,$wb_dst)

Rename-Item -Path $wb_dst -NewName$wb_temp

$xl = New-Object -comobject Excel.Application$xl.Visible = $False$wb = $xl.Workbooks.Open($wb_temp)

foreach ($i in get-childitem Modules\*.bas) {$wb.VBProject.VBComponents.Import($i.fullname) | Out-Null } foreach ($i in get-childitem Classes\*.cls) {
$wb.VBProject.VBComponents.Import($i.fullname) | Out-Null
}

$wb.Save()$xl.Quit()

Rename-Item -Path $wb_temp -NewName$wb_final

tags: excel, vba, powershell,