Wednesday, July 30, 2014

Using the OpenXML SDK Productivity Tool to "decompile" Office documents (Turn *X files into the C# OpenXML SDK code that would generate them)

Ode To Code - Easily Generate Microsoft Office Files From C#

"...

These days, Office files are no longer in a proprietary binary format, and are we can create the files directly without using COM automation. A .docx Word file, for example, is a collection of XML documents zipped into a single file. The official name of the format is Open XML.

There is an SDK to help with reading and writing OpenXML, and a Productivity Tool that can generate C# code for a given file. All you need to do is load a document, presentation, or workbook into the tool and press the “Reflect Code” button.

image

The downside to this tool is that even a simple document will generate 4,000 lines of code. Another downside is that the generated code assumes it will write directly to the file system, however it is easy to pass in an abstract Stream object instead.

So while this code isn’t perfect, the code does produce valid document and..."

I've been blogging about the OpenXML SDK for years now, but I think this is the first time I've seen this part of it, this utility. And like he says, 4K LoC is like, well, allot, it does look like an awesome way to learn the low level OpenXML SDK ins and outs.

 

Related Past Post XRef:
Open Sesame - Open XML SDK is now open source

Using OpenXML to load an Excel Worksheet into a DataTable (or just how different OpenXML is from the old Excel API we're used too)

Using OpenXML SDK to generate Word documents via templates (and without Word being installed)
Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM)
LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX
Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...)
Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML?

Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here)
So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map…
Where to go to scratch your OpenXML dev info itch…
"Open XML Explained" Free eBook (PDF)
The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...)

Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world
PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek
Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released
OpenXML PowerTools updated – Cell your Excel via PowerShell
Powering into OpenXML with PowerShell

Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office

Open XML 2.0 Code Snippets for VS2010 (and VS2008 too)
Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding
Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required)

Open XML SDK v1 Released

OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

No comments: