Wednesday, August 03, 2005

Office 2003 Add-in: Word Redaction

Download details: Office 2003 Add-in: Word Redaction

"Redaction is the careful editing of a document to remove confidential information.

The Microsoft Office Word 2003 Redaction Add-in makes it easy for you to mark sections of a document for redaction. You can then redact the document so that the sections you specified are blacked out. You can either print the redacted document or use it electronically.

Sensitive government documents, confidential legal documents, insurance contracts, and other sensitive documents are often redacted before being made available to the public. With the Word 2003 Redaction Add-in, users of Microsoft Office Word 2003 now have an effective, user-friendly tool to help them redact confidential text in Word documents.

Notes:
We recommend that you carefully review any documents redacted using this tool to confirm that all the information that you intended to redact was successfully redacted."


I sure hope MS did their homework and closely reviewed this. It would not be pretty for them if someone "important" released a redacted document and another party was able to remove the redaction. Redaction's on produced/delivered documents, by their very nature, have to be secure, permanente and inviolable.


Taking a quick 2 second pass at trying to break this, and so far it looks like MS did a good job.


What happens is that in the original Word document you mark sections/words/etc for redaction. Then you "Redact Document". This creates a new Word document with the redactions in place. During the Redact Document step, the Add-in appears to actually replace the redacted characters with ASCII character 124, aka the vertical pipe symbol, |


I focused on this new, redacted document to see if there was an easy way to get the original text back out.

First, can I select, cut and paste out the original text? Nope.

Second, using the Word Object Model/API can I get the original text? Nope (Tested via Office Spy)

Third, what about SaveAs *.txt? Nope, still can't get the original text.

Fourth, what about opening the DOC in notepad? Nope, the original text is not easily viewable

Finally, Track Changes. Turing track changes on for the original document, redacted some text, “Redacted Document”. Now in the new document, can I turn off/reject/view original text? Nope. Still redacted.


So while it looks like this is a good, workable tool if you are still concerned you might want to wait for forensic professionals to weigh in before using it.

3 comments:

Anonymous said...

For years, I've conjectured that the championship prize for EDD will go to whoever figures out how to simulate the specialized tasks of the document discovery process in (copies of) the original electronic documents themselves rather than in electronic images of the "printed" pages. The EDD Killer App is the one that satisfies the needs of the legal community but without also requiring that documents first be virtually printed ("pageified"). Being able to redact the text of digital documents in any format will be a huge win for the legal profession. The Microsoft Word Redaction technology is a baby step in that direction.

Does the add-on obscure (encrypt and hide) or obliterate (remove and replace with placeholders) the text. By definition, it should be doing the latter.

Anonymous said...

Another quick test if you have a unix box at your disposal is to run the strings command on it (which prints whatever plaintext exists in a document) - can be a useful way to quickly extract strings from a binary document (i.e.: pdf, doc, etc) - you sometimes get garbage and command strings, but redacted information can be found that way too.

Greg said...

Thanks Joe, nice tip...

What's cool is that the free Windows Services for Unix,http://www.microsoft.com/windowsserversystem/sfu/default.mspx, includes a "strings" implementation.

Running strings on a test redacted document reveals that the redacted text is indeed gone and replaced with |'s.

So all is still good with this add-in.