Wednesday, July 02, 2008

Got some RTF? Want to create your own parser? Don’t want to re-invent the wheel?

CodeProject - Writing your own RTF Converter

“…

The component introduced in this article has been designed with the following goals in mind:

  • Support for the current RTF Specification 1.9.1
  • Open source C# code
  • Unlimited usage in console, WinForms, WPF, and ASP.NET applications
  • Independence of third party components
  • Possibility to analyze RTF data on various levels
  • Separation of parsing and the actual interpretation of the RTF data
  • Extensibility of parser and interpreter
  • Providing simple predefined conversion modules for text, images, XML, and HTML
  • Ready-to-Use RTF converter applications for text, images, XML, and HTML
  • Open architecture for simple creation of custom RTF converters

Please keep the following shortcomings in mind:

  • The component offers no high-level functionality to create RTF content.
  • The present RTF interpreter is restricted to content data and basic formatting options.

    There is no special support for the following RTF layout elements:

    • Tables
    • Lists
    • Automatic numbering
    • All features which require knowledge of how MS-Word might mean it ...

In general, this should not pose a big problem for many areas of use. A conforming RTF writer should always write content with readers in mind that do not know about tags and features which were introduced later in the standards history. As a consequence, a lot of the content in an RTF document is stored several times (at least if the writer cares about other applications). This is taken advantage of by the interpreter here, which just simply focuses on the visual content. Some writers in common use, however, improperly support this alternate representation which will result in differences in the resulting output.

Thanks to its open architecture, the RTF parser is a solid base for development of an RTF converter which focuses on layout.

…”

The thought of writing my own RTF parser makes my brain hurt. That is a wheel I don’t think I’d EVER want to re-invent.

No comments: