Wednesday, July 02, 2008

Got some RTF? Want to create your own parser? Don’t want to re-invent the wheel?

CodeProject - Writing your own RTF Converter

“…

The component introduced in this article has been designed with the following goals in mind:

  • Support for the current RTF Specification 1.9.1
  • Open source C# code
  • Unlimited usage in console, WinForms, WPF, and ASP.NET applications
  • Independence of third party components
  • Possibility to analyze RTF data on various levels
  • Separation of parsing and the actual interpretation of the RTF data
  • Extensibility of parser and interpreter
  • Providing simple predefined conversion modules for text, images, XML, and HTML
  • Ready-to-Use RTF converter applications for text, images, XML, and HTML
  • Open architecture for simple creation of custom RTF converters

Please keep the following shortcomings in mind:

  • The component offers no high-level functionality to create RTF content.
  • The present RTF interpreter is restricted to content data and basic formatting options.

    There is no special support for the following RTF layout elements:

    • Tables
    • Lists
    • Automatic numbering
    • All features which require knowledge of how MS-Word might mean it ...

In general, this should not pose a big problem for many areas of use. A conforming RTF writer should always write content with readers in mind that do not know about tags and features which were introduced later in the standards history. As a consequence, a lot of the content in an RTF document is stored several times (at least if the writer cares about other applications). This is taken advantage of by the interpreter here, which just simply focuses on the visual content. Some writers in common use, however, improperly support this alternate representation which will result in differences in the resulting output.

Thanks to its open architecture, the RTF parser is a solid base for development of an RTF converter which focuses on layout.

…”

The thought of writing my own RTF parser makes my brain hurt. That is a wheel I don’t think I’d EVER want to re-invent.

No comments:

Post a Comment

NOTE: Anonymous Commenting has been turned off for a while... The comment spammers are just killing me...

ALL comments are moderated. I will review every comment before it will appear on the blog.

Your comment WILL NOT APPEAR UNTIL I approve it. This may take some hours...

I reserve, and will use, the right to not approve ANY comment for ANY reason. I will not usually, but if it's off topic, spam (or even close to spam-like), inflammatory, mean, etc, etc, well... then...

Please see my comment policy for more information if you are interested.

Thanks,
Greg

PS. I am proactively moderating comments. Your comment WILL NOT APPEAR UNTIL I approve it. This may take some hours...