Background

The document conversion library and the responsive design document editor have been shipping as components of UX Write on the iOS app store since February 2013. Both components have undergone continued development over time. As far as UX Write is concerned, they provide a stable and reliable codebase. As an open source project, Corinthia is completely new (thanks to a grant from UXproductivity), in the sense that it is now moving from a single-developer commercial project, to an open, community-based project. We believe that this is the most beneficial path forward for the technology, enabling it to be developed to its full potential, and made available to anyone who needs to deal with multiple document formats or provide editing functionality on web, mobile platforms or desktops.

Goal

The goal of Corinthia is to provide a responsive design office document editor as well as a toolkit that enacts a defined conversion between different office document formats. Responsive design fits the layout as needed, tablet or desktop. The editor is a lightweight editor - an extension and not a replacement for the desktop editor.

Many office document programs claim to read/write to the ISO open standards for office documents, OpenDocument Format (ODF) and Office Open XML (OOXML), but do not document which parts are left unimplemented. Furthermore, the standards have a large number of "implementation defined" parts, making real-world congruence chancy. The Corinthia project wants to put this unacknowledged aspect into the open and provide "compliance sheets" for document formats, as known from industry computer protocols.

Corinthia aims at generating a large set of test documents, which can be used to verify the "compliance sheets". The code can work as test case for other applications (or entities tendering for OOXML/ODF based systems) as well.

The base of Corinthia and its toolkit is the library DocFormats, which converts between different office document file formats. Currently it supports .docx (part of the OOXML specification), HTML, and LaTeX (export-only). In addition to this is an editor, which allows manipulation of the HTML files in a web browser or embedded web view, and can be used in conjunction with DocFormats to edit documents in all supported formats.

The design of DocFormats is based on on the idea of bidirectional transformation (BDT), in which a specific document (the original file in its source format) is converted into an abstract document (in the destination format). A modified version of the abstract document can then be used to update the specific document in a non-destructive manner, keeping intact all parts of the file which are not supported in the abstract format by modifying the original file rather than replacing it.

Description

Corinthia is a toolkit and editor for converting between and editing common office file formats, with an initial focus on word processing.

Corinthia is designed to cater for multiple classes of platforms - web, mobile and desktop - and relies on web technologies such as HTML, CSS, and JavaScript for representing and manipulating documents.

The toolkit is small, portable, and flexible, with minimal dependencies. The target audience is developers wishing to include office viewing, conversion, and editing functionality into their applications. The file format conversion library is implemented in highly-portable C, and can be easily embedded in native applications, with bindings for other programming languages planned. The library allows two-way conversion between different formats, and avoids irreversible loss of content or formatting unsupported in a target format by updating the source format in a way that makes only the minimal changes necessary.

The editor is implemented in JavaScript, and runs in a browser runtime - either an actual web browser, or a web view embedded in a native app. It follows the philosophy of responsive design, popular on the web, where layout of a document is automatically adapted to suit the screen size and orientation, enabling the same content to be viewed on mobile phones, tablets, and desktop systems. All layout is handled by the browser's own engine; the editor works solely with the document's HTML structure and CSS styles. Currently the editor only operates in an embedded web view, but we plan to have it run in all major web browsers, and provide a clean API for easy integration into various native apps.

Importantly, Corinthia document viewing and editing is on the intermediate form (HTML & CSS), limited to common, widely-supported features. Corinthia is not a comprehensive substitute for format-specific authoring, editing, and final-form printing/production software. It is intended to complement, not compete with, major office suites.

Identification and confirmation of inter-convertible features of different formats for dependable import and export involves development of extensive test documents in the different formats. There is profiling of the extent to which standardized formats are supported in practice, with identification of deviations and implementation-dependent choices that impact convertibility.