r4 - 19 Oct 2005 - 19:28:20 - WendellPiezYou are here: TWiki >  DHquarterly Web > SchemaRequirements
DHQ Wiki | Policy Documents | Journal Development

Towards Schema Design Requirements

(nb: please use the plural "schemas" not "schemata", which are something different)

See the Technology Commitments page for why we prefer RelaxNG for each of these schemas. In process, these can each can be supplemented by one or more Schematron assertion sets (to say nothing of validation by context).

DHQ schemas: Scope and Goals

Authoring schema: =DHQauthor

See the DHQauthor Schema page for details on the DHQauthor schema.

Scope:

  • Articles for DH Quarterly (Zone I: see FrameWorkNotes for this stuff on zones) and references to articles and pieces in Zones II or III (via a "writeup" with a metadata wrapper)
Goals:
  • Provide suitable tagging to authors for articles, reviews and editorials for DHQ
  • Validate tagging constructs as tightly as possible to DHQ "house style" while offering the widest practical range of expression to authors (outside house style, use the crayonbox schema or go to Zone II or III)
  • Be easy to use and lightweight
  • Be consonant with tagging constructs familiar from TEI (to the extent possible; processing semantics can take priority but TEI should be used when its semantics fit)
  • Capture whatever useful metadata is easily or best captured from authors
Non-goals:
  • Support production in detail (DHQpublish can provide support for details of production or processing)

Editorial, production and publication Schema: DHQpublish

See the DHQpublish Schema? page for details.

Scope:

  • Articles for DH Quarterly (in any of zones I-III: see FrameWorkNotes)
Goals:
  • Maximum compatibility with DHQauthor (an easy transform at most)
  • Provide "lights-out" generation of articles and resources in HTML, PDF and other target formats (RSS etc.) from source using freely available tools (or cheaply accessible ones)
  • Provide flexible and robust metadata structures for
    • cataloging and online access
    • process (workflow) control
    • archiving

Crayon Box schema: =DHQcrayonbox

Scope:
  • DH articles in Zone I too "experimental" for DHQauthor
Goals:
  • Support wilder experiments in tagging than DHQauthor
  • Validate DHQauthor documents freely, and then some
  • Presume custom stylesheets may be applied (instead of or in addition to regular stylesheets), to ease requirements for work-alikeness with DHQpublish
  • Provide a DHQ header like other documents (in Zones II and III) to assure compatibility in access

Any of these

Non-goals:
  • Support retrospective tagging of arbitrary documents already extant (TEI can be used for this, or works in other formats can be referenced in Zone II or III)

Operational context

In brief, a document published by DHQ (article, review, editorial) will pass through several successive stages:

  1. [authoring format]
  2. DHQauthor: basic text + authorial metadata
  3. DHQpublish: enhanced with tagging for publication
  4. [production formats]

Authoring formats include any of DHQ's accepted formats, as described in the Submission Guidelines. In particular, since we wish to encourage authoring and submission in XML formats, we provide DHQauthor itself as a viable format for the original creation of submissions. We will be sharing tools and stylesheets to create, test, and use DHQauthor, and because they will be less laborious for us to produce, we expect to publish works submitted in DHQauthor more quickly than papers requiring conversion.

The DHQauthor Schema will be documented for general use outside DHQ. The DHQpublish Schema? will be documented and specified separately, as it will be only for internal use by DHQ.

Read more about the operational context:

  • Technical Specs presents an outline of our design areas, as they are differentiated by functional area ... Front end (HTML to start with) and presentation; processing layer (XML/XSLT), and anything we care to write up about storage and file management (hah)
  • Technology Commitments - what platform(s) and software are we running on?
  • For background: see some Architecture Notes

Development model

As described in the Architecture Notes and FrameWork Notes documents, DHQ will be published in an XML-based framework, taking input in a range of formats (ranked by preference) and creating a range of outputs (HTML, PDF, RSS and the like). The unifying format in the center is "DHQ ML".

We are getting at DHQ ML from two ends: at the front, we are designing the "body" or presentation format - to start with, in HTML+CSS, but over time, in other formats as well. ("Body" here is meant in the sense that we might say "auto body", meaning the shell or skin.) Simultaneously, in back, we are developing DHQauthor, which will be the format in which we will conduit those works into DHQ which can be described in it (which may correlate to "text-based submissions" as described in the Submission Guidelines). Having the "uniform center" of DHQ material entirely in this single XML format will improve its potentials for reuse, search/retrieval, analysis and heuristics, and corpus research, along with its long-term accessibility. Consider this format the "engine" of DHQ, if you like.

When DHQ materials do not prove to be usefully reformatted as DHQ ML, they will be presented in their native format, but provided with DHQauthor descriptors ("stubs"). (This is likely to be the case with many or most of the "media-based" formats also described in Submission Guidelines, though it may not be limited to them.) This metadata set will allow them to be tracked and handled within the DHQ production framework, in most respects as if they were themselves in DHQ ML. (Being not as consistently or richly tagged, they may not be quite as transparent to the system.)

In practical terms, this means that all materials published by DHQ will either be represented in some fashion by XML documents in DHQ ML format (whether that representation be complete, or only by a link), or will be derivatives from such representations (such as directories, indexes and tables of contents).

To achieve this goal, we are approaching the design as follows:

  1. Deploy a prototype schema (DHQauthor beta)
  2. Test this model by developing sample documents for a range of use cases
  3. Finalize DHQauthor beta
  4. Build stylesheets targetting the initial DHQ front end design (a separate design process)
    • Develop DHQpublish in answer to requirements of publishing system
      including HTML/CSS, PDF, RSS, OAI etc. etc.... whatever we go with
    • DHQpublish documents should be straightforwardly derived from DHQauthor source
  5. Finalize, finish documenting, and deploy DHQauthor 1.0 version for general use by production staff and DHQ contributors
  6. Run the system a few times!
  7. Rev subsequent designs in view of developments to front end, other interfaces
    • Develop DHQcrayonbox in answer to requirements for experimentation

WendellPiez - 24 Sep (19 Oct)

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r4 < r3 < r2 < r1 | More topic actions
 
DHQuarterly
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback