Open source, open standards, open access

Posted by Dorothea Salo | Uncategorized | Monday 26 February 2007 6:33 pm

Reading the MIRACLE project report on institutional repositories today, I came across a survey question that asked respondents to rate various factors involved in choosing a repository platform. The exact phrasing of one factor: “Adherence to open-access standards.”

This betrays a common and sometimes distracting confusion, which I will try to clear up.

The phrase “open source” applies to software. The exact parameters surrounding what software is open source and what isn’t are (still!) a matter of debate, but the general idea is that open-source software makes its source code available for (re)distribution and modification, whereas closed-source software does not. (The Open Source Initiative’s definition is available for those who wish to learn more.)

The phrase “open standards” applies to specifications, specifications for software, data formats, metadata, and so on. The Wikipedia page offers links to several definitions of an open standard… but again, the general idea is that it is a set of rules surrounding a particular data or metadata format (or type of software, or whatever) that is available to the world at large for examination and implementation.

Unlike open-source software, an open standard may not be free-as-in-beer to the implementor, and it is almost certainly not easily modifiable by the world at large. Open standards are typically created and maintained either by an interest group such as the Open Archives Initiative, or by a formally-constituted national or international standards organization such as the International Standards Organization.

The phrase “open access” applies to the research literature. Again, the exact parameters are a matter of debate, but “open access” literature is generally considered to be that made available on the open Web without fees, subscription or membership requirements, or other access barriers aside from gaining access to the Web itself. Peter Suber has written a lucid introduction to the concept.

“Open-access standards” with reference to institutional-repository software is at best ambiguous, and at worst meaningless. Does it mean that the IR software implements open technical standards related to access provision, standards such as OAI-PMH? Or does it simply mean that the software provides open access to the repository’s contents?

Surveys in especial need to pay careful attention to their wording to ensure useful results. I hope that future IR surveys address this confusion appropriately.

CMSes, WYSIWYG — why learn HTML?

Posted by Dorothea Salo | Uncategorized | Monday 19 February 2007 3:25 pm

Many libraries may wonder whether on-staff web-design expertise is truly necessary, given the proliferation of content-management systems and WYSIWYG tools such as FrontPage and Dreamweaver.

I’m here to say—maybe.

Major concerns for any library website migration fall into two groups:

  • Patron concerns: usability, accessibility for the print-disabled, visual-design quality, printability.
  • Staff concerns: maintainability, ease of content creation and updating, compliance with legal requirements.

In practice, ease of content creation has tended to trump all other concerns in libraries, which has produced some truly unfortunate results measured by the other criteria above. Most other industries have abandoned Microsoft FrontPage, for example; the HTML code it produces is thoroughly abominable for accessibility, printability, and maintainability. FrontPage is alive and well in libraries, though—and if you’re using it, you need to stop as soon as you reasonably can; you’ll thank me in a year or five, I promise.

(And if you’re using Microsoft Word to produce HTML—please, please, please stop. I hate to beg, but Microsoft Word was an unconscionable thing to do to a poor innocent markup language like HTML. Don’t even cut-and-paste out of Word into an HTML document if you can avoid it! Learn to love text editors; this is what they’re for!)

Dreamweaver, on the other hand, can produce tolerably responsible HTML. The problem is, Dreamweaver has to be tweaked into doing so, and the tweaker needs to understand HTML to perform the tweaking properly. Anyone who relies on Dreamweaver’s WYSIWYG interface exclusively is almost certainly not producing web pages that will meet legal accessibility standards, never mind future-proofing or easy maintainability.

Content management systems are much like Dreamweaver. They can produce HTML so beautiful it makes text artisans weep for joy—or dross almost as horrid as Microsoft Word. To embrace the former and avoid the latter, there’s no getting around a need for HTML expertise in the CMS selection and customization/design process.

Be duly wary about trusting your CMS vendor/consultant’s expertise, for all the obvious reasons. If you can’t hire or create web-design expertise on your staff, at least find an independent contractor to give the results a swift once-over.

Once you have a responsible CMS up and running, though… the need for HTML expertise on your staff may go down precipitously. With a few simple precautions (such as never, ever pasting content from Microsoft Word!), a CMS can simply meander along doing the right thing, while staff with no web expertise whatsoever create and alter content in TinyMCE textareas or wiki markup or whatever’s easiest.

For a small library, or indeed any library with limited IT expertise, that may be the Panglossian ideal. It’s certainly a goal worth pursuing—especially if you’re still using the horror that is FrontPage.