About the NZETC
The New Zealand Electronic Text Centre has four aims:
-
To create a digital library providing open access to significant New Zealand and Pacific Island texts and materials. This encompasses both digitised heritage material and born-digital resources.
-
To effectively partner with other organisations, as a collaborator and service provider, on a variety of digitisation and digital content projects.
-
To build a wider community skilled in the use and creation of digital materials through teaching and training activities and by publishing and presenting the results of research.
-
To work at the intersection of computing tools with textual material and investigate how these tools may be used to make new knowledge from our cultural inheritance.
Acting on these goals, NZETC is engaged in an ongoing programme of digitisation and hosts an expanding online library. The standards-based collection is delivered through an Open Source framework and offers full and free access to a range of materials in multiple formats for download or online browsing. Today the NZETC collection contains over 2,600 texts (around 65,000 pages) and receives over 10,000 visits each day. Information on the NZETC selection policy for digitisation is available here.
Since it was created in 2002, the NZETC has worked successfully with a range of partners on a variety of external projects. More information about those projects can be found here. As part of Victoria University of Wellington Library, one of the Centre's key relationships is with the wider University. A large part of the funding for the Centre is provided by the University, and the NZETC works closely with Library colleagues and academic staff to identify and deliver digital content which will support research within the institution. The NZETC actively collaborates with VUW staff and students on digital humanities research projects. NZETC staff undertake teaching work in digital resource management and electronic publishing, and support internships within the Centre for students wishing to learn about various aspects of digital text.
The NZETC is a founding contributing partner in the Matapihi project. We are active members of the National Digital Forum, the Text Encoding Initiative Consortium, and the Australia New Zealand Digital Encyclopedias Group.
News
To read the news on developments at the NZETC go to the NZETC blog. Here you can find out what texts have been added to the collection and what projects we have been working on. The blog also acts as an archive with news stories going back to the creation of the Centre in 2002
Subscribe to the NZETC Blog RSS Feed
Research papers and reports produced by NZETC staff can be found in VUW's ResearchArchive
Projects
Projects with partners within the University
- Each year students from the International Institute of Modern Letters work with the NZETC to produce Turbine, a literary journal, and the annual collection of Best New Zealand Poems
- The School of English, Film, Theatre and Media Studies has collaborated with the NZETC on several research and publishing projects including an electronic edition of the poetry of William Golder, and Kotare, an online journal of New Zealand Studies
- Tidal Pools is a joint project between the NZETC and Va'aomanū Pasifika to make available texts of interest to those researching Pacific islands history, language, culture and politics.
- The University Library sponsored a project for the School of Biological Sciences to digitise "Tuatara: The Journal of the Biological Society"
- The NZETC has worked with Victoria University Press on several projects, notably the creation of an online archive of the Sport literary journal
- The J C Beaglehole Room is an important partner for the NZETC and a major source of the heritage material that is digitised to form the online collection.
- Wai-te-ata Press, the NZETC and the J C Beaglehole Room collaborated on the Print History Project
- The NZETC is responsible for the development of an institutional repository for the University Library called ResearchArchive
Projects with external partners
In addition to internal VUW projects, the NZETC provides expertise and services to a range of heritage institutions, government departments and commercial organisations in the areas of cultural heritage digitisation and e-publishing.
- The NZETC has worked with the National Library of New Zealand and the Alexander Turnbull Library on several projects including Te Ao Hou and the Transactions and Proceedings of the Royal Society of New Zealand 1868-1961
- Auckland War Memorial Museum and the NZETC were awarded Community Partnership Funding in 2007 to digitise the Embarkation Rolls from the First World War
- The Australian State Library of Victoria worked with the NZETC to produce a full-text online archive of their La Trobe Journal
- Learning Media uses NZETC services to optimise the publishing processes
Technology
XML and TEI are the document mark-up standards which underpin the work of the NZETC. Information on TEI can be found through the Text Encoding Initiative. Other key technologies used at the NZETC include topic maps and XTM, XSLT, Apache Cocoon and Lucene. More information is given below.
Books, images, and collections are navigable through a dynamically-generated semantic framework, which represents the first release of a large-scale XML Topic Map (XTM) site in New Zealand. Users are able to move around the resources on the site tracking topics of interest rather than merely browsing the material linearly or through text searching. In a topic map, web-based resources are grouped around items called "topics", each of which represents some subject of interest. In the NZETC topic map, the topics represent books, chapters, and illustrations, and also people and places mentioned in those books.
Topics in a topic map are linked together with hyperlinks called "associations". There can be different types of association in a topic map, representing the different kinds of relationship in the real world. For instance, in the NZETC topic map, the topic which represents a particular person may be linked to a topic which represents a chapter of a book which mentions that person. This association would be labelled to indicate that it represents a "mention". Similarly, the same person's topic might be linked to a particular photograph topic, via a "depiction" association.
To construct our topic map, we use XSLT stylesheets to extract metadata from each of our XML text files, and express it in the XTM format. In this way we automatically create hundreds of topic maps, each of which describes one of our texts. We also harvest information about people, places and organisations from an entity authority file which we construct from what is mentioned in our collection. Finally we merge the harvested topic maps together to create a unified topic map which describes our entire website.
Each page on the website represents one of these topics, along with any associated topics.
The Topic Map framework for the NZETC website was presented at the launch of the new information architecture on 5 May 2005. PowerPoint slides from the presentation are available.
Papers on the NZETC technical infrastucture are available through the Victoria University ResearchArchive
We use the open source TM4J Topic Map engine for merging and querying our topic map.
We use an XML publishing framework called Apache Cocoon to publish the NZETC website.
Cocoon is a Java servlet and hence it can be deployed on a wide variety of systems. We run Cocoon inside the Apache Tomcat servlet container (the official reference Implementation for the Java Servlet specification), using JVM version 1.4 from Sun Microsystems.
Cocoon offers a flexible environment based on the separation of concerns between content, logic and style.
Cocoon can deliver documents in a variety of formats, including HTML, PDF, RTF, SVG, JPEG, PNG, and any other XML-based format. We have also integrated software to produce Microsoft's eBook Reader format.
We use Cocoon to transform our XML texts into readable documents using XSLT stylesheets.
Cocoon can perform these transformations on demand; i.e. when a request is received from a web browser. Each request is handled by reading the appropriate XML document or documents, and processing the XML data in a succession of stages, first applying logical, then presentational transformations. Each stage is distinct and can be effectively managed by different people. Our web designer can edit the look of the site, the web developer can edit the structure of the site, and the text-editors can edit the content of the site (the e-texts), all independently of each other. To install a new text, the editors can simply upload the XML document and associated image files into the webserver via FTP. The document will then be automatically converted to HTML and divided into separate pages for each chapter, and scaled-down thumbnail versions of the JPEG graphics will be created using the XML graphics format SVG. To change the overall look of the site, the web-designer can upload new design elements such as CSS stylesheets, new versions of the logo, navigation menu, etc, in the same way. When a document is displayed to the reader, the content will be automatically inserted into this new design.
We use Lucene for searching. Lucene is a full-text search engine written entirely in Java, published by the Apache Software Foundation.
Services to External Partners
The NZETC provides expertise and services to other institutions, including commercial organisations, in XML, document conversion and repurposing, digitisation, Open Source e-publishing, metadata, digital imaging, and digitisation project development and management.
The income earned from commercial projects is used to support research and digitisation.
Digitisation, project-management and consultancy services
- Full imaging services
- Conversion from print to XML or other desired formats
- Expertise across a range of metadata standards and protocols, including TEI, EAD, Dublin Core, MADS, OAI-PMH, MARC
- Migration between metadata standards, dynamic creation of multiple metadata records
- Expertise in XML and XSLT (stylesheets for manipulating and delivering XML)
- Expertise in configuring proven open source XML-based
platforms for managing and delivering digital content
online, features can include:
- Versioning control repository
- Multiple format delivery, accessible formats for visually impaired community
- dynamic thumbnail generation
- full text searching
- image galleries and searching
- topic mapping of resources (semantic web)
XML-based publishing solutions
- DocBook and TEI-based systems
- Maintenance of technical manuals and large catalogues
- Adding value to existing content with structured tagging
- Schema and taxonomy development
- XML for content exchange
- Migration between schemas
- Content management
- Stylesheet ( CSS / XSLT / XSL-FO) development for templating and transformation
- Single-source multi-channel publishing - Automated web publishing; Systems/database integration
People
The people at the NZETC come from a wide range of backgrounds - computing science, publishing, information management, literary scholarship, library science. This mix of skills and interests which fosters a dynamic and creative environment.
NZETC staff have experience and internationally recognised skills in online publication of digital heritage materials using XML, semantic web technologies, and open source systems.
Alison Stevenson, Director (currently on maternity leave)
Jason Darwin, Project Manager and acting Director
Jamie Norrish, Analyst Programmer, Jamie.Norrish@vuw.ac.nz
Stuart Yeates, Lead Architect
Samantha Callaghan, Research Assistant
Edmund King, Research Assistant
Max Sullivan, Research Assistant
Jane Hornibrook, Research Assistant
Louise Grenside, Research Assistant
Copyright and Conditions of Use
The material made available through this website was processed by the New Zealand Electronic Text Centre of Victoria University.
Different materials in the collection are made available under different conditions.
All text and images are copyright to the original authors and/or publishers where a work remains in copyright. In such cases texts and images are made available for non-commercial use only and all forms of electronic or print re-sale or re-distribution are forbidden without written permission.
Where the original text is out of copyright it is our policy to provide the digitised version under a New Zealand Creative Commons Attribution Share-Alike License (CC BY-CA). This licence allows the sharing and transformation of the digitised text even for commercial reasons, as long as the NZETC is credited and users licence their new creations under the identical terms. This licence is often compared to open source software licences. More information on Creative Commons is available here. NZETC Creative Commons licensed texts are clearly labelled as such in the sidebar. Material not labelled as Creative Commons should be assumed to be in copyright and not available for sharing or re-use without written permission.
Copyright queries should be directed in the first instance to the New Zealand Electronic Text Centre.
Contact Information
Alison Stevenson, Director NZETC
Email: director@nzetc.org
Phone: +64 4 463 6847
Postal Address: New Zealand Electronic Text Centre, Victoria University of Wellington,
P O Box 3438, Wellington, New Zealand
NZETC Digitisation Selection Policy
Introduction
“One of the most important services performed by archives, libraries, and museums is selection, choosing from the many products of the living those few items which will best tell their stories. Digitization means that cultural caretakers will find themselves conducting yet another series of selections among collections that have been winnowed time and again”
North Carolina ECHO, 2005
“Considering the bourgeoning volume and heterogeneity of information on the web, selection and appraisal of resources for digitization is one of the most difficult tasks in the digital resources management life cycle”
Hartman et al., 2005
There are many aspects to the creation of a “content-rich New Zealand”. This paper focuses on the digitisation of heritage material1. New Zealand has significant stores of formal content held in local, regional and national institutions, ranging from manuscripts and printed material to film, video and sound recording. Much of this material is not in digital form. Unlocking this content through digitisation is important because it enables New Zealanders to access information about our histories, cultures, languages and identities – and tells our stories to the world.
New Zealand’s efforts to date in putting such content online have been sporadic and lacking in national oversight or coordination. The New Zealand government has now proposed through the Draft Digital Content Strategy that action be taken to “significantly increase the store of New Zealand digital content on-line through a nationwide digitisation programme of key local, regional and national content”2.
It is expensive to select, create, and maintain digital resources. There are limits to financial resources and to technical capabilities. It is not currently feasible to digitise everything and intellectual property rights and cultural preferences mean that not everything should be digitised and made available online. A process of selection and prioritisation is required which takes account these factors along with the value of the materials and the interest in their content. This process takes place to some degree in every institution or community embarking on digitisation work but it should also take place at a national level. A 2004 report on the piecemeal and uncoordinated approach to digitisation in the UK highlighted resulting issues including risk of duplication, use of diverse standards, lost opportunities for collaboration, lack of user awareness of existing resources and poor gap analysis3.
This paper articulates ideas about how the New Zealand Electronic Text Centre can select and prioritise material for digitisation and what criteria should be taken into account when doing so4. It is intended to be a resource both for NZETC staff and the NZETC Text Selection Advisory Group. This paper does not attempt to address other aspects of the work of the NZETC such as criteria for taking on commercial digitisation work or the decision making framework around selection of digital humanities research projects.
Statement of Principles
-
The primary purpose of digitisation is to facilitate access. The aim is to enable people, regardless of location, to directly access to content relating to New Zealand’s documentary and cultural heritage. A secondary purpose may be to preserve rare and fragile items, by providing digital surrogates of the items for use.
-
The highest priority for digitisation is material relating to New Zealand and New Zealanders.
-
As part of Victoria University of Wellington, the New Zealand Electronic Text Centre has a responsibility to develop an online collection which supports the University’s strategic objectives around teaching, learning and research. Selection of material for digitisation should therefore fit within the overarching principles of the VUW Library Collection Development and Management Policy.
-
Special consideration needs to be given to the digitisation and online delivery of resources which are considered to be Mātauranga Māori.
Māori share with other indigenous peoples a legitimate concern and apprehension when uninitiated enter their cultural world. Not only is there a need for respect, but also for caution about the dangers inherent in ‘getting on the bandwagon but starting at the top’ without having first served an appropriate apprenticeship in learning about the culture, its history, cosmogony, customs and language. Too often, the lack of these attributes has led to subsequent misuse and even abuse of superficially acquired knowledge, thus reinforcing the reluctance of many Māori to share their knowledge with the uninitiated.5
These concerns must be addressed. There is a clear risk that if they are not, and if the majority of resources detailing aspects of Māori history, culture and language are therefore excluded from a nationwide digitisation programme, then part of essence of New Zealand will be invisible to us and to the world.
The NZETC has developed a policy to cover the display of images of tupuna especially in relation to mokamokai.
-
Digitisation has to take account of the provisions of the 1994 Copyright Act.
Selection Criteria6
Value
The value of the materials’ content and the benefits derived from access to digital versions justify the expenditure of time and effort of carrying out a digitization project. The content should have sufficient intrinsic value to ensure ongoing use by a defined constituency for a significant period of time.
Many factors contribute, but they include:
- intellectual content;
- historical significance7;
- rarity;
- importance for the understanding of the relevant subject area;
- broad or deep coverage of the relevant subject area;
- useful and accurate content;
- information on subjects or groups that are otherwise poorly documented;
- access to the material currently restricted due to its condition, value, vulnerability or location.
Demand
To justify the effort and expense, there should be a reasonable expectation that the product will have immediate utility for New Zealanders community and/or other appropriate audiences. Thus factors to be considered might include:
- an active, current audience for the materials;
- advocacy for the project from part of the community;
- realistic expectation of attracting new users even if current use is low;
- requests from potential partners in collaborative or consortial efforts.
Note however that that a 2005 paper looking at 21 digitisation projects for historical photograph collections cautions against using existing demand as the sole justification for digitisation:
“Criteria for selection are often made on the perceived needs of the targeted viewer. Hence there is a danger of producing a ‘turn-of-the-century view’ shaped, as one archivist interviewee put it, by ‘today’s trends for nostalgia’ rather than by online resources that will have sustainability over time. …The question here .. is one of authenticity and representation of historical material being accessed by the public”8
Non-Duplication
There is no identical or similar digital resource that can reasonably meet the expressed needs.
Collaborative Potential
The following factors could be considered:
- part of a collection split among a number of institutions that could be united online as a virtual collection;
- contribution to development of a "critical mass" of digital materials in a subject area;
- flexible integration and synthesis of a variety of formats, or of related materials scattered among many locations.
Enhancement of intellectual access
The following factors could be considered:
- enhancement of intellectual control through creation of new finding aids, links to bibliographic records, and development of indices and other tools;
- ability to search widely, manipulate images and text, and study disparate images in new contexts;
- widespread dissemination of local or unique collections.
Enhancement of resource quality
Improved quality of access to resource content, e.g., through improved legibility of faded or stained documents, enhanced images or restored sound quality through digitisation processes.
Preservation
While digitization does not in itself constitute preservation, there are preservation aspects to be considered through the creation of digital surrogates will allow:
- significant reduction in handling of fragile materials;
- access to materials that cannot otherwise be easily used;
- protection of materials at high risk of theft or mutilation.
Technical Feasibility
Potential projects should be evaluated as to whether it is technically possible with current equipment and software to capture, present, and store digital resources in ways that meet user needs.
Considerations include:
- degree to which a digital version can represent the full content of the original;
- understanding of how people will use the digital versions and the level of quality that that implies;
- whether the materials will display well digitally;
- anticipation of future users with better equipment, to avoid a need to rescan in a few years;
- staff and resources to support programming, user interface design, and search engine development to assure that the project can fulfil the functions for which digitization is planned;
- long term storage requirements.
Materials that require special consideration include:
- materials that require unusually high resolution;
- materials for which fidelity to original colour is essential;
- oversize items;
- items with poor legibility;
- material with a complex graphic layout intertwined with text.
Intellectual Control Criteria
Potential projects should be evaluated as to whether appropriate intellectual control can be provided for the original materials and the digital versions:
- cataloguing, processing and related organizational work already accomplished or to be accomplished as part of the project;
- staff and resources to support creation of appropriate metadata relating to document identification, technical capture information, provenance, and easy navigation within the information resource;
- accordance with the provisions of the 1994 Copyright Act and any amendments to it.
Consideration of special requirements around traditional knowledge
“Although digitization is ideal for sharing, exchanging, educating and preserving indigenous cultures, it also creates ample opportunities for illicit access to and misuse of traditional knowledge. It is essential that traditional owners be able to define and control the rights and access to their resources, in order to uphold traditional laws; prevent the misuse of indigenous heritage in culturally inappropriate or insensitive ways; and receive proper compensation for their cultural and intellectual property. Finally, it is essential that indigenous communities be able to describe and contextualize their culturally and historically significant collections in their own words and from their own perspectives.”
J. Hunter, B. Koopman, J. Sledge, “Software Tools for Indigenous Knowledge Management”, Museums and the Web 2003
“A cornerstone of an Indigenous Digital Library is that the indigenous communities themselves control the rights management of their cultural intellectual property. Local cultural protocols need to be documented and followed prior to the creation of digital content, and communities must be consulted with regard to the digitization of content already gathered by institutions of social memory.”
Robert Sullivan, “Indigenous Cultural and Intellectual Property Rights”, D-Lib May 2002
Selected Bibliography of Digitisation Selection Policies
(Given in chronological order)
Selecting Research Collections for Digitization by Dan Hazen, Jeffrey Horrell, Jan Merrill-Oldham, 1998
University of Oxford Assessment Criteria for Digitisation, 1999
A Handbook for Digital Projects: A Management Tool for Preservation and Access, edited by Maxine K. Sitts, Northeast Document Conservation Center, Andover, Massachusetts, 2000
Columbia University Selection Criteria for Digital Imaging, 2001
DEF (Denmark’s Electronic Research Library) Final Report. National Digitisation Programme and Policy by Brian Robinson and Simon Tanner, 2001
North Carolina Echo (Exploring Cultural Heritage Online), 2005
National Library of Australia Collection Digitisation Programme 2006
Policy Review
Policy Created: August 2007
Policy Due for Review: August 2009
1 The creation and wide availability of accurate catalogues, indexes and finding aids to enable the discovery of content which is not is another important piece of work required to improve access to heritage content. However this document is focused on digitisation of the content itself.
3 B. Bültmann, R. Hardy, A. Muir, C. Wictor, “Digitisation in UK Research Libraries and Archives: is a national strategy needed?”
4 This paper is a slightly revised version of a selection policy document prepared by the NZETC for the National Digital Forum.
5 M. Roberts, W. Norman, N. Minhinnick, D. Wihongi, C. Kirkwood, Kaitiakitanga: Maori Perspectives on Conservation, University of Auckland, 1995, pp 1–2
6 Based largely on the criteria developed and published by Columbia University Libraries.
7 For an expanded discussion on the idea of significance see D. Dorner, S. Young, “A Regional Approach to Identifying Items of National Significance Held by Small Culture Institutions: A Research Report”, 2004.
NZETC Privacy Policy
This website uses Google Analytics, a web analytics service provided by Google, Inc. ("Google"). Google Analytics uses "cookies", which are text files placed on your computer, to help the website analyze how users use the site. The information generated by the cookie about your use of the website (including your IP address) will be transmitted to and stored by Google on servers in the United States. Google will use this information for the purpose of evaluating your use of the website, compiling reports on website activity for website operators and providing other services relating to website activity and internet usage. Google may also transfer this information to third parties where required to do so by law, or where such third parties process the information on Google's behalf. Google will not associate your IP address with any other data held by Google. You may refuse the use of cookies by selecting the appropriate settings on your browser. By using this website, you consent to the processing of data about you by Google in the manner and for the purposes set out above.
The NZETC makes use of Google Analytics in order to evaluate the usage of our site, and this information is useful in allowing us to:
- determine which resources are heavily used, and so indicate areas that we should consider focusing future digitsation efforts upon;
- determine which resources are lightly used, and so indicate areas where we should consider improving navigation and promotion of these resources;
- measure the usage of particular resources so that we can provide feedback to those parties that are assisting us in making these resources available through financial or other support.
If you wish to opt-out of cookies from Google you can on the Google site.
Should you like further information about this privacy policy please contact us.


.jpg)
.jpg)
.jpg)
