An ONIX Glossary of Terms


This glossary is from the ONIX for Books Specification + Best Practice Guide + Codelists posted to EDItEUR’s website at Appearing as Appendix A.1 in the Best Practice Guide, we have re-created it here for easy reference and repeat reading.

BookNet Canada has added to this glossary with terminology and contextual information specific to the Canadian market. You will find BookNet Canada entries to this document identified with a BookNet Canada addition or BOOKNET CANADA AMENDMENT tag alongside the heading.


[ 1 A ] [ 2 B ] [ 3 C ] [ 4 D ] [ 5 E ] [ 6 F ] [ 7 G ] [ 8 H ] [ 9 I ] [ 10 J ] [ 11 K ] [ 12 L ] [ 13 M ] [ 14 N ] [ 15 O ] [ 16 P ] [ 17 Q ] [ 18 R ] [ 19 S ] [ 20 T ] [ 21 U ] [ 22 V ] [ 23 W ] [ 24 X ] [ 25 Y ] [ 26 Z ]


A0, A4 etc

See A series paper sizes. A0 is 1m² in area (1189 × 841mm), A1 is half that, A2 is half again and so on. All sizes have width and height in the ratio 1:√2. A4 is 1⁄16m², and 297 × 210mm.


Author’s alterations, corrections made on proofs by the author or publisher. cf printer’s errors or literals, which are errors made by the typesetter.


Advanced Audio Coding. Improved codec for audio files to reduce file size or download time. Used to compress audio files in the iTunes store, but not unique to Apple. For a given amount of compression, AAC generally sacrifices a little less quality than MP3.


Anglo-American Cataloging Rules, second edition, widely used English language library cataloging rules. cf RDA. See also MARC21, the primary format in which library catalog metadata is transmitted (and often stored).


Abandoned, previously planned publication later canceled without ever having been published. cf NYP.


Content shortened by removal of text and minimal re-writing, occasionally termed ‘condensed’ (the latter implies a greater degree of re-writing).


Short summary of the contents of (for example) an academic paper, article or chapter. Journal publishers often provide free online access to abstracts, while access to the full text remains dependent upon subscription to the journal.

Abstract model

Generalized conceptual model of a real-world system, developed as a guide or aid to understanding the principles of that system. Often expressed as a series of generic entities (‘things’ – books, people, places, dates and so on), the potential relationships between them, and perhaps the events that may change those entities and relationships. See <indecs>, FRBR.


Author’s corrections, see AA.


See diacritical mark.


A book can be accessible by print-impaired people (eg blind, partially sighted or dyslexic readers, or readers with a physical disability). For print books, special accessible editionsare often required. For e‑books, accessibility is not a special edition or feature, but a best practice for mainstream editions. Accessibility is also a consideration for the remainder of the supply chain, including retailer websites, library catalogs and e-reading devices. See also WCAG.

Accessible edition

Large print, Braille or specially-formatted audiobook (DTB) that can be used by print-impaired people who cannot use a conventional physical book.

Accession number

A unique identifier added to each item acquired by a library or archive, assigned as part of the acquisition process. The accession number is a proprietary item identifier, and two items with the same ISBN (ie two instances of the same manifestation) would have distinct accession numbers.

Acid-free paper

Higher quality paper with low lignin content, and treated so it is chemically neutral (pH 7.0) or slightly alkaline to avoid yellowing and deteriorating as quickly as normal paper. For the maximum longevity, the highest archival-quality paper contains about 2% calcium or magnesium carbonate as a chemical buffer to guard against future development of acid within the paper as it ages.


Beginning of the publishing process – the publisher agrees the contract with the creator and purchases the rights to publish a work.

The process of selecting, ordering, and receiving books and resources for a library or archive. See also accession number.

The purchase of rights to a product, range of products or (often) an entire imprint from one publisher by another. This usually include transfer of any existing stock of the affected products. cf divestment of a product, range or imprint by the selling publisher.


Extra agreements or clauses added to the end of an existing contract.

Addition made to a book in a later printing or subsequent edition – for example a list of corrections (corrigenda), appendices or coda added to any section of a book.

Adhesive binding

Typical paperback (‘limp’) book binding using hot-melt adhesive applied to the roughened or notched spine of the book block to hold the pages or signatures and cover together. Also termed ‘Perfect binding’, ‘unsewn binding’.

Adobe RGB

Extended RGB colorspace, allowing a much wider range or gamut of colors on screen than standard sRGB. It is intended to cover – in RGB – almost all of the colors that can be printed using CMYK. See also DCI-P3, color profile.


Decision by a school or college, or by a consortium or educational authority, that a specific textbook will be included on a reading list or used to teach a course of study.


Sum paid by a publisher to an author (or other contributor) prior to publication. Often paid in parts, upon acquisition (agreement of the publishing contract), upon delivery of manuscript, and upon publication. The advance is paid against future royalty earnings, so the author does not receive any further royalty payments from the publisher until the advance has been ‘earned out’ (or ‘recouped’) at the agreed per-copy-sold royalty rate.

See advance copies.

Advance copies

(pl often just Advances), sometimes also known as a review copies: early finished copies of a book, usually arriving before publication and used for publicity purposes, reviews, and occasionally for evaluation (eg for potential adoption – see approval copies) etc. Book proofs (which are bound but usually editorially unfinished) are also sometimes used in the same way.

A format

UK term for a paperback around 178 × 111mm in size, roughly equivalent to a US rack-sized mass-market paperback. See also B format, pocket book.


See literary agent.

Agency model

Business model based on the idea that a publisher sells to the consumer and is wholly responsible for setting the price. The retailer acts as an intermediary or agent to facilitate the sale and takes a fixed commission from the publisher; cf the more common wholesale or reseller model. Under the agency model, the publisher can directly control the ‘street price’ or Actual selling price of a book, whereas under the reseller model, street prices are usually at the discretion of the retailer – the retailer can even choose to sell the book at a loss to attract footfall – unless the legal framework provides for Fixed rather than Recommended retail prices. Reseller models where retail prices are not fixed thus encourage retailers to compete on price (possibly to the exclusion of other areas such as customer service) and put great pressure on margins. However, agency models are more complex for the publisher, and may be viewed as anti-competitive.


In metadata, an organization that collects metadata from many sources (mostly publishers), and re-distributes a combined feed to other metadata recipients. This service may enable or be provided alongside other services such as identifier registration, maintenance of a national bibliography or Books-in-Print database, retail sales reporting and so on.


Artificial intelligence, machine intelligence – broad term for the application of software to analyse data and make decisions based on that data aimed at achieving some predefined goal. Encompasses natural language processing, knowledge representation, reasoning, machine learning etc. In publishing and metadata, AI techniques might for example be applied to automated entity extraction (recognising names, places, concepts and so on) from the text of a book to create keyword lists or links to other books about the same entities, to sales prediction, or to abstracting and summarization.

See AIS.


Abstracting and indexing. cf AIS, AI.

Airside edition

Book only for sale in bookshops in the duty-free (or ‘airside’) area of an airport (cf ‘groundside’). Occasionally, special editions are produced specifically for airside retail outlets (though they are not editions in the proper sense used for P.9 Edition). At other times, airside retail outlets may sell products normally limited to export.


Advance Information Sheet, colloquially just ‘AI’, also called a Title sheet, a printed page of metadata about a book produced in advance of publication for sales and marketing purposes, including details of the title, author, ISBN, pub date, format, price, a description of the contents and marketing information. In effect, an ONIX Product record can be the digital equivalent of an AIS or title sheet.


The & character, meaning ‘and’ when used in text (typographically, it is related to the Latin ‘et’). In XML data, it must be replaced by ‘&amp;’.

Answer code

Distributor or wholesaler’s brief response to an order or stock enquiry, for example indicating the product is NYP or OP. In ONIX, codes in List 65 (used in the <ProductAvailability> date element) are equivalent to answer codes.


Advertising and promotion, often the major concern of the marketing department. Not to be confused with P&A, price and availability.


Article processing charge, a fee charged to the author by an open access publisher to cover the cost of editorial work, production and distribution of a work. More common for articles published in open access academic journals, but a similar business model is also used by some publishers for monographs.


Application programming interface: a set of protocols, functions or services that one piece of software offers to another, used to share data in ‘real time’ – more or less instantly – between applications on a computer or between computer systems on a network. There are two common styles of network API protocol – SOAP and REST – which differ in the manner they encapsulate data in the request and response messages, and the messages are often passed over HTTP. See also web service.

Approval copy

Finished copies of a book sent to an educational institution or library for evaluation purposes, with a view to purchase or adoption. Approval copies may be sent speculatively or on request, and may be free of charge, or charged if not returned (ie on SOR terms). Also termed an Inspection copy, Desk copy or Evaluation copy.


Advance Reading Copy, see Advance copies.

AS2, AS3

Applicability Statement 2, 3 etc, IETF specifications covering the secure and reliable communication of business-to-business data on the internet using HTTP and digital signatures for document signing and encryption, with receipt confirmation returned after decryption and signature verification to guarantee successful delivery. Most used for EDI message delivery, but the content could be any standardised structured data (potentially including ONIX messages). AS3 uses the FTP protocol instead of HTTP.


Part of lower case letter that lies above the top of the lower case x (ie above the x-heightof the text), in letters such as b or d. cf descender, part of a lower case letter that lies below the baseline of the text, in letters such as g or y. Note that the relative sizes of the ascenders, x-height and descenders varies in different typefaces.


American Standard Code for Information Interchange. Simple character set comprising 0–9, A–Z and a–z, plus a few basic symbols and punctuation characters. An ‘Ascii’ text file is one that contains plain text (‘words and spaces’) using only characters from this set. There’s no control over fonts, no formatting or styling (eg different point sizes, justification, bold or italic), and no accented characters, specialized symbols or fancy punctuation – ASCII does not even allow for proper curly quotation marks “ … ” or currency symbols like £ and €. Plain text. cf Latin‑1, Windows‑1252, Unicode.


A bibliographic collection to which someone other than the publisher, typically a metadata aggregator, assigns a collective identity. (For example, among the novels of Tony Hillerman, there are several that feature the same protagonists Joe Leaphorn and Jim Chee. The publisher does not give them a series identity, but in retailer databases they may carry an ascribed identity Joe Leaphorn and Jim Chee Series).

See collection.

A series

ISO standard cut sheet paper sizes, used almost everywhere except North America. A0 is 1189 × 841 mm – 1 square metre in area, with sides in the ratio of 1:√2. A1 is half that area (but the same shape), A2 is half again, and so on. 2A0 is twice the area of A0. A4 is 297 × 210 mm (116th of a square metre). RA and SRA raw paper sizes are roughly 5% or 15% larger in area than A series sizes, to allow for bleed and final trimming, so SRA4 is 320 × 225 mm. B series sheets are intermediate between the A series sizes (B1 is between A0 and A1). These ISO 216 A and B series paper sizes are not used in the USA and Canada, where Letter sized paper is more common than A4 for office use. Letter is 11 × 8½ inches (or about 279 × 216 mm), similar in area but ‘squarer’ in shape than A4. US Legal size is 13½ × 8½ inches (about 343 × 216 mm), significantly taller than A4.


Amazon Standard Identification Number, a proprietary identifier for products used internally within Amazon.


Advance Shipping Notification, message sent by printer, distributor etc, usually via EDI, to confirm imminent dispatch of books to the customer (ie to the distributor, wholesaler or retailer). cf GRN.


See RRP.

Aspect ratio

Ratio of width to height of an image, screen etc, for example 16:9 for a modern TV screen. See also portrait, landscape.


Automatic Stock Replenishment, business method whereby new copies of a physical book are automatically manufactured when warehouse stocks fall below a pre-set trigger level – the publisher does not need to generate an order each time. Often combined with short-run printing in a ‘little and often’ stock maintenance process.

Assistive technology

Software and devices such as text-to-speech (TTS) screen readers or Braille displays that make e‑books more accessible to print-impaired readers.


Typographical mark, a small star (‘ * ’), used in text to indicate the presence of an annotation, as a list item marker (instead of a • symbol), to indicate multiplication (instead of the proper × symbol), etc.


Typographical mark, usually of three stars or asterisks (‘ ⁂ ’) but often approximated by a row of three spaced asterisks, indicating a break in the flow of text.


In XML documents such as an ONIX message, text, numeric or other data contained within an opening markup tag (eg the dateformat attribute within <Date>). XML attributes usually carry information about how to interpret the data content of a data element. More generally, can be synonymous with ‘property’, ‘characteristic’, ‘data element’ or ‘data field’.


Verification of the identity of a person (eg via a login), product or process. cf authorization.


Person or corporate body responsible for the intellectual or artistic content of a book. Often specific to the writer of the textual content – a broader and more inclusive term is contributor.

Author’s copies

Free copies of a book given to the author (or other primary contributor) upon publication, as (usually) stipulated in the author’s contract. cf voucher copies.

Authority file

In library cataloging and bibliographic data, a central list of, for example, contributornames. Used to ensure that contributors can be identified unambiguously and to highlight the single preferred form of a name that might have various forms or spellings. Any particular name may appear in different forms on different books, eg with or without Dr., with ü, ue or u, yet the shared contributor number from the authority file would make it clear that the names identify the same contributor. Authority files also help differentiate different contributors who share a name, and optionally can be used to resolve the real people behind pseudonyms. See also ISNI, VIAF. More generally, an authority file forms a type of controlled vocabulary.


Verification of the permissions associated with a person or process, for example to access or change some information. cf authentication, on which authorization depends.


Advanced Video Coding, a format for compressed video data, also termed H.264 or MPEG‑4 part 10. Has superseded earlier and less technically sophisticated video compression schemes such as H.261, H.263, and is likely to be replaced by H.265 (HEVC).



Business-to-business – commercial transactions between businesses, such as between a wholesaler and a retailer, or between a publisher and a wholesaler. cf B2C.


Business-to-consumer – commercial transactions between a business and an end user (usually but not always an individual consumer). Also termed D2C (direct-to-consumer). cf B2B.


Backlist products are those that have been on sale for (typically) a year or more and are still available. In contrast, Frontlist titles are conventionally those less than a year old (and usually including forthcoming publications).

Back matter

Pages following the main content of a book, including appendices, bibliography, index, other notes and – possibly – any sample or ‘teaser’ material from other books, advertising pages and blank pages added to make up a convenient signature size. Back matter is also termed End matter and occasionally ‘postlims’. cf front matter.


See dues.


Two books in one, bound back-to-back, with the text of one upside down with respect to the other, so that it reads from the ‘back’ of the book. The two share a single spine. Sometimes termed a ‘turn-around book’, ‘tête-bêche’ or a flip book (but cf flick book). cf dos-à-dos binding, where two books are bound back to back without turning one upside down, so the foredge of one meets the spine of the other.

Backward compatibility

See compatibility.


In data communications, the amount of data that can be carried over a particular channel, usually measured in bits per second (or megabits per second – Mbit/s or Mbps). Transmitting a 10 megabyte file over a 1Mbit/s link should take around 80–100 seconds.


Machine-readable data printed as a series of black and white stripes on a product or on packaging. A conventional Bookland barcode on a book uses the EAN-13 barcode symbology and has the ISBN printed above the stripes, with the equivalent GTIN-13 at the bottom. The stripes represent the GTIN-13, not the ISBN (though for modern books, the two are the same number). Other types of barcode (different ‘symbologies’, with differing sizes and arrangements of bars) can appear on products, on cartons containing multiple products, on pallets, shipping labels etc, for example GTIN-12 (formerly known as UPC-A barcodes, but obsolete in the book trade since 2005), GTIN-14 and GS1-128 (SSCCbarcodes). Barcode readers mostly use red light, so printing barcodes in color requires care to preserve adequate contrast.

Basis weight

See paper weight.

Berne Convention

International copyright agreement concluded between various European countries in 1886, since revised and extended to most countries. It provides for copyright protection of textual works from the moment of creation, and for the life of the author plus at least 50 years (many countries have a longer term), but also allows for some limited copyright exceptions. The 1967 revision of the Convention introduced the ‘three step test’ for judging exceptions – any exception must be limited and narrow in scope, must not conflict with normal exploitation of the work and must not unreasonably prejudice the interests of the author.

B format

UK term for a paperback around 197 × 130 mm in size, roughly equivalent to a US trade paperback. See also A format.


A relatively new abstract model and RDF-based data format for library bibliographic data, intended to replace MARC but at the same time also breaking with FRBR data modeling practice. Where FRBR offers work, expression, manifestation and item entities, BIBFRAME originally contained only work and instance – work conflated FRBR’s work and expression, and instance conflated manifestation and item. BIBFRAME version 2 introduces an item entity, making it much closer to the <indecs> work, manifestation and item structure (though a BIBFRAME work may be more abstract than an <indecs> work>). BIBFRAME data is expressed in RDF using linked data principles.

Bibliographic collection BOOKNET CANADA Addition

A collection to which an identity is ascribed which is also part of the bibliographic description of each member (eg Penguin Modern Classics).

See collection.


Book Industry Communication, a UK-based trade organization.

In the ONIX and metadata context, the subject categorization scheme developed by BIC and used mostly in the UK, though close variants of the BIC scheme are used in some other European countries, for example CCE (Classificazione commerciale editoriale) in Italy. cf BISAC, CLIL, WGS, see also Thema. Schemes like BIC, BISAC, CLIL, WGS and Thema are intended for use in the book trade, and have little in common with library-focused subject classification or categorization schemes like Dewey DecimalUDC or LCSH (Library of Congress Subject Headings).

BIC Basic

A bare-bones set of metadata elements enumerated by BIC, and forming part of the requirements for its data quality accreditation scheme. The elements may be communicated using ONIX, a flat file (eg CSV, tab-separated file), or another method.

Binder’s pack

See carton.


See books in print.


Book Industry Systems Advisory Committee. It was later incorporated into BISG, along with SISAC, its serials publishing counterpart. In the past, BISAC and SISAC have also been known as BASIC (Book and Serial Industry Communications).

More frequently, the schemes developed by the BISAC Subjects committee and administered by BISG and used mostly in the North American book trade. The schemes include: subject categorization, merchandising themes, and regional themes. cf BIC, see also Thema.


US-based trade organization Book Industry Study Group.


In computing, a binary digit – a single unit of information, either 0 or 1. Eight bits are usually combined into a byte. A byte of data might represent a single (integer) number between 0 and 255 (for mathematical convenience, a byte representing an integer between −128 and +127 is also common). But equally, a byte of binary data could represent a single text character (eg in the Latin‑1 character set), or a particular color for a pixel in an image (eg brightness of red in a single pixel within an RGB image), or any other type of information – including a programming instruction for the computer itself. This document comprises more than 3.5 megabytes (million bytes) of data.


Sample pages of a book, produced in the form of a booklet for promotional purposes.


Print or printable area that extends beyond the trimmed page edge. Headline text or images can extend into the bleed area to avoid an unsightly edge when the book block is slightly mis-trimmed.


In ONIX, a special type of large composite that groups together all the data about a specific aspect of a product – Block 1 is all the main bibliographic data, Block 2 is marketing collateral, Block 3 is chapter-level metadata, and so on. There are currently seven blocks in a Product record, though (in specific circumstances) each is optional

See also Organization of data delivery.


Metallic foil (often gold or silver colored) often used to ‘print’ the title, logo or decorative pattern on the spine or boards of a hardback book, or added for visual impact on a cover. It is applied with a heated stamping die.

Block update

See organization of data delivery.

Blu-ray disc

Optical disc developed to supersede the DVD, holding up to 25GB of compressed and DRM-protected high-definition video or other data. Dual and multi-layer variations can hold much more data, including ultra-high definition or 3D video.


Short descriptive text usually written by the publisher and used to promote the book. The blurb may be used in a catalog or on the back cover or jacket flaps of the book, and may include short quotations from favorable reviews or endorsements.


Stiff card (paperboard, fibreboard) used for the rigid covers of a hardback, or for the leaves of a ‘board book’. Generally more than 400gsm and 500µm (or 20 mils) in bulk.


Buy One, Get One Free, promotional offer where a retail customer receives one product free of charge when paying full price for a second (usually the cheaper of the two is free, if they are priced differently, and usually it’s any two selected from a range of books on offer). BOGOF is equivalent to a maximum 50% price reduction where both products have the same RRP, a little less if the products vary in price. 3-for-2 (equivalent to a maximum 33.3% price reduction) or buy one, get one half price (maximum 25% reduction) are more common.


Template of standard clauses used to create contracts, for example between authors and publishers.


ONIX does not define what a ‘book’ is. Some legal systems set a minimum number of pages, below which a low-extent publication is a ‘pamphlet’ or similar, but within the ONIX framework this distinction is left to the data provider.

Book block

Part-bound book, with all the signatures gathered, bound and trimmed, but before the cover is added.


GTIN-13s are normally allocated nationally, with the first two or three digits indicating the country. Bookland is the fictional country to which the 978 and 979 prefixes used for ISBNs are assigned. In this way, the range of ISBNs becomes a small subset of the larger GTIN numbering scheme.


BookNet Canada is a non-profit organization that develops technology, standards, and education to serve the Canadian book industry. Founded in 2002 to address systemic challenges in the industry, BookNet Canada supports publishing companies, booksellers, wholesalers, distributors, sales agents, industry associations, literary agents, media, and libraries across the country.

Book proof

Paginated and bound proof copy, usually without the final cover and with text that still requires final corrections. Used for marketing and (sometimes) review purposes, as well as final proofreading and correction. See also advance reading copy, page proof.


Reference catalog or service providing aggregated metadata – both bibliographic and commercial – aiming to cover all books available in the market (in print or at least in commerce). Often compiled on a national basis, and used by book retailers, libraries etc. BIP services can usually provide information on OP and out of commerce titles too.


In databases, a data value that can be either True or False. (A third possible value – null or ‘unknown’ – is usually also an option.)

Bound proof

See book proof, advance reading copy.


See Set.

Brackets, Braces, Parentheses

Paired typographic symbols – brackets ‘[ … ]’, braces ‘{ … }’, parentheses ‘( … )’. In text, brackets are often used to surround sections of quotations that are not verbatim. and parentheses for subsidiary phrases or clarification. These are often called ‘square’, ‘curly’ and ‘round brackets’ in the UK.


Thickness of a sheet of paper, usually measured in microns (µm, thousandths of a millimetre). Typical book paper is around 90–120µm. In the US, often termed Caliper, and measured in mils (thousandths of an inch). See also paper weight. Strictly, the relationship between a paper’s weight and bulk is the density (mass per unit volume), and higher quality papers generally have higher density, but confusingly, a paper’s ‘density’ is often used as a synonym of the paper’s weight or grammage (ie mass per unit area) without regard to bulk or caliper.


See bit.

Byte order mark

In the UTF‑16 encoding of Unicode characters, each character is represented by two or more bytes of information. But these bytes might be in either order – something like saying either ‘seventy three’ or ‘three and seventy’. The latter could easily be misinterpreted as 37. A special character, a byte order mark, may be included as the first character in a Unicode file to make it clear which way around the rest of the file is. However, the strong recommendation in ONIX is to omit byte order marks, and to declare either UTF‑16BE (‘big endian’, like seventy three) or UTF‑16LE (‘little endian’, like three and seventy) explicitly in the first line of the XML file. A byte order mark is valid in UTF‑8 too, but it has no real meaning, and again should be omitted.



See bulk.

Canadian contributor BOOKNET CANADA ADDITION

Is an author, illustrator, translator or editor (in the case of an edited collection of material) who is a Canadian citizen or a permanent resident of Canada.

For Canadian market context, refer to the BookNet Canada documentation on


Abandon plans for publication before a book is published, see AB.

Removal and replacement of a page from a book, or the reprinted sheets for replacing canceled pages.


In XML, data modeling and database design, whether a data element or composite is optional or mandatory, and whether nor not it is repeatable, within a particular DTD or schema. In the ONIX documentation, a cardinality of 0…1 means the element is optional, 1 means mandatory, 0…n means optional and repeatable, and 1…n means mandatory and repeatable. Cardinality is often a simplification of the full requirements of a schema or data model, since the requirements can be contextual – they depend on other data values. In ONIX, <ROWSalesRightsType> is 0…1 (ie optional), but in many circumstances is actually mandatory (it is dependent on the data provided in various <SalesRights> composites). Such contextual requirements cannot be expressed in the XML DTD or schema.


Box made of paperboard or corrugated fibreboard, and used by manufacturers to pack multiple copies of a book ready for distribution. Bulk shipments of books are packed in cartons and then stacked on pallets. A carton might hold anything between four and 100 or more copies, depending on the size of the book and carton. Retailers can order in multiples of this carton quantity (occasionally, case quantity) for convenience, though in general, orders for any number of copies are accepted. Sometimes called a binder’s pack.


Some scripts (eg Latin, Cyrillic, Greek) include distinct upper case and lower casevariations of each alphabetic character, with lower case used for the majority of text and upper case used for initial characters in each sentence, on nouns, etc. Upper case (or capital, or majuscule) letters are so called because when moveable type consisted of small cast blocks of metal, the capitals were kept in a wooden box or case on the top shelf, and lower case (minuscule) letters were kept in a case on the lower shelf. Typographically, majuscule letters are all more or less the same height, whereas minuscule letters have variable height with ascenders and descenders. Other alphabetic and logographic scripts (eg Arabic, Devanagari, Hebrew, Hanzi and Kanji) do not maintain distinct cases.

The cover and spine boards of a case-bound book. cf slip-case.

Occasionally, a synonym for carton, as in the phrase ‘case quantity’.


Book bound with rigid board covers – a hardback. Not to be confused with a slip-case, a separate board ‘sleeve’ the book slides into. cf limp-bound, a paperback.

Cast off

Calculation of the likely number of typeset lines or pages from the number of characters or words in text and the line width, page height and type size.

CC, CC-By, CC Zero etc

See Creative Commons.


See BIC.

CD, Compact disc

Optical disc holding digital data – often digital audio data – developed by Philips and Sony, based originally on the CD Digital Audio ‘Red Book’ standard for high-quality audio (44.1KHz sampling rate, 16 bits per sample, two channel stereo), or 1411Kbits per second (cf compressed MP3 or AAC audio files at perhaps 128Kbits/s which sacrifice a little fidelity for much lower file size). Other CD standards allow up to about 700MB of ordinary data files to be stored on a disc. See also DVD.


Consumer direct fulfillment, see Drop shipment.


In dates, Common Era of the Gregorian calendar, secular equivalent to AD (anno domini). cf BCE, Before Common Era.

In product certification, the CE logo on a product is a declaration by the manufacturer that indicates it conforms to European Union legislation and directives, for example on product and materials safety.

In font names, CE usually indicates the font includes a repertoire of characters suitable for Central European languages such as Polish or Czech. These fonts often support the Latin-2 character set and encoding.

C format

Less common UK term for a paperback in a size more typically used for trade hardbacks – sometimes around 216 × 135mm in size (Demy), but equally often 234 × 153mm (Royal) or another size. More typically just termed a trade paperback.


Originally a small book or pamphlet of popular, sensational, juvenile, moral or educational content once sold by street merchants or peddlers known as ‘chapmen’. In modern use, may be almost any short booklet, often a children’s book. Occasionally (and probably wrongly) termed a ‘chapter book’.

Character encoding

See character set.

Character entity

Method of encoding non-ASCII characters in HTML, for example ‘&hellip;’ for an ellipsis, now largely unnecessary with widespread use of Unicode characters. While character entities were used with earlier versions of ONIX, they are not valid in ONIX 3.0.

Character set

A defined list or repertoire of characters. A Character encoding then defines how this repertoire is represented by a computer. For example, ASCII lists a repertoire of 95 printable characters including space – plus a selection of non-printable ‘control characters’ including tab, new line and so on – and encodes them using the numbers 0–127 (or 00000000 to 01111111 in binary). Latin‑1 lists 191 characters, and encodes them using the numbers 0–255. Windows‑1252 is a different encoding of around 215 characters also using the numbers 0–255 – and obviously this means that if some text is encoded using Windows‑1252 and then displayed as if it were Latin‑1, some characters will be displayed wrongly or not at all. See also Unicode, a character set of more than 130,000 characters.

Check digit

Many identifiers include a numerical check digit, calculated arithmetically from the other digits. For an ISBN, for example, calculating what the check digit should be based on the first twelve digits, then comparing it with the actual check digit indicates whether the ISBN is likely to be correct, or whether an error has been introduced – a mistyped digit, two digits transposed etc. Different identifiers use different mathematical procedures (or ‘algorithms’) for calculating the check digit.

Chicago Manual of Style

Widely used guide for spelling, punctuation, grammar and typographic style in American English, derived originally from guidelines set by Chicago University Press. cf Hart’s Rules.


Commission Internationale de l’Eclairage, the body responsible for colorimetry standards, against which colorspaces such as Adobe RGB, sRGB or device-specific color profiles are characterized.


Cataloging in Publication, limited bibliographic information produced by national libraries prior to publication of a book, based on information supplied in advance by publishers. The CIP information is often printed within the book itself on the title verso page.


International Cooperation for the Integration of Processes in Prepress, Press, and Postpress Organization, a standards organization focusing on process automation and improved workflow in the print industry. CIP4’s key technical standard is JDF (Job Definition Format), an XML messaging system used in print production.


Chinese Library Classification, library subject classification scheme used in China. See also UDC, DDC, LCC.


Commission de Liaison Interprofessionnelle du Livre, a French book trade organization.

In the context of ONIX and metadata, Classification CLIL is the subject classification developed by the Commission and used widely in the French book trade. See also Thema.


Term for a hardback/hardcover book, generally only used in a bookbinding or specialist publishing context (‘…in cloth’). Also the textile or faux material used to cover the boards of a hardback.

Cloth book

See rag book.


Content management system, system used to manage the creation and editing of material destined for publication (in a book, or on a website).


Cyan, magenta, yellow, key (or black), basic subtractive color model used to simulate (at least in theory) the full range (or gamut) of visible colors in color printing with just four colored inks (and halftoning). In practice the range of printable colors is more limited than the full visible range. See also RGB.


Coder–decoder. Loosely, the compressed file format used to store a files containing image, audio or video data. Since such files can often be very large, the data is compressed mathematically. JPEG is a lossy format for image data, whereas TIFF is lossless. AAC and MP3 are two different lossy formats for audio data. These are referred to as ‘different codecs’. More strictly, the codec is the software used to compress or decompress the data in a particular format.


Term used in ONIX documentation for a controlled vocabulary or authority file. In addition, codelists define a language-independent notation – the code – for each term or concept in the vocabulary. Only the code appears in ONIX data. Some codelists have an implied hierarchy (for example list 150, where BA is clearly a broader term than BB or BC), making some lists taxonomies rather than simple vocabularies. An interactive codelist browser is available at See also SKOS, keyword.


A book – printed or manuscript content arranged in discrete pages, bound down one edge (the spine), or occasionally folded accordion-style. cf scroll.


See co-publication.


Process of sorting, the sort order or procedure used, or the process of checking items have been sorted correctly.


Fixed or indefinite number of products that share some collective identity such as a collective title. Members of the collection usually also have other attributes in common, such as product form or a branding or design style. A set or a series is a collection, but a collection could also comprise a less formal selection of products.


Logo of publisher or imprint on the spine or title page of a book.

A statement about production details such as typeface, paper grade and binding printed on the title verso or at the end of the book.

Color gamut

The range of colors available within a particular colorspace (for example, within sRGB) or on a particular display or printed page (see CMYK), often measured against the full range of visible colors (or ‘chromaticities’) as defined by the CIE.

Color profile

Extra metadata embedded inside an image file that specifies what colorspace the image in ‘in’ (JPEG, TIFF, PNG images can carry ICC color profiles – but not all image formats support the inclusion of profiles). The profile relates real-world colors to digital color values, and an ‘input profile’ is a characteristic of the camera, scanner etc used (or of the software used to generate or manipulate the image). The color profile defines a device-specific ‘colorspace’. A second profile – an ‘output profile’ – belonging to the printing system or display device can be combined with the digital image and its embedded input profile to compensate for less than ideal color accuracy in both input and output and thus present the final printed or on-screen image as close as possible to the original real-world color. Profiles can also be used to convert between device-specific colorspaces and idealised standard colorspaces like sRGB, Adobe RGB or CMYK. See also CIE.


In an IT context, the ability of a system to interoperate with other systems. Backward compatibility is the ability of a system to accept and correctly process input intended for older systems. A data format is backward compatible with its predecessor if data that is valid under the old format remains valid and retains its meaning under the new format. Forward compatibility is the reverse, the ability of a system to accept and process input in a format intended primarily for later versions of itself, although of course it may not be able to extract all data from the newer format.


part of a product – a single volume of a multi-volume set (when sold as a single product), a single CD in a multi-disc audiobook, a book in a book plus toy bundle. cf item, though note that a component of a multi-component trade pack can become a product or item in its own right when sold at retail.


In ONIX, a sequence of XML data elements can be nested inside another pair of start and end tags, forming a ‘composite' that emphasizes the logical structure of the data. For example, all the data about one contributor is nested between <Contributor> and </Contributor> – inside a <Contributor> composite. In effect, the 'data' inside the composite data element consists of other data elements. Many such composites are repeatable, for example if there are multiple contributors

An example can help: The repeatable <ProductIdentifier> composite contains two mandatory data elements: <ProductIDType> a defining code (ONIX Code List 5) and <IDValue> is the value defined by the code: Code “15” defines ISBN-13 and the IDValue tag contains the ISBN. The composite repeats but the code is applied uniquely. You only need give the book's ISBN-13 once – because it only has one ISBN – but if you include the GTIN-13 (and you should) then it takes two Product Identifier composites.

ONIX is structured data so, in addition to repeating composites can appear in multiple positions within an ONIX record. Using the ONIX record’s XML tree to illustrate:
Product / Product Identifier
contains the Product Record’s ISBN, but Product Identifiers can also appears in
Product / Related Materials / Related Product / Product Identifier
where the same composite, carries the same codes but they are now contained within a repeating Related Product composite that is defined by its <ProductRelationCode>. The Product Identifier composite is using the same codes to identify the ISBN-13 but the value represents a different book whose relationship to the Product Record is defined by the Product Relation Code.


Data compression is the mathematical process of reducing the size of a file, for example by eliminating repetition and redundancy. Different compression methods are either lossyor lossless. Lossless compressed files can be expanded back to reconstitute the exactoriginal data, but lossy compression – often used with image, video or audio files – discards less important sounds or image detail to make the compressed file even smaller, so the re-expanded file is only approximately the same as the original. In practice, the difference may be invisible or inaudible, but lossy compression is obviously unsuitable for use with text or numerical data. AAC and MP3 are lossy audio codecs, JPEG is a lossy image codec and AVC is a lossy video codec. TIFF is (almost always) a lossless image file format, WAV is lossless audio, and Zip losslessly compresses any file (but worth knowing that zipping a file that is already compressed – like zipping a JPEG, for example – does not usually make it any smaller…). See also codec.


See sale or return.


In typography, a typeface design that is tall and narrow, or narrower than usual for a particular family of typefaces, so more characters fit on a single line of text.

See abridged.


Person or organization – more generally, the party – responsible for creating the intellectual or artistic content of the product. ONIX is usually only concerned with contributors named on the product itself, and then only with their outward-facing public identity or persona. The publisher acquires rights to exploit the intellectual or artistic content created by the contributor, in return for fees or a royalty.

Controlled vocabulary

An exhaustive list of terms that can be used in a particular context or data element. Each term in the vocabulary should have an unambiguous definition or explanation, and may include both preferred terms and less-preferred synonyms. Controlled vocabularies may be a ‘flat’ list of terms, or the terms may be arranged hierarchically – in which case the vocabulary is sometimes called a taxonomy. See also codelist, authority file.


Co-operative marketing, arrangement whereby the publisher subsidizes or pays for advertising and promotional activities (A&P) carried out by a retailer.


When two publishers co-operate to publish a book, they may create and sell a single product (a co-publication), or may create a shared ‘base’ from which two different products are derived and sold (these are Co-editions). A co-publication may carry the branding for both publishers, and may even carry two separate ISBNs (one from each publisher), but it is a single product. In contrast, co-editions have separate identities (including separate ISBNs for each publisher’s version), even though they might sometimes differ by no more that the branding and the publisher details. More often, the shared base for a co-edition may consist only of the color images, to which each publisher adds entirely separate text – this type of co-edition is very common in multi-language illustrated books, as it offers significant savings in shared production costs.


A play on conventional copyright: a licensing arrangement whereby a work (most often computer software) may be copied, used, re-distributed, adapted or modified, free of any restrictions, on condition that anything derived from it is also free of the same restrictions and bound by the same condition. Some Creative Commons licenses are copyleft licenses.

The exclusive legal rights to perform, display, reproduce, sell, modify, adapt or otherwise use original work or other intellectual property that is expressed in text, images, sound – a right enshrined in the ‘Berne Convention’, originally agreed in 1886 but revised and updated several times since – most recently by the Marrakesh Treaty. The copyright in a work is held by the author or creator, and can subsequently be passed on (eg to the author’s estate), or licensed or assigned to publishers (and others) in a contract. In most jurisdictions, copyright (which is essentially a commercial right) is accompanied by inalienable Moral rights such as the right to be identified as the author, and protection for the integrity of the work. Unlike rights over other intellectual property such as a patent or a trademark, copyright is automatic – you don’t need to register your work to gain legal protection, though in some countries, registration can be beneficial and in others, the Moral rights must be explicitly asserted (for example, within the work itself). Copyright in a new textual work usually persists for 70 years after the death of the original creator (occasionally slightly more), and limits exploitation of the work by those other than the copyright holder, licensees or assignees (collectively, rightsholders). The term of copyright has varied significantly across different countries during the last century, so 70 years after death is not always correct for older works. In some countries, Moral rights expire alongside the commercial copyright. In others, they are perpetual. After expiry of the commercial rights, the work passes into the public domain. Certain groups, eg print-impaired readers, may hold a legal copyright exception and can make copies for personal use without obtaining permission from the rightsholders. Other uses of copyright material may also be allowed without explicit permission (eg for purposes of education, research, parody, for review and criticism, for digital backups etc) under ‘fair use’ or ‘fair dealing’ provisions of national law, but the scope and detail of these exceptions vary from country to country (see also the ‘three step test’).

See copyright.

Also termed the Imprint page. In a printed book, the title verso on which the copyright notice, publisher and imprint details, the ISBN and impression number, CIP and other details about the publication of the book usually appear. In a e-publication, this is often added at the end of the book.


Camera-ready copy, largely obsolete term for typeset or graphical material ready for photographic transfer to a printing plate.

Cyclic redundancy check, a number calculated according to some mathematical algorithm and appended to digital data as it is stored or transmitted, to enable later detection of errors. On receipt or retrieval of the data, the algorithm is recalculated and compared – any difference indicates an error. The concept is similar to a check digit within an identifier.

Creative Commons

Organization that develops copyright licenses that permit and encourage free sharing of creative works. Some CC licenses are copyleft (in particular the SA ‘Sharealike’ variants), whereas others reserve some rights to the creator (eg CC-By reserves the right to be credited as the author). The related CC0 (‘CC Zero’) waiver waives all possible rights (including the right to place – or prevent placement of – licensing restrictions on derived works). However, CC Zero cannot waive certain inalienable moral rights. See also Open Access, Public Domain.

Cross grain

In US, against the grain. See long grain.


Customer relationship management, integrated management of an organization’s customer and product support and other interaction with its customers and potential customers across multiple channels (eg via phone, website, e‑mail, social media, and advertising and marketing activities), and more narrowly, the IT systems to support and analyse those business processes.


Corporate social responsibility, ethical principles, policies and actions of an organization that promote social or environmental good, going beyond legal requirements, via corporate philanthropy, responsible and sustainable business and supply chain practices, cause-related and social marketing, etc.


Cascading Style Sheets, W3C standard for markup used alongside HTML to control the appearance of web pages (where the HTML markup largely defines the structure). CSS is also used in EPUB, which is also HTML-based.


A Comma-Separated Value data file consists of tabluar data (rows and columns) with each value stored as ordinary text, columns separated by commas and rows by line breaks. If a value itself includes a comma, then the value is enclosed in quote marks (in some files, allvalues are enclosed in quotes). CSV files are often a last-resort data exchange format: they are easy to read and write with a spreadsheet application (eg Microsoft’s Excel), but it’s not always clear whether a value might consist of multiple lines of text, how quote marks themselves might be encoded, whether the table includes column headers (field names), and what character set or encoding should be used. See also tab-separated file.



See B2C.


Digital Asset Distributor, organization that facilitates distribution of digital assets such as e‑books to online retailers and libraries on behalf of the publisher. The service may encompass a managed asset repository, file format conversion services, metadata and e‑book distribution, and aggregation of sales statistics.


Typographic symbol ‘ † ’ sometimes used to indicate footnotes in text. Also comes in double dagger (‘ ‡ ’) flavor.

DAISY Consortium

Digital Accessible Information System, a consortium of organizations working to promote ‘inclusive publishing’ and the availability of accessible editions of books to all, including print-disabled readers. See also DTB, EPUB.


A Digital Asset Management system manages the ingestion and storage of digital assets, their cataloging and metadata, search and retrieval, and sometimes distribution. DAMs can be structured like a library aimed at simplifying the reuse of assets, or as a workflow tool forming part of a production system.


Common typographic dashes come in different lengths, -, – and —. Hyphens (the shortest) are used to connect compound words or split words at the end of a line in justified text. En dashes (the middle size) can be used for parenthetical phrases – usually like this, with spacing – or unspaced to indicate a range such as A–Z or 1–9. Em-dashes are used for parenthetical phrases—usually like this, without spacing—or to indicate an abrupt halt or discontinuity in a sentence. An em dash is about one em long. In some languages and typographic styles, em dashes or a slightly longer ‘quotation dash’ are commonly used for dialogue, in place of opening quotation marks. And in maths, the subtraction sign ‘−’ is about the size of an en dash, but is often matched to the widths of the digits.


An organized collection of data. Modern databases are normally electronic, often with tables of data arranged in columns and row, and relationships between tables (see relational database). The data is organized to model aspects of the real world, and to support various business processes through manipulating that data.

Data element

In XML documents such as an ONIX message, a single XML tag and its content – text, numeric or other data contained between a pair of markup tags. Sometimes loosely termed a ‘data field’. cf attribute, composite.

Day and date

Movie industry term for simultaneous release of linked media properties, as when releasing a tie-in book carefully timed to publish on the same day as the film opens in cinemas. Such releases are usually embargoed to prevent early sales from bookshops. cf windowing.


Distribution center, a distributor or wholesaler’s warehouse.

In metadata contexts, see Dublin core.


Extended RGB colorspace, allowing a much wider range or gamut of colors on screen than standard sRGB. Allows more detail in red colors, but not as much in green as Adobe RGB. See also color profile.


Demand-driven acquisition, also termed PDA, patron-driven acquisition, where libraries have instant access to a complete catalog of e‑books, without having to purchase them up-front. Purchase or licensing of a particular e‑book is triggered automatically when actual usage of the book exceeds an agreed threshold (eg a library patron reading more than a certain number of pages).


Dewey Decimal Classification, common subject classification system used in libraries, named for its creator Melville Dewey. See also UDC, LCC, CLC.

Deckle edge

Page edges of a book block left rough and untrimmed, or more likely, carefully trimmed to make them look rough and untrimmed. Also termed a Rough front.


Character or character sequence used in data files to separate one discrete data element from the next. CSV files use commas as delimiters. XML uses angle brackets (< and >) as separators between data and markup. Within a few ONIX data elements such as <CountriesIncluded>, a space is used as a delimiter in a list of country codes. There is a clear problem whenever a delimiter character occurs within the data itself – this is why XML data must use &lt; to represent the < symbol within the data.

Delta update

See organization of data delivery.

Demy, Demi

Common hardback book size in the UK, typically around 216 × 135mm. Pronounced as in ‘deny’. See also Royal, trade paperback.


See bulk.


Deprecated data elements or codes are not recommended for use, and in general are strongly discouraged. In most cases, another element or code is recommended instead. Deprecation implies obsolescence, but the element or code does remain technically valid.

Derived work

See work.


See ascender.

Desk copy

See approval copy.


See DDC.

Diacritical mark, diacritic

Small decoration or accent added to a letter in Latin and other scripts, which modifies the pronunciation of the letter. Diacritics can also indicate the presence of unwritten vowels (eg in Arabic script), or indicate tones (eg in Chinese pinyin) or prosody. Common Latin accents include acute, grave, circumflex, háček, ring, tilde, cedilla etc, but there are many other types in other writing systems and languages. The effect of diacritics on alphabetic sorting varies by language – some (eg French) ignore accents for sorting purposes, others (eg Swedish) treat accented characters as whole new letters at the end of the alphabet.


Device for scoring, folding, cutting, stamping or embossing paper or card during manufacture.

Dieresis, diaeresis

Diacritical mark indicating (in English pronunciation) a vowel that is pronounced separately from the adjacent vowel in what would otherwise be a diphthong (two vowel characters indicating a single vowel sound). For example, a dieresis is used in ‘naïve’. A dieresis has roughly the opposite effect from a vowel ligature (which indicates they are pronounced as a single sound), but both have fallen out of common typographical use. The same ‘two dots’ symbol is more commonly used in Germanic languages for an umlaut, which has a different effect on pronunciation, for example ,schon / schön‘.

Digipack, Digipak

Folded card packaging with a distinct spine and one or more plastic trays affixed to hold a CD or DVD, as distinct from card sleeve packaging or plastic jewel or keep cases.

Digital asset distributor

See DAD.

Digital certificate

In cryptograph, a digital certificate or digital signature is a data file that securely identifies a particular web server or (less often) a legal entity such as an organization. On the internet, a certificate is used in the first stage of creating a connection via HTTPS, so you can be sure the server you are interacting with really is EDItEUR’s web server and can therefore trust the information it provides. It also allows the communication to be encrypted and private. Certificates can also be used to validate identities in e-mail exchanges, and ensure the integrity of software applcations.


Also Retailer discount, Trade discount. In some countries, books have established wholesale or business-to-business prices. In others, business-to-business transactions are based on a discount from the fixed or recommended retail price agreed by the parties. The discount given by a distributor or wholesaler varies from retailer to retailer (bigger retailers sometimes get more discount) and from book to book (discounts are often greater on mass market fiction than on specialist non-fiction). Assuming the books are sold to end customers at their normal retail price, the trade discount represents the retailer’s gross margin (sales revenue minus cost of goods sold). See also reseller model.

More loosely, Discount can also refer to books sold at retail for less than their recommended retail price (which reduces the retailer’s margin).

Discount code, Discount group code

Index into a table of discounts shared in advance between supplier and retailer. Since each retailer may have a unique table, the actual discount percentage can be communicated without revealing commercially-sensitive information.

Distribution rights

Rights to make a product available in a particular market, a commercial right derived from the publisher’s publishing rights and conferred contractually on the publisher’s distributors, wholesalers and retailers. See sales rights.


Organization that holds the primary stock of books and is responsible for fulfillment (of trade orders) in a particular territory or market on behalf of the publisher. Wholesalers and retailers may act as intermediaries between the distributor and the end customer. Many large publishers own or operate their own distributor, and hold stocks themselves. Other publishers appoint a single, exclusive distributor per market or territory (and this exclusive distributor is sometimes termed the Vendor of record). Some publishers prefer to appoint multiple non-exclusive distributors.


See IP address.


XML markup for structured book texts and documentation. The text of the book contains markup that divides it up into parts, chapters, paragraphs, tables, lists, footnotes and so on. The markup is structural and semantic, rather than defining the typographic presentation or page layout, and the DocBook data can be processed automatically to create e‑books, large print, conventional print, synthetic audio versions of the book. See also TEI, JATS.


Digital Object Identifier – a digital identifier for a document or other object, generally one accessible via the Internet. Objects with DOIs can be the target of clickable links in other documents (eg a scholarly article may cite another article via its DOI), or the link may provide information about the object. In contrast to the superficially similar URL link, DOIs are managed (by the owner of the target object) so that third-party links to the object do not break when the target document is moved. In contrast with many other identifiers, DOIs are always associated with some kind of domain-specific service, and are actionable(clickable) to gain access to that service. The most common application of DOIs in publishing is CrossRef’s registration and resolution service for online scholarly material, which provides citation services, but DOIs are also used within the DataCite service for citation of and access to research datasets, and the Entertainment Identifier Registry (EIDR) identifier scheme for movie and TV assets. Like ISBN and ISNI, DOI is an ISOidentifier, and is managed by the International DOI Foundation (IDF).


Dots per inch, the number of data points or pixels per inch (PPI) of resolution in an image. Note that the DPI of an image is variable – if the image is displayed at twice the size, the DPI halves. To print an image as a halftone, the DPI of the image should ideally be twice the halftone LPI (so at least 350dpi at the size the image will be required for optimum-quality results with a 175lpi halftone screen. More pixels are unnecessary, fewer means a lower-quality halftone).


Digital Rights Management, usually refers to technical protection measures (eg content encryption systems and other access control technologies, and also watermarking) that are used to enforce or monitor compliance with constraints on usage of digital content within e-publications. DRM can for example limit copying and redistribution of the digital content, printing, cutting and pasting of text, sharing and lending, and can also place a time limit on the use of the content to enable rentals. DRM seeks to protect intellectual property from copyright infringement, but can also (often inadvertently) prevent usages that are specifically allowed by copyright exceptions.

Drop shipment

Also termed Consumer direct fulfillment (CDF). In order to avoid retailers holding stock, while at the same time minimizing fulfillment time, distributors and wholesalers sometimes ‘drop ship’ goods direct to the retail customer. The retailer placing the order must pass the customer delivery address to the wholesaler or distributor. This is most common with PODcopies of books.


See DTO.


Digital Talking Book, standard specification for an e‑book format accessible to blind or otherwise print-impaired readers (and also for the associated reading systems). Developed by the DAISY Consortium, and therefore also known as the DAISY standard(versions 2 and 3). Content of a DTB can range from text with XML markup (to be read using text-to-speech software), through combined text plus pre-recorded audio, to audio-only files with additional navigation functionality. DTB has been largely superseded by the EPUB 3 standard, which incorporates many features to make EPUB content more accessible.


Document Type Definition. Specifies the set of markup tags and the structure – in terms of mandatory and optional markup tags, their order and nesting – that may be used in a particular type of XML or SGML. document. So the ONIX for Books DTD defines how tags like <Contributor> or <x448> may be used. See also schema.


Download To Own, also termed DST, EST, Digital or Electronic Sell-Through. Business model whereby e‑book files or other digital goods are downloaded with a perpetual license. cf DTR, rental.


Download To Rent. Business model whereby e‑book files are downloaded with a time-limited license. See also rental. cf DTO.

Dublin Core

Set of key metadata elements (including Title, Creator, Publisher, Date and so on) intended to standardize bibliographic description and facilitate access to documents and resources on the internet. There were originally (in 1998) just 15 elements in the Dublin Core Metadata Element Set (DCMES), with a further 40 terms (DCTerms) added later, but the imprecision of the term definitions and lack of a common data exchange format mean DC is used either very loosely, or with application-specific ‘profiles’ that prevent broad interoperability.

Dues, Backorders

Business-to-business orders taken by a publisher, distributor or wholesaler prior to publication or during a temporary period of unavailability (when they are more frequently termed backorders), for delivery when the book becomes available. Subscription ordersare recorded as dues. cf pre-orders, which are consumer orders placed with a retailer.


Usually handmade mock-up of a book, folded together or bound from unprinted pages to demonstrate the physical nature of a planned product.


Temporary, free-standing decorative box, usually of cardboard, produced by the publisher to display copies of books in a retail environment. See POS.

DVD, Digital Versatile Disc or Digital Video Disc

Optical disc developed by Philips and Sony, originally for digital video. Similar to a CD, but able to hold much more data – up to 4.7GB – usually compressed and DRM-protected standard-definition digital video, but occasionally digital audio, software (eg video games) or other data. Much rarer double-sided and multi-layer variations of DVD can hold up to 17GB. See also Blu-ray.



Former European Article Number. See GTIN.

Edg3e decoration

Colored, marbled or gilded edges of the book block.


Electronic Data Interchange: system for highly-automated exchange of strictly-formatted business documents such as invoices, orders or P&A updates. Common EDI document formatting standards include X.12 (used in North America), Tradacoms (in UK) and EDIFACT (common across much of the rest of the world). These are not XML-based standards, but XML equivalents do exist under the EDItX banner.


Electronic Data Interchange for Administration, Commerce and Transport, the ISO 9735 standard developed by the United Nations that underlies much of the world’s e-commerce. Application of the standard to the book trade was coordinated by EDItEUR.


Not-for-profit membership-supported organization that develops and maintains, inter alia, the ONIX and EDItX families of message standards and the Thema subject categorization scheme. EDItEUR also manages the ISNI International Agency and until recently managed the International ISBN Agency.


See P.9 Edition. Historically and in book collecting, ‘edition’ carries the same meaning as impression (eg a limited edition, or a ‘first edition’ of Charles Darwin’s On the origin of species by means of natural selection), but in other contexts, does not imply that all copies are manufactured together, only that they are identical or very nearly so in content (ie there can be several impressions of a Third Edition in paperback, and a separate hardback of the same Third Edition). Significant revisions or changes to the content imply a new edition, and in <indecs> and ISTC terms, a new (derived) work. Other ONIX edition types specified in P.9 – facsimile, large print or Braille, for example – and loose terms such as airside edition used elsewhere are not classed as new works (though they are new manifestations).


Family of XML-based ‘transactional’ e-commerce messages developed by EDItEUR, intended as potential replacements for common EDIFACT, X.12 and Tradacoms EDItrading messages. The family also includes sales/revenue/sales tax report messages intended to simplify retail platform-to-publisher sales reporting of e‑books, and an inventory report covering both physical stockholdings (eg for reporting stock on consignment) and digital inventory (for reporting the on-sale status of e‑books).


Standardized e‑book file format aimed particularly at education material. EDUPUB, now more often termed ‘EPUB for Education’, is a specialized profile of EPUB 3.


European Economic Area, a free-trade area made up of the European Union countries plus three of the four EFTA countries, Iceland, Lichtenstein and Norway. The EEA agreement provides a ‘uniform internal market’ with freedom of movement for people, goods, services and capital, and great uniformity in regulations relating to trade, social policy, consumer protection, the environment and company law across 31 countries. In territorial rights contexts, ‘Europe’ often means the EEA – but at other times Europe implies the continent, which would additionally include Switzerland, the Balkans, Ukraine, Belarus, Russia and (possibly) Turkey and the trans-Caucasus (Georgia etc).


European Free Trade Association, comprising Iceland, Lichtenstein, Norway and Switzerland (though several other countries were previously members and are now members of the European Union). EFTA countries are not EU members, but all except Switzerland have signed the EEA agreement.


Entertainment ID Registry, and the DOI-based identifier used by the registry for the film, TV and video sectors. See also ISAN.


There is no special type of ISBN for e‑books. See ISBN.


See data element.


Typographic mark consisting of three dots (‘ … ’) indicating an omission, elision or continuation.


English language teaching.


Measurement equal to the point size of text – so for 12pt text, one em is 12pts. An ‘en’ is half an em. See also dashes.


Proscription of sales, or sometimes of publication of reviews, of a book prior to a particular date. See sales embargo date.


See Sales embargo date.

For Canadian market context, refer to the BookNet Canada documentation on


Europe, Middle East and Africa.


All text is encoded in some way in order to store and process it digitally: each character is stored as a number, and the relationship between the characters and the numbers is the encoding. ‘A’ might be stored as the number 65, or more likely as the binary equivalent 01000001. B is 66, C is 67 and so on. There are many different standardized encodings, though for historical reasons, most actually do use 65 for A (because ASCII is the basis of many later, more complex encodings). However, encodings vary greatly for characters that are not present on an English typewriter keyboard – é might be stored as 233 (11101001 in binary), or as two numbers 195 then 169 (11000011 and 10101001), or in another way. And most encodings have a very limited repertoire of characters (their character set), so there are many characters that cannot be encoded at all. See Unicode, UTF-8, Latin-1, Windows-1252.


Headband or tailband at top and bottom ends of a book spine – originally protective reinforcement of the binding of the book block, and thus implying a high-quality binding, but these days usually a purely decorative cord or strip of woven material sewn or glued to the spine of the block prior to addition of the cover.

End matter

See back matter.


See footnote.

End paper

Strong paper sheet used to affix the case of a hardback to the book block.


A thing, something that has a distinct and independent existence, and may carry various descriptive properties. Entities may be concrete or abstract, actual or potential, and their properties may often also be considered to be entities. ONIX allows description of various types of entities and the relationships between them – products, contributors, locations etc. See also abstract model, relational database.


EDItEUR Product Information Communication Standard – a data dictionary that inspired many of the data element definitions of ONIX.


Electronic Point Of Sale, a retailer till or checkout system often also used for sales data and stock control.


Standard e‑book file format developed and maintained by the IDPF and more recently by the W3C. The latest EPUB 3 version of the format builds on HTML5 and CSS, incorporates both reflowable and fixed-format variants, and includes features to help publishers optimize the accessibility of the content (obviating the need for specialized accessible editions). Note that ONIX has a number of data elements named <Epub… (for example <EpubTechnicalProtection>) which are intended for use with all types of e-publication. They are not exclusively for publications in the EPUB file format.


Enterprise resource planning, the integrated managment of multiple high-level business processes, including inter alia purchasing and production, order-to-cash, customer service or CRM, financials including budgeting and accounting, perhaps even HR. Also commonly, the IT systems that support this integrated approach.


Corrections to the content of a book, sometimes corrigenda printed on a separate sheet and inserted into finished copies as an addendum.


See DTO.


European Union, political and economic union of 28 member European states (Austria, Belgium, Bulgaria, Croatia, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Luxembourg, Malta, Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, United Kingdom).


End user license agreement, the license granted by the publisher or rightsholder to the final purchaser or reader of an e‑book.


See EEA.

Evaluation copy

See approval copy.

Exclusive rights

See Sales rights composite in Group P.21, though exclusivity can apply not only to publishing rights but also to other types of right (for example exclusive distribution rights).

Exhaustion (of rights)

See first sale principle.

Export edition

Product intended to be sold outside the country of publication.




Page count, the number of pages in a book (or occasionally, the number of words or the running time). See Group P.11 for different methods of measuring the extent.


Technique for searching a collection of information based on applying multiple filters, successively refining the search results until the item is found. Faceted searching is often used after an initial direct text search, to narrow down a large number of search results.

Fair use, Fair dealing

Limited exceptions to copyright, for example allowing usage of short excerpts or quotations from a copyrighted work for particular purposes such as review, criticism or private study without formal permission or payment. In some jurisdictions, the limits on fair use are clearly and legally defined. In others, fair dealing is defined in a more flexible and pragmatic way.


See data element. May also refer to a column in a database table, or a pre-defined data entry area in a form.


Procedure to capture the underlying characteristics of, for example, an image or a chunk of audio or video, irrespective of how it is encoded as data. Fingerprint algorithms are designed to be resistant to change – for example, a cropped or lightly retouched image should have a fingerprint very similar to the uncropped original. In contrast to a mathematical hash function, a song would have the same fingerprint when encoded as MP3 or AAC. Digital fingerprints are used to recognize similarity, where hashes are used to detect differences. See also identifier – though note that fingerprints can only be used to identify a resource by comparing the fingerprint with a large database of fingerprints of known resources.

Firm sale

See sale or return.

First sale principle

Under US copyright law, retail book purchasers are allowed to re-sell, trade, rent, lend, or dispose of the previously purchased item without the prior consent of the copyright holder. In other jurisdictions, some of the copyright holder’s rights are ‘exhausted’ at the time of first (retail) sale, to similar effect. The first sale doctrine and exhaustion of IP rights is often now interpreted internationally – that is, the retail sale may happen anywhere, and the retail purchaser may re-sell, trade, rent etc anywhere, including from another country back into the country of origin. However, first sale and exhaustion do not apply to e‑books because they are licensed rather than sold, so libraries and consumers do not usually have the automatic right to lend, rent or re-sell e‑books without the publisher’s permission.

Fixed retail price

Often just FRP, occasionally also Fixed book price or FBP. In certain countries, for example France or Germany, there are legal restrictions that prevent retailers selling books significantly below the list price set by the publisher (or occasionally, the importer). The IPA publishes a report on the use of fixed book pricing across many countries. See also recommended retail price.

Flash card

Small card bearing a letter, word, phrase, number or symbol, displayed quickly to learners to improve recognition skills and recall. Often used in teaching very basic literacy and numeracy.

Flick book

Book containing a sequence of illustrations on the pages, designed to give an animated effect when the pages are flicked through. Not to be confused with ‘flip book’.

Flip book

A two-in-one volume, two books bound together, see back-to-back.


Historical term for a page number, or for a single sheet (leaf) of paper, or occasionally, for a single sheet folded once to form four pages of a book.


Cartographer’s folly, a fictitious feature, ‘trap street’ or ‘paper town’ inserted into a published map to help detect plagiarism and copyright violation. Similarly fictitious entries in dictionaries, encyclopedias etc are occasionally called ‘mountweazels’. See also watermark.


Set of characters of the same typeface and size, for example 18pt Helvetica or 10pt Garamond.


Additional information, explanation or cross-reference printed at the bottom of a page, and referenced in the main text by a symbol such as an asterisk or dagger, or by a superscripted footnote number. cf endnote, which is similar but placed at the end of a chapter, section or the end of the body of the book.


Outer edge of a book page, opposite the bound edge or spine. Occasionally ‘fore-edge’.

Forward compatibility

See compatibility. 


See CMYK process color.


IFLA’s Functional Requirements for Bibliographic Records is a conceptual model, the approximate library-centric equivalent of the <indecs> framework. The most significant contrast in terminology for ONIX implementers is that, while <indecs> and FRBR describe effectively identical manifestation and item or instance concepts, a work in <indecs>, ONIX and ISTC terms is roughly equivalent to an expression within the FRBR model. See also RDA, BIBFRAME.


See backlist.

Front matter

Pages preceding to the main content of a book, including title and imprint pages, tables of contents and of illustrations, and any foreword, preface and introduction text. Also termed Prelims (preliminary pages), and usually numbered with Roman numerals. cf back matter.


See fixed retail price.


Forest Stewardship Council, international organization promoting responsible management of the world’s forests. It provides accreditation and certification standards for forest management and forest products such as pulp and paper. See also PEFC, the somewhat similar Programme for the Endorsement of Forest Certification.


File Transfer Protocol, a standard method of transferring files across the internet. cf HTTP.


Binding in which the spine and covers are fully bound in cloth, leather or other material. cf half-bound, quarter-bound.

Full update

See organization of data delivery.


Galley, galley proof

Unpaginated proof copy for checking and correction of the text. The text is in a single long column, without specific page breaks). cf page proof.


Books of a particular style, form or subject matter – for example romance, crime, science fiction.


See GLN.


Globally Harmonized System of Classification and Labelling of Chemicals, standardized system of classification for hazards and hazardous materials, including hazard statements, warning pictograms, safety data sheets etc.


Low-quality raster image file, suitable only for use on the web. See TIFF and JPEG for higher quality image file formats.


Global Location Number, an international standard identifier for a physical trading location or (loosely) for an organization at that location. Well established within e-commerce and physical logistics, and not in any way specific to the publishing industry. cf SAN. Although the GLN system is administered by GS1, there is only limited central coordination of GLN assignment, and a single location may have more than one GLN. Note that GLNs are 13-digit numbers, and must be distinguished from GTIN-13s by context. Details about a particular GLN can be looked up in the Global Electronic Party Information Registry (GEPIR) – though the results are imperfect.


Short explanation, interpretation or annotation of an unfamiliar word or phrase, added between the lines of text (an ‘interlinear’ gloss), in the margin or in a separate glossary. See also ruby.


Alphabetically arranged list of terms and their definitions or explanations, essentially a topic-specific dictionary. See also controlled vocabulary, taxonomy. cf index.

‘Gold’ OA

See open access.


Policy, decision-making and oversight arrangements for a standard such as ONIX, or for an identifier registry like that operated by national ISBN agencies. Well-founded, inclusive and consistent governance engenders trust and builds credibility in the standard, allowing confident adoption and use of the standard by stakeholders.


The extent to which metadata is divided into appropriate data fields – for example the way that a contributor name can be divided into <NamesBeforeKey>, <PrefixToKey>, <KeyNames> etc. Highly-granular metadata makes it easier to re-use the same metadata in different and possibly even unforeseen ways.

‘Green’ OA

See open access.


Goods Received Notification, message sent by wholesaler or retailer to the supplier (egthe distributor), usually via EDI, to confirm receipt of books. cf ASN.


International standards organization responsible for a wide range of supply-chain standards, including the SSCC, GLN and GTIN identifier schemes.

GSM, grammage

Grams per square metre, see paper weight.


See VAT.


Global Trade Item Number, a numbering scheme for tradeable items and consumer products of all types in the supply chain. The GTIN identifier scheme is administered by GS1. Common GTINs have 12, 13 and 14 digits. GTIN-12s were formerly known as UPCs(Universal Product Codes), and are used almost exclusively used in North America. Their use on books has been deprecated since 2005. GTIN-13s were formerly known as EANs(European Article Numbers), and they are used globally to identify a wide range of retail items. The barcode symbology used to represent GTIN-13s is still referred to as ‘EAN-13’. Thirteen digit ISBNs are a small subset of the GTIN-13 number scheme (see Bookland). GTIN-14s are used to identify trade packs of items (from cartons to pallets).






Globally Unique Identifier. See UUID.


Arrow-shaped quotation marks ( « and » ) used in French and other languages.


Gap between columns of type on a page

The blank portion of a bound page closest to the spine, the ‘back’ or ‘inside’ margin (cf the ‘outside’ margin nearest the foredge).


See zip.



See AVC.




High-quality binding in which the spine and corners of the covers (only) are bound in leather (or other fine and durable material). cf quarter-bound, full-bound.


See title page.


Printing technique or the resulting printed image in which shades of grey (or for color images, shades or tints of the CMYK process colors) are simulated by small and regularly spaced ink dots of varying sizes – this is termed AM or ‘amplitude modulation’ screening. The regular spacing of the halftone dots is measured in lines per inch (LPI), or per centimetre. 150 LPI (60 lines per centimetre) is common in books and magazines, and high quality color printing may be halftoned at up to 200 LPI (80 lines per centimetre). Occasionally, halftoning uses very small dots but varies their spacing rather than their size – so-called ‘stochastic’ or FM (‘frequency modulation’) screening. cf line art, which uses only ‘solid’ areas and lines of ink and does not use halftoning to simulate shading.

Handle system

The name resolution system that underlies the DOI. The Handle system turns an identifier or ‘handle’ for a resource – something like 10.4400/zuim – either directly into an IP address, or into a DNS name which can itself be resolved to an IP address. The resulting IP address can then be used to locate the identified resource, or its metadata, on the Internet.

Hart’s Rules

Reference book listing rules of (British) English spelling, punctuation, grammar, usage and typographic style. Derived originally from guidelines for Oxford University Press, but is now the basis of ‘house style’ at many UK publishers.