An ONIX Glossary of Terms


This glossary is from the ONIX for Books Specification + Best Practice Guide + Codelists posted to EDItEUR’s website at Appearing as Appendix A.1 in the Best Practice Guide, we have re-created it here for easy reference and repeat reading.

BookNet Canada has added to this glossary with terminology and contextual information specific to the Canadian market. You will find BookNet Canada entries to this document identified with a BookNet Canada addition or BOOKNET CANADA AMENDMENT tag alongside the heading.



A0, A4 etc

See A series paper sizes. A0 is 1m² in area (1189 × 841mm), A1 is half that, A2 is half again and so on. All sizes have width and height in the ratio 1:√2. A4 is 1⁄16m², and 297 × 210mm.


Author’s alterations, corrections made on proofs by the author or publisher. cf printer’s errors or literals, which are errors made by the typesetter.


Advanced Audio Coding. Improved codec for audio files to reduce file size or download time. Used to compress audio files in the iTunes store, but not unique to Apple. For a given amount of compression, AAC generally sacrifices a little less quality than MP3.


Anglo-American Cataloging Rules, second edition, widely used English language library cataloging rules. cf RDA. See also MARC21, the primary format in which library catalog metadata is transmitted (and often stored).


Abandoned, previously planned publication later canceled without ever having been published. cf NYP.


Content shortened by removal of text and minimal re-writing, occasionally termed ‘condensed’ (the latter implies a greater degree of re-writing).


Short summary of the contents of (for example) an academic paper, article or chapter. Journal publishers often provide free online access to abstracts, while access to the full text remains dependent upon subscription to the journal.

Abstract model

Generalized conceptual model of a real-world system, developed as a guide or aid to understanding the principles of that system. Often expressed as a series of generic entities (‘things’ – books, people, places, dates and so on), the potential relationships between them, and perhaps the events that may change those entities and relationships. See <indecs>, FRBR.


Author’s corrections, see AA.


See diacritical mark.


A book can be accessible by print-impaired people (eg blind, partially sighted or dyslexic readers, or readers with a physical disability). For print books, special accessible editionsare often required. For e‑books, accessibility is not a special edition or feature, but a best practice for mainstream editions. Accessibility is also a consideration for the remainder of the supply chain, including retailer websites, library catalogs and e-reading devices. See also WCAG.

Accessible edition

Large print, Braille or specially-formatted audiobook (DTB) that can be used by print-impaired people who cannot use a conventional physical book.

Accession number

A unique identifier added to each item acquired by a library or archive, assigned as part of the acquisition process. The accession number is a proprietary item identifier, and two items with the same ISBN (ie two instances of the same manifestation) would have distinct accession numbers.

Acid-free paper

Higher quality paper with low lignin content, and treated so it is chemically neutral (pH 7.0) or slightly alkaline to avoid yellowing and deteriorating as quickly as normal paper. For the maximum longevity, the highest archival-quality paper contains about 2% calcium or magnesium carbonate as a chemical buffer to guard against future development of acid within the paper as it ages.


Beginning of the publishing process – the publisher agrees the contract with the creator and purchases the rights to publish a work.

The process of selecting, ordering, and receiving books and resources for a library or archive. See also accession number.

The purchase of rights to a product, range of products or (often) an entire imprint from one publisher by another. This usually include transfer of any existing stock of the affected products. cf divestment of a product, range or imprint by the selling publisher.


Extra agreements or clauses added to the end of an existing contract.

Addition made to a book in a later printing or subsequent edition – for example a list of corrections (corrigenda), appendices or coda added to any section of a book.

Adhesive binding

Typical paperback (‘limp’) book binding using hot-melt adhesive applied to the roughened or notched spine of the book block to hold the pages or signatures and cover together. Also termed ‘Perfect binding’, ‘unsewn binding’.

Adobe RGB

Extended RGB colorspace, allowing a much wider range or gamut of colors on screen than standard sRGB. It is intended to cover – in RGB – almost all of the colors that can be printed using CMYK. See also DCI-P3, color profile.


Decision by a school or college, or by a consortium or educational authority, that a specific textbook will be included on a reading list or used to teach a course of study.


Sum paid by a publisher to an author (or other contributor) prior to publication. Often paid in parts, upon acquisition (agreement of the publishing contract), upon delivery of manuscript, and upon publication. The advance is paid against future royalty earnings, so the author does not receive any further royalty payments from the publisher until the advance has been ‘earned out’ (or ‘recouped’) at the agreed per-copy-sold royalty rate.

See advance copies.

Advance copies

(pl often just Advances), sometimes also known as a review copies: early finished copies of a book, usually arriving before publication and used for publicity purposes, reviews, and occasionally for evaluation (eg for potential adoption – see approval copies) etc. Book proofs (which are bound but usually editorially unfinished) are also sometimes used in the same way.

A format

UK term for a paperback around 178 × 111mm in size, roughly equivalent to a US rack-sized mass-market paperback. See also B format, pocket book.


See literary agent.

Agency model

Business model based on the idea that a publisher sells to the consumer and is wholly responsible for setting the price. The retailer acts as an intermediary or agent to facilitate the sale and takes a fixed commission from the publisher; cf the more common wholesale or reseller model. Under the agency model, the publisher can directly control the ‘street price’ or Actual selling price of a book, whereas under the reseller model, street prices are usually at the discretion of the retailer – the retailer can even choose to sell the book at a loss to attract footfall – unless the legal framework provides for Fixed rather than Recommended retail prices. Reseller models where retail prices are not fixed thus encourage retailers to compete on price (possibly to the exclusion of other areas such as customer service) and put great pressure on margins. However, agency models are more complex for the publisher, and may be viewed as anti-competitive.


In metadata, an organization that collects metadata from many sources (mostly publishers), and re-distributes a combined feed to other metadata recipients. This service may enable or be provided alongside other services such as identifier registration, maintenance of a national bibliography or Books-in-Print database, retail sales reporting and so on.


Artificial intelligence, machine intelligence – broad term for the application of software to analyse data and make decisions based on that data aimed at achieving some predefined goal. Encompasses natural language processing, knowledge representation, reasoning, machine learning etc. In publishing and metadata, AI techniques might for example be applied to automated entity extraction (recognising names, places, concepts and so on) from the text of a book to create keyword lists or links to other books about the same entities, to sales prediction, or to abstracting and summarization.

See AIS.


Abstracting and indexing. cf AIS, AI.

Airside edition

Book only for sale in bookshops in the duty-free (or ‘airside’) area of an airport (cf ‘groundside’). Occasionally, special editions are produced specifically for airside retail outlets (though they are not editions in the proper sense used for P.9 Edition). At other times, airside retail outlets may sell products normally limited to export.


Advance Information Sheet, colloquially just ‘AI’, also called a Title sheet, a printed page of metadata about a book produced in advance of publication for sales and marketing purposes, including details of the title, author, ISBN, pub date, format, price, a description of the contents and marketing information. In effect, an ONIX Product record can be the digital equivalent of an AIS or title sheet.


The & character, meaning ‘and’ when used in text (typographically, it is related to the Latin ‘et’). In XML data, it must be replaced by ‘&amp;’.

Answer code

Distributor or wholesaler’s brief response to an order or stock enquiry, for example indicating the product is NYP or OP. In ONIX, codes in List 65 (used in the <ProductAvailability> date element) are equivalent to answer codes.


Advertising and promotion, often the major concern of the marketing department. Not to be confused with P&A, price and availability.


Article processing charge, a fee charged to the author by an open access publisher to cover the cost of editorial work, production and distribution of a work. More common for articles published in open access academic journals, but a similar business model is also used by some publishers for monographs.


Application programming interface: a set of protocols, functions or services that one piece of software offers to another, used to share data in ‘real time’ – more or less instantly – between applications on a computer or between computer systems on a network. There are two common styles of network API protocol – SOAP and REST – which differ in the manner they encapsulate data in the request and response messages, and the messages are often passed over HTTP. See also web service.

Approval copy

Finished copies of a book sent to an educational institution or library for evaluation purposes, with a view to purchase or adoption. Approval copies may be sent speculatively or on request, and may be free of charge, or charged if not returned (ie on SOR terms). Also termed an Inspection copy, Desk copy or Evaluation copy.


Advance Reading Copy, see Advance copies.

AS2, AS3

Applicability Statement 2, 3 etc, IETF specifications covering the secure and reliable communication of business-to-business data on the internet using HTTP and digital signatures for document signing and encryption, with receipt confirmation returned after decryption and signature verification to guarantee successful delivery. Most used for EDI message delivery, but the content could be any standardised structured data (potentially including ONIX messages). AS3 uses the FTP protocol instead of HTTP.


Part of lower case letter that lies above the top of the lower case x (ie above the x-heightof the text), in letters such as b or d. cf descender, part of a lower case letter that lies below the baseline of the text, in letters such as g or y. Note that the relative sizes of the ascenders, x-height and descenders varies in different typefaces.


American Standard Code for Information Interchange. Simple character set comprising 0–9, A–Z and a–z, plus a few basic symbols and punctuation characters. An ‘Ascii’ text file is one that contains plain text (‘words and spaces’) using only characters from this set. There’s no control over fonts, no formatting or styling (eg different point sizes, justification, bold or italic), and no accented characters, specialized symbols or fancy punctuation – ASCII does not even allow for proper curly quotation marks “ … ” or currency symbols like £ and €. Plain text. cf Latin‑1, Windows‑1252, Unicode.


A bibliographic collection to which someone other than the publisher, typically a metadata aggregator, assigns a collective identity. (For example, among the novels of Tony Hillerman, there are several that feature the same protagonists Joe Leaphorn and Jim Chee. The publisher does not give them a series identity, but in retailer databases they may carry an ascribed identity Joe Leaphorn and Jim Chee Series).

See collection.

A series

ISO standard cut sheet paper sizes, used almost everywhere except North America. A0 is 1189 × 841 mm – 1 square metre in area, with sides in the ratio of 1:√2. A1 is half that area (but the same shape), A2 is half again, and so on. 2A0 is twice the area of A0. A4 is 297 × 210 mm (116th of a square metre). RA and SRA raw paper sizes are roughly 5% or 15% larger in area than A series sizes, to allow for bleed and final trimming, so SRA4 is 320 × 225 mm. B series sheets are intermediate between the A series sizes (B1 is between A0 and A1). These ISO 216 A and B series paper sizes are not used in the USA and Canada, where Letter sized paper is more common than A4 for office use. Letter is 11 × 8½ inches (or about 279 × 216 mm), similar in area but ‘squarer’ in shape than A4. US Legal size is 13½ × 8½ inches (about 343 × 216 mm), significantly taller than A4.


Amazon Standard Identification Number, a proprietary identifier for products used internally within Amazon.


Advance Shipping Notification, message sent by printer, distributor etc, usually via EDI, to confirm imminent dispatch of books to the customer (ie to the distributor, wholesaler or retailer). cf GRN.


See RRP.

Aspect ratio

Ratio of width to height of an image, screen etc, for example 16:9 for a modern TV screen. See also portrait, landscape.


Automatic Stock Replenishment, business method whereby new copies of a physical book are automatically manufactured when warehouse stocks fall below a pre-set trigger level – the publisher does not need to generate an order each time. Often combined with short-run printing in a ‘little and often’ stock maintenance process.

Assistive technology

Software and devices such as text-to-speech (TTS) screen readers or Braille displays that make e‑books more accessible to print-impaired readers.


Typographical mark, a small star (‘ * ’), used in text to indicate the presence of an annotation, as a list item marker (instead of a • symbol), to indicate multiplication (instead of the proper × symbol), etc.


Typographical mark, usually of three stars or asterisks (‘ ⁂ ’) but often approximated by a row of three spaced asterisks, indicating a break in the flow of text.


In XML documents such as an ONIX message, text, numeric or other data contained within an opening markup tag (eg the dateformat attribute within <Date>). XML attributes usually carry information about how to interpret the data content of a data element. More generally, can be synonymous with ‘property’, ‘characteristic’, ‘data element’ or ‘data field’.


Verification of the identity of a person (eg via a login), product or process. cf authorization.


Person or corporate body responsible for the intellectual or artistic content of a book. Often specific to the writer of the textual content – a broader and more inclusive term is contributor.

Author’s copies

Free copies of a book given to the author (or other primary contributor) upon publication, as (usually) stipulated in the author’s contract. cf voucher copies.

Authority file

In library cataloging and bibliographic data, a central list of, for example, contributornames. Used to ensure that contributors can be identified unambiguously and to highlight the single preferred form of a name that might have various forms or spellings. Any particular name may appear in different forms on different books, eg with or without Dr., with ü, ue or u, yet the shared contributor number from the authority file would make it clear that the names identify the same contributor. Authority files also help differentiate different contributors who share a name, and optionally can be used to resolve the real people behind pseudonyms. See also ISNI, VIAF. More generally, an authority file forms a type of controlled vocabulary.


Verification of the permissions associated with a person or process, for example to access or change some information. cf authentication, on which authorization depends.


Advanced Video Coding, a format for compressed video data, also termed H.264 or MPEG‑4 part 10. Has superseded earlier and less technically sophisticated video compression schemes such as H.261, H.263, and is likely to be replaced by H.265 (HEVC).



Business-to-business – commercial transactions between businesses, such as between a wholesaler and a retailer, or between a publisher and a wholesaler. cf B2C.


Business-to-consumer – commercial transactions between a business and an end user (usually but not always an individual consumer). Also termed D2C (direct-to-consumer). cf B2B.


Backlist products are those that have been on sale for (typically) a year or more and are still available. In contrast, Frontlist titles are conventionally those less than a year old (and usually including forthcoming publications).

Back matter

Pages following the main content of a book, including appendices, bibliography, index, other notes and – possibly – any sample or ‘teaser’ material from other books, advertising pages and blank pages added to make up a convenient signature size. Back matter is also termed End matter and occasionally ‘postlims’. cf front matter.


See dues.


Two books in one, bound back-to-back, with the text of one upside down with respect to the other, so that it reads from the ‘back’ of the book. The two share a single spine. Sometimes termed a ‘turn-around book’, ‘tête-bêche’ or a flip book (but cf flick book). cf dos-à-dos binding, where two books are bound back to back without turning one upside down, so the foredge of one meets the spine of the other.

Backward compatibility

See compatibility.


In data communications, the amount of data that can be carried over a particular channel, usually measured in bits per second (or megabits per second – Mbit/s or Mbps). Transmitting a 10 megabyte file over a 1Mbit/s link should take around 80–100 seconds.


Machine-readable data printed as a series of black and white stripes on a product or on packaging. A conventional Bookland barcode on a book uses the EAN-13 barcode symbology and has the ISBN printed above the stripes, with the equivalent GTIN-13 at the bottom. The stripes represent the GTIN-13, not the ISBN (though for modern books, the two are the same number). Other types of barcode (different ‘symbologies’, with differing sizes and arrangements of bars) can appear on products, on cartons containing multiple products, on pallets, shipping labels etc, for example GTIN-12 (formerly known as UPC-A barcodes, but obsolete in the book trade since 2005), GTIN-14 and GS1-128 (SSCCbarcodes). Barcode readers mostly use red light, so printing barcodes in color requires care to preserve adequate contrast.

Basis weight

See paper weight.

Berne Convention

International copyright agreement concluded between various European countries in 1886, since revised and extended to most countries. It provides for copyright protection of textual works from the moment of creation, and for the life of the author plus at least 50 years (many countries have a longer term), but also allows for some limited copyright exceptions. The 1967 revision of the Convention introduced the ‘three step test’ for judging exceptions – any exception must be limited and narrow in scope, must not conflict with normal exploitation of the work and must not unreasonably prejudice the interests of the author.

B format

UK term for a paperback around 197 × 130 mm in size, roughly equivalent to a US trade paperback. See also A format.


A relatively new abstract model and RDF-based data format for library bibliographic data, intended to replace MARC but at the same time also breaking with FRBR data modeling practice. Where FRBR offers work, expression, manifestation and item entities, BIBFRAME originally contained only work and instance – work conflated FRBR’s work and expression, and instance conflated manifestation and item. BIBFRAME version 2 introduces an item entity, making it much closer to the <indecs> work, manifestation and item structure (though a BIBFRAME work may be more abstract than an <indecs> work>). BIBFRAME data is expressed in RDF using linked data principles.

Bibliographic collection BOOKNET CANADA Addition

A collection to which an identity is ascribed which is also part of the bibliographic description of each member (eg Penguin Modern Classics).

See collection.


Book Industry Communication, a UK-based trade organization.

In the ONIX and metadata context, the subject categorization scheme developed by BIC and used mostly in the UK, though close variants of the BIC scheme are used in some other European countries, for example CCE (Classificazione commerciale editoriale) in Italy. cf BISAC, CLIL, WGS, see also Thema. Schemes like BIC, BISAC, CLIL, WGS and Thema are intended for use in the book trade, and have little in common with library-focused subject classification or categorization schemes like Dewey DecimalUDC or LCSH (Library of Congress Subject Headings).

BIC Basic

A bare-bones set of metadata elements enumerated by BIC, and forming part of the requirements for its data quality accreditation scheme. The elements may be communicated using ONIX, a flat file (eg CSV, tab-separated file), or another method.

Binder’s pack

See carton.


See books in print.


Book Industry Systems Advisory Committee. It was later incorporated into BISG, along with SISAC, its serials publishing counterpart. In the past, BISAC and SISAC have also been known as BASIC (Book and Serial Industry Communications).

More frequently, the schemes developed by the BISAC Subjects committee and administered by BISG and used mostly in the North American book trade. The schemes include: subject categorization, merchandising themes, and regional themes. cf BIC, see also Thema.


US-based trade organization Book Industry Study Group.


In computing, a binary digit – a single unit of information, either 0 or 1. Eight bits are usually combined into a byte. A byte of data might represent a single (integer) number between 0 and 255 (for mathematical convenience, a byte representing an integer between −128 and +127 is also common). But equally, a byte of binary data could represent a single text character (eg in the Latin‑1 character set), or a particular color for a pixel in an image (eg brightness of red in a single pixel within an RGB image), or any other type of information – including a programming instruction for the computer itself. This document comprises more than 3.5 megabytes (million bytes) of data.


Sample pages of a book, produced in the form of a booklet for promotional purposes.


Print or printable area that extends beyond the trimmed page edge. Headline text or images can extend into the bleed area to avoid an unsightly edge when the book block is slightly mis-trimmed.


In ONIX, a special type of large composite that groups together all the data about a specific aspect of a product – Block 1 is all the main bibliographic data, Block 2 is marketing collateral, Block 3 is chapter-level metadata, and so on. There are currently seven blocks in a Product record, though (in specific circumstances) each is optional

See also Organization of data delivery.


Metallic foil (often gold or silver colored) often used to ‘print’ the title, logo or decorative pattern on the spine or boards of a hardback book, or added for visual impact on a cover. It is applied with a heated stamping die.

Block update

See organization of data delivery.

Blu-ray disc

Optical disc developed to supersede the DVD, holding up to 25GB of compressed and DRM-protected high-definition video or other data. Dual and multi-layer variations can hold much more data, including ultra-high definition or 3D video.


Short descriptive text usually written by the publisher and used to promote the book. The blurb may be used in a catalog or on the back cover or jacket flaps of the book, and may include short quotations from favorable reviews or endorsements.


Stiff card (paperboard, fibreboard) used for the rigid covers of a hardback, or for the leaves of a ‘board book’. Generally more than 400gsm and 500µm (or 20 mils) in bulk.


Buy One, Get One Free, promotional offer where a retail customer receives one product free of charge when paying full price for a second (usually the cheaper of the two is free, if they are priced differently, and usually it’s any two selected from a range of books on offer). BOGOF is equivalent to a maximum 50% price reduction where both products have the same RRP, a little less if the products vary in price. 3-for-2 (equivalent to a maximum 33.3% price reduction) or buy one, get one half price (maximum 25% reduction) are more common.


Template of standard clauses used to create contracts, for example between authors and publishers.


ONIX does not define what a ‘book’ is. Some legal systems set a minimum number of pages, below which a low-extent publication is a ‘pamphlet’ or similar, but within the ONIX framework this distinction is left to the data provider.

Book block

Part-bound book, with all the signatures gathered, bound and trimmed, but before the cover is added.


GTIN-13s are normally allocated nationally, with the first two or three digits indicating the country. Bookland is the fictional country to which the 978 and 979 prefixes used for ISBNs are assigned. In this way, the range of ISBNs becomes a small subset of the larger GTIN numbering scheme.


BookNet Canada is a non-profit organization that develops technology, standards, and education to serve the Canadian book industry. Founded in 2002 to address systemic challenges in the industry, BookNet Canada supports publishing companies, booksellers, wholesalers, distributors, sales agents, industry associations, literary agents, media, and libraries across the country.

Book proof

Paginated and bound proof copy, usually without the final cover and with text that still requires final corrections. Used for marketing and (sometimes) review purposes, as well as final proofreading and correction. See also advance reading copy, page proof.


Reference catalog or service providing aggregated metadata – both bibliographic and commercial – aiming to cover all books available in the market (in print or at least in commerce). Often compiled on a national basis, and used by book retailers, libraries etc. BIP services can usually provide information on OP and out of commerce titles too.


In databases, a data value that can be either True or False. (A third possible value – null or ‘unknown’ – is usually also an option.)

Bound proof

See book proof, advance reading copy.


See Set.

Brackets, Braces, Parentheses

Paired typographic symbols – brackets ‘[ … ]’, braces ‘{ … }’, parentheses ‘( … )’. In text, brackets are often used to surround sections of quotations that are not verbatim. and parentheses for subsidiary phrases or clarification. These are often called ‘square’, ‘curly’ and ‘round brackets’ in the UK.


Thickness of a sheet of paper, usually measured in microns (µm, thousandths of a millimetre). Typical book paper is around 90–120µm. In the US, often termed Caliper, and measured in mils (thousandths of an inch). See also paper weight. Strictly, the relationship between a paper’s weight and bulk is the density (mass per unit volume), and higher quality papers generally have higher density, but confusingly, a paper’s ‘density’ is often used as a synonym of the paper’s weight or grammage (ie mass per unit area) without regard to bulk or caliper.


See bit.

Byte order mark

In the UTF‑16 encoding of Unicode characters, each character is represented by two or more bytes of information. But these bytes might be in either order – something like saying either ‘seventy three’ or ‘three and seventy’. The latter could easily be misinterpreted as 37. A special character, a byte order mark, may be included as the first character in a Unicode file to make it clear which way around the rest of the file is. However, the strong recommendation in ONIX is to omit byte order marks, and to declare either UTF‑16BE (‘big endian’, like seventy three) or UTF‑16LE (‘little endian’, like three and seventy) explicitly in the first line of the XML file. A byte order mark is valid in UTF‑8 too, but it has no real meaning, and again should be omitted.



See bulk.

Canadian contributor BOOKNET CANADA ADDITION

Is an author, illustrator, translator or editor (in the case of an edited collection of material) who is a Canadian citizen or a permanent resident of Canada.

For Canadian market context, refer to the BookNet Canada documentation on


Abandon plans for publication before a book is published, see AB.

Removal and replacement of a page from a book, or the reprinted sheets for replacing canceled pages.


In XML, data modeling and database design, whether a data element or composite is optional or mandatory, and whether nor not it is repeatable, within a particular DTD or schema. In the ONIX documentation, a cardinality of 0…1 means the element is optional, 1 means mandatory, 0…n means optional and repeatable, and 1…n means mandatory and repeatable. Cardinality is often a simplification of the full requirements of a schema or data model, since the requirements can be contextual – they depend on other data values. In ONIX, <ROWSalesRightsType> is 0…1 (ie optional), but in many circumstances is actually mandatory (it is dependent on the data provided in various <SalesRights> composites). Such contextual requirements cannot be expressed in the XML DTD or schema.


Box made of paperboard or corrugated fibreboard, and used by manufacturers to pack multiple copies of a book ready for distribution. Bulk shipments of books are packed in cartons and then stacked on pallets. A carton might hold anything between four and 100 or more copies, depending on the size of the book and carton. Retailers can order in multiples of this carton quantity (occasionally, case quantity) for convenience, though in general, orders for any number of copies are accepted. Sometimes called a binder’s pack.


Some scripts (eg Latin, Cyrillic, Greek) include distinct upper case and lower casevariations of each alphabetic character, with lower case used for the majority of text and upper case used for initial characters in each sentence, on nouns, etc. Upper case (or capital, or majuscule) letters are so called because when moveable type consisted of small cast blocks of metal, the capitals were kept in a wooden box or case on the top shelf, and lower case (minuscule) letters were kept in a case on the lower shelf. Typographically, majuscule letters are all more or less the same height, whereas minuscule letters have variable height with ascenders and descenders. Other alphabetic and logographic scripts (eg Arabic, Devanagari, Hebrew, Hanzi and Kanji) do not maintain distinct cases.

The cover and spine boards of a case-bound book. cf slip-case.

Occasionally, a synonym for carton, as in the phrase ‘case quantity’.


Book bound with rigid board covers – a hardback. Not to be confused with a slip-case, a separate board ‘sleeve’ the book slides into. cf limp-bound, a paperback.

Cast off

Calculation of the likely number of typeset lines or pages from the number of characters or words in text and the line width, page height and type size.

CC, CC-By, CC Zero etc

See Creative Commons.


See BIC.

CD, Compact disc

Optical disc holding digital data – often digital audio data – developed by Philips and Sony, based originally on the CD Digital Audio ‘Red Book’ standard for high-quality audio (44.1KHz sampling rate, 16 bits per sample, two channel stereo), or 1411Kbits per second (cf compressed MP3 or AAC audio files at perhaps 128Kbits/s which sacrifice a little fidelity for much lower file size). Other CD standards allow up to about 700MB of ordinary data files to be stored on a disc. See also DVD.


Consumer direct fulfillment, see Drop shipment.


In dates, Common Era of the Gregorian calendar, secular equivalent to AD (anno domini). cf BCE, Before Common Era.

In product certification, the CE logo on a product is a declaration by the manufacturer that indicates it conforms to European Union legislation and directives, for example on product and materials safety.

In font names, CE usually indicates the font includes a repertoire of characters suitable for Central European languages such as Polish or Czech. These fonts often support the Latin-2 character set and encoding.

C format

Less common UK term for a paperback in a size more typically used for trade hardbacks – sometimes around 216 × 135mm in size (Demy), but equally often 234 × 153mm (Royal) or another size. More typically just termed a trade paperback.


Originally a small book or pamphlet of popular, sensational, juvenile, moral or educational content once sold by street merchants or peddlers known as ‘chapmen’. In modern use, may be almost any short booklet, often a children’s book. Occasionally (and probably wrongly) termed a ‘chapter book’.

Character encoding

See character set.

Character entity

Method of encoding non-ASCII characters in HTML, for example ‘&hellip;’ for an ellipsis, now largely unnecessary with widespread use of Unicode characters. While character entities were used with earlier versions of ONIX, they are not valid in ONIX 3.0.

Character set

A defined list or repertoire of characters. A Character encoding then defines how this repertoire is represented by a computer. For example, ASCII lists a repertoire of 95 printable characters including space – plus a selection of non-printable ‘control characters’ including tab, new line and so on – and encodes them using the numbers 0–127 (or 00000000 to 01111111 in binary). Latin‑1 lists 191 characters, and encodes them using the numbers 0–255. Windows‑1252 is a different encoding of around 215 characters also using the numbers 0–255 – and obviously this means that if some text is encoded using Windows‑1252 and then displayed as if it were Latin‑1, some characters will be displayed wrongly or not at all. See also Unicode, a character set of more than 130,000 characters.

Check digit

Many identifiers include a numerical check digit, calculated arithmetically from the other digits. For an ISBN, for example, calculating what the check digit should be based on the first twelve digits, then comparing it with the actual check digit indicates whether the ISBN is likely to be correct, or whether an error has been introduced – a mistyped digit, two digits transposed etc. Different identifiers use different mathematical procedures (or ‘algorithms’) for calculating the check digit.

Chicago Manual of Style

Widely used guide for spelling, punctuation, grammar and typographic style in American English, derived originally from guidelines set by Chicago University Press. cf Hart’s Rules.


Commission Internationale de l’Eclairage, the body responsible for colorimetry standards, against which colorspaces such as Adobe RGB, sRGB or device-specific color profiles are characterized.


Cataloging in Publication, limited bibliographic information produced by national libraries prior to publication of a book, based on information supplied in advance by publishers. The CIP information is often printed within the book itself on the title verso page.


International Cooperation for the Integration of Processes in Prepress, Press, and Postpress Organization, a standards organization focusing on process automation and improved workflow in the print industry. CIP4’s key technical standard is JDF (Job Definition Format), an XML messaging system used in print production.


Chinese Library Classification, library subject classification scheme used in China. See also UDC, DDC, LCC.


Commission de Liaison Interprofessionnelle du Livre, a French book trade organization.

In the context of ONIX and metadata, Classification CLIL is the subject classification developed by the Commission and used widely in the French book trade. See also Thema.


Term for a hardback/hardcover book, generally only used in a bookbinding or specialist publishing context (‘…in cloth’). Also the textile or faux material used to cover the boards of a hardback.

Cloth book

See rag book.


Content management system, system used to manage the creation and editing of material destined for publication (in a book, or on a website).


Cyan, magenta, yellow, key (or black), basic subtractive color model used to simulate (at least in theory) the full range (or gamut) of visible colors in color printing with just four colored inks (and halftoning). In practice the range of printable colors is more limited than the full visible range. See also RGB.


Coder–decoder. Loosely, the compressed file format used to store a files containing image, audio or video data. Since such files can often be very large, the data is compressed mathematically. JPEG is a lossy format for image data, whereas TIFF is lossless. AAC and MP3 are two different lossy formats for audio data. These are referred to as ‘different codecs’. More strictly, the codec is the software used to compress or decompress the data in a particular format.


Term used in ONIX documentation for a controlled vocabulary or authority file. In addition, codelists define a language-independent notation – the code – for each term or concept in the vocabulary. Only the code appears in ONIX data. Some codelists have an implied hierarchy (for example list 150, where BA is clearly a broader term than BB or BC), making some lists taxonomies rather than simple vocabularies. An interactive codelist browser is available at See also SKOS, keyword.


A book – printed or manuscript content arranged in discrete pages, bound down one edge (the spine), or occasionally folded accordion-style. cf scroll.


See co-publication.


Process of sorting, the sort order or procedure used, or the process of checking items have been sorted correctly.


Fixed or indefinite number of products that share some collective identity such as a collective title. Members of the collection usually also have other attributes in common, such as product form or a branding or design style. A set or a series is a collection, but a collection could also comprise a less formal selection of products.


Logo of publisher or imprint on the spine or title page of a book.

A statement about production details such as typeface, paper grade and binding printed on the title verso or at the end of the book.

Color gamut

The range of colors available within a particular colorspace (for example, within sRGB) or on a particular display or printed page (see CMYK), often measured against the full range of visible colors (or ‘chromaticities’) as defined by the CIE.

Color profile

Extra metadata embedded inside an image file that specifies what colorspace the image in ‘in’ (JPEG, TIFF, PNG images can carry ICC color profiles – but not all image formats support the inclusion of profiles). The profile relates real-world colors to digital color values, and an ‘input profile’ is a characteristic of the camera, scanner etc used (or of the software used to generate or manipulate the image). The color profile defines a device-specific ‘colorspace’. A second profile – an ‘output profile’ – belonging to the printing system or display device can be combined with the digital image and its embedded input profile to compensate for less than ideal color accuracy in both input and output and thus present the final printed or on-screen image as close as possible to the original real-world color. Profiles can also be used to convert between device-specific colorspaces and idealised standard colorspaces like sRGB, Adobe RGB or CMYK. See also CIE.


In an IT context, the ability of a system to interoperate with other systems. Backward compatibility is the ability of a system to accept and correctly process input intended for older systems. A data format is backward compatible with its predecessor if data that is valid under the old format remains valid and retains its meaning under the new format. Forward compatibility is the reverse, the ability of a system to accept and process input in a format intended primarily for later versions of itself, although of course it may not be able to extract all data from the newer format.


part of a product – a single volume of a multi-volume set (when sold as a single product), a single CD in a multi-disc audiobook, a book in a book plus toy bundle. cf item, though note that a component of a multi-component trade pack can become a product or item in its own right when sold at retail.


In ONIX, a sequence of XML data elements can be nested inside another pair of start and end tags, forming a ‘composite' that emphasizes the logical structure of the data. For example, all the data about one contributor is nested between <Contributor> and </Contributor> – inside a <Contributor> composite. In effect, the 'data' inside the composite data element consists of other data elements. Many such composites are repeatable, for example if there are multiple contributors

An example can help: The repeatable <ProductIdentifier> composite contains two mandatory data elements: <ProductIDType> a defining code (ONIX Code List 5) and <IDValue> is the value defined by the code: Code “15” defines ISBN-13 and the IDValue tag contains the ISBN. The composite repeats but the code is applied uniquely. You only need give the book's ISBN-13 once – because it only has one ISBN – but if you include the GTIN-13 (and you should) then it takes two Product Identifier composites.

ONIX is structured data so, in addition to repeating composites can appear in multiple positions within an ONIX record. Using the ONIX record’s XML tree to illustrate:
Product / Product Identifier
contains the Product Record’s ISBN, but Product Identifiers can also appears in
Product / Related Materials / Related Product / Product Identifier
where the same composite, carries the same codes but they are now contained within a repeating Related Product composite that is defined by its <ProductRelationCode>. The Product Identifier composite is using the same codes to identify the ISBN-13 but the value represents a different book whose relationship to the Product Record is defined by the Product Relation Code.


Data compression is the mathematical process of reducing the size of a file, for example by eliminating repetition and redundancy. Different compression methods are either lossyor lossless. Lossless compressed files can be expanded back to reconstitute the exactoriginal data, but lossy compression – often used with image, video or audio files – discards less important sounds or image detail to make the compressed file even smaller, so the re-expanded file is only approximately the same as the original. In practice, the difference may be invisible or inaudible, but lossy compression is obviously unsuitable for use with text or numerical data. AAC and MP3 are lossy audio codecs, JPEG is a lossy image codec and AVC is a lossy video codec. TIFF is (almost always) a lossless image file format, WAV is lossless audio, and Zip losslessly compresses any file (but worth knowing that zipping a file that is already compressed – like zipping a JPEG, for example – does not usually make it any smaller…). See also codec.


See sale or return.


In typography, a typeface design that is tall and narrow, or narrower than usual for a particular family of typefaces, so more characters fit on a single line of text.

See abridged.


Person or organization – more generally, the party – responsible for creating the intellectual or artistic content of the product. ONIX is usually only concerned with contributors named on the product itself, and then only with their outward-facing public identity or persona. The publisher acquires rights to exploit the intellectual or artistic content created by the contributor, in return for fees or a royalty.

Controlled vocabulary

An exhaustive list of terms that can be used in a particular context or data element. Each term in the vocabulary should have an unambiguous definition or explanation, and may include both preferred terms and less-preferred synonyms. Controlled vocabularies may be a ‘flat’ list of terms, or the terms may be arranged hierarchically – in which case the vocabulary is sometimes called a taxonomy. See also codelist, authority file.


Co-operative marketing, arrangement whereby the publisher subsidizes or pays for advertising and promotional activities (A&P) carried out by a retailer.


When two publishers co-operate to publish a book, they may create and sell a single product (a co-publication), or may create a shared ‘base’ from which two different products are derived and sold (these are Co-editions). A co-publication may carry the branding for both publishers, and may even carry two separate ISBNs (one from each publisher), but it is a single product. In contrast, co-editions have separate identities (including separate ISBNs for each publisher’s version), even though they might sometimes differ by no more that the branding and the publisher details. More often, the shared base for a co-edition may consist only of the color images, to which each publisher adds entirely separate text – this type of co-edition is very common in multi-language illustrated books, as it offers significant savings in shared production costs.


A play on conventional copyright: a licensing arrangement whereby a work (most often computer software) may be copied, used, re-distributed, adapted or modified, free of any restrictions, on condition that anything derived from it is also free of the same restrictions and bound by the same condition. Some Creative Commons licenses are copyleft licenses.

The exclusive legal rights to perform, display, reproduce, sell, modify, adapt or otherwise use original work or other intellectual property that is expressed in text, images, sound – a right enshrined in the ‘Berne Convention’, originally agreed in 1886 but revised and updated several times since – most recently by the Marrakesh Treaty. The copyright in a work is held by the author or creator, and can subsequently be passed on (eg to the author’s estate), or licensed or assigned to publishers (and others) in a contract. In most jurisdictions, copyright (which is essentially a commercial right) is accompanied by inalienable Moral rights such as the right to be identified as the author, and protection for the integrity of the work. Unlike rights over other intellectual property such as a patent or a trademark, copyright is automatic – you don’t need to register your work to gain legal protection, though in some countries, registration can be beneficial and in others, the Moral rights must be explicitly asserted (for example, within the work itself). Copyright in a new textual work usually persists for 70 years after the death of the original creator (occasionally slightly more), and limits exploitation of the work by those other than the copyright holder, licensees or assignees (collectively, rightsholders). The term of copyright has varied significantly across different countries during the last century, so 70 years after death is not always correct for older works. In some countries, Moral rights expire alongside the commercial copyright. In others, they are perpetual. After expiry of the commercial rights, the work passes into the public domain. Certain groups, eg print-impaired readers, may hold a legal copyright exception and can make copies for personal use without obtaining permission from the rightsholders. Other uses of copyright material may also be allowed without explicit permission (eg for purposes of education, research, parody, for review and criticism, for digital backups etc) under ‘fair use’ or ‘fair dealing’ provisions of national law, but the scope and detail of these exceptions vary from country to country (see also the ‘three step test’).

See copyright.

Also termed the Imprint page. In a printed book, the title verso on which the copyright notice, publisher and imprint details, the ISBN and impression number, CIP and other details about the publication of the book usually appear. In a e-publication, this is often added at the end of the book.


Camera-ready copy, largely obsolete term for typeset or graphical material ready for photographic transfer to a printing plate.

Cyclic redundancy check, a number calculated according to some mathematical algorithm and appended to digital data as it is stored or transmitted, to enable later detection of errors. On receipt or retrieval of the data, the algorithm is recalculated and compared – any difference indicates an error. The concept is similar to a check digit within an identifier.

Creative Commons

Organization that develops copyright licenses that permit and encourage free sharing of creative works. Some CC licenses are copyleft (in particular the SA ‘Sharealike’ variants), whereas others reserve some rights to the creator (eg CC-By reserves the right to be credited as the author). The related CC0 (‘CC Zero’) waiver waives all possible rights (including the right to place – or prevent placement of – licensing restrictions on derived works). However, CC Zero cannot waive certain inalienable moral rights. See also Open Access, Public Domain.

Cross grain

In US, against the grain. See long grain.


Customer relationship management, integrated management of an organization’s customer and product support and other interaction with its customers and potential customers across multiple channels (eg via phone, website, e‑mail, social media, and advertising and marketing activities), and more narrowly, the IT systems to support and analyse those business processes.


Corporate social responsibility, ethical principles, policies and actions of an organization that promote social or environmental good, going beyond legal requirements, via corporate philanthropy, responsible and sustainable business and supply chain practices, cause-related and social marketing, etc.


Cascading Style Sheets, W3C standard for markup used alongside HTML to control the appearance of web pages (where the HTML markup largely defines the structure). CSS is also used in EPUB, which is also HTML-based.


A Comma-Separated Value data file consists of tabluar data (rows and columns) with each value stored as ordinary text, columns separated by commas and rows by line breaks. If a value itself includes a comma, then the value is enclosed in quote marks (in some files, allvalues are enclosed in quotes). CSV files are often a last-resort data exchange format: they are easy to read and write with a spreadsheet application (eg Microsoft’s Excel), but it’s not always clear whether a value might consist of multiple lines of text, how quote marks themselves might be encoded, whether the table includes column headers (field names), and what character set or encoding should be used. See also tab-separated file.



See B2C.


Digital Asset Distributor, organization that facilitates distribution of digital assets such as e‑books to online retailers and libraries on behalf of the publisher. The service may encompass a managed asset repository, file format conversion services, metadata and e‑book distribution, and aggregation of sales statistics.


Typographic symbol ‘ † ’ sometimes used to indicate footnotes in text. Also comes in double dagger (‘ ‡ ’) flavor.

DAISY Consortium

Digital Accessible Information System, a consortium of organizations working to promote ‘inclusive publishing’ and the availability of accessible editions of books to all, including print-disabled readers. See also DTB, EPUB.


A Digital Asset Management system manages the ingestion and storage of digital assets, their cataloging and metadata, search and retrieval, and sometimes distribution. DAMs can be structured like a library aimed at simplifying the reuse of assets, or as a workflow tool forming part of a production system.


Common typographic dashes come in different lengths, -, – and —. Hyphens (the shortest) are used to connect compound words or split words at the end of a line in justified text. En dashes (the middle size) can be used for parenthetical phrases – usually like this, with spacing – or unspaced to indicate a range such as A–Z or 1–9. Em-dashes are used for parenthetical phrases—usually like this, without spacing—or to indicate an abrupt halt or discontinuity in a sentence. An em dash is about one em long. In some languages and typographic styles, em dashes or a slightly longer ‘quotation dash’ are commonly used for dialogue, in place of opening quotation marks. And in maths, the subtraction sign ‘−’ is about the size of an en dash, but is often matched to the widths of the digits.


An organized collection of data. Modern databases are normally electronic, often with tables of data arranged in columns and row, and relationships between tables (see relational database). The data is organized to model aspects of the real world, and to support various business processes through manipulating that data.

Data element

In XML documents such as an ONIX message, a single XML tag and its content – text, numeric or other data contained between a pair of markup tags. Sometimes loosely termed a ‘data field’. cf attribute, composite.

Day and date

Movie industry term for simultaneous release of linked media properties, as when releasing a tie-in book carefully timed to publish on the same day as the film opens in cinemas. Such releases are usually embargoed to prevent early sales from bookshops. cf windowing.


Distribution center, a distributor or wholesaler’s warehouse.

In metadata contexts, see Dublin core.


Extended RGB colorspace, allowing a much wider range or gamut of colors on screen than standard sRGB. Allows more detail in red colors, but not as much in green as Adobe RGB. See also color profile.


Demand-driven acquisition, also termed PDA, patron-driven acquisition, where libraries have instant access to a complete catalog of e‑books, without having to purchase them up-front. Purchase or licensing of a particular e‑book is triggered automatically when actual usage of the book exceeds an agreed threshold (eg a library patron reading more than a certain number of pages).


Dewey Decimal Classification, common subject classification system used in libraries, named for its creator Melville Dewey. See also UDC, LCC, CLC.

Deckle edge

Page edges of a book block left rough and untrimmed, or more likely, carefully trimmed to make them look rough and untrimmed. Also termed a Rough front.


Character or character sequence used in data files to separate one discrete data element from the next. CSV files use commas as delimiters. XML uses angle brackets (< and >) as separators between data and markup. Within a few ONIX data elements such as <CountriesIncluded>, a space is used as a delimiter in a list of country codes. There is a clear problem whenever a delimiter character occurs within the data itself – this is why XML data must use &lt; to represent the < symbol within the data.

Delta update

See organization of data delivery.

Demy, Demi

Common hardback book size in the UK, typically around 216 × 135mm. Pronounced as in ‘deny’. See also Royal, trade paperback.


See bulk.


Deprecated data elements or codes are not recommended for use, and in general are strongly discouraged. In most cases, another element or code is recommended instead. Deprecation implies obsolescence, but the element or code does remain technically valid.

Derived work

See work.


See ascender.

Desk copy

See approval copy.


See DDC.

Diacritical mark, diacritic

Small decoration or accent added to a letter in Latin and other scripts, which modifies the pronunciation of the letter. Diacritics can also indicate the presence of unwritten vowels (eg in Arabic script), or indicate tones (eg in Chinese pinyin) or prosody. Common Latin accents include acute, grave, circumflex, háček, ring, tilde, cedilla etc, but there are many other types in other writing systems and languages. The effect of diacritics on alphabetic sorting varies by language – some (eg French) ignore accents for sorting purposes, others (eg Swedish) treat accented characters as whole new letters at the end of the alphabet.


Device for scoring, folding, cutting, stamping or embossing paper or card during manufacture.

Dieresis, diaeresis

Diacritical mark indicating (in English pronunciation) a vowel that is pronounced separately from the adjacent vowel in what would otherwise be a diphthong (two vowel characters indicating a single vowel sound). For example, a dieresis is used in ‘naïve’. A dieresis has roughly the opposite effect from a vowel ligature (which indicates they are pronounced as a single sound), but both have fallen out of common typographical use. The same ‘two dots’ symbol is more commonly used in Germanic languages for an umlaut, which has a different effect on pronunciation, for example ,schon / schön‘.

Digipack, Digipak

Folded card packaging with a distinct spine and one or more plastic trays affixed to hold a CD or DVD, as distinct from card sleeve packaging or plastic jewel or keep cases.

Digital asset distributor

See DAD.

Digital certificate

In cryptograph, a digital certificate or digital signature is a data file that securely identifies a particular web server or (less often) a legal entity such as an organization. On the internet, a certificate is used in the first stage of creating a connection via HTTPS, so you can be sure the server you are interacting with really is EDItEUR’s web server and can therefore trust the information it provides. It also allows the communication to be encrypted and private. Certificates can also be used to validate identities in e-mail exchanges, and ensure the integrity of software applcations.


Also Retailer discount, Trade discount. In some countries, books have established wholesale or business-to-business prices. In others, business-to-business transactions are based on a discount from the fixed or recommended retail price agreed by the parties. The discount given by a distributor or wholesaler varies from retailer to retailer (bigger retailers sometimes get more discount) and from book to book (discounts are often greater on mass market fiction than on specialist non-fiction). Assuming the books are sold to end customers at their normal retail price, the trade discount represents the retailer’s gross margin (sales revenue minus cost of goods sold). See also reseller model.

More loosely, Discount can also refer to books sold at retail for less than their recommended retail price (which reduces the retailer’s margin).

Discount code, Discount group code

Index into a table of discounts shared in advance between supplier and retailer. Since each retailer may have a unique table, the actual discount percentage can be communicated without revealing commercially-sensitive information.

Distribution rights

Rights to make a product available in a particular market, a commercial right derived from the publisher’s publishing rights and conferred contractually on the publisher’s distributors, wholesalers and retailers. See sales rights.


Organization that holds the primary stock of books and is responsible for fulfillment (of trade orders) in a particular territory or market on behalf of the publisher. Wholesalers and retailers may act as intermediaries between the distributor and the end customer. Many large publishers own or operate their own distributor, and hold stocks themselves. Other publishers appoint a single, exclusive distributor per market or territory (and this exclusive distributor is sometimes termed the Vendor of record). Some publishers prefer to appoint multiple non-exclusive distributors.


See IP address.


XML markup for structured book texts and documentation. The text of the book contains markup that divides it up into parts, chapters, paragraphs, tables, lists, footnotes and so on. The markup is structural and semantic, rather than defining the typographic presentation or page layout, and the DocBook data can be processed automatically to create e‑books, large print, conventional print, synthetic audio versions of the book. See also TEI, JATS.


Digital Object Identifier – a digital identifier for a document or other object, generally one accessible via the Internet. Objects with DOIs can be the target of clickable links in other documents (eg a scholarly article may cite another article via its DOI), or the link may provide information about the object. In contrast to the superficially similar URL link, DOIs are managed (by the owner of the target object) so that third-party links to the object do not break when the target document is moved. In contrast with many other identifiers, DOIs are always associated with some kind of domain-specific service, and are actionable(clickable) to gain access to that service. The most common application of DOIs in publishing is CrossRef’s registration and resolution service for online scholarly material, which provides citation services, but DOIs are also used within the DataCite service for citation of and access to research datasets, and the Entertainment Identifier Registry (EIDR) identifier scheme for movie and TV assets. Like ISBN and ISNI, DOI is an ISOidentifier, and is managed by the International DOI Foundation (IDF).


Dots per inch, the number of data points or pixels per inch (PPI) of resolution in an image. Note that the DPI of an image is variable – if the image is displayed at twice the size, the DPI halves. To print an image as a halftone, the DPI of the image should ideally be twice the halftone LPI (so at least 350dpi at the size the image will be required for optimum-quality results with a 175lpi halftone screen. More pixels are unnecessary, fewer means a lower-quality halftone).


Digital Rights Management, usually refers to technical protection measures (eg content encryption systems and other access control technologies, and also watermarking) that are used to enforce or monitor compliance with constraints on usage of digital content within e-publications. DRM can for example limit copying and redistribution of the digital content, printing, cutting and pasting of text, sharing and lending, and can also place a time limit on the use of the content to enable rentals. DRM seeks to protect intellectual property from copyright infringement, but can also (often inadvertently) prevent usages that are specifically allowed by copyright exceptions.

Drop shipment

Also termed Consumer direct fulfillment (CDF). In order to avoid retailers holding stock, while at the same time minimizing fulfillment time, distributors and wholesalers sometimes ‘drop ship’ goods direct to the retail customer. The retailer placing the order must pass the customer delivery address to the wholesaler or distributor. This is most common with PODcopies of books.


See DTO.


Digital Talking Book, standard specification for an e‑book format accessible to blind or otherwise print-impaired readers (and also for the associated reading systems). Developed by the DAISY Consortium, and therefore also known as the DAISY standard(versions 2 and 3). Content of a DTB can range from text with XML markup (to be read using text-to-speech software), through combined text plus pre-recorded audio, to audio-only files with additional navigation functionality. DTB has been largely superseded by the EPUB 3 standard, which incorporates many features to make EPUB content more accessible.


Document Type Definition. Specifies the set of markup tags and the structure – in terms of mandatory and optional markup tags, their order and nesting – that may be used in a particular type of XML or SGML. document. So the ONIX for Books DTD defines how tags like <Contributor> or <x448> may be used. See also schema.


Download To Own, also termed DST, EST, Digital or Electronic Sell-Through. Business model whereby e‑book files or other digital goods are downloaded with a perpetual license. cf DTR, rental.


Download To Rent. Business model whereby e‑book files are downloaded with a time-limited license. See also rental. cf DTO.

Dublin Core

Set of key metadata elements (including Title, Creator, Publisher, Date and so on) intended to standardize bibliographic description and facilitate access to documents and resources on the internet. There were originally (in 1998) just 15 elements in the Dublin Core Metadata Element Set (DCMES), with a further 40 terms (DCTerms) added later, but the imprecision of the term definitions and lack of a common data exchange format mean DC is used either very loosely, or with application-specific ‘profiles’ that prevent broad interoperability.

Dues, Backorders

Business-to-business orders taken by a publisher, distributor or wholesaler prior to publication or during a temporary period of unavailability (when they are more frequently termed backorders), for delivery when the book becomes available. Subscription ordersare recorded as dues. cf pre-orders, which are consumer orders placed with a retailer.


Usually handmade mock-up of a book, folded together or bound from unprinted pages to demonstrate the physical nature of a planned product.


Temporary, free-standing decorative box, usually of cardboard, produced by the publisher to display copies of books in a retail environment. See POS.

DVD, Digital Versatile Disc or Digital Video Disc

Optical disc developed by Philips and Sony, originally for digital video. Similar to a CD, but able to hold much more data – up to 4.7GB – usually compressed and DRM-protected standard-definition digital video, but occasionally digital audio, software (eg video games) or other data. Much rarer double-sided and multi-layer variations of DVD can hold up to 17GB. See also Blu-ray.



Former European Article Number. See GTIN.

Edg3e decoration

Colored, marbled or gilded edges of the book block.


Electronic Data Interchange: system for highly-automated exchange of strictly-formatted business documents such as invoices, orders or P&A updates. Common EDI document formatting standards include X.12 (used in North America), Tradacoms (in UK) and EDIFACT (common across much of the rest of the world). These are not XML-based standards, but XML equivalents do exist under the EDItX banner.


Electronic Data Interchange for Administration, Commerce and Transport, the ISO 9735 standard developed by the United Nations that underlies much of the world’s e-commerce. Application of the standard to the book trade was coordinated by EDItEUR.


Not-for-profit membership-supported organization that develops and maintains, inter alia, the ONIX and EDItX families of message standards and the Thema subject categorization scheme. EDItEUR also manages the ISNI International Agency and until recently managed the International ISBN Agency.


See P.9 Edition. Historically and in book collecting, ‘edition’ carries the same meaning as impression (eg a limited edition, or a ‘first edition’ of Charles Darwin’s On the origin of species by means of natural selection), but in other contexts, does not imply that all copies are manufactured together, only that they are identical or very nearly so in content (ie there can be several impressions of a Third Edition in paperback, and a separate hardback of the same Third Edition). Significant revisions or changes to the content imply a new edition, and in <indecs> and ISTC terms, a new (derived) work. Other ONIX edition types specified in P.9 – facsimile, large print or Braille, for example – and loose terms such as airside edition used elsewhere are not classed as new works (though they are new manifestations).


Family of XML-based ‘transactional’ e-commerce messages developed by EDItEUR, intended as potential replacements for common EDIFACT, X.12 and Tradacoms EDItrading messages. The family also includes sales/revenue/sales tax report messages intended to simplify retail platform-to-publisher sales reporting of e‑books, and an inventory report covering both physical stockholdings (eg for reporting stock on consignment) and digital inventory (for reporting the on-sale status of e‑books).


Standardized e‑book file format aimed particularly at education material. EDUPUB, now more often termed ‘EPUB for Education’, is a specialized profile of EPUB 3.


European Economic Area, a free-trade area made up of the European Union countries plus three of the four EFTA countries, Iceland, Lichtenstein and Norway. The EEA agreement provides a ‘uniform internal market’ with freedom of movement for people, goods, services and capital, and great uniformity in regulations relating to trade, social policy, consumer protection, the environment and company law across 31 countries. In territorial rights contexts, ‘Europe’ often means the EEA – but at other times Europe implies the continent, which would additionally include Switzerland, the Balkans, Ukraine, Belarus, Russia and (possibly) Turkey and the trans-Caucasus (Georgia etc).


European Free Trade Association, comprising Iceland, Lichtenstein, Norway and Switzerland (though several other countries were previously members and are now members of the European Union). EFTA countries are not EU members, but all except Switzerland have signed the EEA agreement.


Entertainment ID Registry, and the DOI-based identifier used by the registry for the film, TV and video sectors. See also ISAN.


There is no special type of ISBN for e‑books. See ISBN.


See data element.


Typographic mark consisting of three dots (‘ … ’) indicating an omission, elision or continuation.


English language teaching.


Measurement equal to the point size of text – so for 12pt text, one em is 12pts. An ‘en’ is half an em. See also dashes.


Proscription of sales, or sometimes of publication of reviews, of a book prior to a particular date. See sales embargo date.


See Sales embargo date.

For Canadian market context, refer to the BookNet Canada documentation on


Europe, Middle East and Africa.


All text is encoded in some way in order to store and process it digitally: each character is stored as a number, and the relationship between the characters and the numbers is the encoding. ‘A’ might be stored as the number 65, or more likely as the binary equivalent 01000001. B is 66, C is 67 and so on. There are many different standardized encodings, though for historical reasons, most actually do use 65 for A (because ASCII is the basis of many later, more complex encodings). However, encodings vary greatly for characters that are not present on an English typewriter keyboard – é might be stored as 233 (11101001 in binary), or as two numbers 195 then 169 (11000011 and 10101001), or in another way. And most encodings have a very limited repertoire of characters (their character set), so there are many characters that cannot be encoded at all. See Unicode, UTF-8, Latin-1, Windows-1252.


Headband or tailband at top and bottom ends of a book spine – originally protective reinforcement of the binding of the book block, and thus implying a high-quality binding, but these days usually a purely decorative cord or strip of woven material sewn or glued to the spine of the block prior to addition of the cover.

End matter

See back matter.


See footnote.

End paper

Strong paper sheet used to affix the case of a hardback to the book block.


A thing, something that has a distinct and independent existence, and may carry various descriptive properties. Entities may be concrete or abstract, actual or potential, and their properties may often also be considered to be entities. ONIX allows description of various types of entities and the relationships between them – products, contributors, locations etc. See also abstract model, relational database.


EDItEUR Product Information Communication Standard – a data dictionary that inspired many of the data element definitions of ONIX.


Electronic Point Of Sale, a retailer till or checkout system often also used for sales data and stock control.


Standard e‑book file format developed and maintained by the IDPF and more recently by the W3C. The latest EPUB 3 version of the format builds on HTML5 and CSS, incorporates both reflowable and fixed-format variants, and includes features to help publishers optimize the accessibility of the content (obviating the need for specialized accessible editions). Note that ONIX has a number of data elements named <Epub… (for example <EpubTechnicalProtection>) which are intended for use with all types of e-publication. They are not exclusively for publications in the EPUB file format.


Enterprise resource planning, the integrated managment of multiple high-level business processes, including inter alia purchasing and production, order-to-cash, customer service or CRM, financials including budgeting and accounting, perhaps even HR. Also commonly, the IT systems that support this integrated approach.


Corrections to the content of a book, sometimes corrigenda printed on a separate sheet and inserted into finished copies as an addendum.


See DTO.


European Union, political and economic union of 28 member European states (Austria, Belgium, Bulgaria, Croatia, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Luxembourg, Malta, Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, United Kingdom).


End user license agreement, the license granted by the publisher or rightsholder to the final purchaser or reader of an e‑book.


See EEA.

Evaluation copy

See approval copy.

Exclusive rights

See Sales rights composite in Group P.21, though exclusivity can apply not only to publishing rights but also to other types of right (for example exclusive distribution rights).

Exhaustion (of rights)

See first sale principle.

Export edition

Product intended to be sold outside the country of publication.




Page count, the number of pages in a book (or occasionally, the number of words or the running time). See Group P.11 for different methods of measuring the extent.


Technique for searching a collection of information based on applying multiple filters, successively refining the search results until the item is found. Faceted searching is often used after an initial direct text search, to narrow down a large number of search results.

Fair use, Fair dealing

Limited exceptions to copyright, for example allowing usage of short excerpts or quotations from a copyrighted work for particular purposes such as review, criticism or private study without formal permission or payment. In some jurisdictions, the limits on fair use are clearly and legally defined. In others, fair dealing is defined in a more flexible and pragmatic way.


See data element. May also refer to a column in a database table, or a pre-defined data entry area in a form.


Procedure to capture the underlying characteristics of, for example, an image or a chunk of audio or video, irrespective of how it is encoded as data. Fingerprint algorithms are designed to be resistant to change – for example, a cropped or lightly retouched image should have a fingerprint very similar to the uncropped original. In contrast to a mathematical hash function, a song would have the same fingerprint when encoded as MP3 or AAC. Digital fingerprints are used to recognize similarity, where hashes are used to detect differences. See also identifier – though note that fingerprints can only be used to identify a resource by comparing the fingerprint with a large database of fingerprints of known resources.

Firm sale

See sale or return.

First sale principle

Under US copyright law, retail book purchasers are allowed to re-sell, trade, rent, lend, or dispose of the previously purchased item without the prior consent of the copyright holder. In other jurisdictions, some of the copyright holder’s rights are ‘exhausted’ at the time of first (retail) sale, to similar effect. The first sale doctrine and exhaustion of IP rights is often now interpreted internationally – that is, the retail sale may happen anywhere, and the retail purchaser may re-sell, trade, rent etc anywhere, including from another country back into the country of origin. However, first sale and exhaustion do not apply to e‑books because they are licensed rather than sold, so libraries and consumers do not usually have the automatic right to lend, rent or re-sell e‑books without the publisher’s permission.

Fixed retail price

Often just FRP, occasionally also Fixed book price or FBP. In certain countries, for example France or Germany, there are legal restrictions that prevent retailers selling books significantly below the list price set by the publisher (or occasionally, the importer). The IPA publishes a report on the use of fixed book pricing across many countries. See also recommended retail price.

Flash card

Small card bearing a letter, word, phrase, number or symbol, displayed quickly to learners to improve recognition skills and recall. Often used in teaching very basic literacy and numeracy.

Flick book

Book containing a sequence of illustrations on the pages, designed to give an animated effect when the pages are flicked through. Not to be confused with ‘flip book’.

Flip book

A two-in-one volume, two books bound together, see back-to-back.


Historical term for a page number, or for a single sheet (leaf) of paper, or occasionally, for a single sheet folded once to form four pages of a book.


Cartographer’s folly, a fictitious feature, ‘trap street’ or ‘paper town’ inserted into a published map to help detect plagiarism and copyright violation. Similarly fictitious entries in dictionaries, encyclopedias etc are occasionally called ‘mountweazels’. See also watermark.


Set of characters of the same typeface and size, for example 18pt Helvetica or 10pt Garamond.


Additional information, explanation or cross-reference printed at the bottom of a page, and referenced in the main text by a symbol such as an asterisk or dagger, or by a superscripted footnote number. cf endnote, which is similar but placed at the end of a chapter, section or the end of the body of the book.


Outer edge of a book page, opposite the bound edge or spine. Occasionally ‘fore-edge’.

Forward compatibility

See compatibility. 


See CMYK process color.


IFLA’s Functional Requirements for Bibliographic Records is a conceptual model, the approximate library-centric equivalent of the <indecs> framework. The most significant contrast in terminology for ONIX implementers is that, while <indecs> and FRBR describe effectively identical manifestation and item or instance concepts, a work in <indecs>, ONIX and ISTC terms is roughly equivalent to an expression within the FRBR model. See also RDA, BIBFRAME.


See backlist.

Front matter

Pages preceding to the main content of a book, including title and imprint pages, tables of contents and of illustrations, and any foreword, preface and introduction text. Also termed Prelims (preliminary pages), and usually numbered with Roman numerals. cf back matter.


See fixed retail price.


Forest Stewardship Council, international organization promoting responsible management of the world’s forests. It provides accreditation and certification standards for forest management and forest products such as pulp and paper. See also PEFC, the somewhat similar Programme for the Endorsement of Forest Certification.


File Transfer Protocol, a standard method of transferring files across the internet. cf HTTP.


Binding in which the spine and covers are fully bound in cloth, leather or other material. cf half-bound, quarter-bound.

Full update

See organization of data delivery.


Galley, galley proof

Unpaginated proof copy for checking and correction of the text. The text is in a single long column, without specific page breaks). cf page proof.


Books of a particular style, form or subject matter – for example romance, crime, science fiction.


See GLN.


Globally Harmonized System of Classification and Labelling of Chemicals, standardized system of classification for hazards and hazardous materials, including hazard statements, warning pictograms, safety data sheets etc.


Low-quality raster image file, suitable only for use on the web. See TIFF and JPEG for higher quality image file formats.


Global Location Number, an international standard identifier for a physical trading location or (loosely) for an organization at that location. Well established within e-commerce and physical logistics, and not in any way specific to the publishing industry. cf SAN. Although the GLN system is administered by GS1, there is only limited central coordination of GLN assignment, and a single location may have more than one GLN. Note that GLNs are 13-digit numbers, and must be distinguished from GTIN-13s by context. Details about a particular GLN can be looked up in the Global Electronic Party Information Registry (GEPIR) – though the results are imperfect.


Short explanation, interpretation or annotation of an unfamiliar word or phrase, added between the lines of text (an ‘interlinear’ gloss), in the margin or in a separate glossary. See also ruby.


Alphabetically arranged list of terms and their definitions or explanations, essentially a topic-specific dictionary. See also controlled vocabulary, taxonomy. cf index.

‘Gold’ OA

See open access.


Policy, decision-making and oversight arrangements for a standard such as ONIX, or for an identifier registry like that operated by national ISBN agencies. Well-founded, inclusive and consistent governance engenders trust and builds credibility in the standard, allowing confident adoption and use of the standard by stakeholders.


The extent to which metadata is divided into appropriate data fields – for example the way that a contributor name can be divided into <NamesBeforeKey>, <PrefixToKey>, <KeyNames> etc. Highly-granular metadata makes it easier to re-use the same metadata in different and possibly even unforeseen ways.

‘Green’ OA

See open access.


Goods Received Notification, message sent by wholesaler or retailer to the supplier (egthe distributor), usually via EDI, to confirm receipt of books. cf ASN.


International standards organization responsible for a wide range of supply-chain standards, including the SSCC, GLN and GTIN identifier schemes.

GSM, grammage

Grams per square metre, see paper weight.


See VAT.


Global Trade Item Number, a numbering scheme for tradeable items and consumer products of all types in the supply chain. The GTIN identifier scheme is administered by GS1. Common GTINs have 12, 13 and 14 digits. GTIN-12s were formerly known as UPCs(Universal Product Codes), and are used almost exclusively used in North America. Their use on books has been deprecated since 2005. GTIN-13s were formerly known as EANs(European Article Numbers), and they are used globally to identify a wide range of retail items. The barcode symbology used to represent GTIN-13s is still referred to as ‘EAN-13’. Thirteen digit ISBNs are a small subset of the GTIN-13 number scheme (see Bookland). GTIN-14s are used to identify trade packs of items (from cartons to pallets).






Globally Unique Identifier. See UUID.


Arrow-shaped quotation marks ( « and » ) used in French and other languages.


Gap between columns of type on a page

The blank portion of a bound page closest to the spine, the ‘back’ or ‘inside’ margin (cf the ‘outside’ margin nearest the foredge).


See zip.



See AVC.




High-quality binding in which the spine and corners of the covers (only) are bound in leather (or other fine and durable material). cf quarter-bound, full-bound.


See title page.


Printing technique or the resulting printed image in which shades of grey (or for color images, shades or tints of the CMYK process colors) are simulated by small and regularly spaced ink dots of varying sizes – this is termed AM or ‘amplitude modulation’ screening. The regular spacing of the halftone dots is measured in lines per inch (LPI), or per centimetre. 150 LPI (60 lines per centimetre) is common in books and magazines, and high quality color printing may be halftoned at up to 200 LPI (80 lines per centimetre). Occasionally, halftoning uses very small dots but varies their spacing rather than their size – so-called ‘stochastic’ or FM (‘frequency modulation’) screening. cf line art, which uses only ‘solid’ areas and lines of ink and does not use halftoning to simulate shading.

Handle system

The name resolution system that underlies the DOI. The Handle system turns an identifier or ‘handle’ for a resource – something like 10.4400/zuim – either directly into an IP address, or into a DNS name which can itself be resolved to an IP address. The resulting IP address can then be used to locate the identified resource, or its metadata, on the Internet.

Hart’s Rules

Reference book listing rules of (British) English spelling, punctuation, grammar, usage and typographic style. Derived originally from guidelines for Oxford University Press, but is now the basis of ‘house style’ at many UK publishers. cf Chicago Manual of Style.


Typographic symbol ‘ # ’ indicating ‘number’ or (in North America) ‘pound avoirdupois’. Not to be confused with the Pound Sterling sign (‘ £ ’) or the musical sharp sign (‘ ♯ ’).

Short, unique numerical pattern based on the digital content of a large block of data. If the underlying data changes in any way, the hash (sometimes loosely but wrongly termed a fingerprint) inevitably changes, so comparison of the expected and actual hash values is a way of detecting changes to the data. There are many different procedures (‘functions’ or ‘algorithms’) for generating hashes, for example MD5 or SHA-256. Hashes are used to detect differences, where digital fingerprints are used to recognize similarity. See also identifier, the key difference is that the hashes and fingerprints are generated from the data via a particular hash or fingerprint function, whereas identifiers are assigned to the data.


High Efficiency Image Coding, a subset of HEVC intended for compression and storage of still images. Not yet in widespread use.


High Efficiency Video Coding, a format for compressed video data, also termed H.265 or MPEG‑H Part 2. Improved compression enables higher-quality video to be stored in smaller files. Not yet in widespread use, but is likely to supersede existing formats such as AVC.


Numbers in base 16. Hexadecimal notation is convenient for numbers that are actually stored by a computer as binary (base 2) numbers. 49 in ‘hex’ is 01001001 in binary (and 73 in ordinary base 10). Hexadecimal uses the digits 0–9 and a–f, so after 49, you get 4a (01001010), 4b–4f (01001011–01001111), then 50 (01010000).


See hyphenation and justification.


Hypertext Markup Language, the markup system used for simple web pages. Sometimes refers specifically to HTML version 4, standardized by the W3C in 1997, but in other contexts encompasses HTML5 too. It uses simple XML-like tags to add structure to plain text, for example by surrounding third-level headings with <h3> tags and by marking paragraphs with <p> tags. But although XML-like, it does not fully conform to XML syntax, as certain HTML end tags are optional (eg </p>, </li> or the / in <br />) and tags may be upper or lower case. See also XHTML. Along with CSS and JavaScript, it forms the core of the ‘open web platform’, the suite of royalty-free standards and technologies that underlie the World Wide Web.


The latest version of HTML. Some old HTML tags have been removed or redefined, some new tags have been added, the specification itself (technically a W3C ‘Recommendation’) is more rigorous, but it retains much of the familiarity of HTML version 4. Less formally, ‘HTML5’ may also encompass related standards used in modern web browsers. See also CSS.


HyperText Transfer Protocol, a standard method used for transferring files or data across the internet. HTTP is used to transfer normal web pages (at their most basic, these are just files that use HTML markup) from web server to browser – but HTTP can be used to transfer other types of information too. cf FTP, HTTPS.


Version of HTTP in which the data transferred is securely encrypted while in transit across the network. It uses a digital certificate for network authentication and for establishing the encrypted link to maintain the privacy and integrity of data. The EDItEUR website uses HTTPS.

Hyphenation and justification

Usually just H&J, typesetting procedures to align line endings so that both left and right margins are straight rather than ragged. Both the spacing of letters and spacing between words can be varied so that each line of justified text is the same length. In some scripts (eg Arabic), joining strokes between individual letters can also be elongated. Words can be hyphenated to reduce the need for excessive variations in spacing, and to improve the look and readability of the text. Justified text may still need further adjustment to eliminate widows and orphans.



Internet Assigned Numbers Authority, group that coordinates a central registry for DNS, protocols and MIME types for use on the internet.


Effectively, a persistent ‘name’ or ‘label’ for some entity like a product, a work, a location – or indeed a name – where the label is unique within a given context. Identifiers are often (but not always) in a tightly-controlled alphanumeric format, and sometimes contain a check digit to help error detection. Standardized identifiers (for example the ISBN or ISNI) generally provide global uniqueness, there is often a minimum set of metadata associated with the identifier, and the identifier and metadata are sometimes managed in a centralized registry. Other identifiers use decentralized registries. A well-managed registry engenders trust in the identifier and its likely future persistence. Although many identifiers are constructed using a ‘recipe’ (something like ‘four digits for the year, three for the publisher, then five more digits and a check digit’), it is best to treat them as dumb labels without internal meaning, intelligence or affordance. For standardized identifiers, the nature or scope of the identified entity is well defined and understood, so they enable unambiguous communication between organizations within the supply chain. Proprietary identifiers such as the ASIN may only be unique within a particular organization, and the exact nature of the entity identified is often not understood beyond the organization.


International DOI Foundation, see DOI.


International Digital Publishing Forum, the standards body that devised and maintained the EPUB file format for e‑books, now absorbed into the W3C.


Internet Engineering Task Force, group providing technical standards (documented in ‘RFCs’ – Requests for Comment) and guidance (documented in ‘BCPs’ – Best Current Practices) for many aspects of the internet. cf the W3C, which provides standards and best practices specifically for the World Wide Web.


Inside front cover (cover 2), Inside back cover (cover 3).


See LMS.


The arrangement of a set of pages printed on a single sheet, so that the pages appear in the correct order when the sheet is folded and trimmed to form a signature. Page 2 will be printed on the reverse of the sheet where page 1 is printed. Page 3 will not be imposed side-by-side with page 2 (except in a 4-page imposition with a single fold), but will appear adjacent after folding. Pages 2 and 3 are a ‘reader’s pair’ or spread in the finished book, but pages 2 and 15 could be a ‘printer’s pair’ (they will be adjacent in a 16-page, three fold imposition).


A single print run or batch of copies of a book. All copies in an impression are manufactured at the same time and are functionally identical. Subsequent impressions of the same product (reprints) may incorporate minor corrections to the content but may not include significant changes (significant changes to the content would imply it is a new edition, which is considered a new product, albeit a product that is closely linked to the earlier one). The impression number is usually noted on the copyright page.


A brand name or marque used on the product by the publisher. On a book, the imprint is usually named on the title page with further details on the title verso. The imprint name is often (but not always) different from the name of the publisher, and larger publishers often make use of multiple imprints across a variety of products. cf list.

Imprint page

See copyright page.

Inalienable rights

See moral rights.


The <indecs> (Interoperability of Data in E-commerce Systems) project and resulting <indecs> metadata framework provide the conceptual model and many of the principles that underpin the ONIX metadata framework. For example, the concepts of work and manifestation used in the ONIX framework (and the item concept which is largely unused in ONIX) are drawn directly from <indecs>. cf FRBR. <indecs> also underlies DOI, DDEXand EIDR for the recorded music and filmed entertainment sectors.


Storing metadata in a database, in way that makes it quick to search. Online bookstores index some but not all data fields, so a search for ‘Wellington’ may find appropriate books where the word occurs in the title or name as subject fields, but not those books where the word occurs in an unindexed field like review copy. Different stores index a different selection of data fields.

Alphabetic list of keywords from the content of the book, with references to where they occur, often included at the back of a book. cf glossary.

In print

Active, available, commonly abbreviated to IP (though see also IP). Status implying the product has been published and is orderable from the publisher or publisher’s primary distributor. Note that this does not imply the product is immediately available – it may be temporarily unavailable (cf in stock/out of stock). Conversely, out of print does not mean the product is necessarily unavailable – there may still be stock available within the supply chain. Out of print means only that the publisher or the publisher’s distributor will no longer accept further orders for the product. Out of commerce is sometimes used to indicate ‘out of print and unavailable’.


See plate section. See also tip-in for a single-leaf insert.

Inspection copy

See approval copy.


A single copy of a book, usually synonymous with item – a single retail product. In both the FRBR and <indecs> frameworks, a single ‘instantiation’ of a particular manifestation, more or less interchangeable with all other instances of the same manifestation.


Whole number, a number without any fractional component, −1, 0, 1, 2 etccf real number.

Integrated book

Book with text and illustrations combined on the same pages, rather than with any illustrations isolated in a plate section.


‘The much-feared best advertisement for books, and their perfect complement: one being a high-tech place where you can go to be connected and somehow feel alone, the other a low-tech thing where you can go to be alone and somehow feel connected.’ Steve Macone, New York Times blog, 25 Nov 2013. See also the world wide web, one of the systems built on top of the internet. E‑mail, FTP and numerous other services also use the internet.


The ability of multiple systems, using different hardware or software platforms and data structures, to exchange or share data using a common interface.


Intellectual Property Rights are legally-recognized property rights over ‘intellectual property’ – tangible expressions of creations of the mind such as literary, musical and other artistic works, designs, inventions and discoveries. Intellectual property is (or can be) protected by copyright, patent, trademark and design registration. See rights. Only copyright is automatic – patents, trademarks and designs must be registered with national Intellectual Property Offices in order to acquire legal protection.

Internet Protocol, the mechanism for data exchange across the Internet. Each computer attached to the internet has a unique IP address.

For IP, see also In print.


International Publishers Association, the federation of regional, national and specialist publishers' associations which represents publishers worldwide.

IP address

Internet Protocol address, unique numerical address of a computer attached to the Internet. Some IP addresses are also associated with DNS (Domain Name System) names, which are easier to remember – is (currently) linked to Computers use the IP address to communicate with other computers, but whenever necessary, they use a DNS service to convert (or resolve) any names entered by a human into the matching IP address.


Industry returns initiative – in the UK, an agreed set of industry-wide norms and processes for handling returns that enables much of the returns process to be automated. Many (but not all) UK distributors and wholesalers have adopted the IRI methodology for returns.


International Standard Audiovisual Number, a unique identifier for audiovisual material in the film, TV and video sectors. ISAN is an ISO standard, administered by the ISAN International agency (ISAN‑IA).The identifier itself consists of 16 hexadecimal digits, a check character, plus eight further hex digits and another check character. See also EIDR.


International Standard Book Number, unique and internationally-recognized identifier for a ‘monographic publication’ – a book, e‑book, audiobook or book-like product – which according to ISBN rules must be available to the public. See also barcode. ISBNs are a subset of GTIN-13s beginning with 978 or (more rarely) with 9791–9799. The 9790 numbers are ISMNs, and 9791–9799 ISBNs are being introduced as needed, and extend to France, Italy, Korea and USA so far. All ISBNs consist of 13 digits (eg 9780001234567). They are often displayed with extra spaces or hyphens (eg 978‑0‑00‑123456‑7 or 978 0 00 123456 7), but the hyphens or spaces are really just a convenience – they are not a significant part of the identifier. [The positions of the hyphens or spaces within the ISBN highlight the different ‘parts’ of the ISBN – for example separating the ‘registrant element’, sometimes called the ‘publisher prefix’ (00 in the example), from the ‘publication element’ (123456) and the final check digit (7). Note that the middle two of the four spaces or hyphens do not always appear in the same positions – their correct positioning depends on the length of the ‘registration group element’ (0 in the example, but can be up to five digits) and the registrant element (00, but can be up to seven digits). A longer registration group or registrant element implies a shorter publication element. Note also that – even though it is often termed the ‘publisher prefix’ – the registrant element is not a reliable guide to the identity of the publisher.] Also termed ISBN-13 when it is important to differentiate from legacy ISBN-10s. ISBN is an ISO standard. The ISBN system as a whole is administered on behalf of the ISO by the International ISBN Agency – which acts as registration authority – and around 150 mostly national ISBN registration agencies.


Prior to 2007, ISBNs contained only 10 digits (occasionally including the digit X, used to represent 10 in the last digit, the check digit, which was calculated using base 11 arithmetic). Any former 10-digit ISBN can be converted into the equivalent 13-digit ISBN by prefixing it with ‘978’ and recalculating the check digit. (There are several online converters which will do this. The algorithm for 13-digit ISBNs is different from that used for ISBN-10.) Any 978… ISBN-13 can be converted back into an ISBN-10, but 979… ISBN-13s do not have an ISBN-10 equivalent. Continued use of ISBN-10s is strongly deprecated.




Actionable ISBN, a special type of DOI that incorporates an ISBN as part of its syntax. The equivalent ISBN-A for the ISBN 978-0-00-123456-7 would be 10.978.000/1234567. Note that registration of an ISBN-A is separate from the registration of the ISBN – it is not an automatic part of the ISBN registration process.


The ISCC is a universal identifier for multiple generic media-types (text, image, audio, video), a lightweight and similarity-preserving fingerprint designed for digital content, and is free, open-source and transparent. It is designed for cross-sector applicability (journalism, books, music, film, etc.) and to identify content in decentralized and networked environments.


International Standard Link Identifier. An embryonic international standard identifier that can be attached to a relationship between two entities, when it is important to identify and manage the link itself, not just assert the fact that it exists. The ISLI could find potential application in rights management, where in a relationship such as ‘A licenses B’, it would be important to attach an identifier and other metadata to the ‘licenses’ relationship.


International Standard Music Number, an identifier similar to the ISBN, but used for manuscript (notated) music and digital equivalents. Prior to 2008, ISMNs were ten characters (the letter M plus nine digits), but are now 13 digits, with 9790 replacing the ‘M’. A mathematical quirk – or admirable foresight – means the check digit does not need to be recalculated.


International Standard Name Identifier, a standard identifier for public identities or personas of parties (people, organizations) involved in creative activities. It can also be applied to brand names (imprints) and to fictional characters. An ISNI can provide an unambiguous way of identifying contributors, imprints and publishing companies. An ISNI consists of 15 decimal digits plus a final check digit which may be 0–9 or X. ISNI is an ISOstandard, and the ISNI system is managed on behalf of the ISO by the ISNI International Agency (ISNI‑IA) registration authority and a range of ISNI registration agencies. See also ORCID, VIAF.


International Organization for Standardization, the non-governmental global standards setting body of which many national standards bodies are members. ISO Technical Committee 46 Subcommittee 9 (ISO/TC46/SC9) is ultimately responsible for many standardized publishing identifiers including the ISBN, ISTC and ISNI, but ISO sets standards for many other aspects of the publishing industry too.


International Standard Recording Code, standard twelve character identifier for audio recordings (often music). The twelve characters comprise a two-letter country code, a three-character code identifying the registrant, two digits for the year the code was assigned, and five digits to specify the particular recording.


International Standard Serial Number, a standard eight-digit identifier for serial publications such as academic journals, magazines and periodicals, and ongoing series of books. As with the ISBN-10, the last digit may be ‘X’. Note that the ISSN identifies the journal (eg ISSN 0028-0836 is the print version of Nature), magazine or series, not a particular issue of the journal or magazine, nor a particular book within the series. The ISSN-L (Linking ISSN) is an ISSN shared between versions of a journal available in different formats (eg online and printed). ISSNs can be expanded to GTIN-13 form and expressed in a barcode using the prefix 977 and a recalculated check digit.


Former International Standard Text Code, unique identifier for textual works. An ISTC consists of 16 hexadecimal digits, comprising a three-digit code for the registration agency, a year of registration, eight digits to specify the individual work and a final hexadecimal check digit. A work identified by an ISTC may be exploited in several products, which would be identified by ISBNs. ISTCs are cross-publisher, and may also be related to each other where one work is derived from another (via translation, abridgement, compilation and so on). Like ONIX, ISTC draws upon the theoretical model provided by <indecs>. Note that the standard has been withdrawn, and the registration system is no longer active.


International Standard Musical Work Code, unique identifier for musical works, independent of any particular performance or recording (for which an ISRC would be used). An ISWC consists of 11 characters – a T followed by nine digits and a check digit.


A single (copy of a) product. Note that an item can comprise multiple components, which are sold together, and that a single trade-only product may be composed of multiple items which can be resold at retail as products in their own right. See also instance.


Jacket, dust jacket

Loose paper cover wrapped around the boards of a cased book, also known as a dust cover, dust wrapper etc. The imagery printed on the jacket (the ‘cover image’), and any blurb on the back or folded-in flaps, are key marketing tools.


Journal Article Tag Suite, a set of standardized XML markup tags for tagging journal articles during the preparation, publishing and archiving processes. See also TEI, DocBook.


Programming language, used primarily to add functionality to web pages. JavaScript programs are embedded in web pages along with HTML content and CSS styling, and the programs are executed with a web browser.


See library supplier.


Joint Photographic Experts Group. More commonly, an efficiently compressed image file format standardized by that group. JPEG image files are much smaller and nearly as high in quality as the original image (as little as 10% of the size with little perceptible decrease in quality). Greater compression reduces the image quality (JPEG is ‘lossy’), and highly-compressed JPEG images often show characteristic ‘quilted’ patterns (called ‘macroblock artifacts’). cf TIFF.


JavaScript Object Notation, a basic data syntax for data extended from JavaScript but now used in many contexts to communicate data consisting of a list of name-value pairs. JSON‑LD is JSON used to carry Linked data.


See hyphenation and justification.



Adjustment of the spacing between particular pairs of letters to improve their fit and look. For example, in the word ‘WAVE’, the A fits nicely between the W and the V, so the three letters can be closed up a little. cf ‘tracking’ or letterspacing, which adjusts the spacing between all letters equally to adjust the length of a line (see justification).

Key title

See lead title.


Word or phrase chosen to describe or associate with the content or theme of a book. Keywords do not conform to a controlled vocabulary, but may be any natural language word or phrase, such as names of characters in a novel, locations, narrative themes, terms of art etc – any relevant word that is likely to be the target of a search.



Laid paper shows a ribbed texture, imparted on the sheet by parallel wires as the paper dries during the manufacturing process. cf wove paper, which shows no ribbed texture (the wires are a much finer mesh which imparts no pattern). Sometimes, a laid finish is added artificially.


Thin plastic film applied to a printed sheet, for protection or to improve the appearance. The film can be glossy or matte, and is most often used on covers and jackets. cf UV-cured varnish.


A page or image in landscape format is wider than it is tall. cf portrait, where the height is greater than the width. See also aspect ratio.


For ONIX and book metadata in general the language of the book’s text is typically the as the language of the intended audience. Another way to put it is the text, the language of reader and the language that a retailer will market in and to are all the same. Language instruction book are solved by using the language of the intended audience as the metadata’s language of the text supported by a Subject providing the language being taught. So long as the language of text is the same as the language the retailer sells to, using the intended audience is recommended. For other exceptions like genuinely bi-or-multilingual books designed for multiple language markets please consult the Language section of EDItEUR’s Best Practices for guidance.

ONIX supports language attributes in many tags to support marketing to readers in different languages but this has no bearing on the Language of Text entry. While one use case is support a multilingual text with metadata in each language a common Canadian need might be to provide an English language description for a book otherwise intended for a French audience. This might be offered to facilitate the book's use by English speakers or its use in French immersion classes. Use of attributes means the metadata translation is a distinct entry running in parallel the French entry allowing for retailers to display it appropriately.


Also known as ISO 8859‑1. One of a range of standard 8‑bit encoded character setsintended for use with various European languages – Latin‑1 is designed for a wide range of west European languages, and includes the common characters A–Z, a–z, 0–9 and basic punctuation, plus some fancy symbols (eg §, ¶, ¿, × and ÷ symbols) and extra and ‘accented’ characters like ð, ø, ß, and á, ç, ê, ñ, ü. It omits curly quotes, en- and em-dashes and some other fancy punctuation (see Windows‑1252), and many Latin characters and diacritics used only by central and east European languages. It is a superset of ASCII, and a subset of Unicode. Related character sets Latin‑2 to Latin‑10 are optimized for other Latin-script languages: Latin‑2 contains the necessary characters for most central European languages, Latin‑5 is for Turkish, Latin‑8 is for Celtic languages such as Welsh, Gaelic and Breton, and Latin‑9 (also termed ISO 8859‑15) is an improved version of Latin‑1 that additionally includes the Euro sign (€) and a handful of letters (Œ, Š, Ž plus their lower case equivalents, and Ÿ) while omitting common fractions and the international currency symbol ¤. Further ISO 8859 character sets combine basic Latin characters and symbols with Cyrillic, with monotonic Greek, with Hebrew and with basic Arabic characters.


See Latin‑1.


See sales embargo date. In printing, is occasionally used to mean Imposition.


US Library of Congress.


Library of Congress Classification, book subject classification used in some libraries. See also DDC, UDC, CLC, LCSH.

Linked Content Coalition, a consortium of standards bodies and identifier registries that aims to – over time – increase interoperability between metadata, rights information and identifier standards across multiple media sectors.


Library of Congress Subject Headings, book subject classification used in some libraries, and distinct from LCC.

Lead time

Expected time taken from order to delivery, for example of a POD product.

Lead title, Key title

Publisher’s frontlist titles for any particular month or season that are expected to sell the most copies or become bestsellers, which are given significant advertising and promotion support (A&P). cf midlist.


Single thickness of paper, forming two pages of a book (one verso, one recto).

Learning object

See LOM.

Legal requirement and administrative process whereby publishers lodge a copy – sometimes multiple copies – of every publication with a national library or with other repositories. Exact requirements vary from country to country, and increasingly apply to digital as well as physical publications.


Measurement or assessment of the complexity of text, or the reading ability required for comprehension.


Publication of a defamatory or untrue statement about a person, organization etc that will harm their reputation, or tend to make them the target of ridicule, scorn, dislike or contempt. cf slander, which is the oral equivalent.

Library supplier

Wholesaler which specializes in supplying library and sometimes school customers. Library suppliers (jobbers, in North America) usually provide selection or bundling of suitable products and cataloging services and in addition to normal wholesale fulfillment, and even offer services such as rebinding.


Legal permission to make use of some intellectual property. A rightsholder may license another party (a licensee) to do something that would – without the license – be an infringement of copyright. With the license, the licensee is also a rightsholder (though most likely with narrower rights), and may in some cases be able to sub-license rights to a third party.


In typesetting, special precomposed glyph representing two (occasionally, three) letters. In Latin-based typesetting, the combinations æ, ff, fi, fl, ij and a handful of others are common (depending on the fonts available, the difference between ff and ff may not be apparent, but the former is a single Unicode glyph, not a pair of characters). In Arabic typesetting, there are many common ligatures, as characters change shape according to the surrounding letters and their position in a word.


See case-bound.

Line art

See halftone.

Linked data

Approach that expresses structured data as collections of subject–predicate–object ‘triples’. So for example ISNI:0000000121479135—‘is author of’—ISBN:9780007232833 is a triple. Additionally, linked data identifies the subject entities, predicates and many object entities using URIs – so could be used as the subject of a linked data triple. Linked data triples can be expressed and exchanged in RDF JSON-LD or other formats, or stored in a specialized database called a ‘triple store’. Linked open data (LOD) is linked data published under an ‘open’ license, and the Semantic web is a collection of highly interlinked machine-readable Linked open data resources on the internet that allows data to be shared and widely reused.


Informally, the books a publisher or imprint has available in print, or are soon to be published. Within a large publisher, there are usually separate lists for different types of book. See also backlist and frontlist. cf imprint.

List price

The retail price for the product set by the publisher. See RRP, FRP.


A typographical error or ‘typo’ – an error in typeset text.

Literary agent

Person or organisation representing creators such as authors, negotiating and licensing their rights to publishers and others in return for a share of the licensing revenue.

Litho, offset litho

Traditional high-volume lithographic printing technology using oil-based inks and oleophobic printing plates with oleophilic patterns to pick up the ink and transfer it to the paper. cf POD.


Library Management System, integrated software application to support the functioning of a library, including acquisition, cataloging and access. Also ILS, Integrated Library System.

Learning Management System, integrated application to support administration, delivery, tracking and reporting of digital educational resources or of complete courses. See also VLE.


The set of location or culturally-specific patterns and conventions used for display of numbers, prices, time and date formats, etc, sometimes also encompassing language and script preferences, and sort order. ONIX generally aims to ensure data (other than textual descriptions) is communicated in a locale-independent way, leaving the details of locale-specific use or display of the data to the eventual recipient. The Common Local Data Repository (CLDR) is a useful central source of information about specific locales.


Linked Open Data, see Linked data, Semantic web


Learning Object Metadata, a metadata scheme for describing educational resources, used within most LMSs. Learning objects are self-contained collections of instructional or explanatory content, together with practice activities and assessment. The LOM model describes the subject, educational objective, interaction model, any prerequisites and the technology requirements of a learning object.

Long grain

Pages printed and bound where the spine of the book lies parallel to the ‘grain’, the preferred orientation of fibres, in the paper. This means a stronger binding because folds aligned with the grain break fewer fibres in the paper. cf cross grain, where the spine is aligned across the grain of the paper.

Loose leaf

Publication which is supplied unbound, as individual sheets or leaves. These are usually punched to fit in a binder (often the type with openable metal rings) which makes replacing updated (or ‘canceled’) pages simple.

Lossy, lossless

See compression.

Lower case

See case.


See halftone.



The physical or digital embodiment of a particular work. The related hardback, paperback and e‑book products are different manifestations of the same work. All contain essentially identical content. Note that a manifestation encompasses multiple individual copies (or instances) of a book, which are identical (or very nearly so). Manifestations may be identified with ISBNs. See <indecs> and FRBR.


Machine Readable Cataloging, a family of metadata formats used in library cataloging. MARC21, the latest version in use in North America and the UK, is closely tied to the AACR2 cataloging rules. UNIMARC and many minor national variations are common elsewhere. MARC has been in use for nearly 50 years, and has been updated many times to cope with developments in cataloging practices (see AACR2 and RDA), but it is likely to be replaced by more modern data formats over the next decade.


A geographical area within which commercial arrangements for distributing and selling a product are consistent – usually with a single exclusive distributor and single availability date across the market.


Labels, delimiters or tags within a document that define its structure or meaning. In XMLand HTML, markup tags are placed between < and > symbols. Tags are often paired to indicate the beginning and end of a particular data element within the document, so top-level heading text in HTML is contained between <H1> and </H1> tags. Markup can be described as semantic or presentational, but in practice is usually a mixture of both.

Marrakesh Treaty

International agreement signed in 2013 providing for an exception to copyright to facilitate the creation of accessible versions of books and other copyrighted works for print-impaired readers. It also allows for cross-border supply of these accessible versions via trusted intermediaries, independent of the territorial rights arrangements for the works.

Mass market

General non-specialist (adult) consumer market. Also, a paperback product aimed at this broad-based market, often a rack-size or A-format size.


In typesetting, the width of a column of type.

Media type

Formerly MIME type, the descriptor for file formats used in many internet applications. For example, an HTML file may have the media type ‘text/html’ and a zipped ONIX file should be ‘application/xml+zip’.


See ONIX message.


Strictly, data about other data. More usefully in the context of the book and e‑book supply chain, metadata can be thought of as all the data used to describe and trade products through the supply chain. This encompasses both simple, structured and factual information like titles, author names, distribution arrangements and prices, and richer, more complex descriptive data, classifications of various types and even parts of the book itself (a table of contents can be seen also as valuable descriptive metadata). ONIX messages are a method of communicating this highly structured and standardized metadata from one party to another within the supply chain. Some organizations might also consider internal workflow information to be part of the product’s metadata.


Frontlist titles not expected to become bestsellers, and thus not attracting the marketing and promotional effort that publishers afford to lead titles.

MIME type

Multipurpose Internet Mail Extension type. See media type.


Markup language, see HTML, SGML, MathML.

Machine learning, computer algorithms that build and use mathematical or statistical models based on pre-existing ‘training’ data to classify or make decisions about new data, often seen as one facet of AI.

Mono, monochrome

Printed using a single ink (usually black). Note this can include halftone images, not just text and line illustrations.


Publication that’s complete in one part or volume (or occasionally, in a small number of separate volumes). Individual stand-alone books – or occasionally, sets of books – are monographs. Sometimes also implies a detailed scholarly work. cf a serial publication such as an academic journal or magazine.

Moral rights

See copyright.


MPEG Audio Layer-III, standard for compressed audio files. cf AAC. See also codec.


Manuscript, the author’s original version of the text. Occasionally ‘TS’ for Typescript.

Multi-component, multi-item

A single product may contain multiple parts – components – that are intended to be sold together, for example a book bundled with a toy or a slipcase containing three volumes. In contrast, a multi-item product contains several other products and is intended to be split before resale. Multi-item products such as a box containing a dozen copies of an individual book are often available only to the trade.



A naming system, or collection of all the possible unique names or identifiers for a particular type of entity. For example, the ISBN namespace consists of all possible 13-digit numbers between 9780000000000 and 9799999999999 (minus those numbers that begin 9790, and those which the check digit indicates are invalid). The ONIX namespace is the complete list of XML tags that could be used in an ONIX message. In XML, namespaces are themselves often given URI-style names, so is the name of the ONIX 3.0 reference namespace. Note that there is no web page at that address – it is merely an unambiguous way of naming the namespace.


National Bibliography Number, an identifier assigned to a book or other document by a national library. Unlike the ISBN, there is no international standard – every library uses its own proprietary format, and NBNs are rarely used outside the library context.

Neighboring rights

See publishing rights.

Net Book Agreement

Former agreement administered by the UK Publishers Association which prevented sale of books at below the publisher’s list price. The NBA was abandoned in 1995, allowing retailers to compete by selling below the list or recommended retail price. But there are laws and similar agreements in other countries that fix the retail price or limit retailers’ ability to sell at a discount, eg the Lang Law in France. See FRP.

Net price

Generally, a price after tax, trade discounts and other adjustments are made – in effect, a business-to-business or wholesale price. See RRP.


Near-field communication, a set of communication protocols used to communicate over very short distances (a few centimetres) with mobile phones, ‘contactless’ payment cards, biometric passports etc, and closely related to older RFID standards.

In Unicode, see Normalization.


New in Paperback – a paperback version of a book previously published in hardback.

In binding, to catch paper between rollers to sharpen a fold, or to expel air between sheets.

Non-exclusive rights

See Sales rights composite in Group P.21, though non-exclusivity could apply to other types of rights, for example distribution rights.


In a relational database, the strategy of designing data tables and relationships between tables to ensure each chunk of data is stored only once, so that it can be managed efficiently and consistently. However, database designs can sometimes be judiciously denormalized to improve performance, if data management is not such a priority.

In Unicode, the normalization form (NFC, NFD, NFKC or NFKD) concerns whether composite characters like é or the ff ligature are decomposed into two separate characters, the e and ´ (acute accent) or f plus f, or kept as a single precomposed character.

In XML and HTML, ‘whitespace normalization’ means that combinations of multiple space, tab, return and newline characters are treated as if they were a single space character.


Not yet published: publication is forthcoming. cf AB.



See open access.


Online Computer Library Center, US-based global library cooperative offering services such as cooperative cataloging, technology services and research.


Outside front cover (cover 1), Outside back cover (cover 4).

Offset fee

Fee paid by one publisher to another for use of typesetting, film or printing plates and occasionally of data files, in the manufacture of a book, entirely separate from any licensing or rights to the intellectual property content.

ONIX for Books

Online Information eXchange, a standardized framework for communication of rich bibliographic and product metadata between computer systems within the book and e‑book supply chains. Originated under the aegis of the Association of American Publishers in 1999 and first published in January 2000, the standard is now managed, developed and supported by EDItEUR. ONIX comprises a specific set of XML tags that are designed to contain particular types of data about a book (ie book metadata). That set of tags, the meaning of the data they contain, whether each is mandatory or optional, the order they have to be listed in, and so on, are all defined in the ONIX Specification

e‑mail mailing list and support forum for general questions about ONIX. Subscribe by sending a blank e-mail to

ONIX feed

a pre-arranged sequence of ONIX messages exchanged between sender and recipient, maybe daily, weekly or ‘as needed'. Once a data feed has been established with an initial message containing the full set of Product records, subsequent messages in the feed normally contain only Product records for new products and updates (more accurately, replacement data) for existing products.

ONIX message

A complete ONIX data file, generally one in a series of messages passed between a data provider and a data recipient. A single message may contain one or many Product records.

On-sale date BOOKNET CANADA Amendment

See Sales embargo date.

For Canadian market context, refer to the BookNet Canada documentation on


In information science, a formally-defined set of entities and their properties and relationships needed to model a domain or area of interest. See taxonomy.


See out of print.


Online Public Access Catalog, bibliographic database of a library’s holdings, accessible either online or via terminals within the library.


Open Publication Distribution System, a simple data format for distributing catalogs of electronic publications, and usually containing only basic Dublin Core metadata.

Open access

Licensed on ‘open’ terms, for example under a Creative Commons license, which typically mean that a work or published product can be read, used and re-distributed freely – free of charge, and free of at least some of the usual copyright restrictions. The particular license chosen may still require attribution, prevent the distribution of derivative works, or impose other restrictions. More often associated with serials such as academic journals, but some academic publishers produce open access monographs (ie books). Open access material is usually made available via a digital archive maintained by the publisher (so-called ‘Gold’ OA), or an an archive maintained by the author, the research institution or an independent ‘open access repository’ (collectively, ‘Green’ OA). Gold OA material is almost always made available under a highly permissive license, whereas Green OA may be free of charge but remain subject to many or even all the usual copyright restrictions.

Open access publisher

Publisher which specializes in making its products available on open access terms. The publisher charges the author a fee – often termed an APC (article processing charge) or BPC (book processing charge) – to cover the cost of editorial work, peer review, production and distribution, and these fees are usually paid from the author’s research grants. Note that although the product is free of charge to the reader, reputable open access publishers impose the same editorial controls, academic review processes and quality standards as conventional publishers.

Open source (software)

Computer software whose source code is available under an ‘open’ license allowing anyone to use, modify and distribute the code without most of the usual copyrightrestrictions. See for example Apache v2 licenses.

Open standard

Broadly, standards that are publicly available and often implementable without charge, and which are developed and maintained through a transparent decision-making process open to a broad range of stakeholders.


Out of print, Substitute. An answer code meaning the product is OP, but the publisher suggests another equivalent or updated product (eg a second edition may be marked as OPS when the third edition becomes available).


Open Researcher and Contributor ID, a unique identifier used for academic and scholarly contributors somewhat similar to ISNI, but requiring only self-registration instead of assignment through an ISNI registration agency.


High-level business process, or the IT system (often an ERP) responsible for the processes of accepting orders, dispatching goods, generating invoices, and receiving payments. Sometimes OTC, O2C, Purchase-to-pay, PTP or P2P.

Organization of data delivery BOOKNET CANADA ADDITION

ONIX data files are sent from a ‘sender’ to a ‘recipient’ – for example from a publisher to a retailer. One data file, or ‘message’, may contain information about many products, and may form part of a sequence or ‘feed’ of ONIX messages.

There is a separate – and entirely optional – Acknowledgement message which may be returned from recipient to sender, to confirm receipt and processing of the data. Use of the Acknowledgement message should be agreed between parties involved in a message exchange.

A single ONIX data file or ‘message’ includes a snapshot of the sender’s data about a range of products at the moment the message was created – and that snapshot can be transferred into the recipient’s system. But within the sender’s in-house systems, that data is subject to change. For example, as a forthcoming product approaches publication, more comprehensive bibliographic data becomes available, and previously collected data is corrected or refined. Thus exchanges of ONIX data are better thought of as ‘data feeds’ consisting of an ongoing sequence of messages – either a series of complete snapshots of the entire set of data, or (more likely) a series of changes between one snapshot and the next.

Refer to the ONIX for Books Best Practice Guide for more information.


In typesetting, a single word or excessively short line that appears at the end of a paragraph. Alternatively, a section heading or the first line of a paragraph, or a partial or short line of text (eg the last line of a paragraph), which occurs at the bottom of a column or page. Typesetters and designers try to avoid all three types of orphan. See also widow.

Orphan work

A work protected by copyright or other rights, but where the presumed rightsholders are unknown or cannot be contacted (perhaps because the work is out of commerce). Such orphans are problematic, because uncertainty about the rightsholders and the expiry of their rights means they cannot be exploited.

Out of commerce

Out of print and no new stock is available in the supply chain (though used copies may be available). May apply to a particular product (which implies that other manifestations of the same work may still be in commerce), or may be applied to a work (which implies nomanifestations of the work are in commerce).

Work on which copyright (and other IP rights) have expired. After expiry, the work passes into the public domain.

Out of print

A publisher may declare a product out of print (or OP) to indicate it (or its primary distributor) will no longer accept orders. This usually also means copies sold to retailers on a sale or return basis are no longer returnable (or may be returnable only for a short period after the product is declared OP). However, out of print does not mean ‘unavailable’, as there may still be many copies in the supply chain. See also in print, out of commerce. Out of print can also be applied to a work rather than an individual product, which implies all manifestations of a work are out of print.


Open Web Platform, term used to describe the combination of HTML, CSS and JavaScriptused to create web pages and web-based applications, and EPUB e‑books.



Agency contracted by the publisher to produce a book, usually including text creation, editing, design and illustration but not manufacturing of the final product.


Price and availability. Not to be confused with A&P, advertising and promotion.

Page count

See extent.

Page proof

Paginated but unbound proof copy for final checking, indexing etc prior to printing and publication. cf galley proof, bound proof.


The process or result of dividing the text into individual pages. See also extent.


Wooden packing base on which books (or more typically, cartons of books) or other goods are stacked, stored and transported in bulk. A pallet or skid is typically around 1 × 1.2m (the most common size in the UK), is designed to be handled using a pallet-jack or forklift truck, and can carry up to about 1000 typical hardback novels or 2500 paperbacks (depending on their extent), or a tonne or more of paper sheets. Common Euro and US ‘standard’ pallets differ in size – 1 × 0.8m and 48 × 40in respectively. In many Asian countries, 1.1m square is the most common size. Euro pallets are convenient because they fit through doors even when laden, but they waste more space than US and UK sizes when packed in a standard shipping container.


Small booklet with few pages, often self-covered (no separate cover), and usually bound with a simple wire stitch.


The Pantone Matching System (PMS) is a proprietary colorspace used to specify colors of print and ink, paint, plastic, dye and fabric. Only a subset of named Pantone colors can be printed accurately using CMYK four-color printing. Other Pantone colors are printed using special inks.

Paper weight

The weight (or more strictly, the mass) of a particular grade of paper, usually measured in gsm (grams per square metre of a single sheet). Handily, an A0 sheet is 1 square metre, and so are 16 sheets of A4. Typical book paper lies in the range of 60–100gsm. In North America, paper weight may instead be expressed in terms of the Basis weight, the weight in pounds of a Ream (500 sheets) sized 25 × 38 inches. Typical book paper lies in the range of 40–70lb basis weight (approximately equivalent to 60–100gsm). For some types of paper and board, a different basis size is used – eg 20 × 26 inches for cover board, so 230gsm cover board is 85lb basis weight (and this is sometimes written 85#). See for basis weight to gsm conversion tables. See also bulk.


In XML, a parser is software that can read an XML document (such as an ONIX message), check that it matches the required XML syntax and grammar (by comparing it with a DTD or schema file), and make each individual data element available to other software (via an API).


Publication released in several weekly or monthly installments, which may be bound together to form a complete book.


Generalized term for a person or organization involved in a creative activity. See also persona.


See DDA.


Portable Document Format, standard file format for electronic documents (including e‑books), originally devised by Adobe as a simplification of PostScript, but now an ISOstandard. A PDF file can encapsulate almost every aspect of the document – the text, fonts, all vector and raster graphics and even limited interactivity. In general, a PDF fixes the visual aspects of the document exactly and PDF is often used to transfer publisher files to the printer for manufacturing, but reflowing PDFs are possible. Unfortunately, PDFs do not retain structural or semantic markup, which reduces its accessibility. So-called ‘web PDFs’ are normal PDFs (generally not reflowable) but which may have lower quality images (fewer pixels or more loosely, lower resolution, than PDFs intended for printing) to limit the file size.


See FSC.

Perfect binding

See adhesive binding.


Punctuation mark (‘ . ’) also called a full stop, used at the end of a a sentence, as a decimal point in a real number, or after an abbreviation. Not to be confused with a mid-dot (‘ · ’) or comma (‘ , ’) which can also be used as decimal points in different locales. XML data uses only the period as a decimal point within real numbers, even if that data is then displayed using the appropriate separator for the locale.


Authorization from the copyright or other rightsholder to incorporate one work, or more often a part of a work, within another – for example the permission to include a copyright image or to quote a passage of text from one book in another. See also rights.


The public-facing outward identity of a party – usually of a person, though a persona may be pseudonymic, or otherwise distinct from the private identity of a real-world person. See also ISNI, contributor.


Typographic symbol ‘ ¶ ’ usually indicating ‘paragraph’ (or sometimes in technical contexts, the Return character at the end of a paragraph).


The first recorded bibliographic database, developed by Callimachus to catalog the papyri holdings of the Library of Alexandria around 245 BCE.


See raster image.

Plate section

Pages bound into a book containing primarily halftoned illustrations (often photographs), usually printed on higher quality paper than the remainder of the book block. Plate sections – sometimes also termed Inserts – usually lack page numbers. cf integrated book.


Printed laminated case, a decorative variation of a plain paper over boards hardback. The printed design on the cover boards is usually varnished or laminated, as PLCs often lack a separate jacket.


Public lending right, the right of an author to be compensated for the loan of books from libraries – and in the UK, the system used to provide the related payments.


Portable network graphics, a losslessly-compressed raster image file format. While mostly intended as an improvement over GIF (a relatively low-quality image format) for online use, PNG can carry high-quality RGB images, including transparency, color profiles etc – but it does not support CMYK. As a result, its use in publishing workflows is limited. See also TIFF, JPEG.

Pocket book

Small book, usually a paperback. Just how small varies from publisher to publisher and country to country. (In the UK, a pocketbook is a small blank notebook, often spiral bound, and the term ‘pocket book’ is not used – it is roughly synonymous with A-formatpaperback.)


Print on demand, the manufacture of a single copy – often using xerographic (dry, toner-based) or ink-jet printing – in response to a customer order. POD copies may be drop-shipped to the retail customer, or fulfilled via a retailer. POD has a largely undeserved reputation for inferior print quality and binding: current POD technology rivals the quality of conventional litho manufacturing. cf short-run printing. See also ASR.


Traditional primarily American and British typographic unit of size, abbreviated as pt. Historically, slightly less than 172nd of an inch (or about 0.3515mm). The use of points in PostScript has made exactly 172 (0.3527mm) the de facto standard, and spread the use of points to other typographic traditions. The main text in a printed document is usually around 9–12pt. See also em. cf didot, similar measurement unit widespread in European typography at least until the 1980s, around 0.375mm (ie about 7% larger than a traditional point).


See landscape.


Point of Sale, or Point of Service – ultimately, the retailer’s till or checkout. In the US, Point of Purchase.

Promotional material such as posters, dumpbins, bookmarks placed at or for sale at the retailer’s till or checkout.


See pre-coordination.


Computer programming language devised by Adobe, used by early laser printers and by typesetting machines to describe page images, now mostly superseded by PDF.


Pixels per inch, an unsatisfactory measure of resolution of a digital image. It relates the number of pixels in a digital image to the physical size of the image when printed or displayed.


Method of classification where multiple concepts are combined to form single subject headings or codes carrying a complex meaning. All allowed combinations are enumerated in advance. The opposite is Post-coordination, where multiple simple subject headings are used, and the complex combined meaning emerges from the combination of subject headings or codes. Post-coordination allows creative combination of the subject headings. As an example, BISAC is pre-coordinated: a children’s sports story featuring baseball would be classified as JUV032010. In contrast, Thema is at least in part post-coordinated: the same book would be classified as YFR (Children’s sporting stories) plus YNWD3 (baseball), and the full meaning emerges from the combination of the two codes.


See front matter.


Consumer order placed with a retailer in advance of actual availability of the product, for delivery on (or at least close to) the publication date.

Presentational markup

See semantic markup.

Readers with visual, physical or cognitive impairments who cannot use conventional printed or on-screen books, for example blind, partially-sighted or dyslexic readers, or readers with a physical disability. Print-impairment is a range of issues rather than a single problem, but print-impaired readers can often make use of various types of assistive technology and accessible editions. In many countries – in particular, those that have ratified the Marrakesh Treaty – print-impaired readers have a copyright exception that allows copying and modification of copyright material to make it accessible for personal use.

Number of copies printed in a single impression. Historically, this was an edition, and this sense is still used in book collecting (‘a valuable first edition’).

Process color

Color printing in which CMYK inks and halftoning are used to simulate the full range of color. Note that cyan, magenta, yellow and black is not the only possible combination of process colors – though rare, a six-color process with additional orange and green inks can widen the range of colors that can be simulated faithfully (the color gamut). cf mono, spot color.


A particular manifestation of a work that is available for sale to the public (or to an organization, on a business-to-business basis). Although ONIX is often characterized as describing products, it can be used to describe parts of a product that are not individually available, or components that are used during creation of products. However, such parts and components should not be identified using an ISBN, unless they are also products in their own right.

Product record

All the ONIX data for a single commercial product, grouped in several Blocks between one <Product> tag and the following </Product> end tag – in effect, one giant <Product> composite. In book publishing, each commercial product is identified by its ISBN (though ONIX allows for other identifiers too), and the metadata about that product is identified by the <RecordReference> data element within the Product record. There can be many Product records in an ONIX message

Proprietary code and value BOOKNET CANADA ADDITION

A composite carrying an ONIX code list entry defined as “proprietary” so that it can carry a value defined by outside of ONIX. An IDType or description value MUST be supplied so the receiver can use it to define the value is intended to mean. It therefore should be a simple to use as a code and maintained consistently. A typcial example is a discount code or a proprietary value such as an Amazon ASIN supplied as an identifier.


In metadata, the origin of an assertion or metadata property – for example, who says that the book is about motorcycle maintenance? The provenance is important when metadata sources conflict.



The date on which the product is nominally ‘published’, though not necessarily the date on which it first becomes available at retail. See Publishing date composite in Group P.20.

For Canadian market context, refer to the BookNet Canada documentation on

Public domain

Works in which all copyright, neighboring and related rights have expired, or those where such rights never pertained. Note that works covered by Creative Commons and similar open access licenses are not public domain, though for particular ‘licenses’ such as the CC0 (‘CC Zero’) waiver, the net effect may be somewhat similar.


The organization responsible for making the product available, and generally the one taking the financial risk. cf distributor, contributor and imprint. The publisher is a legal entity, whereas the imprint is merely a brand name.

Publisher collection BOOKNET CANADA ADDITION

A bibliographic collection to which the publisher assigns a collective identity, either on the products themselves, or in product information for which it is responsible (again, Penguin Modern Classics is a clear example).

See collection.

Publishing rights

The rights to publish or commercially exploit a work in various ways, obtained by the publisher from the author, creator or other rightsholder, and ultimately derived from the creator’s copyright. The publishing rights obtained by a publisher may be exclusive, global and perpetual, or non-exclusive, limited geographically or for a limited period, and may cover all languages or a specific language. In the English-language publishing market, it is common for a British-based publisher to obtain exclusive world (or world English language) rights from an author, but then sublicense exclusive North American rights to a US-based publisher (or vice versa). Such a sublicense may include additional non-exclusive rights to countries where both UK and US publishers compete to sell their own separate versions of the work. Alternatively, the author may license the North American and remaining rights separately to two different publishers, or a single publisher may obtain the rights to publish globally. A publisher may also sublicense subsidiary and neighboring rights (translation, abridgement, serialization, audio etc) it does not wish to exploit directly. cf sales rights.


Destruction of unsold copies of a book to remove them from the supply chain. This ultimately involves shredding and recycling of the paper mass to make new paper, but there may be stages prior to recycling that involve defacing or otherwise making copies of the book unsaleable.


QR code

Quick Response code, a type of 2D barcode comprising a grid of black and white squares. When scanned (eg with a smartphone), QR codes can trigger actions (go to a URL, send an SMS text message, add contact details to an address book, display text), and some actions have attendant security risks. There is no standard use within the book trade, but they are often used in consumer advertising to avoid the need to type long URIs.


Subject code used within schemes such as BIC and Thema that can only be used to refine the meaning of another code.


High-quality binding in which the spine (only) is bound in leather (or other fine and durable material). cf half-bound, full-bound.

Quotation marks

Typographic symbols (eg ‘ ’ or “ ”) used in pairs around quotations, reported speech or to highlight a word or phrase. Different languages use different quotation marks, even within the same writing system, and conventions for single or double quotes vary too (eg English uses ‘…’, German uses „…‟, French uses « … », Japanese uses 「…」).


Rack size

Size of mass-market paperback common in the US, 6¾ × 4¼ inches (or 171mm × 108mm), occasionally slightly taller. A ‘tall rack size’ is around 7½ × 4¼ ins. See also A-format.

Rag book

Children’s book in which the cover and every leaf are fabric rather than paper, card or board. Also termed a ‘Cloth book’.


Random Access Memory, the working memory of a computer, usually in the range of 1 gigabyte (roughly a billion bytes, in a smartphone) to 64 gigabytes (in a server). The contents of RAM are usually lost when the computer is switched off. Contrast with the more or less permanent storage attached to a computer, which is also measured in gigabytes.

Raster image

A digital image represented as rows and columns of colored dots (pixels – picture elements). TIFFs and JPEGs are raster image formats. Raster images cannot be scaled without losing visual clarity – if displayed too large, curves become jagged and the image pixels become individually visible. cf vector image.


Resource Description and Access. Modern set of library cataloging rules – increasingly widely used and gradually supplanting AACR2 in most English language library applications. RDA rules are also applicable to cataloguing in other languages.


See relational database.


Resource Description Framework, a set of W3C specifications (‘Recommendations’) for representation of metadata about resources (typically web resources) as sets of ‘triples’, three-part statements expressing relationships between entities. RDF triple data can be expressed as XML, or in a variety of other data formats (eg Turtle or JSON‑LD, and is the foundation of the ‘semantic web’. RDFa is a data format for embedding RDF-style structured data within the markup of HTML web pages, and is one of the data formats used by

Reading age

Measure of a child’s reading proficiency or the proficiency required to read a text, expressed as the age of a child of average reading ability.

Real number

Number that contains both a whole number part and a fractional part, representing a quantity that can vary continuously rather than in discrete increments. Can be expressed as a vulgar or mixed fraction (eg 54 or 1¼) or as a decimal (1.25), though only the latter can be used in an XML data element. (More strictly, in mathematics, these are rational numbers. Real numbers also include irrational numbers, which cannot be represented exactly by fractions.) cf integer.

Often just RRP, or occasionally termed a Suggested retail price, or SRP. Price chosen and recommended by the publisher for sales to the consumer. The retailer does not haveto use this price, and may choose to sell the product for a lower (or higher) price – the Actual selling price or ASP. But royalties paid to the contributors of the book are often based on a percentage of the RRP, even though the Actual selling price is different. In some countries, retail prices set by publishers are fixed: by law, retailers may not reduce (or increase) the Fixed retail price, or may do so only within a fairly narrow band or only after a certain time has elapsed since publication. In countries where RRPs are the norm, the Agency model may also allow publishers to exert direct control over consumer prices.


The side of a single leaf in a book that is read first – usually the right hand page in a book, which is given an odd page number. (Conceptually at least, a recto page is a left page in Arabic, Hebrew, or in traditional Chinese and Japanese, where page progression is right-to-left.) The opposite is a Verso page, the second side of a leaf, or left hand page (in most languages and writing systems), with an even page number.

Red Book

The standard covering the physical, optical and electronic nature of all Compact Discs, and for recording high-quality (44.1kHz, stereo, 16‑bit linear PCM) digital audio on such discs. cf Yellow Book.


Internationally recognized codes of letters and/or numbers to countries and their subdivisions, based on ISO 3166. See ONIX code list 49.

Registration agency

See registration authority. 

Registration authority

Organization ultimately responsible for managing an identifier standard, its governance, and registrations or assignments, for example the International ISBN Agency which is responsible for the ISBN system on a global basis. A registration authority may delegate the process of registration or assignment to multiple Registration agencies, perhaps on a geographical or sectoral basis. For example, Nielsen in the UK and Bowker in the US operate national ISBN registration agencies. (This terminology mostly applies to ISOidentifier standards.)


As for reprint, but with refreshed marketing collateral, a new cover etc. Reissue sometimes implies renewed availability after a period where the product has been unavailable. Reissues continue to carry the same ISBN as earlier impressions – and in contrast, using a new ISBN means it is a new product, not a reissue of the original, even if it carries identical content and has an identical product form etc.

Relational database

Metadata is often managed using a relational database, in which the data is stored in one or more tables. Each table has rows representing entities such as products or contributors, and the table columns are the properties of each entity. Tables are related to each other, usually through sharing a single column in common, and the relationships usually mirror the real-world relationships between the entities. The tables are usually normalized (designed to minimize redundancy) – so an author who has written a dozen books has their name stored once, forming a row in a Contributor table, and data in that row can be linked to the many rows in the Books table as required. Most relational database management systems (RDBMSs) or applications have library software for importing and exporting XML. See also SQL.


See Expected availability date in Group P.20.

For Canadian market context, refer to the BookNet Canada documentation on


Excess stock of books disposed of by the publisher or distributor at a low price to a specialist bookseller (who cannot return them). An alternative to destruction of excess unsold copies, for example by pulping.


In e‑books, a limited-term (temporary) license, where a ‘purchase’ is a perpetual license. cf subscription.


In text encoding, the range of characters used in a document, or present in a particular character set.

In rights, may refer to the works of a particular creator or rightsholder, or those offered by a righsholder for licensing under particular terms.


[Print] a new impression, usually manufactured to replenish stock. Copies are essentially identical to the previous batch or impression, though may incorporate minor changes to correct errors, and they carry the same ISBN as previous impressions. Note that if the changes represent significant alterations to the content, then the new copies are a distinct edition – in ONIX terms a new product (and indeed a new work) which would also have a new ISBN (and a new ISTC).

Reseller model

Business model based on the idea that the publisher sells to an intermediary (typically a distributor, wholesaler or retailer) based on an established retail price (RRP or FRP) minus a trade discount (the discount may vary from intermediary to intermediary), or via an established business-to-business or wholesale price. The intermediary can subsequently sell on the book to the consumer at whatever price they choose. Alternatively, in some legal frameworks, the retailer must sell to the consumer at the fixed retail price. In either case, it is the wholesaler or retailer that is the publisher’s direct customer, and not the consumer. This is the most common business model in the book trade; cf the alternative agency model.


The process of resolving an identifier to retrieve its underlying metadata, or to find the entity itself. For some ‘actionable’ identifiers such as the ISBN-A, this resolution process can be via the internet, in the same way a DNS name may be resolved to an IP address.

The degree of detail captured in an image (or more generally, degree of precision in a measurement). Image resolution is usually expressed in dots per inch (DPI) or per centimetre (occasionally it’s pixels per inch, PPI). Note that resolution applies to an image in the physical world, not to the underlying pixel data which can be reproduced at any size (see notes on image resolution in Group P.16).


See returns.

Carriage return, character typed to indicate a new paragraph or confirm entry of data.


Books sold to retailers on ‘sale or return’ terms, and which fail to sell at retail, can be returned to the wholesaler or distributor for credit. Returned copies can be re-sold to another retailer, or may be pulped. Some publishers allow low-value books to be destroyed by the retailer rather than physically returned (eg by stripping off the cover, which is then returned as proof of destruction). Publishers clearly aim to discourage over-ordering and minimise the consequent returns from retailers, but the trade as a whole benefits from ensuring bookstores are full of potentially-saleable books rather than full of books known to be unsaleaable. Thus, a certain level of returns is anticipated, and the credit on returns frees retailers to order new books that are more likely to sell.


Return to the creator of rights previously licensed by the creator to a publisher, for example at the end of a contractually-set term or under other agreed circumstances. Reversions are often requested if a publisher fails to keep a work in print. Similarly, sublicensed rights can revert to the licensor.

Review copy

Copy sent (free of charge) to the press or other media for the purposes of review. See ARC.

See approval copy.


Radio frequency identification, technology used to communicate with tiny electronic tags that can be attached to physical items such as books. The tags contain identification and other information for the item, which can be read at a short distance to track the item and automate many logistics processes. For example RFID tags can be used to automate library lending processes.


Red, Green, Blue – basic additive color model used (eg) on a computer screen to generate (or at least simulate) the full range (or gamut) of visible colors. See sRGB, Adobe RGB, CMYK, color profile.


General term covering copyright, moral rights and other intellectual property rights, plus contractual rights such as the right to distribute or sell products. So-called ‘Volume rights’ give the publisher the right to publish and sell products based on (manifestations of) a copyright work, and are sometimes divided by language and geographical territory (eg ‘the publisher holds Commonwealth English language rights’). Subsidiary rights or ‘sub-rights’ – usually attached to the volume rights, but often sublicensed by the volume rights holder to another publisher – include the right to create and publish serializations for newspapers or magazines, abridgements, adaptations as a play or movie, and (assuming the underlying volume rights are not limited by language) translations. Distribution rights are contractual rights conferred by the publisher on a distributor, enabling the distributor to trade a product in a particular market. See also permissions, publishing rights, sales rights.


Party which holds copyright or some related rights in a work.


RELAX NG schema language. See schema.

Rough front

See deckle edge.


Common UK hardback book size typically around 234 × 153mm. See also demy, trade paperback.


Payments made by the publisher to authors or other contributors, in return for the right to publish and sell the book. Royalties are usually calculated based on the number of copies sold and a percentage of the list price of the book – though sums are often paid in advance, and minor contributors may only receive a fixed fee.


See recommended retail price.


Small gloss or annotation added to characters in non-alphabetic scripts such as Japanese Kanji or Chinese Hànzì, used to guide pronunciation of unfamiliar characters or provide phonetic information for sorting purposes. Ruby glosses are written using phonetic characters such as Pinyin for Chinese or Hiragana for Japanese. See Person name in Group P.7.9, though ruby glosses can be incorporated into any textual data element in ONIX. In HTML, the <ruby> tag can be used to specify glosses. Ruby glosses are so named in English because they were supposedly typeset in Ruby size (5½pt) type.


Saddle stitched

Binding by stapling (wire stitching) folded sheets together in the seam of the fold. The clinch of the staple lies in the centre of the book. cf side stitched, where a wire stitch is driven through the folded sheets from front to back (the clinch of the staple lies at the back of the book).

Sale or return

Transaction terms where goods are sold (eg from a distributor or wholesaler to a retailer) and the purchaser retains the right to return them for full credit if not sold at retail. Sometimes called ‘see-safe’ terms. The purchaser’s right to return unsold copies may expire when – or a fixed period after – a book is declared out of print by the publisher. In some cases, goods need not be physically returned to claim a credit – the cover can be torn off and returned instead (such books are ‘strippable’ and carry a triangular indicator containing a letter S adjacent to the barcode), or the products can be defaced or destroyed in some trusted process to prevent resale. In contrast, ‘Firm sale’ terms mean the product is sold and is not returnable for credit at all, and ‘Consignment’ terms mean the product is still owned by the upstream distributor or wholesaler even though it is physically stocked by the retailer (the retailer ‘buys’ it only after selling it on to a consumer).

Occasionally, sale or return terms (more strictly, consignment terms) are applied to products supplied to end-purchasers, and the purchaser pays only after deciding to keep the goods. See approval copy.


See Publishing date composite in Group P.20.

See On-sale date.

For Canadian market context, refer to the BookNet Canada documentation on

Sales rights

The commercial rights derived from the publisher’s publishing rights that a publisher confers on its distributors, wholesalers and retailers, allowing them to trade in and make the product available to customers. Note the contrast with publishing rights: publishing rights concern where a publisher has the right to publish and sell a product. The sales rights are where the publisher chooses to make the product available (ie chooses to exercise those publishing rights). Clearly the sales rights must be a subset of (or the same as) the publishing rights. In ONIX, only the sales rights are described in detail, and the publishing rights themselves are not made explicit (at least in part because the publishing rights pertain to the work, not the product). In ONIX, the <SalesRights> composites list the set of countries and regions where the publisher is exercising its rights to make a product available. A <ProductSupply> composite and its enclosed <Market> composites can detail the subset of countries and regions where those rights are conferred upon a particular set of distributors, wholesalers and retailers. Another market might have a different subset of the sales rights conferred upon a different group of suppliers. When broken down by market, these are often termed Distribution rights.

Sales tax

Tax levied as a percentage of retail sales to the consumer or end user. cf VAT, which is levied incrementally at all points in the supply chain. Sales taxes are levied by most US states and some Canadian provinces, with rates varying typically in the 0–7.5% range, and additional sales taxes may be levied by city and local government. Since the total retail price – inclusive of tax – varies according to the exact location of the retail sale, advertised prices in those countries do not include the sales tax element, and tax is added at the checkout.


Standard Address Number, American national standard identifier for a trading location within the supply chain. The SAN registry is administered by Bowker. In Germany, the Börsenverein administers the similar Verkehrsnummer. In contrast to the GLN, the SAN is unique to the publishing industry, but is well established in book-related e-commerce in North America and parts of Europe to identify distribution locations, customer delivery addresses etc.


Like a DTD, an XML schema formally defines the set of markup tags that may be used in a particular type of XML document, whether each tag is mandatory or optional, and their order and nesting. But unlike a DTD, a schema also constrains the data types and values that may be used within the data elements in the document. It can require that a particular tag contains an integer, or a date, or set a limit on the length of text. And an XML schema can define lists of allowed values (‘enumerations’, controlled vocabularies, or in ONIX terminology, codelists) that can be used in a particular data element. Two primary ‘flavors’ of ONIX for Books schemas are available, using the XSD and RNG schema languages, both of which are themselves XML documents. (Technically, the DTD is a very simple kind of schema too, though DTDs are not themselves XML documents and they cannot define required data types or enumerations.) The normal or ‘classic’ ONIX XSD is based around XSD 1.0. A ‘strict’ XSD 1.1 is also available, which checks a further range of data types, business rules and other requirements, although it not compatible with all XML validation scenarios. A further schema language flavor, Schematron, can be used independently or in conjunction with an XSD schema. A range of Schematron-based rules are embedded within the ONIX ‘strict’ XSD to provide optional warnings covering the use of deprecated data elements and codes.

A database schema is a formal definition of the structure of a database, specifying the nature of the columns, tables, relationships and so on in the database.


See Schema.

See Validation.

Initiative by the major search engines – specifically by Google, Bing, Yahoo and Yandex – that encourages addition of structured metadata into HTML web page markup using a JSON‑LD, Microdata or RDFa vocabulary, largely for SEO purposes.


An alternative type of XML schema language. Schematron is rule-based, and is able to test conformance with a wider range of document constraints and business rules than XSD or RNG schemas. Over and above the pass/fail capability of validation with an XSD 1.1, Schematron validation can also deliver warnings about the data.


A writing system of conventional symbols representing the elements of language. A script can be alphabetic, like Latin or Cyrillic, or logographic like Kanji or Hanzi. Scripts can be linked to particular languages (eg Hangul to Korean), or used for a variety of languages (eg Latin to a host of European languages, and also to Chinese via phonetic Hanyu pinyin). A handful of languages are commonly written in more than one script (eg Serbian in either Cyrillic or Latin), or have changed their script at some point in recent history (eg Turkish, Vietnamese are written using Latin script).


Printed or manuscript content arranged as a continuous document, not divided into discrete pages. cf codex.

Section mark

Typographic symbol ‘ § ’ usually indicating a section or clause in a document.


Person combining the roles of contributor and publisher.


Sales to retailers – the number of copies entering the sales channel. In the book trade, most of these copies are typically sold on sale or return terms, so sell-in is not a final sales total. cf sell-through.


Sales to consumers. In principle, over the long term, sell-through equals sell-in minus returns – though over a shorter period, this can be masked by changes in the number of copies held in stock by retailers. Sell-through is sometimes expressed as a percentage of sell-in (eg a sell-through of 85% implies a 15% return rate). See also EST.

Semantic markup

Markup tags that define or at least highlight the meaning or nature of the tagged text, rather than specifying how it should be presented. In contrast, presentational markupdefines only how the text should be displayed on page or screen. As a simple example, HTML includes <i>, a purely presentational tag that indicates text should be displayed in italics. In contrast, the <em> and <cite> tags have the same typographic effect on the appearance, but have extra semantic value – they indicate why the text should be italicized (for emphasis, or because it is a title citation).

Semantic web

See linked data.


Search engine optimization, the process of enhancing the visibility of your web pages in organic search results, by adding cross-links, keywords and structured metadata about the web page within the web page itself (see and RDFa), or using terms in the content that are frequently searched for.


A distinct data feed.

See Organization of data delivery.


Ongoing publication issued under the same title in a succession of discrete parts, often at regular intervals, such as a magazine, academic journal, newspaper or regularly-issued directory or annual report. Issues are usually dated or numbered sequentially, and are usually purchased via a subscription rather than by purchase of individual issues. Just as monographic publications are identified with an ISBN, serial publications are identified with an ISSN.


Continuing and indefinite sequence of monographic products published separately over a period of time, with a shared identity such as a ‘series title’. The products are usually of similar product form, and share a distinctive branding or design style. A series is not available for purchase as a single product. In ONIX, a series is a type of collection. cf set.


Finite number of products published simultaneously or over a definite period of time, with a shared identity such as a ‘set title’. The products are usually of similar product form, and share a distinctive branding or design style. The products in the set may be available individually, or the set may be a single product, or both. In ONIX, a set is a type of collection. cf series.


Standard Generalized Markup Language. Highly complex technical standard for markup. Effectively the predecessor of XML, but rarely used because of its complexity.

Sheet-fed press

See web press.

Short-run printing

Uses similar technology to POD to manufacture small numbers of copies (a print run of perhaps 10 up to 200 copies) in response to a publisher order. These copies are then warehoused and distributed in a conventional manner (though warehousing for such small numbers of copies may be at the printer, rather than at a dedicated distributor’s or wholesaler’s warehouse). See also ASR.


Thin plastic film used for packaging, which contracts tight around the packed goods when exposed to heat. Single copies of high-value books can be shrink-wrapped to protect them until they are sold, or entire pallets of cartons can be wrapped to maintain their stability and integrity during distribution.

Side stitched

See saddle stiched.


A number of pages (usually a multiple of eight) printed on a single sheet and then folded and trimmed to form a section of a book. Signatures are gathered in the correct order and bound to form the book block. See also imposition.


See pallet.


Simple Knowledge Organization System, a way of structuring and representing controlled vocabularies such as ONIX codelists, subject classification and categorization schemes, taxonomies, etc. Using SKOS, each concept (such as a single entry in an ONIX codelist or a single Thema subject) has an optional (language-independent) notation, a preferred label (per language), alternative labels (per language) and a variety of notes (per language), and can be related in various ways to other concepts (eg semantically broader than, narrower than, related to).


Stock Keeping Unit. In logistics, a unique (and often proprietary) identifier for each product available. In the book trade, the ISBN is sometimes used as an SKU. But often – for example where a single ISBN is reprinted or reissued – an internal stock control process needs to use more granular identification than is provided by the external product identifier (the ISBN). A book distributor might supplement the ISBN with an impression, lot or batch number to ensure older stock is sold before newer.


Service Level Agreement, the agreed quality of service (often quoted in terms of time to react, time to complete a task, acceptable technical standards etc) to be provided by a service department or an external partner.


Sleeve constructed of rigid board into which the book slides, leaving the spine exposed.

Social DRM

See watermarking, DRM.


Alone, apart from others of its type. So a ‘solus review’ which concentrates on a single book, or a solus advertisement, placed away from other adverts (or at least away from others offering similar products, cf classified advertising).


See sale or return.


Arrange into alphabetical or numerical order.

In typesetting, a special symbol (as opposed to a normal alphabetic or numeric character), a ‘dingbat’.

Special sale

Business-to-business sale where the terms and conditions of sale differ from the norm – for example, for a product where sale-or-return terms are normal, a firm sale (non-returnable) is a special sale. The buyer usually pays a lower price, of course.


Bound edge or ‘back’ of a bound book, squared off or slightly rounded.

Spot color

Specific colored ink used for printing, in contrast to simulating that color using process color inks and halftoning.


Pair of facing pages in a book.


Structured Query Language, a programming language for querying a relational database.


Standard RGB, the colorspace that Windows uses by default for RGB images. See also Adobe RGB, DCI-P3.


See RRP.


Serial Shipping Container Code, an 18-digit number used to identify logistics units such as containers, pallets and shipping cartons (parcels) in the supply chain. The SSCC is often printed as a GS1-128 barcode.


Science, Technology, Engineering and Mathematics curriculum topics and publishing sectors.


Reducing inflected or derived words to their word stem, base or root form. Search engines use stemming to improve the search results, ensuring that searches for ‘fishes’, ‘fishing’, ‘fished’, ‘fisher’, and possibly ‘fisherman’, all match ‘fish’. Occasionally termed ‘lemmatization’.


Scientific, Technical, Medical (and Legal) publishing sectors.


Number of copies of books in a warehouse, or held at a retailer, available for distribution or for sale (eg ‘free stock’).

The grade or type of paper or card used for printing (eg ‘cover stock’).


Also known as a sales embargo date (this is the preferred term within ONIX), ‘on-sale date’ or ‘laydown date’ – see the <PublishingDate> composite in Group P.20, and note that ONIX does not distinguish between a strict on-sale date backed by legal force (eg an affidavit) and one that is not (eg backed only by an industry code of conduct). Where the publisher wishes to exercise close control over the earliest retail availability of a product, this is the earliest date that a consumer may obtain a copy of a product – though advance orders (pre-orders) may be placed prior to the embargo date, and advance orders fulfilled by mail-order may be dispatched one day prior to expiry of the embargo. cf publication date.

For Canadian market context, refer to the BookNet Canada documentation on

Stream, streaming

Download and play or display audio, video or e‑book data in real time, without the recipient storing the data permanently as a file.


See sale or return.


A collection that is retailed as a single product. This definition includes what are traditionally considered to be sets, but also covers multi-packs and other multiple-item retail products, since in ONIX 3.0 they are all handled in the same way. Trade packs, designed to be broken up so that the contents can be retailed singly, are not multiple-item retail products1 . For ONIX purposes, the following are all multiple-item products: a complete set of Proust’s A la Recherche du Temps Perdu; all the Harry Potter novels packaged together with items of ‘memorabilia’ in a box; a classroom set of 25 copies of a coursebook together with a teacher text and DVD; a two volume dictionary; a book and toy. In Spain and Latin-America, and possibly elsewhere, however, it is common practice for collections of a fixed number of items that are not retailed as a single product to be identified and described as multiple-item products.

A group of any two or more items within a bibliographic collection to which an additional, subsidiary, identity is ascribed which is also part of the bibliographic description of each member (eg in A History of Western Europe, Part II: The Dark Ages, Volume I: After Rome, the complete History of Western Europe is a bibliographic collection, and the volumes in Part II: The Dark Ages are a sub-collection).

See collection.


Subsidiary rights, or a sublicensed fraction of the volume rights.


See inferior.


Prepayment against future delivery of multiple issues of a serial publication such as an academic journal over a fixed period of time (eg an annual subscription for twelve monthly issues). Journal issues already received are retained by the subscriber even if the subscription is canceled.

Prepayment of a regular (eg monthly or annual) fee for access to an online resource such as a journal, journal collection or an online library of e‑books for a specific period. Access to the journal or e‑book library ends if the subscription is canceled (in principle at least – some e-journal subscriptions can include limited ‘post-cancellation access’). Note that some so-called B2C ‘subscription’ models are structured more like ordinary B2B salesbetween publisher or distributor and the subscription library operator, but operate as subscriptions between the subscription library operator and the consumer. For books, subscription has features in common with rental, but subscription usually applies to a library of many e‑books, whereas rental usually applies to a single book, and subscription is open-ended whereas rental tends to be for a fixed period.

Subscription orders

Retailer orders placed with a wholesaler or distributor several weeks prior to publication date – see also dues.

Subsidiary rights

See rights.

Superior, Inferior

Also termed superscript, subscript. Characters printed smaller and higher or lower than normal characters on a line.


See superior.

Supplier  BOOKNET CANADA Addition

The company who supplies the book to Retailers (not consumers) – so the name of the company who has been assigned by the publisher a market area (usually on a exclusive basis) and they supply retailers within that market.  While that's makes the most sense in a print world supporting warehouses, EDI and distribution, it's no less true for digital books.  The difference in the digital world is that it's common that the publisher = supplier and the market area is the "WORLD".


See supplier.

Supply chain

A network – not necessarily a linear chain – of organizations and processes involved in creating and delivering a product or service from the initial producer to the end user. In principle, the supply chain begins with the author and ends with the reader, via publishers, typesetters, printers, distributors, wholesalers and retailers etc. The supply chain encompasses the flow of both goods such as physical or digital books, and informationsuch as metadata, orders and invoices, between the supply chain partner organizations. This is a operational management concept that encourages focus on process optimization, logistics and customer satisfaction, cf value chain, a closely-related business management term.


Scalable Vector Graphics, a vector image format incorporated within HTML5. The diagrams in this document use SVG.


Tab-separated value file, TSV

Tab-separated value data file. Tabular data file like a CSV file, but with tab characters used as separators between values instead of commas, removing the need to use quote marks when the data contains commas.

Tag, tagging

Can refer to either the markup elements of HTML, XML or ONIX (eg ‘the <ProductForm> tag’, or ‘using <ruby> tags in XHTML’), or to keywords associated with some content that are used to classify the content (eg ‘blog posts tagged with “ONIX”’, or ‘a tag cloud’). It is common to confuse these two very distinct meanings, particularly as classification tags can sometimes be embedded within markup tags….

EDItEUR defines Tag as: XML markup element that begins with < and ends with >. There are three types, start tags, end tags and empty tags. End tags begin </ and empty tags end with />. Start tags do not contain /. A single chunk of XML data sits between a start tag and an end tag – for example <CorporateName>EDItEUR</CorporateName>. The pair of tags and the enclosed data might also be called a data element or a data field


A classification scheme, where controlled vocabulary terms or concepts are arranged in a hierarchy of classes and sub-classes. Strictly, a particular entity can only be attached to a single class or term within the taxonomy, though this is not always rigorously applied (for example with many ‘subject classification’ schemes – these are really subject categoryschemes, where entities can be assigned to multiple categories). Where the vocabulary terms are related not just hierarchically (ie broader and narrower terms) by also by non-hierarchical links of association (‘related to’) and equivalence (‘same as’), and terms are accompanied by a richer range of usage notes, the scheme is often called a thesaurus. A formal representation of the concepts and the relationships is an ontology. Taxonomic hierarchies can be simple or ‘polyhierarchical’, where a sub-class has multiple parent classes – though polyhierarchies can be confusing to use. See also SKOS.


See ISO.

Technical protection

See digital rights management.


Teaching (of) English as a foreign language. Also TESL, Teaching English as a second language.


Text Encoding Initiative, organization that develops the TEI standard and guidelines for XML semantic markup of structured text. TEI standard markup is used most often in XML workflows for academic work in the humanities, social sciences and linguistics, and particularly for text preservation.


E-commerce system handling electronic orders for books, routing EDI messages between booksellers and their suppliers.

Territorial rights

Loose term referring to rights that vary by location. See rights, publishing rights, distribution rights, sales rights.


A new(ish) subject categorization scheme developed in part from the legacy BIC scheme, but updated, internationalized and significantly extended to create a multi-lingual scheme with global applicability. It differs from BIC in that it makes greater use of post-coordination, and has a mechanism for national extensions to qualifiers within the scheme. An interactive category browser is available at Like ONIX, Thema is managed by EDItEUR. Thema was introduced in late 2013, and is currently in the early stages of implementation in many countries. It was initially intended to be used in parallel with existing nationally-focused schemes, but with the potential to supplant them as a single global scheme. It has already been adopted widely across the book trade in a number of countries, including Germany, Spain, the UK, Norway and Sweden, and has become a key part of Amazon’s ‘browse by subject’ scheme in its European stores.


In physical logistics, the number of items per layer (tier) and number of layers high that goods are stacked on a pallet. In this context, the ‘items’ are usually cartons rather than individual books. Unless the cartons are cubic, there is usually a prescribed stacking pattern so that successive tiers are laid differently.


Tagged Image File Format. Typically, TIFF is a high-quality (lossless) file format for raster images. TIFF files are usually compressed using an LZW or Deflate compression scheme, but are still usually much larger than the same image stored as a JPEG. (In fact, TIFF can contain lossy image data that is compressed using JPEG, or uncompressed images, but this is rare).


Extra leaf glued into a book after binding, often used for separately-printed illustrations or plates, errata sheets, gatefolds and other pages that are not the same size, or sometimes for pages intended to be removed.

Title page

Recto page at the beginning of a book that bears the title of the book, and also the names of contributors and the publisher and city of publication. This is often the third page of the book block. The Title verso (the reverse of the title page, often also called the Imprint page, often the fourth page of the book block) usually contains copyright details and any CIP information and colophon. The Half-title page is the recto page before the title page – and thus usually the first page of the book block – that carries just the title of the book (without other details).

Title sheet

See AIS.

Title verso

See title page.


Trimmed leaf size, more ordinarily TPS (trimmed page size) or just trim size. Dimensions of a book page or leaf after binding and trimming. For a typical paperback where the cover is cut flush to the pages, the TLS is also the size of the cover. For hardbacks where the cover boards extend beyond the trimmed pages, the cover is usually around 6–8mm larger in each dimension.

Transport layer security, a cryptographic security protocol used to ensure privacy and integrity of communication between two computers, for exaample within the HTTPS.


Table of Contents.


In logistics, a reusable plastic crate used to collect and convey items within a distribution center.


See trade paperback.


See TLS.


An early EDI standard used in the UK retail sector, an implementation of the UN/GTDI standard that pre-dates the very similar EDIFACT, and orginally managed by GS1 UK. The standard was superseded by EDIFACT in the 1990s – at least in principle – but is still in common use in UK retail (including the UK book trade).


The book trade – the business and commerce of book publishing, manufacture, distribution and selling, in phrases like ‘trade-only’, ‘trade-pack’, ‘trade association’ etc.

A trade book – fiction and general interest non-fiction books primarily for adult consumers, sold through ordinary retail bookshops. See trade publishing, trade paperback.

Trade discount

See discount.


Designed to be broken up so that the contents can be retailed singly, are not multiple-item retail products, though they are described in a similar way, using Product Part elements (refer to page 36 here).

See multi-component, multi-item.

Trade paperback

In the UK, a paperback produced in a size more typical of hardbacks (see demy, royal); in the US, a paperback that’s usually larger than rack-size mass-market paperbacks (but not as large as a typical hardcover book).

Trade publishing

Publishing of fiction and general interest non-fiction (‘trade’) books primarily for adult consumers, which are sold through ordinary retail bookshops. cf specialist sectors, including children’s publishing, academic and educational publishing, STM and legal publishing. Note that trade books are only a small part of the book trade.


Expressing the words of one language in an alphabet or writing system normally associated with a different language – so for example Russian language text can be transliterated into the Latin alphabet. cf translation, where Russian can be translated into the English language.

Trim size

See TLS.


Agreement on Trade-Related Aspects of Intellectual Property Rights, international agreement administered by the World Trade Organization that defines minimum standards for legal protection and enforcement of intellectual property rights (IPR).


See tab-separated value file.