/
What is the “Language” composite in ONIX. What is it asking for?

What is the “Language” composite in ONIX. What is it asking for?

The "Language of the text" is not a problem for most books, so let's deal straightaway with the common case:

If the language of the book's text and the language spoken by the intended audience are the same
– the book's written in English for an English speaking audience (or French for French, or Spanish for Spanish) —
then the "Language of the Text" is just English (eng) – or just French (fre) or just Spanish (spa).

It's handled exactly the same in ONIX 2.1 or 3.0:

<Language>
    <LanguageRole>01</LanguageRole>  <!-- code for language of text -->
    <LanguageCode>eng</LanguageCode> <!-- ONIX Code List 74 from ISO 639-2/B -->
</Language>

That covers most books nicely – but before we move on you should realize how important that is to an on-line retailer. You're probably only submitting your data to "same language" retailers but they will still be mixing your data with other companies and may be doing a select on their data set for language:  

Get this wrong and it can mean on-line listings offering Spanish books to English speakers. No one will be happy.

90% of publishers can stop here – it is obvious what language your book is written in is. And for those cases where there are other considerations -- it's for language instruction, it's in dialect or written as a "country specific" version (example US vs UK English), or the audience just isn't that simple then, you should go read the P.10 Language section in the ONIX for Books Implementation and Best Practice Guide

It's short, clear and has examples – read it please – but here are some points of emphasis:

Mostly you should think of the "Language of Text" as a proxy for the language used by the audience. 

  • The "Language of Text" code is used to tell a retailer of books in that language that this is a product they can sell to their customers – it's the simple case above. If the book does include multiple languages then the MAY be reasons to list them all, but when in doubt make it easy for the retailer. There really aren't many genuinely multilingual titles which I'm defining as a book that can be SOLD by retailers to readers in the different languages. One way to tell that you have one is if you can create a full bibliographical record in all the languages listed as Language of Text. For example: If the book title is only in one of the languages do you seriously think a retailer can sell it in another language? 
  • An excellent explanation of when to use multiple Language of Text codes is covered in EDItEUR's guide

Language instruction books should be sold to the language who will use the book.

  • I know! Duh. But it's the same point again: The fact the might contain text in two languages doesn't change anything and it's still a proxy for the language used by the audience – only one of them is the "Language of the Text."
    • The language being learned is a Subject of the book and should be fully listed in that section of the metadata. THEMA has an excellent system for handling most any language of instruction. Again, just think of the retailer selling the book – you want retailers selling in the audience's language to stock and list the title – in the right section: Language Instruction filed by that Subject language.
  • A special case is allowed for – and it really is a special case so don't expect to use it unless the book matches this
    • What if the book is in Russian – the language of the whole text is <LanguageCode>rus</LanguageCode  – but it's a book in Russian designed specifically to be used by English language speakers – so, it's like a book in Russian but it's for English speakers. HA! Solve that Mr. ONIX! You can't fudge the language of the text there.

    • There's a limit to how much Language of Text can be a proxy for audience but this isn't just whimsy and a real use case so Mother ONIX provides for it by using the Audience Composite: ONIX Code List 29, code "27" is for the Language of the Intended Audience. It allows you to name the audience language for a book in a different one. The book should be in one language and specifically designed for readers in another – otherwise don't use this.
    • To be clear: In day-to-day life retailers will look to the Language Composite for the Language of text to know what audience they should sell to. In this very specialized case a data sender can highlight the special audience consideration of this foreign language text to retailers in another language. But if you fit the normal, simple, use case and "back-up" your Language of Text value by duplicating it as the Language of the Intended Audience all you do is cloud the data. You force retailers to clean their data by looking for matches between the values and removing your unnecessary one. They won't thank you for that so don't do it – good granular metadata means letting the special case metadata to do its work and not adding values because it looks like you can.

Information about Translations

  • The Language Composite offers other Language Role codes for other purposes, including "02" for the Original language of a translated text, with the explanatory note "Where the text in the original language is NOT part of the current product." So you can supply the original language for a translation easily.
  • One use of more than one "Language of Text" codes would be for a book where a translation was on facing pages with the original language. There both Languages should be listed as Code "01". Does this contradict the "retailer" advice above? Yes - so use discretion here (when in doubt make it easy for the retailer selling the book) – but facing translations have value in both languages and it's hard to cover everything in the metadata perfectly. I would recommend making sure the primary audience language (the language of Introduction and comments, etc.) is listed first. Order is paid attention to by many retailers.
  • There is another place to list information on translations: the Contributor Composite where you give the Translator's information block can list the language that the translator worked from. 

Trick question: If the translator to English worked from a Spanish translation of an original German book how should you handle it?  Publishers work hard to create scenarios that defeat all metadata systems – but you could handle it in ONIX by using the two alternatives above. Would you still do it the same way if it there was a marketing reason that the reader should know the Spanish translation was chosen as the base? I really have no idea, but that's what Descriptions are for – that's where important marketing messages go. Talk to your trading partners, ask questions, but mostly do the obvious in a straightforward way.

Metadata Language vs. Language of Text

ONIX 3.0 allows for multi-language metadata. India is a good use case as there are 3 major languages and an audience who are multi-lingual but might be most comfortable in one language over another. In such an environment wouldn't it be great to provide the metadata in the language the reader is most comfortable in? ONIX 3.0 provides it by allowing many composites to repeat if language attributes are provided. So rather than awkwardly repeating the title three times in one entry, you provide the Distinct Title three times in separate composites set up for each language and identified by it's attribute. A retailer could work with such information and provide display options that could help sell books.

BookNet Canada is unaware of any North American retailer ready to use such a facility but there's no harm in having this information ready if you think it can help sell books. Before retailers implement they test and experiment – and think kindly on publishers who can support their efforts.

Coding for a Multilingual Text

A multilingual text is a book with more than 2 languages and intended for any them . this is a separate issue than providing the metadata in more than one language discussed in Metalanguage vs Language of Text.  While this would be illustrated in the Best Practices Graham Bell posted in 2022-03-15 to the ONIX implementers group this excellent example of handling a multilingual text and offered both the right way and pragmatic advice:

Where the text actually originated in a third language like Portuguese or whatever – and this third language is also in the book. That could look something like this in the ONIX

<Language>
    <LanguageRole>07</LanguageRole> <!-- translated language of a multilingual edition -->
    <LanguageCode>eng</LanguageCode> <!-- English -->
</Language>
<Language>
    <LanguageRole>07</LanguageRole> <!-- translated language of a multilingual edition -->
    <LanguageCode>cmn</LanguageCode> <!– Mandarin -->
</Language>
<Language>
    <LanguageRole>06</LanguageRole> <!-- original language of a multilingual edition -->
    <LanguageCode>por</LanguageCode> <!-- Brazilian Portuguese -->
    <CountryCode>BR</CountryCode> <!-- note use of a country code or region code to qualify the language -->
</Language>
However, some recipients might not be able to handle these codes – although 06 and 07 have been a part of this codelist for 15 years. If you do run into problems, then you could probably do this:
<Language>
    <LanguageRole>01</LanguageRole> <!-- language of the text -->
    <LanguageCode>eng</LanguageCode> <!-- English -->
</Language>
<Language>
    <LanguageRole>01</LanguageRole> <!-- language of the text -->
    <LanguageCode>cmn</LanguageCode> <!-- Mandarin -->
</Language>
<Language>
    <LanguageRole>01</LanguageRole> <!-- language of the text -->
    <LanguageCode>por</LanguageCode> <!-- Brazilian Portuguese -->
    <CountryCode>BR</CountryCode>
</Language>
Clearly the first example is much better, as it indicates the relationship between the languages and highlights Portuguese as the ‘original’. The second just says ’there are three languages’.


Summary

Don't rely on this page for all you need to know. You should go read the P.10 Language section in the ONIX for Books Implementation and Best Practice Guide. It's short, practical and comprehensive, providing language knowledge without the sophomoronic humour.