Imprint and publisher name: Opportunities in ONIX

This is the third part of a series. The first part can be found at https://booknetcanada.atlassian.net/wiki/spaces/UserDocs/pages/1259733139/How+do+you+best+represent+the+names+of+your+publisher+and+imprint+in+your+metadata. The second part can be found at https://booknetcanada.atlassian.net/wiki/spaces/UserDocs/pages/1259962551/Imprint+and+publisher+name+some+case+studies.

We agree that what's on the book and what's in the metadata should generally agree. Next, let's allow for the special case where a publisher rebrands their company or standardizes their name. Then consistency in naming means all their metadata should match the new name even though what's printed on the backlist edition might be different. We're adults — we can deal with that scenario even if we agree that matching the book is best.

Opportunities in ONIX

ISNI

ISNI, the International Standard Name Identifier, can be used as an identifier for corporate entities as well as individuals. Any content producer that can carry a name (including personas where an individual might want to be identified in more than one way) can carry an ISNI identifier. It just has to be useful to find user support. Publishers can use GLN (GS1's Global Location Number) or SAN (Bowker's Standard Address Number) as identifiers but since those are location-specific they're only powerful identifiers until you have to move. Distributors and warehouses are better served by them and neither GLNs or SANs are intended to track name variations the way an ISNI entry can. There's a lot be said for an identifier that's only concern is tracking the content production of a named person or corporate entity. And an ISNI entry might contain a whole history of corporate naming variations. We find that if you mention that ISNI can be used as a naming identifier for publishers, many businesses can see immediate value in it.

Could an imprint have an ISNI? There are probably some cases where that's appropriate, but if you ask yourself "Who would use this and why?" I think you'd find it harder to answer. There should be a need to track and make associations on a name with content production and if the "brand" carries that association then, yes, it makes sense, but I'd tend to think that the legal entity's a better fit.

ONIX code list 44 associated with Contributor, Imprint, and Publisher (among others) supports ISNI codes. Absolutely use ISNI if it's available for your authors, but think about applying for one for your own company, too.

We understand that soon ISNI will be releasing guidelines for use by publishers, so keep an eye out for those.

Solving other problems

We have all this other stuff — groupings, hierarchies, and information that doesn't appear on the book — information that may need to be traded. If the first principle of metadata is: "What appears on the book should be matched in the metadata," you have to exclude using text fields like Publisher and Imprint Name to carry that "stuff."

I'm assuming that what needs to be communicated doesn't already exist as a coded option in ONIX and I will show how to solve it below (but if it is already a coded value in ONIX, the answer is largely the same anyway).

Let's imagine a distributor with a contract to distribute another distributor's list in their market. It happens a lot and if the local supply chain already knows the list, all the new distributor has to say is "We have so-and-so's books now!" and everybody's happy because the original distributor was probably pretty crappy about foreign markets.

Sweet, business as usual. But it has nothing to do with the books themselves. Both ONIX 2.1 and 3.0 can use proprietary coding options to represent the new groupings. Both Imprint and Publisher composites support this option using ONIX code list 44. As well, code list 92, supporting the Supplier Identifier composite, carries it. Any company can declare a proprietary code, identify its purpose using IDTypeName (description), and provide a code (IDValue) that represents a grouping they deem useful to the supply chain.

A Canadian distributor did most of that when they took on an account representing almost 30,000 ISBNs with hundreds of individual publishers. They provided accurate Publisher and Imprint Names (avoiding the "distributor choices" complication referenced in part two of this series) and backed it up with a proprietary code in the Imprint composite, providing a unifying code grouping their source account. Retailers who care about these books know that this account can continue to group these books. This is an incredibly useful backend value. Success!!

And yet...

Complicating factors in ONIX use

I'm going to get technical now, so get out yer ONIX specs! Any ONIX composite that supports a proprietary code option provides a "description" field that supports it. I'll use the typical element names for identifier associated with publishing names (supplier identifier uses similar names):

NameCodeType is the ONIX code value — in this case from ONIX code list 44.
IDTypeName is a free text description that defines the Name Code Value if a proprietary code is provided as the ONIX code.
NameCodeValue is the value — the code provided by the sender.
The actual name — imprint or publisher (or supplier) is supplied separately from the "code".
- ONIX 3.0 is simpler: It provides a repeatable identifier composite within all three options so any name can have easily associated codes, both proprietary and standard identifiers like ISNI. It can also support composites where only an identifier without a name is provided.
- In ONIX 2.1:
  - Publisher and Imprint are repeatable composites that can support either a name, an identifier, or both. These two composites don't contain a repeatable identifier sub-composite, so it's hard to support two identifiers that both have a one-to-one relationship to a name, but the standalone composite with just an identifier makes it perfectly possible to support multiple identifiers.
  - Supply Detail is identical in 2.1 and 3.0 as both support a repeatable identifier composite. And you can support your Supply Detail with an identifier without a Supplier Name if you like.

As a general statement for most composites in the ONIX standard, only the code and value are required. Nothing else is needed because one explicitly supports the other: The code defines the value provided. Or if you look at it from the other end: An end user loads the value by programming to select a specific code. The description doesn't play a part but there's no harm in supplying it if it helps make readable ONIX.

The exception comes when a proprietary code is used. If the sender chooses a proprietary code, how can an end user "know" what it is in order to select it accurately? What if two composites exist and both are labeled as proprietary? If a proprietary code is provided then the value is provided for a purpose chosen by the sender. They're the only ones who can tell you why it's there and that's why a description, the IDTypeName element, exists. It allows the sender to provide a secondary definition that defines what the value they provide means. They need to do that because the ONIX code list 44 "01" Proprietary code can't.

Good ONIX programming requires any composite to be treated as a unit that is understood by first selecting the ONIX code to identify the composite as containing the data you need, and then loading the value within the composite.

Good ONIX programming associated with proprietary codes requires recognizing that the composite is providing proprietary information — all that the code can tell you — then selecting to use that specific composite based on the information in IDTypeName and then loading the value provided by that composite.

All fully implemented ONIX programming should always have a sender providing a means to select the right information and having that data loaded by receivers who choose to load the value by selecting it based on code.

Really? Is that the way it really works between trading partners?

Proprietary code use

BookNet may not have eyes looking into your soul to know for sure (only Facebook and Google have those), but BiblioShare data allows us to know how many data senders provide proprietary codes without a description. The following counts show that use of the imprint proprietary code use is very high (two-thirds of all those records – after all, it's a standard Amazon request) and the publisher proprietary code use is also high (approaching half of all records – after all, it's an alternate Amazon request), but only a couple of distributors use a proprietary code within Supplier Identifier. We've broken out the percentages for each to show if there was a single composite per record, and the percentage where the composite is supplied without a description.

Composites with proprietary code

Total count, January 2019

(out of 3 million ONIX 2.1 records)

% single composite

% missing description

Imprint

2,207,625

100%

61%

Publisher

1,358,548

96%

63%

Supplier Identifier

10,713

100%

0%

Over 60% without a description! (We'll get back to 4% multiple publisher composites in a minute.) So while it's impossible to know an end user's programming choices, this implies that no one's asking data senders to supply a description because it would be trivial to supply it. It also implies that senders aren't telling anyone why they supply this value. That's okay because the standard Amazon request is well known and it's for a code to identify imprint (or publisher) name, so it's arguably not needed.

Except that our example is a distributor using it for a different purpose. (And if they chose to also support Amazon's request, they'd have to repeat a proprietary code.) They only support one code and maybe that can be allowed since it's useful. Who knows? Amazon may have asked! And BookNet now has programming that's using it, so, like everyone else, we're invested in them not changing it!

That 4% of records with multiple Publisher composites with proprietary codes confirms most (but not all) are supplied as a unique composite. But it could also mean that a data sender is sending a proprietary code for more than one reason. Honestly, my best guess is that they provided multiple composites for multiple publishers and a detailed analysis might show that. But, when it really comes down to it, 4% isn't too meaningful for our purposes.

Here's the problem

Data practices change, data develops. The media file composite used to only support covers with 4% duplication. Now 18% of records carry additional choices (more than one in many cases).

And let me put it this way: If you wanted to use proprietary codes to carry new information, then you'd need to expect to use the description field. And you need to believe that recipients are prepared. Part two of this series looked at a number of use cases where there's background detail that might be useful to know about Publisher or Imprint Names. Further, if publishers start using ISNI for their own names, you should expect to see multiple identifier composites. The 60% of entries that supply proprietary data without full support says very loudly that no one is prepared.

ONIX 3.0 emphasizes use of code lists specifically because:

It minimizes the cost of implementing new data.

If you're already using a composite (properly), then accepting a new piece of data using a different code shouldn't cost much extra. You copy the current programming, change it to use a different ONIX code and map that composite's value to your appropriate data point. And if you don't need the new data, you don't need to do anything as your current programming has used the right code to get the value you need. The only other extra expense would be if a sender chose to stop supporting the value you expect and provided some kind of alternative. ONIX code lists are designed to support unique values, so that should be hard to do, but there's ample evidence of senders who need to supply more data points than any single end user needs. The fact that a given data recipient doesn't absorb ISNI should be irrelevant to whether or not your feed includes it.

Repeating composites are there to save you time and money. Improper use of composites are future costs. They lead to a data receiver expecting a specific value to be delivered in a specific composite because they don't actually use the ONIX codes and have only programmed to pull the value from a composite. Improper use means any change becomes an expense because a receiver's data programming has to be re-written to add a Code selection so they can choose the right composite with the correct value.

Or the sender asks for the file they receive to be modified to match their expectations.

Data senders complain they have to provide ONIX metadata in multiple unique files.

From which we infer that data users pass their development costs back to the data senders if they can force them to support a special file.

The circle of life

You can see the pattern: Part one of this series started from need and use causing the re-purposing of a commonly supplied value. Your database only supports one slot, so if someone wants an alternate you stick the data there even if it doesn't represent the same thing. And at the other end of the supply chain, recipients of your metadata base their programming on the data as it was, hoping it'll never change. Part two was a list of reasonable complexity — some of which might need to be communicated, some not.

Therefore, If you want to save money, start by fully implementing, not the whole ONIX standard, but any part of the standard that you use. And you should be prepared to both add support and tweak your programming for improvements.

ONIX code lists are a lingua franca — a shared resource. The definitions and the notes that come with them are clear and easy to use. Having determined the right code, use it. If you need to support two dates, repeat the composite and supply two dates (at least in 3.0, as always 2.1 is a little more complex date-wise). Need to support two markets? Repeat the Product Supply to support two markets. Need to support two publishers (single 01 Publisher and 02 Co-publisher, of course)? Repeat the composite. Need an identifier? Supply it. Need a second? In ONIX 3.0, just repeat the identifier composite, though ONIX 2.1 can be a little trickier.

Data senders minimize future costs by always supplying the means to select the right composite by matching value to the right code. You should expect other companies to ask what a proprietary code means — or you should tell them if they'll find it useful — but be aware that conventions develop (like using a proprietary code to represent Imprint Name).

Data receivers minimize costs by always selecting the code from a composite in order to know that it contains the value you want. By doing that you can ignore the other options allowing senders to have flexibility in supplying other data points to other clients.

Everyone should expect change and work with the standards community to identify priorities and set best practices.

This originally appeared on the BNC Blog at https://www.booknetcanada.ca/blog/2019/1/17/imprint-and-publisher-name-opportunities-in-onix. Subscribe the the blog RSS at https://www.booknetcanada.ca/blog?format=rss.