On e-Book Reading Systems

by Alberto Pettarin, Ph.D.

In this page I collect some notes about Reading Systems for e-Books, and their (current) lack of functions to properly support advanced usages of e-books.

The opinions contained in this page are personal. Feel free to disagree, to make them yours, even to exploit them in commercial products. In the latter case, I would like to be kept posted of your progress. Comments are welcome, especially if you point out mistakes, or have useful suggestions; just let me know.
http://www.albertopettarin.it
http://twitter.com/@acutebit

On e-Book Reading Systems
table of contents
Definitions
My View on e-Books and Reading Systems
Please Give Us Intelligent RS
Some Experiments with Android
Extra Ideas
Interesting Links

bowerbird's comments

this is a remix, demonstrating a side-by-side display.

the original version of this file is located here:

http://www.albertopettarin.it/rs.html

note that the orange boxes on the right are contenteditable,
meaning that you can enter any notes or annotations you like.
you can click a button at the bottom to collect all your notes...

alberto, my notes are appended. jump down.

Definitions

I wrote this page with the least possible technical language, so that it can be read by real-world people, not only software developers and e-book producers. (Kidding, of course.) To avoid misunderstandings, let me define the precise meaning I will give to some frequently occurring words:

book: a self-contained, packaged cognitive unit, possibly composed by different resources, like text, illustrations, audio, etc. (used here mainly to refer to the content)

e-book: the digital instantiation of a book (used here mainly to refer to content+markup+metadata)

ecosystem: an integrated platform (like Amazon Kindle or Kobo) where a user can buy/store e-books, read them from multiple devices/apps, synchronize notes and bookmarks, etc.

p-book: the physical (usually, paper) instantiation of a book

Reading System (RS): a software used to get a rendition (visual, aural, both, etc.) of an e-book. It might be a stand-alone PC application, an app, the reading software of an eReader, etc.

My View on e-Books and Reading Systems

Let me start with a bold statement: reading e-books, now, is a quite disappointing reading experience, with respect to their (abstract) potential.

Do not get me wrong: if you simply want to read novels or comics on your spare time, you might find decently formatted e-books, and decent Reading Systems (RS) to enjoy them. (Even in this case, though, few RS allow the reader to apply really deep custom settings to the rendition of the e-book, and they almost all focus on typographical aspects, like changing font, its height, line density, margins, justification, and so on.)

But when you start using books for more complex activities, like studying, learning a foreign language, consulting professional manuals, and the like, current RS prove to be very poor tools. Have you ever experienced extremely different renditions of the same e-book, in two different RS? Have you ever tried taking annotations on an e-book and then expect them to be embedded into the e-book, while, at the same time, being able to use them outside the RS/ecosystem where you created them? Have you ever experienced the rather tiring, frustrating experience of using footnotes, dictionaries, translations, glossaries, on today RS? Have you tried referencing a text fragment of an e-book in another book (e-book or p-book)?

I think that the main reasons for all this mess are:

e-books are reaching critical mass (in terms of serious users) only now;

non-recreational e-books are a marginal fraction of the e-book market, but their have an enormous potential (just think about the educational segment);

these high commercial stakes make big publishers/vendors cultivate their own ecosystem, for commercial gain, instead of fostering open, interoperable standards and tools, sometimes even with the governments/regulatory agencies complicity;

on the other hand, open initiatives from public institutions seem to be geared toward favoring content production/digitalization, rather than improving standards and tools;

very few e-books offer a real advantage, beyond de-materialization, over their p-book equivalents (i.e., a lot of e-books tend to be just the digitalization of a p-book, instead of being an augmentation of the corresponding content).

I do not have a magical recipe to improve the digital publishing ecosystem, especially in its financial dynamics: I leave this task to those who can pull the right triggers. But I can certainly contribute to the technical side of the discussion about e-books and RS. The only prediction I have the guts to make is that the future belongs to open formats and tools, and that building walled gardens full of amazing proprietary tools will not pay off long term. Hence, in what follows, I will speak about open formats, RS and tools only.

Please Give Us Intelligent RS

I think that one of worst pitfalls in the industry is that current RS are way dumber than they should be, given the level of sophistication that formats like EPUB3 allow.

For example, no RS that I am aware of takes any advantage from the semantic vocabulary defined by EPUB3 (with the exception of footnotes in iBooks). But I do not want to enter the dangerous field of discussing a particular format. So, let's consider the generic theme of how links are handled in e-books: how many RS allow the user to split the viewport to display the target (footnote, other chapter of the same text, an external Web page) concurrently with, say, the text fragment referencing it? (AFAIK, only ASTRI Bee can split the viewport, but it does not manage links automatically.) Along this line, the list of examples can be made arbitrarily long.

The effect is that actual e-book creators tend to spend insane amounts of time embedding in their e-books code (CSS, Javascript) to make up for these missing functions. I say insane, because usually these attempts are not working well on all RS, and, more importantly, they are the digital publishing equivalent of the reinventing the wheel principle. Any programmer knows that she should avoid this as much as possible.

Besides support/standardization issues, I feel that e-books are not fully recognized as different cognitive objects than the corresponding p-books. For example, in principle, e-books allow the reader to actively query the data contained in the e-book, unlike p-books, where the reader is forced into a passive role regarding the book-to-brain transfer process, being unable to dynamically alter the content being displayed. Have you heard about any RS supporting XQuery lately? Of course, you haven't.

To sum up, I think that making e-books should be as simple as using any semantic-aware language, being sure that that e-book will be rendered uniformly by different RS and that special functions should be enabled by the RS, not requiring the e-book producer to code them from scratch every time! If someone wants to include her exotic JS stuff, that is fine with me, but, at least for common functions, there should be no need to code your own JS library and embed it into every single e-book. Finally, a good e-book should contain high-quality metadata and marked content, which are two necessary things to make complex usages possible.

Some Experiments with Android

So far, I just spew generic ideas (thanks bowerbird!), so let me go a little bit technically deeper on some examples.

Currently, I am supervising a team of three students at my (former) Department of Information Engineering of the University of Padova, Padova, Italy. They are coding a demo Android app for reading EPUB2/EPUB3 e-books, with focus on complex books.

The goal of this project is to show that, by putting some intelligence in the RS, the reading experience can be greatly enhanced. We focus on managing the following features:

Internal links

External links

Interaction with the dictionary

Parallel text

A first focus of the project is about allowing the user to efficiently explore linked resources, by splitting the viewport to allow her simultaneous access to both linked and linking elements. For example, if you are reading full-screen an e-book and you click on a link, split the viewport in two panels, and render both the linking passage and the linked resource, which might be another part of the same e-book or an external Web page. Moreover, you can apply the same interaction to consulting a dictionary or a glossary.

A second focus of the project is about allowing a natural reading of parallel texts, for example, a book in its original language (say, EN) along with its translation in another language (say, IT). It should not be that difficult to achieve with e-books, right? Wrong. So far, the only ways of doing it are:

display text in EN, then display text in IT and add links to go back and forth between corresponding paragraphs (very tedious, not natural)

interleave text in EN and in IT or use a table structure (bad from a visual point of view)

use JS to make a popup with the translation appear (works only in few EPUB3 RS)

use a FXL e-book (works only in few EPUB3 RS)

Plus, the last two solutions require a lot of coding just to make the rendition work.

Our approach to the problem is different: first of all, the e-book should only contain the textual materials, marked up like any other EPUB book (so it is still compatible with other RS). But then, with the same split-the-viewport technique, we will offer the user the choice of whether she wants to read only the EN text, only the IT text, or both, simultaneously, in two half-screen panels. To keep things simple, we assume that the XHTML pages inside the EPUB container are named according to a naming convention that lets our app recognize the correspondence between the original text and the translation (say, p001.en.xhtml and p001.it.xhtml), but also allows for shared, non-translated stuff (cover, introduction, etc.). The same mechanism might be achieved by other means (e.g., multi-OPF, etc.), but this is not the focus of the project.

We will release this demo app by summer 2013, under a free software license. If the project will receive enough interest (and, possibly, funding) we will try to develop it further, including regular functions (typography management, bookshelf, etc.) and some other, more interesting features, including some of those listed below.

Extra Ideas

In what follows, I list, in no particular order, some ideas for functions that current RS lack, partially or entirely, and that I think I would like to see on RS.

Export annotations, bookmarks, highlights to an exchange format (say, XML or CSV), and their remote backup/synchronization

Bundle annotations, bookmarks, highlights to the e-book container, so that they can be stored together

If an Audio-e-book contains an M3U playlist, let the user choose between an MP3-like player and normal (text+audio) rendition

Media Overlay in reflowable e-books, providing good playback controls (volume, speed, delay, highlight style) and tap-to-play mechanism

Media Overlay with multi text fragment association

Ability to swap text/audio/video language association in Media Overlays

Configurable multi-layout multi-panel views with auto-tiling

Configurable multi-layout multi-panel views with auto-tiling

If in single-viewport mode, after coming back from an internal link, highlight the original anchor point (like the red dot in Marvin)

Support book status (e.g., via local storage), which is greatly needed by interactive fiction and game-e-books

Automatic citation recognition (e.g., via DOI)

Automatic lexicon generation from user dictionary usage

Automatic generation of factual context by relating e-book content to (Web?) resources

Accessing online databases to get the right pronunciation of foreign words and names (with automatic entity resolution)

Resolution of external entities

Self-updating function: an e-book can pull updates from the publisher/store

Typos reporting function

Custom, user-defined CSS and clear control of them over the cascade

Support for custom, user-generated dictionaries (e.g., StarDict)

Anonymous reading statistics

RegEx search, support for XQuery-like interrogation

Clearly this list is not exhaustive; feel free to email me your own suggestions.

Interesting Links

Cinque cose che vorrei -- da editore -- per progettare e-book: Fabrizio Venerandi's blog post (in Italian) on five missing features of e-books:
http://salvoesaurimentoscorte.wordpress.com/2012/10/05/cinque-cose-che-vorrei-da-editore-per-progettare-ebook/

Department of Information Engineering, University of Padova, Padova, Italy:
http://www.dei.unipd.it

Smuuks: my own company, mainly producing EPUB3 Audio-e-books:
http://www.smuuks.it

il Narratore audiolibri: audiobooks and EPUB3 Audio-e-books:
http://www.ilnarratore.com

Readium: EPUB3 reader by the IDPF:
http://www.readium.org

Marvin: EPUB2 RS with smart functions:
http://www.marvinapp.com

Blio: promising EPUB2/EPUB3 RS (now stalled?)
http://www.blio.com

Digital Education Content 0.101: post by Richard Pipe on e-books and education:
http://www.infogridpacific.com/blog/igp-blog-20130501-digital-education-content-101.html

e0: rethinking the (EPUB) e-book format:
http://epubzero.blogspot.it

bowerbird's comments

alberto-

here are my notes, just quick reactions...

first, just so you know my "definitions"...
an e-book consists of the files in a folder.
so adding content to the book is as easy as
adding the appropriate files to the folder.
the "master" text-file in a folder is the one
which has the same name as the folder itself.
the text-files are composed in z.m.l. format.

sometimes i've just summarized your notes,
and other times i've reacted against them.

problem areas:
 * consistency in rendering
 * studying
 * learning a foreign language
 * consulting professional manuals
 * annotations, creating and remixing
 * footnotes
 * dictionaries
 * translations
 * glossaries
 * referencing a text fragment of an e-book in another book

the main reasons for our current mess are:
 * the corporate publishing trying to protect their legacy business model
 * the new titans trying to create and lock in their future business model
 * they're salivating for the "captive market" of the educational segment

 * there's nothing wrong with being "just a de-materialization" of a p-book
 * but why not take advantage of some of the opportunities e-books create?

i _do_ have _several_ magic recipes to fix the digital publishing ecosystem.

the future belongs to formats and tools which are open -- _truly_ open and
not just f.u.d. trojan horses -- and controlled by people, not by corporations.

nobody needs to "give us" anything. we need to create it ourselves.

authoring-tools are certainly not less important than reading-systems.

formats don't give any "sophistication" at all; _working_apps_rule._
features like "splitting the viewport" need to be instantiated in apps.

split-view:
 * links
 * footnote
 * other chapter of the same text
 * an external web page
 * parallel text (translations, annotations, etc.)

 * table of contents
 * results of search
 * associated information

e-book software should contain all effects commonly needed by books,
even ones needed by only a minority of books.  uncommon ones too...

we don't need to create mutiple ecosystems.  we don't even _want_ to.
we want _one_ system, open and reliable and dependable for everyone.

making e-books should be so easy that elementary-school kids can do it.

metadata should be embedded _in_ a book, in which case it is not "meta".

"markup" is a construct that should be relegated to the dust-bin of history.
it is possible, and desirable, that the structure of a book be made obvious,
such that any half-way intelligent app (or person) can determine it easily.

coding a demo android app for epub2/epub3, with focus on complex books.

the goal of this project is to show that, 
by putting some intelligence in the app,
reading experience is greatly enhanced.

none of these things are difficult, not in the slightest:
 * links, internal and external
 * dictionary, glossary, etc.
 * parallel text, annotations, etc.

annotations should naturally be a a format that can be remixed, moved, etc.

the user should be in full control of all play-back and rendering mechanisms.

the user should be able to remix -- add, delete, edit -- any book at any time.

users can come up with more good ideas than one developer could imagine.

a creative developer will have a clever idea users would've never imagined.

e-books should be always-available in the cloud, and also auto-downloaded
to the user's machine, via the philosophy that lots of copies keeps stuff safe.

local storage (in the browser) is too insecure and flaky for us to _depend_ on,
although it can surely be used in a supplemental capacity as a back-up means.

anything that _can_ be automated in e-book creation _should_ be automated.
it is stupid for multiple people across the world to be typing in the same stuff.

we can always pull in supplemental material from the web to help the reader,
and the range of the additional material will be as wide as the world wide web.

annotations should be easily shared with the author, a community, or the world.
this includes everything from typo-reporting to commenting to who-knows-what.

normal people shouldn't have to learn c.s.s. just to make their books look good,
where "good" is defined by the specific individual who's actually reading the book.

let people read in private -- but offer to collect/share any data that _they_ desire.

people shouldn't have to learn reg-ex in order to do the kind of searches they want.

if you made any annotations, click this "save" button to collect them.