ENTRIES TAGGED "html"
Newgen's Silk Evolve is a powerful automation platform
How many times have you opened an ebook and noticed awkward hyphenations or other conversion errors? I still see this in the majority of the ebooks I buy and it’s clear these are the result of someone not paying attention during the conversion process. They may be minor annoyances but they reflect poorly on the publishers who produce them.
I recently had a chance to talk about this problem with Patrick Martinent, the CTO at Newgen KnowledgeWorks. They have a terrific platform called Silk Evolve that helps automate and reduce the errors when going from PDF to EPUB. The following Q&A is a preview to what you can expect to hear in Patrick’s session at next month’s TOC NY conference.
WYSI editors enable a whole new level of interaction
Since HTML is the new paper and the new path to paper online editing environments are becoming much more important for publishing. Dominant until now has been the WYSIWYG editor we all know and…err…love? However the current WYSIWYG paradigm has been inadequate for a long time and we need to update and replace it. Producing text with a WYSIWYG editor feels like trying to write a letter while it’s still in the envelope. Let’s face it…these kinds of online text editors are not an extension of yourself, they are a cumbersome hindrance to getting a job done.
Why are we leaving such an important issue to under-resourced volunteers and small organisations?
Typesetting math in HTML was for a long time one of those ‘I can’t believe that hasn’t been solved by now!’ issues. It seemed a bit wrong – wasn’t the Internet more or less invented by math geeks? Did they give up using the web back in 1996 because it didn’t support math? (That would explain the aesthetic of many ‘home pages’ for math professors.)
The explosion in web typesetting has been largely unnoticed by everyone except the typography geeks. One of the first posts that raised my awareness of this phenomenon was From Print to Web: Creating Print-Quality Typography in the Browser by Joshua Gross.
A user experience plea for more consistency across platforms
Ebook publishing is full of problem areas, most of which cannot be addressed through standardisation but can only come about via a sea-change in the behaviour and nature of the various participants in the ebook industry.
There are, however, several issues that could be addressed, at least partially, via standardisation, that would make everybody’s life easier if implemented.
One of the major issues facing publishers today is the spiralling complexity of dealing with vendor rendering overrides.
Each vendor applies different CSS overrides with differing behaviours, sometimes even only enabling features through server-side manipulation, which means that proper testing of an ebook is not only difficult, but impossible.
If vendors cannot be talked out of requiring these overrides then they need to be standardised and normalised. Any reading system that implements a CSS override is in violation of how the CSS standard defines the cascade and so is in violation of the EPUB 3 standard.
CSS overrides come in four broad types:
- Vendor styles only – The publisher’s styles are completely ignored in favour of the vendor’s.
- Aggressive vendor styles, but publisher styles enabled – Very little is seen of the publisher styles in this scenario. They mainly surface in edge cases that weren’t accounted for in the vendor’s stylesheet.
- Minimal overrides – The vendor only really enforces control over margins, backgrounds, and possibly font styles.
- Publisher styles – The mode that the reading app goes into when the reader deliberately selects ‘publisher styles’. Under ordinary circumstances this would simply disable the overrides but in most reading apps this mode has a unique behaviour.
Replacing the book production ecosystem with webpage production tools
Browser as typesetting machine
The change of the books basic carrier medium from paper to HTML (the stuff webpages are made of) has meant many changes to what we might still call typesetting. Kindle and other e-ink devices actually move ink on a display to form words, sentences and paragraphs. The moveable type of Gutenberg’s time has become realtime, in a very real sense each book is typeset as we read it. Content is dynamically re-flowed for each device depending on display dimensions and individualised settings to aid readability. Moving type in ‘read time’ marks a significant paradigm shift from moveable type systems, including digital moveable type manipulated by Desktop Publishing software, to flowable typesetting. We are leaving behind moveable type for flowable type.
The engine for reflowing a page in realtime is something we have seen before. It is the job of the browser. And, since ebooks are webpages, browsers have come to play a central role in digital ereaders. In the case of the iPad the iBook reader is actually a fully featured browser engine; Webkit, the very same technology behind the Chrome and Safari browsers. Browsers are the typesetting machines for ebooks.
CSS is the set of rules used by the browser to know where to place type, images and other elements on a webpage and style those elements. Typical rules define where an image is placed in relationship to text, what fonts used, the font size, background color of the page, and the maximum width of an image, etc. While designed originally for the exclusive application to webpages the CSS Working Group, responsible for overseeing the development and direction of CSS, anticipated the intersection of the book and the web some time ago. In the latest drafts of the CSS standards new additions are almost entirely focused on typography and page control. As a consequence this area is starting to blossom. In particular, the CSS Generated Content for Paged Media Module specification is astonishing for its reframing of flowable text into a fixed page. Cross reference and footnote controls, not needed on the web, are among many book-like structure controls being addressed by CSS. Table of contents creation, figure annotations, page references, page numbers, margin controls, page size, and more are all included. The definition of these rules precede their adoption in browsers, however they are being included in browser engines, notably Webkit, at a very fast pace.
Ease and efficiencies
The implications for this are enormous and possibly not yet fully realised. At publishing industry conferences and other book-focused forums the attention has largely been on the ebooks effect on distribution, ereaders and the demise of the so-called brick-and-mortar book stores. The biggest effects however are elsewhere, ‘bubbling under’ in the recasting of the browser as a typesetting engine, and with it the slow realisation that the technical ecosystem surrounding book production can be replaced by tools for producing webpages. We are beginning to turn our attention to the tools for making webpages, to make books, and this, it turns out, is much easier than with Desktop Word Processing and Publishing software. Additionally due to recent developments, all of this, as it turns out, can also be used to design print (more on in-browser print production in a future post). Book production once again is becoming faster and cheaper and on its way to achieving another leap of magical efficency.
How to mimic flowing text in a non-reflowable format
Q: In a traditional printed book, if a paragraph has not finished when the end of the page is reached, the entire paragraph will be justified. However the [CSS] command ‘text align last’ does not seem to be honoured in the last paragraph of the page in fixed layout for the iPad…What seems to happen is that in [InDesign CS6] it ‘looks’ justified but it doesn’t make it through to the epub version and there is a small gap at the end of the line. If you add text it goes on to a new line. I tried adding whitespace but that didn’t seem to be accepted…Is the problem with ibooks? Is there any workaround?
A: When you load a standard EPUB file into iBooks, the application automatically paginates the HTML content based on screen size and settings set by the user (font and font size). Content flows from page to page, and if a paragraph spans a page break, text alignment will be consistent on both pages.
Fixed-layout EPUBs differ from standard EPUBs in that it is the ebook designer who sets the pagination of the book, not the iBooks application. Each XHTML document in a fixed-layout EPUB file corresponds to a distinct page in the book, and no content is flowed from one page to the next.
If you want to mimic a text flow from page to page in a fixed-layout EPUB, you’ll need to split the text between two separate HTML documents. This poses a challenge if you want your text to be justified, because the text-align: justify CSS property does not stretch the final line of a paragraph to the full text-column width.
The good news is that CSS3 offers a solution to this very problem: the text-align-last property, which allows you to indicate how the final line of a text block is aligned. text-align-last: justify specifies that the final line should be fully justified, and span the full text column width.
The bad news about this good news is that text-align-last is not yet fully honored across all major Web browsers. It is supported in Mozilla-based browsers (Firefox), but is not supported in the Webkit engine, which powers Safari, Chrome, and—sadly—the iBooks ereader. Neither text-align-last nor the WebKit-specific -webkit-text-align-last, nor the EPUB3-specific -epub-text-align-last will produce the desired effect in the iBooks reader.
But some more good news for the intrepid and patient is there’s a hack-y HTML/CSS workaround that can achieve the effect of text-align-last: justify in iBooks (your mileage may vary on other ereader platforms).
Tweak word spacing using CSS
The old-school (dating all the way back to CSS1) word-spacing property allows you to designate a specific amount of space to place in between words. The following example uses word-spacing: 7px to specify that the last seven words on the page should have seven pixels of whitespace between them:
<p>Everywhere there are mysteries. And the most ancient man-made wonders of all are the stone monuments erected by our Neolithic and Early bronze Age ancestors between 4000 and 1500BC - or, if it is less difficult to visualize in this way, between 140 and 240 generations ago. Little England (and smaller Scotland and Wales) are rich in these megalithic structures. Archaeologists tell us that more than a thousand chambered tombs and some 700 stone circles have resisted the smoothing iron of wind and rain, the teeth of the plough, the <span style="word-spacing: 7px">grasping hands of wave upon wave of</span></p>
And here’s a screenshot illustrating how this text renders in iBooks.
The main benefit of this approach is that it gives you fine-grained control over the whitespace in a paragraph. The downside is that it can require a fair amount of trial and error to determine the proper word-spacing values to achieve the desired justification effect. If you do decide to use this method, and have a paid iTunes Connect ebooks account, I highly recommend using Apple’s Book Proofer tool, as it eliminates much of the hassle involved in syncing EPUB files between your computer and your iPad/iPhone/iPod.
Hugh McGuire on his new PressBooks publishing platform.
In this TOC podcast, PressBooks founder Hugh McGuire talks about the current state and future plans for this new book production platform PressBooks.
This short article outlines some ideas about an open source, online platform for making books, based on WordPress.
Taking a page from the Baen playbook, Tor.com, a division of Macmillan, is giving away 24 science fiction ebook titles through July 27. The ebooks are available in PDF, HTML and Mobi formats. (Via News.com) Cory Doctorow: "Science Fiction is the Only Literature People Care Enough About to Steal on the Internet." Free Ebooks with Embedded Ads Via Scribd-Lulu…