Friday, November 06, 2009

On Publishers

Over the years I've been reading and reviewing books, I've come to appreciate just how much a good publisher can do for a book.  Having the right author and the right material is essential, but a good publisher can make things really work.  

There are lots of good publishers in the field of computing.  Notable among these are Morgan Kaufmann, ddison Wesley and O'Reilly.  These publishers seem to me to produce consistently good books.  Not that others do not put out good books, but one of the things you notice is that some publishers are just a little bit less reliably good.  And a few are even reliably poor.

How do these differences show up?  The good publishers ensure that the material is fundamentally good: interesting, timely and on point.  But even good material can be poorly presented and one of the jobs of a good publisher is to make sure that the material is presented well - this involves everything from checking the quality of the writing to good proofreading to making the typography and layout look good and to a good index.  Even the good publishers sometimes goof a bit here, and mistakes are ok as long as the whole book works.  A good publisher makes a good author look very good indeed, and can make even a middling author look good.

Other publishers (and I'm trying carefully to name no names) don't seem to care.  At the worst there are a couple that seem interested in pushing almost anything with pages and a cover out their door. Sometimes the books are just silly, vanity press offerings but dressed up as something more (perhaps to get the author(s) publication credits). Sometimes the underlying material is good, but the publisher let the author down - not providing adequate proofreading, not working to ensure that the book is consistently laid out, not getting adequate technical review.

And then there are the money-grubbers.  There are a couple of bottom feeder publishing houses who go that one step further down.  I suspect they have academic and other large libraries who just gobble up whatever they print (I can't imagine why).  These serve everyone poorly - the libraries use up their budget in buying junk, the authors look like idiots and the readers waste their time.  Sure, one book in every dozen might be halfway okay, but the others are worse than a waste of money.  I've read and/or reviewed more than a few of these. In one of the worst of these the authors started with material that was (at best) thin, wrote poorly (and clearly the publisher invested not one cent in proofreading), and wrapped it up in a $100 plus package that, and I can say this only of a very small number of the books I've reviewed, left me wishing I'd never even seen the book.I'll sometimes give books away on bookmooch- but I'd feel sorry for the person who got it - even free.  A responsible publisher would have either just left this book to rot, or would have found a way to make the material more informative and interesting.  I fault the authors for allowing the book to go to press, but I fault the publishers all the more.

Monday, November 02, 2009

Reviewing Books

I have been writing reviews for Computing Reviews for a while now and it is an interesting thing to do.   I tend to focus more on books than articles, in large part because I like books, but because I have found that as I pursue my journey to curmudgeon, its not 90% of articles that are crap, but more like 99.9%, due in large part to the fact that making tenure in a university requires publication of quantities of the aforementioned substance. 

Not that even 10% of the books are good, but a bit of experience and care in selecting the books I review has meant that I've been fairly lucky in picking interesting stuff. 

Almost all of the books I review are technical - either computer oriented math or programming-related.  

In this post (and perhaps a followup)  I'm going to say a few things about programming books - things that bug me.  

Reading code is hard.   It is hard when you're using an IDE that does syntax highlighting, it is harder when it all gets printed up.   And yet, far too often, programming books consist of piles and piles of code, separated by text.    The text should say the important things - and it should tell us about them in words.   Authors need to think carefully about what they're saying and how, and keep the code to a minimum.    If it is absolutely necessary to add more code (often boilerplate of one sort or another),  that should be done in the web copy of the code (there is a web accessible copy of the code, naturally).   A couple of lines is usually all that that is needed to show how things are working (more in some of the more verbose languages).   A good guideline: if the code needs to be split across multiple pages, it is too long.   

In some of the poorer books I've seen, there are great hunks of code, a rather shorter section of explanations and then more great hunks of code.   In the worst of these, the great hunks of code may be many pages long.   In one case (this was a while back), the book was organized around a single program which started at perhaps 20 pages of code that did something simple.   Then over the course of several iterations, slight changes were made to the code and the whole program was repeated (with typography indicating the changes).   It was a while ago, but I remember that the final iteration was huge.  Now this was before the internet made it as easy as it is now to put your code online (the author/publisher did put the code online?  I thought so), but I doubt that anyone read much more than I did of the programs (typically only a quarter of a page or so).    The idea of picking a single program and adding to it was a good one (so the reader doesn't have to switch problem domains repeatedly), but the way it was done was simply awful.

In another book, the same kind of thing was done in the sense that most of the text worked around a single problem, offering different approaches along the way.   This book didn't just do massive code dumps onto pages, but did repeat code text over and over with a line added or changed here or there.    Some of this is probably necessary, but repeating whole functions with no changes multiple times is unnecessary.    In this case the author did have all the code available online, but some of the listings were incomplete and didn't compile on my system.  

Perhaps the code is really, really necessary to have on paper.   Fair enough.   But comments?   There's a place for printing out very short comments in the code.   This might be a quick "notice this on this line" thing, or perhaps an identifier for the file on the website.   For the most part though,  the text should be doing the explanations - not the comments.   I've seen programs in books where the size of the comments is roughly the size of the code.    Why oh why didn't the author explain all this in the text?     In more than one case, there have been programs that didn't fit on a single page, and the reason was that the comments made it too long.  

Program typography is another problem, but I'm not sure there's a good solution.   Monospaced typewriter style fonts seem to be the favorite for program text, but even a little help can make the code easier to read.   Boldfacing reserved words and doing comments in italics can go a long way to improve readability.   Line numbers where needed (if they're needed, the code may already be too long) can be useful, but it is quite possible to do them in a smaller font so they're not so intrusive.   

Finally the swoopy "this a a newline" thingy.  If your lines are long enough to require a second line, format them in your editor to fix this.    Most programming languages can cope with this quite nicely - while there might be a reason to do this in a whitespace sensitive language such as Python,  there is no reasonable justification for doing this in a language like Java or C.