Friday, November 06, 2009

On Publishers

Over the years I've been reading and reviewing books, I've come to appreciate just how much a good publisher can do for a book.  Having the right author and the right material is essential, but a good publisher can make things really work.  

There are lots of good publishers in the field of computing.  Notable among these are Morgan Kaufmann, ddison Wesley and O'Reilly.  These publishers seem to me to produce consistently good books.  Not that others do not put out good books, but one of the things you notice is that some publishers are just a little bit less reliably good.  And a few are even reliably poor.

How do these differences show up?  The good publishers ensure that the material is fundamentally good: interesting, timely and on point.  But even good material can be poorly presented and one of the jobs of a good publisher is to make sure that the material is presented well - this involves everything from checking the quality of the writing to good proofreading to making the typography and layout look good and to a good index.  Even the good publishers sometimes goof a bit here, and mistakes are ok as long as the whole book works.  A good publisher makes a good author look very good indeed, and can make even a middling author look good.

Other publishers (and I'm trying carefully to name no names) don't seem to care.  At the worst there are a couple that seem interested in pushing almost anything with pages and a cover out their door. Sometimes the books are just silly, vanity press offerings but dressed up as something more (perhaps to get the author(s) publication credits). Sometimes the underlying material is good, but the publisher let the author down - not providing adequate proofreading, not working to ensure that the book is consistently laid out, not getting adequate technical review.

And then there are the money-grubbers.  There are a couple of bottom feeder publishing houses who go that one step further down.  I suspect they have academic and other large libraries who just gobble up whatever they print (I can't imagine why).  These serve everyone poorly - the libraries use up their budget in buying junk, the authors look like idiots and the readers waste their time.  Sure, one book in every dozen might be halfway okay, but the others are worse than a waste of money.  I've read and/or reviewed more than a few of these. In one of the worst of these the authors started with material that was (at best) thin, wrote poorly (and clearly the publisher invested not one cent in proofreading), and wrapped it up in a $100 plus package that, and I can say this only of a very small number of the books I've reviewed, left me wishing I'd never even seen the book.I'll sometimes give books away on bookmooch- but I'd feel sorry for the person who got it - even free.  A responsible publisher would have either just left this book to rot, or would have found a way to make the material more informative and interesting.  I fault the authors for allowing the book to go to press, but I fault the publishers all the more.

Monday, November 02, 2009

Reviewing Books

I have been writing reviews for Computing Reviews for a while now and it is an interesting thing to do.   I tend to focus more on books than articles, in large part because I like books, but because I have found that as I pursue my journey to curmudgeon, its not 90% of articles that are crap, but more like 99.9%, due in large part to the fact that making tenure in a university requires publication of quantities of the aforementioned substance. 

Not that even 10% of the books are good, but a bit of experience and care in selecting the books I review has meant that I've been fairly lucky in picking interesting stuff. 

Almost all of the books I review are technical - either computer oriented math or programming-related.  

In this post (and perhaps a followup)  I'm going to say a few things about programming books - things that bug me.  

Reading code is hard.   It is hard when you're using an IDE that does syntax highlighting, it is harder when it all gets printed up.   And yet, far too often, programming books consist of piles and piles of code, separated by text.    The text should say the important things - and it should tell us about them in words.   Authors need to think carefully about what they're saying and how, and keep the code to a minimum.    If it is absolutely necessary to add more code (often boilerplate of one sort or another),  that should be done in the web copy of the code (there is a web accessible copy of the code, naturally).   A couple of lines is usually all that that is needed to show how things are working (more in some of the more verbose languages).   A good guideline: if the code needs to be split across multiple pages, it is too long.   

In some of the poorer books I've seen, there are great hunks of code, a rather shorter section of explanations and then more great hunks of code.   In the worst of these, the great hunks of code may be many pages long.   In one case (this was a while back), the book was organized around a single program which started at perhaps 20 pages of code that did something simple.   Then over the course of several iterations, slight changes were made to the code and the whole program was repeated (with typography indicating the changes).   It was a while ago, but I remember that the final iteration was huge.  Now this was before the internet made it as easy as it is now to put your code online (the author/publisher did put the code online?  I thought so), but I doubt that anyone read much more than I did of the programs (typically only a quarter of a page or so).    The idea of picking a single program and adding to it was a good one (so the reader doesn't have to switch problem domains repeatedly), but the way it was done was simply awful.

In another book, the same kind of thing was done in the sense that most of the text worked around a single problem, offering different approaches along the way.   This book didn't just do massive code dumps onto pages, but did repeat code text over and over with a line added or changed here or there.    Some of this is probably necessary, but repeating whole functions with no changes multiple times is unnecessary.    In this case the author did have all the code available online, but some of the listings were incomplete and didn't compile on my system.  

Perhaps the code is really, really necessary to have on paper.   Fair enough.   But comments?   There's a place for printing out very short comments in the code.   This might be a quick "notice this on this line" thing, or perhaps an identifier for the file on the website.   For the most part though,  the text should be doing the explanations - not the comments.   I've seen programs in books where the size of the comments is roughly the size of the code.    Why oh why didn't the author explain all this in the text?     In more than one case, there have been programs that didn't fit on a single page, and the reason was that the comments made it too long.  

Program typography is another problem, but I'm not sure there's a good solution.   Monospaced typewriter style fonts seem to be the favorite for program text, but even a little help can make the code easier to read.   Boldfacing reserved words and doing comments in italics can go a long way to improve readability.   Line numbers where needed (if they're needed, the code may already be too long) can be useful, but it is quite possible to do them in a smaller font so they're not so intrusive.   

Finally the swoopy "this a a newline" thingy.  If your lines are long enough to require a second line, format them in your editor to fix this.    Most programming languages can cope with this quite nicely - while there might be a reason to do this in a whitespace sensitive language such as Python,  there is no reasonable justification for doing this in a language like Java or C.  

Tuesday, September 29, 2009

Dun and Bradstreet spam

I recently got mail from Dun and Bradstreed, a Wall Street firm that (as far as I can tell) sells financial information on people and businesses.    I've never done any business with them, so their mail was (as far as I was concerned) spam.   And, typically of spam, it was not sent by them, but by another firm that they've hired to do this for them.    There were a number of links in the mail that purported to go to Dun and Bradstreet, but the URLs went to exct.net (and not to Dun and Bradstreet) and looked like http://cl.exct.net/?qs=long-hex-string-here.  They even had "unsubscribe" links.   I'm always reluctant to click on such links as who knows what is on the other end, and, of course, because clicking on unsubscribe links usually just confirms the email as valid to the spammers.  Eventually, I did click on one from a sandboxed browser and it took me to Dun and Bradstreet. 

I was still a bit suspicious though and sent Dun and Bradstreet an email pointing out that their mail was spam (at least unsolicited commercial email), that it looked like a phishing attack at the mail level, and that this was not considered good behavior.   I usually try to let companies know when they're being phished or are (perhaps inadvertently) being used in spam.    I got a response back from someone who had clearly not understood my points and fired off another email to them trying to explain just why their mail was spam, why it was suspicious and why I was bothered.  

A while later I got another response, and some of that is worth quoting.    I'll leave out the part where they say I'm misguided (which may be the case, honestly enough).       I'd also note that I now seem to be on their email list (I've received two emails from them asking me to do a survey in the last day) and will be marking their mail as spam in the future.    This response confirms that they are harvesting emails with a view to spamming ("sent to millions of recipients") and the part about "html to most and text to others" is also nonsense as I got the HTML version and the text version in the same email and it had the same links (that is, not to Dun and Bradstreet).  

The rest of this post is excerpted from their email :

As an FYI, the campaign was sent to millions
of recipients whose e-mail addresses we've collected through our Jigsaw
partnership.  Due to the large number of recipients involved, we're
bound to get a certain number of complaints from people who don't
understand the purpose of the campaign (though we tried to explain in
the message) and others who simply like complaining.

The only other comment I'd make is that the message was transmitted in
html to most and in text to others.  Our e-mail vendor informed us that
this is standard, as it depends on the formatting of the various ISP's
through which the messages are transmitted.  It looks like the recipient
below received the message in text format, which is why the links look
weird and unofficial.  I believe that only the html version shows the
graphics, D&B brand, etc.  Unfortunately, neither we nor our vendor have
any control over the format through which the given ISP's transmit the
message.

Thursday, September 24, 2009

Some think new name.

A while back in the New York Times, I ran across a printed black rectangle with the words "Some think new name." printed (in white) in the top left hand corner. I had no idea what it meant, but posted it on my door because, well, it was odd and intriguing. Today I was looking at it and decided to do a Google search on the phrase. Not a hit to be found, nor was there one on Bing, nor on the NY Times web site.

Anyone have a clue as to what this is/was?

Tuesday, September 22, 2009

Cracker

BBC America today showed reruns of "Cracker" featuring Robbie Coltrane (probably better known as Hagrid in the Harry Potter movies). This is one of the best TV dramas I've seen, funny, tragic, and oddly haunting. Even if you don't like the usual TV crime drama stuff, this one is worth a watch and a second watch. Coltrane handles his part beautifully and almost effortlessly and the writing (especially in the episodes shown today) is almost perfect.

Monday, September 21, 2009

Pictures

I was in the Peace Corps from 1973-1977 - Zaire - now Congo (again) Shaba province (now Katanga again, I believe) in the towns of Luabo, Chibambo (on the Luapula river) and Kalemie (on Lake Tanganika). I carried a camera around and took pictures pretty often while I was there. This resulted in a box of fading photographs and another of negatives that I didn't really look at often. I've tried to scan the negatives a couple of times.

Once was in the local lab at EWU. They had a negative scanner attached to a nice big mac, but they'd disabled Terminal on the Mac so the only way to get images off was email or to burn a CD - instead of being able to scp them to my local desktop machine (sigh).

I also tried a negative scanner which scanned one image at a time, slowly, and required manual positioning on the negative. It was essentially unusable for my boxload of negatives. I returned it.

Then recently I was reminded that I wanted to do this and found a recommended scanner on Amazon - an Epson V300. And it worked very nicely, nicely enough that just two weeks later I now have 1500+ scanned negatives. I've put them all, pretty much without looking at them up on my work server (this will be going away at some point in the next year or so, I'll try to find another place to put them). I'll be going through these and removing the ones out of focus and that don't show anything, duplicates and the like and I'll also generate some thumbnails and labels. If you happen to be in one of these pictures and want it to be un-posted, let me know and I'll do that.

I may put some of them on Panoramio or something so I can add google map links to them.

Thursday, September 03, 2009

Fractran in Haskell

One of the blogs I tend to follow is Good Math, Bad Math which had a post on one of those oddball things that can be lots of fun to play with, in this case Fractran, a Turing Complete language by John Horton Conway. I implemented this myself in Haskell back in (hmmm, not at all sure) 1997(?) in
Haskell. The quality of the Haskell novice I was then (and remain) shows though my Haskell has improved since then. Read that post for better information on what Fractran is and how it works.

This implementation factors the numbers used and just keeps track of the factors in the numerator and denominator of the fractions involved. I probably reinvented the wheel to get the primes and factors and all. I tested it in the primes program (named primegame) and
the addition program (named addergame), but not on much else as actually writing Fractran programs was not something I tried very hard to master.


import Ratio
--
-- run takes an integer i
-- and a list of fractions (the program)
-- it returns a result list of the integers generated
--

run :: Integer -> [Rational] -> [Integer]
run i p = takeWhile (>0) res
where
res = i:(map runstep res)
runstep j = runl j p

runl :: Integer -> [Rational] -> Integer
runl j [] = 0
runl j (f:fs) = let
ifv = (fromInteger j)*f
in if isInteger1 ifv
then (numerator ifv) -- essentially a toInteger
else runl j fs

isInteger i = (i == (truncate i))
isInteger1 i = ((denominator i) == 1) -- works in this context

--
-- the numbers themselves dont show much, so write them as products
-- of powers of primes
--
primes :: Integral a => [a]
primes = map head (iterate sieve [2..])
sieve (p:xs) = [ x | x<-xs, x `rem` p /= 0 ]
powersOfTwo = 2:(map (2*) powersOfTwo)
--
-- returns 0 if not a power of two
-- else returns the power
--
whichPowerOfTwo x = l
where
(a,b) = span (< x) powersOfTwo
l = x == (head b) then 1 + (length a) else 0

--
-- brute force factorization
--

factor x = factor1 x primes
factor1 1 _ = []
factor1 x (p:ps) = let (c,q) = fp x p
in if c > 0
then (c,p):(factor1 q ps)
else factor1 q ps

--
-- multiply two lists of prime, power pairs
--

mult [] l = l
mult l [] = l
mult l@((p,pow):ps) l'@((p',pow'):ps')
| p == p' = (p,pow+pow'):(mult ps ps')
| p < p' = (p,pow):(mult ps l')
| p > p' = (p',pow):(mult l ps')

--
-- divide one list of prime,power pairs by another
--

divvy [] l = []
divvy l [] = l
divvy l@((p,pow):ps) l'@((p',pow'):ps')
| p == p' && pow == pow' = (divvy ps ps')
| p == p' = (p, pow - pow'):(divvy ps ps')
| p < p' = (p,pow):(divvy ps l')
| p > p' = (p, -pow'):(divvy ps ps')

--
-- isIntegerL returns true if the list represents an integer
-- when all the powers are >= 0
--

isIntegerL l = and (map ((>0).snd) l)

--
-- fp takes two integers and returns the largest
-- power of the second that evenly divides the first
-- and the quotient thus determined
--

fp :: Integer -> Integer -> (Integer, Integer)
fp x p = if x `mod` p == 0
then let (c,q) = fp (x `div` p) p
in (c+1,q)
else (0, x)

primegame = [17%91, 78%85, 19%51, 23%38, 29%33, 77%29, 95%23, 77%19, 1%17, 11%13, 13%11, 15%2, 1%7,5
5%1 ]

--
-- start with 2^a * 3^b
-- ends with 2^a+b
-- so started with (8=2^3) * (243 = 3^5) = 1944 should result in 256 = 2^8
-- that is, run 1944 addergame => 256
--
addergame = [2%3]

p1 = filter (/= 0) (map whichPowerOfTwo (run 2 primegame))

-- main = putStr (show (take 20 p1))
main = putStr (unlines (map show (map factor (take 10000 (run 2 primegame)))))