Indie-pub wunderkind and author advocate Hugh Howey released this:

Author Earnings dot com.

It aims to provide (admittedly self-selecting) data about author earnings — a subject that has made a lot of hay in the last month or two — and further seems to want to shine a stronger, more data-driven light on the earnings of self-published authors in particular.

I am all for more data, and in this, I respect the effort mightily.

Every piece of data an author has is better than having no data at all.

That said, it’s also important to have some scrutiny of that data.

Data — er, “data” — after all, is easy to come by on the Internet.

Less easy is data that is true, and meaningful, and supports conclusions.

So: is this data all that?

Answer unclear, ask again later.

I’ll note a few things here, and then turn it over to you to let you folks (translation: someone smarter than me please take a look at it and offer your thoughts, willya?).

a) This data is entirely about Amazon, which remains the apparent leader in e-books, is by no means the entire picture in terms of bookselling in general. That skews this as being useful data regarding e-books and e-publishing (and author-publishing in particular), but maybe less so as a big picture than hoped? Point is: Amazon is a big fish, but not the only fish.

b) This data is extrapolated — meaning, no actual numbers, right? It’s taking data from (do I have this right?) a single day’s worth of rankings and from that deducing sales numbers for that day and then, by proxy, a whole year? (Graph here. Text: “The next thing we wanted to do was estimate yearly e-book earnings for all of these authors based on their daily Amazon sales.”) In my experience, those websites that attempt to extrapolate sales data using Amazon ranking numbers have been faulty. Hell, even Bookscan numbers are kinda fucked up (which is, itself, fucked up, BUT HEY WELCOME TO PUBLISHING WHERE NOBODY HAS VITAL INFORMATION).

c) It’s only pulling from bestseller lists on Amazon. This means we don’t have data on everything that isn’t… bestselling, right? Still useful to know and see what it means for those books up there at the top in terms of how many are indie and how many are not — it’s more indie than you think.

d) Because of limited scope, fails to capture ways that authors can make other money with a single book — foreign rights, film/TV rights, etc. That’s true on both sides of publishing, though likely moreso in traditional. If you looked at a book like my own Blackbirds and used a single day’s worth of sales at Amazon, you’d have almost none of the picture of a) how it really sells and b) the money I’ve made from the book beyond just the book.

This is interesting, so far. Be curious to see where it goes from here, and if it starts to include more robust data. At present it seems like an interesting start, though one offering a limited timespan of data (a single day) that captures not so much actual data as it does an extrapolation of data. (Though again, maybe I’m misreading, here. Smart people: jump in.) Either way, good for Howey for getting this out there. One assumes over time the data here will start to sharpen and present something cutting. In the meantime, worth poking through this with a few sticks and seeing if we can get other folks to verify the data and conclusions from the data.

Your thoughts?

  • When the report has a more comprehensive data set, it will become far more useful.

    Something that seems frequently absent from these discussions is the unit revenue differences between SP and trad-pub. It’s understandable that many SP titles out-sell trad pub titles by a big margin, but when unit-revenue is included, the picture is likely to appear very different.

    Overall, I hope this site lives up to the ideals it espouses.

    • You’re right. Speaking strictly of sales and revenue per unit, the data would become very skewed towards self-pub. Royalties are either 35% or 70%, both of which beat the industry standard 25% on ebooks.

  • I had the same blinking reaction to the charts. But Howey’s very upfront about this report being a snapshot, and he’s encouraging others to take a microscope to the data. I give him huge credit for this.

    And hell, who can argue with his takeaway points? He’s telling publishers to lower their e-book prices and nurture their authors more. I’m a legacy-published novel guy and I’m saying halle-fricking-lujah.

  • It’s very interesting, and while Chuck bring some good points. it’s a start. I’m glad that Hugh has done it and can only see it becoming a more useful resource as time goes on. We are going to chat to Hugh live on the Self-Publishing Roundtable tomorrow night 8:30 pm EST to find out more about it.

  • I’d need a wider range of data, and some comparison data vs. print sales to be able to say that All Publishers Must lower their ebook sales and that doing so would be only good for authors and publishers. The lower the ebook price, the less attractive the print edition looks by comparison, and the less books are valued overall. I don’t want a universe where the cultural equilibrium price of a book is the same as the HD version of a single TV episode or of two Mp3s.

    • Whoa, whoa, whoa. I was with you all the way to “the less books are valued overall.” And then that last sentence jumped the shark-pit wall.

      Attempts to define the “real” value of something are… well, I’ll just say it. They’re elitist patronizing bunk. The worth of a thing is what that thing will bring in an open market between a ready buyer and a willing seller. Attempts to run economies and societies according to some other principle usually end in tears. (And often in screams and blood.)

      I love books. And I buy way more books now than I used to. Not because I couldn’t afford them before – I am fortunate to have a day job that pays ludicrously well. But because I thought that they were overpriced and I wouldn’t often pay what was being asked by traditional publishers for paper books. I still do, and I don’t buy paper books at all anymore, except for at HPB. And I won’t buy what I consider subjectively overpriced ebooks, either.

      That being said, the question of margin is a very, very valid one. But there are two possible meanings when you talk about the “value” of a book. You could mean, “What pricing structure will get the most people to read the most books and generate the most profit for the authors* thereof,” or you could mean,”what pricing structure reinforces the appropriate cultural value of books?”

      I think that there was a solid argument that lower pricing, especially on ebooks, might not be the best answer to the first one. I also think that that argument is getting weaker literally as we watch the data come in. It’s not settled yet, but I think that the burden of proof has shifted.

      I can’t answer the second one, and I really don’t care about it. If you do, I am not saying you are wrong, I’m saying that you are making a subjective value judgment and there is no way to refute you, so I won’t insult you by saying that I think your values are stupid, which is what people who try to refute value statements are really doing. I don’t think your values are stupid. I just don’t happen to share them.

      I acknowledge that the analogy to equivalence is not perfect, because books do have cultural value over and above their entertainment value. (Then again, so does music. What’s a song but a short story you play music along with?) But I’m not writing Littrachaw. I’m writing fun stories that I hope people will enjoy (and if I make them think a little bit, bonus.) So I see my competition as equivalent entertainment experiences. If I write a novel that takes two hours to read, I’m perfectly willing to price it equivalently to, say, renting a two-hour movie from Amazon. (An awesome premium new-release movie, of course, because I am just that awesome.) And the fact that you can read my book again and again is me providing extra value (and might justify a small premium.) However, if a two-hour movie costs five bucks to rent, and a two-hour novel costs twenty bucks to read, one of us has unrealistic pricing expectations. I’m a pretty sharp fellow, but I’m not sure I’m sharper than Jeff Bezos. More importantly, I don’t think I have the same pricing information he does.

      *Authors. Not publishers. I don’t give two shits about maximizing publisher profits, unless of course the author is publishing their own work, in which case the problem collapses to the simpler case.

      • You’ll notice I said cultural equilibrium price – not market equilibrium price. Market price can be controlled by a lot of factors – diamonds are expensive due to cartels that have spent a decade creating demand for their product through an extended marketing campaign to create a cultural demand. Therefore the cultural equilibrium price (the perception of ‘real’ value) drives the market price, making room for it when normal supply and demand might not come to the same end. That’s just marketing.

        Over the last few years, I’ve heard more than a few people who have the same opinion that books should be cheaper. But when we do compare books to their direct competitors, books are, on average, one of the cheapest entertainment media.

        Going to your paragraph about relative cost to a two-hour movie – A two hour movie is generally about the same wordcount as a novella – novellas are frequently priced at around $2.99 in current ebook systems, sometimes more like $1.99. My understanding is that average reading speed is such that a 80K novel is going to take more than two hours to read, so in terms of unit cost to hours of entertainment on a single use, a $6.99 novel justifies its higher relative price vs. a film rental at $2.99 or $3.99. Of course, film and books offer different types of entertainment, and there’s more diversity within each medium than perhaps even the average difference between a book and a movie, so direct equivalence is difficult and super-subjective.

        What I resist is the notion that books *should not and do not* command their full price, that books are really ‘worth’ only $2 or $3. If a book costs $14.95 from an independent bookstore and $10 from an online retailer, that doesn’t make it a $10 book that the indie is over-charging for. Many many readers are happy to pay the MSRP at $15, and many chose the ecommerce discounted price of $10. There are a lot of back-end logistics and intangibles that inform these relative costs, and I’m seeing a cultural slide toward the deep-discounted ecommerce price as being regarded as the ‘real’ $ value of a work, which means that authors are not able to command as high a price for their work, and must rely entirely on price elasticity of demand, praying that the math on lower price, higher sales #s adds up in their favor.

        Twenty-dollar books, hardcovers, are the deluxe edition of high-priority works. The publisher is making a bet that the demand for those editions will be sufficient to generate both sales velocity and adequate revenue. Not every work comes out in hardcover, and sometimes, hardcover is the wrong format, either because the ideal reader for the work is used to paying a lower price, or because there isn’t enough demand for that deluxe edition package. Seeing a blockbuster movie in a deluxe setting can cost you around $15 in the US, $10 or so outside the biggest cities. Why not wait until the home video release? Some people chose to pay more, and others don’t. As long as the deluxe version can generate enough demand to be fiscally viable, publishers and film companies will make deluxe editions.

        • Excellent points, and one of the big weaknesses in my argument on that score is that I am a terrible judge of how long it takes to read something. I can read a 60K word novel in an hour, give or take. I usually just double my estimate and even then I am never very close. 🙂 Bragging aside, your comparisons are also fair and certainly argue for a higher price point even when valued at comparative-entertainment levels.

          But I really must protest this phrasing:

          “What I resist is the notion that books *should not and do not* command their full price…”

          You are using “full,” I believe, to mean “publisher’s list,” or some such thing, but your implication, to me, is that that is not only the “full” price but the “real” price and that selling it for less is some kind of subversion and culturally devalues the book. To which I reply:

          1) Anybody who gets a B&N membership doesn’t pay full price for anything. I have yet to hear B&N in general or its membership card in particular denounced as culturally devaluing the printed word.

          2) More generally, again, there is no such thing as a “real” price. The worth of a thing is what that thing will bring. The publisher does not know what the “real” price is, or a “fair” price. They know how much the book costs to print. They know how much profit they would like to make. They add them up and that’s what the book costs. It is entirely possible that few to no people will agree with them that that is a reasonable price, and therefore, nobody will buy the book. Happens all the time. Are you saying that the only reason traditionally published books don’t sell is because people don’t understand that they are being offered the book at the proper price? Or that lowering the price of a traditionally published book wouldn’t increase its sales? If so, I am at a loss to understand the sales/discounting programs that publishers often offer through Amazon, since they had already figured out the right price.

          If it helps, there is obviously a lower bound to all this for works of true cultural significance: I can get all of Mark Twain’s books for free on Project Gutenberg, but I checked and the B&N in my local mall had a variety of editions for sale when I went to lunch.

          • My greatest worry is that customers will get so used to deep discounting that they will view the discounted price from someone like Amazon or WalMart as the reasonable price, and blanche at list price or the slight discounts offered by other retailers. Accounts that can afford to operate on tiny tiny margins have the capacity to, and have been, squeezing out retailers that cannot operate on that slim a margin due to differences of scale.

            If enough consumers choose deep-discounting over time, always prioritizing price over the intangible benefits of other retailers and/or ignoring the effect on the local economy, those behaviors could lead to a single deep discounter getting too much of the market share, leading to monopsony or near-monopson, which has every likelihood of biting book publishers (indie or trad) in the rear due to that monopsony’s ability to dictate terms.

          • I think that’s a valid concern, logically speaking, but I note that you can still buy both Hyundais and Ferraris, and can still choose to dine at McDonalds or Cheesecake Factory. If the market wants higher-priced premium product, the market will get it. If it doesn’t, it won’t, and trying to make it take it will not work in the long run.

            Also, the ability to sell direct means that if a monopsony or monopoly *were* to come into existence, writers would still have the ability to reach their readers even if it offered unconscionable terms.

  • I agree with a lot of Hugh’s points, but I think there’s something odd about the spider they’re using to crawl Amazon’s site, especially when the report claims that 92% of the 100 best-selling genre books are e-books (Kindle edition) and not hardcover/paperback. If you look at the report’s chosen genres on Amazon’s best-selling list, it’s almost exclusively Kindle editions. But when you don’t filter at all and just bring up Amazon’s Best-Selling Books in general, there’s not a single Kindle edition or ebook in the top 100. They’re all hardcover and paperback, and they’ve got like 9,000 or so reviews. Huge numbers. The Divergent series, for example, is in the top Overall 100 as hardcover and paperback, but doesn’t show up at all on its respective genre-filtered list, which is odd, to say the least (if it’s a bestseller overall, it should automatically be the bestseller in its genre as well). That says to me there’s something funky about Amazon’s numbers, or about the ability of a spider to get a real sense of what’s going on.

    One other important point: I’m not sure it makes sense to only examine ebooks and use ebook revenue alone as a way to judge between traditional and indie publishing. The Big Five get a huge portion of their revenue from hardcover and paperback sales, so it’s like saying, “If you discount the 80% of the revenue that authors get from hardcover sales, indie ebooks easily outpace traditionally published ebooks.” That might be true (again, the data is suspect at this point) but it doesn’t present the true revenue picture. If you’re traditionally published, you’re probably going to get the majority of your money from physical book sales (judging by Amazon’s entirely physical top 100), so it doesn’t make sense to compare the tinier portion you’d get from ebook sales to the best-selling indie ebooks that by definition don’t have a physical counterpart. It’s like telling someone who just got paid $100k in stock that he’s not doing as well because his base salary is only $10,000.

    • “If you look at the report’s chosen genres on Amazon’s best-selling list, it’s almost exclusively Kindle editions. But when you don’t filter at all and just bring up Amazon’s Best-Selling Books in general, there’s not a single Kindle edition or ebook in the top 100. They’re all hardcover and paperback, and they’ve got like 9,000 or so reviews. Huge numbers. The Divergent series, for example, is in the top Overall 100 as hardcover and paperback, but doesn’t show up at all on its respective genre-filtered list, which is odd, to say the least (if it’s a bestseller overall, it should automatically be the bestseller in its genre as well). That says to me there’s something funky about Amazon’s numbers, or about the ability of a spider to get a real sense of what’s going on.”

      This is definitely worth a second look, yeah. Amazon’s rankings and lists are always a little puzzling. Would love more information on how all that stuff goes together — and, frankly, what an Amazon rank even means (and my understanding is that an Amazon rank represents more than just sales, or used to, at least).

    • “But when you don’t filter at all and just bring up Amazon’s Best-Selling Books in general, there’s not a single Kindle edition or ebook in the top 100.”

      That’s because ebook can’t appear on that Top 100 list. It’s solely for Print books.

      This is from May 11, 2011 on the New York Times

      “ said that it sold 105 e-books for every 100 printed books, a milestone that its chief executive, Jeff Bezos, said had occurred sooner…”

      That’s almost 3 years ago. And after the DOJ ended the price-fixing, the prices of ebook best sellers drop by a very large percentage. Lower price = higher volume.

      As for Amazon UK, this was announced on August 5 2012

      “The retailer says for every 100 print books sold, 114 ebooks are purchased. These figures are specific to”

  • I’m not addressing the sales issue at all, but I do have a comment about’s take on reviews and price. Howey notes that the average list price and average review score “seem to be inversely correlated.” He then says, “Think about two meals you might have: one is a steak dinner for $10; the other is a steak dinner that costs four times as much. An average experience from both meals could result in a 4-star for the $10 steak but a 1-star for the $40 steak. That’s because overall customer satisfaction is a ratio between value received and amount spent.”

    Okay, fine. But.

    Per his chart, the average rating for a $7 Big Five book is somewhere around 4.1 stars, and for the $3 to $4 books from indie publishers, small/medium publishers, Amazon, and “uncategorized single-author publishers,” it’s between 4.25 and 4.4 or so. That’s a difference of between .15 and .3 stars, which isn’t a lot. (Note the way the chart only goes from 3.0 to 4.5, instead of from 0 to 5; that’s an old Evil Statistician’s trick for making a difference seem larger than it really is.) Given the lack of objectivity inherent in Amazon’s rating system, it’s probably not a very meaningful difference. In other words, the $7 books from the Big Five publishers and the $3 self-published books are being judged as offering pretty much equal value for the money, so his tale of the two steaks doesn’t hold up.

    More importantly, “value for the money” has nothing to do with absolute quality.

    If I compare a $3 Quarter Pounder and a $6 Burger from the higher-priced fast food joint up the block, I might give them equal ratings based on value for the money. Maybe I would even give the $3 burger a slightly higher rating on that basis, but that doesn’t mean I thought it tasted better – it just means I held the $6 Burger to a higher standard. I might buy more Quarter Pounders over the course of a month because they’re cheaper, but when I have the extra $3 to spend, I’m going to get the $6 Burger because I know I’ll enjoy it more. Higher price can certainly be a barrier to sales, but it can also be a sign that whatever’s inside the burger is more likely to taste like meat, not cardboard.

    The same thing applies to books. If I buy a book for $2.99, I’m not going to expect a lot. I’ll probably enjoy it – I’m easy – but if I don’t, I’m only out a Quarter Pounder. If it’s a book costing twice as much, I’m going to know more about it before shelling out; I will have read some real reviews (not on Amazon), or I might have heard about it from a friend. I’ll have higher expectations, and they’ll probably be met.

    (I’m not likely to even consider a new book for $.99 unless I know the author’s work, because I’ll see the price as the author’s way of telling me that even he or she thinks it’s not worth very much. Given too many books and too little time, at that point, I’ll look for something else to read.)

    Rock-bottom pricing is great if you’re selling burgers that taste like cardboard to people who just want something quick and cheap for lunch. I’m not sure it’s a one-size-fits-all solution to selling and promoting books.

    • I’ll grant you there are a lot of indie books out there that are not worth the effort, but in my experience, there are just as many trad-pub turds floating in the bowl. In other words, the ratio of crap to gold seems the same to me (though I admittedly don’t have science to back it up).
      We may be measuring trad-pub books by a higher standard, but do they really live up to it? The last three out of the four trad-pub releases I’ve read were “just OK.” Passable. There were a lot I found unacceptable which I DIDN’T read.
      And I’ve found about the same ratio in indie’s.

    • It would be nice to plot error bars in some of those figures, which would make it easier to see for example whether a difference of between .15 and .3 stars is relatively minor.

  • My two cents – using romance as one of the genres in this makes the data a little less reliable than perhaps it might be. Romance is a huge seller in both traditional and self-published, but I would say that you find way more books out there in romance as e-book only…much of that has to do with many authors putting up their backlists on e-book as well as a number of other things. Romance (and erotica, which is lumped into the same category) have seen a greater swing toward e-books. There are lots of reasons that we could chat about and would make for an interesting discussion, but I’d be curious what the numbers are without that genre in it….not because the genre isn’t valid. Hello! It makes up something like 55% of all books bought. But I think there might be a different picture without it. Not less valid or more correct…just different. And I would love to take that farther and break each genre into its own chart and graph because each market is VERY different. I think there might be a better shot at getting something useful out of the data. But that’s just me. I like numbers and looking at each genre and the unique challenges reaching readers in that genre presents. And I personally believe that when we lump lots of things together the picture gets very cloudy and hard to interpret.

    • There is definitely a discussion to be had about which genres do well in e- and self-publishing, and which genres do not. The decision to go indie is an admirable one (when it’s done well), but is *sometimes* made without a nuanced look at the landscape. And one of those nuances is, you know, romance will rock in self-publishing. Space opera and sci-fi, too. Crime, maybe not as much. Literary, almost certainly not. etc.etc.

    • This was the first thing that popped into my head as well. One of the writer communities I’m in at Google Plus was discussing short stories as a means of income, and an author who makes high-five/low-six figures a year came up. The genre: erotic romance. The medium: ebook.

      So yes, one of these things is not like the other. The romance market is so big and its units moved so high that it skews the numbers for any set it’s a part of.

  • I See this as a shot across the Bow of traditional data reporting. If you remember, the original post that spawned this report was comparing the revenue of traditionally published authors with self-publishers. That dataset included those who identified as Self publishers who hadn’t yet published anything. Howey’s point was this was comparing apples to oranges, primarily because those authors seeking traditional publishing who did not make it past the “gatekeepers” weren’t counted, but those self publishers who couldn’t sell were.

    In this context, I think the report Howey compiled is purposefully meant to do the opposite: He’s trying to show Self-publishing as a viable and lucrative career, but no one’s supplying accurate data. So he came up with a believable broadside. The next step is for the Big 5 or Amazon to refute this data, and we get a better, more accurate picture.

  • Unit revenue is less important than total revenue. They are, of course, related, but it’s not a linear relationship. If a SP author puts out a book at $7.00 ($6.99 but who’s counting?) makes a unit revenue of $4.90. If he sells 100 books, his total revenue is $490.00. Say this author drops his price to $4.00. His unit revenue is $2.80. Sounds like a bad idea, right? But it’s axiomatic that lower prices result in increased volume of sales. To make $490.00 that author would have to sell 175 books at the lower price. Would he? Maybe. Probably, especially if he’s not well known. He might even sell a lot more. The point here is that the only number that really matters to the author is total revenue. The SP paradigm allows him to experiment with unit revenue to maximize his total revenue, something that is impossible under Legacy Publishing. IMHO, Howey was correct in focusing on totals.

  • To clarify, unit/total revenue is not a linear relationship in terms of author satisfaction. An author will be more satisfied with higher total revenue from lower unit revenue. Mathematically, of course, the relationship is linear. Again, the rock-bottom line is how much money the author gets into his hot little hands, NOT how much the legacy publisher makes.

  • Seems to me like Howey and Konrath are just trying to get more PR for their own books. Guess they’re not having enough fun making those deals with trade publishers on the side.

    Publishers and businesses generally have been trying to do this for years. DECADES. Everyone would love to know the magic number to make a book successful. What genre is guaranteed a success. What publicity machine gets the bestseller moving.

    This is all a bunch of numbers that can and will be manipulated to say whatever they want it to say. And, as usually, it’s a huge ad to insult anyone going with a publisher and praising anyone who goes self-pub. Even when the two biggest fans have “sold out” to Amazon and are their biggest fans.

    I’d rather spend my time writing than trying to figure this crap out. You can’t extrapolate on what books are going to be big and which ones aren’t – if it could be done it’d HAVE been done by now by the big publishers.

    Spend less time looking at self-promoting sites like this and more time working on your craft and making your book the best it can be.

    As with all things IMO.

    • This data is an effort to find out what Amazon knows. Amazon could, of course, completely fuck with it by tweaking their ranking algos. But they have no motivation to do that. As for the argument that if it could be done, the publishers would have done it, that doesn’t pass the laugh test. (“If personal computers were something the world needed, IBM would have provided them, Steve. Don’t be dumb.”)

      They have been trying, but as has been demonstrated, they weren’t doing a very good job. The way they calculate sales figures is an insult to anyone who’s ever studied statistics. I’ve been in the licensing biz for decades. I thought I’d seen a lot of rights-grabs and fuzzy math. Then I read my first Big Five publishing contract and saw my first Big Five royalty report. I hadn’t seen anything yet.

      In any event, Amazon knows, but they ain’t saying. The publishers say, but they don’t know. I have to assume that they don’t know, because the only other explanation is that they are lying thieving bastards who purposefully underreport sales to reduce royalties payable. While I’m willing to entertain it, I find the explanation that they are just incompetent much more likely. (On the other hand, purposeful incompetence isn’t much better.)

    • Score one for our small camp, Watcher. Anyone who has actually worked in big publishing (I have) or small publishing (I do) is cringing at the barrage of half-baked stats and blatant misinformation being hurled by Hugh and others. I and my wicked publisher kin have nothing against self-publishing and so far as I can tell don’t wish self-pubs any ill-will. But the business model for legacy publishing cannot support the same system as a single author with no overhead selling his/her books for 1.99. I provide my authors with liability insurance, copyediting, cover art, marketing, accounting, foreign agents, film agents, and more. And yes, a few self-pubs are able to get that on their own and many of my hybrid author friends can handle the business matters just fine and God bless them. But the rest — anecdotal evidence speaking — are babes in the woods. And thus this massive industry has grown up around them, selling them on the goldmine of self-publishing, and it is desperate to prove that self-publishing IS a goldmine and that traditional publishing should no longer be an author’s holy grail. I agree that self-publishing is a fine alternative for authors who can handle that route. But this bizarre and ignorant vitriol against trad publishers is ludicrous. Except that it sells self-help programs to self-pubs. Ah hah. Rally the innocent. Sell them the idea that they’re helpless without your expert opinion. They are, actually, in need of help. But not because traditional publishers prey on them. They’re helpless because the business and creative art of publishing is and always will be brutal, regardless of how access they’re given to the portals of a readership. Many are called; few are chosen. But many will spend big bucks to learn that hard lesson. Any a few will profit from it.

  • He did say they were wroking on getting data from all genre books, and obviously a larger sample size over time will give us a much clearer picture. I think the point he is trying to make is that even with this small data set it seems likely that author-pub and small pub are doing way better than the legacy pub would liek to admit. I would love to see this put pressure on the Big 5 to limit contract life and give a larger cut to authors.

    • Agreed, though it’s “better” in the sense of “At Amazon, and in e-book format.”

      Which is useful information! But still limited enough where I don’t think this is the bombshell some people are claiming that it is.

      If this pressures the Big Publishers to offer better contracts, I’m all for that. Because if they don’t, they’re going to one day soon start hemorrhaging authors and they will, as noted, need to begin to actually compete with one another not just with the books they offer but with the offers they make to authors.

  • I think in this case incomplete (or wrong) data might be worse than no data. The data is being used comparatively, judging one path of writing (indie) versus another (traditional). But it’s only take a tiny slice of data that’s favorable to indie publishing (ebooks on Amazon, an ebook publisher that controls the way the data is scraped and is otherwise opaque). This would be like someone saying:

    “This is a very close game, if you don’t include all the points the other team scored in the first half.”

    But we don’t even know what the points are yet for traditional or indie. Extrapolating daily revenue from a figure shrouded in mist like “daily sales rank” and then extrapolating further to annual tax brackets, as the report does, is not helpful. Nor is limiting it to genre. I’d love to get a peek at the actual breakdown of genres; are the figures dominated by a single category? That’s something I’d like to know if I were considering indie, because no one writes “genre” in general; they write in specific genres.

    My thoughts are that incomplete data should never be published because it’s going to polarize the issue without breaking any light upon it. And I’d love to see the real data, because this is an important choice writers have to make.

    • I don’t think you did it maliciously, but you are using “real” and “complete” interchangeably. They are not, in this context, interchangeable. And in a sense, neither of them applies here. The real data is available, in raw form, from the people who did the survey. It is, in the sense that it is the data they collected, complete. You may feel that it is inadequate data to base conclusions upon, but it is not unreal, and it is not incomplete.

      And that’s a mighty slippery slope you just laid down in that last paragraph. Does the phrase “There is insufficient data for a meaningful answer” mean anything to you? 🙂

  • Howey’s numbers sound like a new Sci-fi genre to me! Amazon is not the be all and end all of sales. Speaking as self-published author (I’ve been selling my non-erotic historical romances on line since the end of 2006) near the end of 2011 Amazon made some big changes to their software (as well as bringing in the library thing – their attempts to coral self-published authors). Before the end of 2011my sales on Amazon had been increasing exponentially year on year. Almost over night my sales on Amazon dried up. I presume my books due to some technical change no long made it on to their also bought/read lists. I still have sales every month (some months better than others), but nothing like before.

    Has anyone else has had a similar experience with Amazon? I may just be a paranoid wench, but I suspect they’ve rigged it so that only those SP authors who signed up to be Amazon-story-slaves were given the also read/bought privilege. For all we know they have it rigged so only certain traditional publisher’s books get the royal (also bought) treatment (which would totally skew sales). These days most of my e-book sales come from itunes and Barnes and Noble (though I also sell directly to readers – which is how I started). For me, Amazon sales have always been a piece of the revenue pie…not the pie!!!

    • There’s an easy way to see if Amazon’s alsobots are treating you right – go and look.

      I have only one book on Select (and I only added it last month.) I publish on lots of other platforms and Amazon knows that (they price match my books.) The alsobots have always been quite fair to me. I look at other books in my genre, my books show up in the alsobot list.

      As far as what happened to your sales, there are multiple potential explanations, which would require considerably more information to evaluate. I will say, though, that that sucks. 🙁

  • Self-published has a 25% ebook market share on Nook, judging from Barnes and Noble press release (April 2013)

    “Customer demand for great independent content continues to dramatically increase as 30% of NOOK customers purchase self-published content each month, representing 25% of NOOK Book™ sales every month.”

    And according to the Wall Street Journal article “Fast-Paced Best Seller: Author Russell Blake Thrives on Volumes”

    ” In 2013, self-published books accounted for 32% of the 100 top selling e-books on Amazon each week, on average”

    Looking at the current Top 100 Paid in Kindle Store, that 32% figure seems about right.

  • A couple of points to clarify the discussion here. “Amazon’s Best-Selling Books” list is print-only. That’s why you only get print books.

    A lot of people were surprised by the dominance of ebooks over print at Amazon. I wasn’t because I’ve compared there print bestsellers with the Nielsen BookScan numbers before and it is really obvious that the sales patterns for fiction at Amazon are very different from the rest of the market. People who shop at Amazon for fiction overwhelmingly prefer the ebook format unless it just doesn’t work well (ex: “Diary of Wimpy Kid”). Just as an example, Amazon sold a lot more print copies of the official SAT Prep guide than they did Divergent. Think about that.

    • A quick note on Bookscan. In the last 18 months that I’ve been working at Angry Robot, I’ve noticed that Bookscan captures an incomplete and quite varied percentage of overall print sales, comparing them to my internal numbers (which capture a much wider range of sales).

      Bookscan captures sometimes as little as 20%, sometimes well more than 50% of sales. Unfortunately, we don’t have a better system yet, so we use what we have. I find Bookscan figures useful when comparing to other Bookscan #s, and to give a very vague sense of sales history.

  • Saying it only included “bestselling” books is a bit disingenuous. It includes the top 7,000 bestsellers on those two particular days. That’s a pretty large sample of books.

    And let’s face it, this is probably a more accurate depiction of the marketplace than all the studies that ignore Amazon altogether and tell us e-book sales are flat.

    • Exactly. Some of these books rank 50,000+ overall. And these books aren’t ranked here as the top 7,000 bestselling titles of all time. Just for a day. This is deep into the midlist of authors who work hard enough to separate themselves from their peers. (Not that all who do everything right have success. Luck is a component as well).

  • I thought this might be of interest to readers of this blog post. The PassiveGuy (from the PassiveVoice blog) wrote the following observations:

    1. Yes, the report is based solely on Amazon and doesn’t include other outlets. But the report is primarily about indie authors vs. traditionally-published books in the world’s largest bookstore. Since it’s based on Amazon US data, it’s about authors and publishers who sell there (which includes a lot of indie authors who live outside the US).

    Most (not all) US indie authors make the majority of their money from Amazon. As such, Amazon data are of the greatest financial significance for looking at the performance of US indie authors as a group.

    2. Yes, the report doesn’t include income from sales in physical bookstores. Again irrelevant for understanding how indies as a whole are doing. However, again, Amazon is the largest bookstore in both the US and the world and still growing in both absolute terms and in share of market.

    With the continuing decline in the numbers of physical bookstores in the US, at least for a few years, what happens on Amazon will be a larger and larger component of what happens to traditional publishers. If indies are selling more genre ebooks than traditional publishers in early 2014, unless traditionally-published ebooks somehow make a massive turnaround in ways they haven’t before, indie dominance is likely to grow.

    3. No, Author Earnings doesn’t include anything about film and TV rights income, but only a tiny fraction of traditionally published authors ever receive a cent from sales of film and TV rights. For most indies, ebook sales are the mainstream royalties source. For most publishers, Amazon sales of ebooks and physical books represents the single largest source of income for most titles.

    4. Yes, the report is based on Amazon bestsellers, but but it’s way, way more than a study of top-10 lists. One set of data includes the top 7,000 bestselling genre books. Another looks at the top 50,000 bestselling books.

    PG won’t say that the 50,000 book list includes all serious indie authors, but it does include a lot of them.

    ———————————credit to PassiveGuy from thePassiveVoice for the above observations————-

    • Those are fine points, sure.

      That said, the report aims to do more than speak to indie authors — it aims to provide information to and about all authors. As such, a rounder and more complete picture would be — for me! — ultimately desirable.

      As I said, my own numbers through this model would look wonky, and I’m one of those confusing “hybrid” author types.

      Also — “but only a tiny fraction of traditionally published authors ever receive a cent from sales of film and TV rights” — I’d love to see that backed up with data. My experience (as in, not just mine but those seen among others) doesn’t reflect this. Maybe not meaningful for indies, but meaningful in the traditional space.

      Point being, if you’re going to put out a report that aims to inform all writers about their choices, it would be useful to have a report that provides as much data from all corners. Right now, it’s a slice from a space biased toward indies that suggests how great it is to be an indie. Which is not inaccurate, exactly, but it’s also incomplete.

      — c.

      • “The people I know” is not a representative sample. In your case, it’s wildly biased. What percentage of all traditionally published authors (since the 1920’s) do you know? Are the ones you know a representative sample? I doubt it. If you think PG is wrong, you can come up with a better estimate or hope someone else does, but saying “my experience doesn’t reflect this” adds zero value to the discussion.

        The data that @authorearnings gathered and analyzed clearly added positive value to our knowledge. The fact that it is incomplete is inevitable. The fact that it opens new opportunities for gaining additional knowledge is invaluable. We have a general idea of how big Amazon is in the specific genres the report covers. We can even estimate how much of the print market belongs to the top 50 print books. Any author with a basic grasp of statistics should be able to make a much better-informed decision today than they could have last week. I will grant you that there are relatively few authors with a basic grasp of statistics (which is, sadly, true of the general public as well). To that end, I hope to use this weekend to work on an analysis of the data to incorporate some external data that will put this data in the broader context of the market for stories in the covered genres.

        • Gosh, you’re fun.

          Yes, I realize that what I said was anecdotal. And no, I don’t know a high percentage of All Authors Since The 1920s, which is relevant in what way, I have no idea (does the Author Earnings survey go back that far? no.).

          Finally, “but only a tiny fraction of traditionally published authors ever receive a cent from sales of film and TV rights” is also equally without meaning, because it means, mmm, nothing. But I don’t see you going after PG for that, which suggests bias on your end, as well.

          — c.

          • February 12, 2014 at 4:00 PM //

            The 1920s matters because of U.S. copyright law. I like to look at things from a systemic perspective, but I take your point.

            I’m not going after PG because his assertion doesn’t contradict the data available to me. I am not an author and I don’t have any direct knowledge of authors’ income sources. The data that I have access to doesn’t support the contention that the inclusion of non-royalty income would substantially alter my understanding of the @authorearnings report.

            You made a specific claim about the limitations of the report based on a faulty premise. The nature of authors income is not a normal distribution. That means that any individual author’s experience contributes very little to our understanding of the data as a whole. You realize that what you said was anecdotal, but you don’t seem to realize that the statement was counterproductive to a reasonable discussion. There are lots of things to debate about this report, but how well your personal experience coincides with it isn’t one.

            You keep saying the report is incomplete and you keep trying to refute it with anecdotes. That doesn’t make sense to me.

          • My personal experience is what I have. It doesn’t have to be emblematic of the data or suggest a flaw in the data, but it allows me — an author — to say, “I don’t fit this mold, and I suspect others don’t, either.”

            I think that’s a perfectly fair addition to the discussion. Particularly since I was addressing PG’s own essentially made-up, data-free point about how many authors earn money from film/TV options/deals.

            At this point, we’re arguing about arguing. If you’d like to keep on doing that instead of talking about the report, it’s your dime, pal.

            — c.

        • William, what’s your external data, and would you care to share it, or at least your sources? My SPSS software is clawing at its cage, but I don’t have enough raw meat to throw at it…

          • February 12, 2014 at 4:11 PM //

            Various sources including some from BookScan (via Publishers Weekly) and the raw data from Pew Research surveys of book readership. I’ve been working on a model of the story market for some time. Pew gives me data about the demand side and sales data gives me the supply side. Pew makes their data available in SPSS format as I recall. But you might not be interested in that.

            Chuck is right about one thing. All the sales data has to be treated with a jaundiced eye. It is sometimes useful in the aggregate, but almost never useful at the level of a particular title. My goal is to build a Bayesian multilevel model that can provide useful information for specific writers. And readers too. I’m not really close yet, but I think I’m making progress.

      • From boxofficemojo for 2013 yearly chart:

        Movies that grossed more than $25 million at the USA boxoffice in 2013: 99
        Movies that grossed more than $10 million at the USA boxoffice in 2013: 131
        Movies that grossed more than $1 million at the USA boxoffice in 2013: 212

        These are movies released in theaters.

        It would be interesting to see the stats for “how many movie/tv rights are purchased each year and the author is guaranteed at least $5,000 from it?”

        Why $5,000? No idea.

  • I think this is a good start on outlining what Konrath calls the “shadow industry” world of author-publishers. While I do know writers who are making big bucks on B&N or Apple, for most of us who self-publish Amazon is the big market where we ply our wares. This snapshot shows that the shadow is actually a pretty big market all on its own. That it’s not the whole market really doesn’t matter to that part of the scene.

    I’m encouraged that Hugh and his data wizard have put their data online for others to poke and prod. I look forward to reading what Ed Robertson or Phoenix Sullivan or William Ockham find once they have time to really dig in. There are a bunch of obsessive number-watchers among self-publishers, and I’m sure they’ll check in when they get time.

    As someone who has done some statistical analysis professionally (though in hydrology not publishing markets), I want to say that it’s not entirely true that you can prove anything you want with numbers. I mean you can, but only if you hide your sources and your analysis.

    The corrective to that is to make your data and your analyses available for others to critique. That’s what academics do, and what Hugh and Co. have done here. They are encouraging everyone to push back, and that’s great. I would only ask that if you’re going to push back you do it with analysis of the data and try to let emotions stand aside a bit. We all seem to let our passions sometimes overwhelm our thoughtfulness on these issues.

    There are lots of little tidbits that are interesting to me personally. Audiobooks as a big emerging market; Amazon’s ability to leverage their marketplace to goose sales of their own titles; the big market for genre fiction that was clearly underserved by publishing as it worked before. And more, but I’ve taken up enough of Chuck’s space here.

    I’m really looking forward to Hugh’s ongoing efforts to expand and deepen the available info. Thanks for keeping an open (though Terrible, of course) mind.

    • Yeah, my self-pub numbers are, with some variance month to month:

      70% Amazon

      20-25% direct sales

      Remainder to B&N.

      I suspected the Nook was in trouble long before anybody said anything publicly because of the absolute dogshit numbers coming out of there.

      I’m a little surprised more indie-publishers don’t attempt to sell directly, though. A not-surprising amount of readers have little interest in supporting Amazon directly.

      — c.

      • Well it’s one more thing to deal with. And if you’re working as a publisher as well as writing, even if you’re farming some stuff out, you already have a pretty full plate.

        • Actually, it’s a lot easier now that I found Payhip.

          Payhip automates the direct purchase and sending of the book.

          It’s worth the time to do and do right, because the royalty on direct is even *better* than the royalty from Amazon. (Er, “royalty.”)

          — c.

          • Thanks for the tip. True about the royalty (can’t beat 100%!), and also it’s good to keep your own income stream because who knows when any given online retailer is going to start dicking us around.

      • Keep in mind that direct sales have no impact on a book’s ranking on bestseller lists, which can be crucial for fledgling writers. At our level, it’s a good service to our customers. Concentrating on direct sales early on could really harm a writer’s chances of working in his underwear one day.

        • That works if you wanna be all in with Amazon, but I think there’s good reason to be wary of that and to want to diversify up front. People who want to read the book and have it on their Kindle are going to grab it from Amazon anyway. But like I said, you’ll find some readers do not want to support Amazon and/or want to give more money to the author directly. Hence, direct sales. Oh — and direct sales can be entirely DRM-free.

          THAT, to be, is good service to our customers — making sure they can buy us our books at the location they choose in the way they want in the *format* they want.

          Concentrating on direct sales early carves out a smaller but not insignificant revenue stream. THAT is how an author achieves the Pantsless Dream.

          Final thought: just as indie publishers help to balance and compete with the corporate influence of the Big Six (Five, Four, Three?) publishers, indie publishers can help countermand some of Amazon’s heavy corporate influence by carving out other channels of distribution.

          — c.

  • 1). It’s taking an incomplete view of the data and drawing conclusions. I’m not saying it’s not good to put that data together, but no conclusions about indie vs. traditional can be drawn from it.
    2). The fault of the report isn’t that it doesn’t include sales from physical bookstores; it’s that it doesn’t include physical BOOKS. It explicitly limits its data set to ebooks sold through Amazon, not physical books of the same titles. And then it draws conclusions about indie vs. traditional when the majority of traditional revenue could come from physical books. Again, it’s great that this data is being collected, but it can’t be used the way it’s being used.
    3). I’d also suspect that film and TV rights for mid-list authors are in general small. But it should also be pointed out that this report doesn’t actually get at author earnings at all. There’s no real money in the report. It talks about Amazon’s “sales per ranking,” estimates what that hard number might be, and then multiplies it by the list price. But “sales per ranking” is not something you can use to pay your landlord. Howey links to three sources for it, but none of them are official, and the only one that cites any evidence cites “looking at my own sales data real closely for a year.” That’s not the stuff data is made of. I’d also like to know if the spider was scraping free books, because for Amazon, “sales” = “downloads.” Free downloads don’t feed struggling authors, so it’d be good to get more clarification about this point.

    • I think you’re correct that the connection between rankings and sales needs more backup and explanation. That’s a connection that self-publishers have been documenting on their own for at least a couple years now. The data is around but not pulled together in one place, and in some cases has only been shared privately. I hope Hugh and Co. will make that info clearer for everyone.

      I don’t believe that free downloads would be included in this analysis because the free bestsellers lists are completely separate from the paid lists. Conflating the two would indeed be a big data error. Again, I hope Hugh or his Data Guy will confirm that they did this part right.

    • 1) The data explicitly includes physical books sold by Amazon.

      2) Free giveaways do not count toward Amazon rankings, and therefore the data, when it uses rankings which it does almost all the time, would not be misleadingly affected by free giveaways.

  • No, it’s not malicious! But you’re right. I should be more careful in my terms, but I’m basically worried that it’s such a slippery report, and that it’s more extrapolation than data than actually points to something. So when I say it’s not “real” data, what I mean is that the majority of the report consists of calculations that are run together from different sources.
    1). Amazon sales rank (a proprietary, basically secret, and easily manipulatable number that conceals more than it reveals).
    2). The list price. Was the book actually download at the list price? (It could very well be, but it’d be nice to see verification of this, and specifically how they verified it).
    3). Extrapolations from a snapshot in time using a not-quite-understood Amazon’s sales rank (which is not real money) to tax brackets. Just because something is in an Excel chart and is assigned a number doesn’t mean it’s data. It’s a conclusion based on data, which is a different thing.

    So I guess I’m not sure the data is real OR complete! By “real,” I simply mean, “It points back to something that actually happened.” The numbers (not data) as presented now don’t give me any confidence in that.

    • It’s very real. “This is what the book was ranked at Time T.” Real. “This is how many books I sold when my book was ranked R at time T.” Real.

      Again, you obviously feel it is inadequate, and possibly to the point of being misleading. As to that, I cannot gainsay you. Might be, at that. It’s certainly not all that precise. (“God invented decimal points so economists could prove they had a sense of humor.”) But saying that data isn’t “real” implies falsification and/or fabrication. I don’t think you’re doing that on purpose, but that is why I am protesting your usage.

    • @marc Amazon Sales Rank is Amazon Sales Rank and sales are sales. The Amazon Sales Rank is real and irrelevant (the sales rank is relative to other books, for one thing). A conclusion about sales based on it is a fictional number, not data, because it obscures its own questionable origins. You might as well have a column in excel that randomly multiplies the values of the adjacent column. The numbers have no real world referrent so I think it’s legit to say they’re not real until the methodological problems are solved. What would help is not making the data public, but the methodology so it can be tested to see if it’s repeatable or to determine if there are inherent data gathering problems associated with prying info out of Amazon.

    • I think I can set your mind at ease about a few things. Amazon sales rank, while proprietary and secret, is not “easily manipulatable”. It serves a purpose for Amazon and it can’t serve that purpose if it doesn’t do what it says it does. To claim that it conceals more than it reveals is a fundamental misunderstanding what Amazon sales rank is. There is lots of information available about it and I would implore you to become acquainted with it. I completely agree that we need to get a better sense of how well the sales rank maps to unit sales. I’ve proposed a method to the folks at @authorearnings of doing exactly that.

      The price reported in the data was the price the book was listed at on Amazon on the date the data was retrieved. You can verify this yourself. Just pick a random set of numbers between 1 and 1000. For those numbers, pull the list price from the corresponding book in the top 1000 ebook bestsellers at Amazon every day. You will have to note the ASIN of each book as the rankings change frequently. When their next report comes out, you can check your numbers against theirs for the date in question. There’s some room for error, but it should be too hard to figure out. Or you could just assume that the list price is a ridiculously stupid thing to screw up. That’s my position.

      On the issue of extrapolations, I agree we should exercise caution with this. The way they are presented, the earnings figures are generalized. Nobody thinks that a title stays at the same rank for an entire year. If their methodology is accurate, you should be able to take a daily ranking of a particular title and multiply each day’s rank by the projected unit sales for that rank, add up the projected sales over the course of the year and get a number reasonably close to the actual sales. For some values of reasonable and close. I would love to know what reasonably close means in this context. Interestingly, if the errors are normally distributed, the estimate for a particular title doesn’t have to be close for the methodology to work. The aggregate figures might still be correct. However, we don’t know if the errors are normally distributed and I have suspicions that they aren’t. Which would suggest that the individual title sales projections need to be close.

      With all that said, these numbers so much better than anything else available that they have substantial value when their limits are understood.

  • @William I’m sure you’re right about the list price, but I think before anyone quits their day jobs to go indie full time, it’d be nice to hear that the spider wasn’t accidentally scraping the Amazon Best Seller Free List as well as the paid list! Amazon has obfuscated things like this in the past (equating sales with downloads in some contexts, but not others).

    I’m still not with you on the Amazon Sales Rank, though. As you say, it serves a purpose FOR AMAZON. That doesn’t mean it serves a purpose for the writer. The main point of agreement I’ve seen regarding it is that it’s not a report of actual sales, but a sort of popularity contest that judges one book relative to others. Again, not on the basis of absolute sales, but on a relative measure of how a given book stacks up to other books at that particular point in time. A slow Amazon-wide day of sales might leave a book at exactly the same rank even though it sold half as many copies than it did the say before. We don’t know, and Amazon’s not telling. Any sort of variability will be increased as you extrapolate out from one hour’s worth of sales, to a day’s worth, to an annual tax bracket.

    However, if we can do what you suggest in your point about extrapolations, we might get a much better picture of what someone is actually earning, but it’d be nice to have an external check on that. Howey’s also gathering user-submitted data, so it’d be interesting to see if we could run the same Amazon Sales Rank methodology for a single book for an extended length of time (i.e., not for a single snapshot) and then bump that up against what the author actually reports getting for the period in question. I’m most excited about Howey’s user-submitted data collection, because there’s too much (deliberate?) confusion about sales coming from the publishing world (and not just from Amazon, but from publishers as well). An external check on their data would be wonderful.

  • Along a parallel line, I was researching social engineering and ran across the archive of DefCon 15, which has an article about self-publishing versus vanity press. I haven’t finished it yet, but one of the things it asserts is that the average self-published book sells about 200 copies. I’ll keep looking and see if there are sources for this or follow-ups.

  • I think the whole point of this was to provide information like never before. Hopefully, the industry as a whole will take this as a sign to provide more in depth data.

  • The thing about Hugh’s data is it proves what many have been saying for a long time…that self-published authors are running toe-to-toe with traditionally published authors in terms of sales on the largest bookseller in the world. With higher royalties even if these authors ONLY get income from Amazon, they are making a really nice living wage. There are thousands of authors, without household names, that are making five and six figure incomes. It proves once and for all that self-publishing can be a viable option for those with the entrepreneur spirit, and even those seeking traditional publishing should rejoice, since it shows publishers that they are not in competition only with each other, but with the concept of “going it alone.” In such an environment, publishers will have to adjust their contracts and “industry standards” to attract and retain authors.

