Meta-Post: New Mascot, New Language, New Database

It has been quiet here at the Use The Index, Luke blog lately. But that’s not because I’ve run out of topics to write about — in fact, my blog backlog seems to be ever growing — the recent silence is just because there are some more demanding projects happening at the moment.

First of all, Use the Index, Luke got a new mascot—not exactly breaking news. However, I’m currently preparing give-aways and merchandise products featuring the new mascot. Stay tuned.

Next, Use The Index, Luke gets translated to Japanese! The first two chapters have just been published. Insiders will remember that chapter 1 and 2 make up half of the book. The translation is done by Hayato Matsuura, Takuto Matsuu has a second look over it. As far as I can tell both are making a great job and I’d like to be the first to thank them! Please help spreading the word about this in the Japanese community.

Finally, I’m just adding DB2 as a first-class citizen to Use The Index, Luke because a client wanted to get my SQL performance training based on DB2 LUW Express-C (which is free, by the way). Like the Japanese translation, this work is not yet finished. However, the appendix on execution plans is already there. Again, please help spreading the word about this in the DB2 community.

That’s it. I just wanted to give you a short update.


MongoDB is to NoSQL like MySQL to SQL — in the most harmful way

Yesterday evening I tweeted: “MongoDB seems to be as bad for NoSQL as MySQL is for SQL.” Unfortunately, I tweeted without context. But I guess I couldn’t have given all the required context in a single tweet anyway, so I’m dedicating this post to it. I hope this answers some of the questions I’ve got in response to the tweet.

First of all, I think everybody should know that I’m not a NoSQL fanboy, yet I’m open to the polyglot persistence idea. This distinction doesn’t seem to make sense if you read NoSQL as "not only SQL" (as you are supposed to do). However, I believe there are NoSQL systems out there that greatly benefit from the idea that SQL is bad and not using SQL is good. On other words, they offer “not using SQL” as their main advantage. MongoDB seems to be one of them. Just my perception.

But if I don’t like NoSQL, then I should like MySQL? Not exactly. In my eyes, MySQL has done great harm to SQL because many of the problems people associate with SQL are in fact just MySQL problems. One of the more important examples is that MySQL is rather poor at joining because is only supports nested loops joins. Most other SQL database implement the hash join and sort/merge join algorithms too—both deliver better performance for non-tiny data sets. Considering the wide adoption of MySQL (“The most popular open source database”) and the observation that many people move away from SQL because “joins are slow,” it isn’t far-fetched to say that an implementation limitation of MySQL pushes people towards NoSQL.

Now let’s look at MongoDB. I think the direct competition between MongoDB and MySQL became most obvious in the epic video “MongoDB is Web Scale.” In the meanwhile, MongoDB even claims to be “the leading NoSQL database” — does that sound like “the most popular open source database”? Nevertheless, MongoDB has disappointed many people because it couldn’t live up to it’s promise of “web scale” (example: global write lock up to release 2.2).

The next piece in the puzzle that eventually caused me to tweet was a funny tweet by Gwen (Chen) Shapira (she’s an Oracle DB consultant):

#mongoDB : the big data platform that is challenging to scale over 100GB. http://blog.mongodb.org/post/62899600960/scaling-advice-from-mongohq

Note that the link was broken for a while (the post originally appeared on Sep 30, then disappeared, but is online since Oct 2 again at a different URL). The article is about handling MongoDB if it grows above 100GB. It gives me the impression that scaling MongoDB to that size is a serious issue. Even though there is no exact definition of “web scale” I guess most people would assume that it should be easy to scale MongoDB to 100GB. 100GB is not big data nowadays. 100GBs can be easily managed with most SQL DBs (joining in MySQL could be a problem). It was really funny to see this post on the MongoDB blog. Chen’s tweet nailed it.

At this point, I was once more thinking about the “misspent half-decade” mentioned by Jack Clark in his article “Google goes back to the future with SQL F1 database.” But as mentioned before, I like the idea of polyglot persistence. I’m not saying NoSQL is bullshit—not just because a single implementation fails to deliver. That would be like saying SQL is bullshit because MySQL is bad at joining. On the contrary, it reminded how Alex Popescu lost his temper in his post “The premature return to SQL” last Friday. His response to the “misspent half-decade” was:

Just take a second a think what we got during this misspent half-decade: Redis, Cassandra, Riak, a multi-parallel fully programmatic way to process data, Cascading, Pig, Cypher, ReQL and many more tools, languages, and APIs for processing data.

Well, I don’t know all of these but I do realize that some of them are interesting tools to have in the tool box. Further, I’m following Alex Popescu long enough to know that he is rather reflective on NoSQL—the title of his post being an exception. That’s why I came back to his post to see if he mentioned “the leading NoSQL database“ in his list. He didn’t. I don’t think it’s a coincidence.

At this point it was inevitable to see MongoDB as a popular, yet poor representative of it’s species—just like MySQL is.

If you like my way to explain things, you’ll love SQL Performance Explained.


Try it online!

Today marks the third anniversary of Use The Index, Luke! And I have to fulfill a promise I gave one year ago: You can now test many of the example from this site online at SQLFiddle.com.

Here is a trivial example how it looks like. Just click on the SQL Fiddle logo on the right top corner of the execution plan.

SELECT first_name, last_name
  FROM employees
 WHERE employee_id   = 123
   AND subsidiary_id = 30
Try online at SQL Fiddle+----+-----------+-------+---------+---------+------+-------+
| id | table     | type  | key     | key_len | rows | Extra |
|  1 | employees | const | PRIMARY | 10      |    1 |       |

As before, MySQL is able to use access type const because the query cannot match more than one row. Note that the key lengths (key_len) has become bigger because it now uses two columns of the index. See ??? for more details.

Try online at SQL Fiddle---------------------------------------------------------------
|Id |Operation                   | Name         | Rows | Cost |
| 0 |SELECT STATEMENT            |              |    1 |    2 |
|*2 |  INDEX UNIQUE SCAN         | EMPLOYEES_PK |    1 |    1 |

Predicate Information (identified by operation id):
   2 - access("EMPLOYEE_ID"=123 AND "SUBSIDIARY_ID"=30)
Try online at SQL Fiddle                QUERY PLAN
 Index Scan using employees_pk on employees 
   (cost=0.00..8.27 rows=1 width=14)
   Index Cond: ((employee_id   = 123::numeric)
            AND (subsidiary_id = 30::numeric))
SQL Server
Try online at SQL Fiddle
|--Nested Loops(Inner Join)
   |--Index Seek(OBJECT:employees_pk,
   |               SEEK:employee_id=@ AND subsidiary_id=@2
   |            ORDERED FORWARD)
   |--RID Lookup(OBJECT:employees,
                 LOOKUP ORDERED FORWARD)

As I said—a trivial example borrowed from chapter 2.

I have to admit that not all examples are available at SQL Fiddle yet. At the moment I’m finishing the examples of chapter 2. However, if you have read SQL Performance Explained you know that chapter 2 makes up half of the book. In other words, half of the book is already available at SQL Fiddle.

I hope this online experience makes Use The Index, Luke an even more awesome learning resource. A large part of this additional awesomeness is owed to Jake Feasel who built SQL Fiddle. Please note that you can flattr SQL Fiddle and donate via PayPal (on the right top of the page).

If you have not yet read the book, please have a look at the table of contents now and remember that you are just one click away from actually running the examples shown in the book. Learning about SQL performance has never been that easy ;)

Original title and author: “Try it online!” by Markus Winand.


I need your help!


Would you please spare me a few minutes and help me a little bit?

As you might know—or maybe not—I’m making my living as an independent trainer and consultant. Up till now I’ve only delivered on-site training at the clients’ site, but I though it makes sense to offer open trainings as well so that singe participants can also join. For that I’d need to know how many people would like to join such a training, where they are physically located, and which database they are using. So, I’ve set up a short survey:


It’s just one form and won’t take much time. Naturally, you are under no obligation if you fill out the form, and as a way of saying thank you for your time, I’ll be drawing three participants to receive a DON’T PANIC towel—the perfect preparation for the upcoming Towel Day on 25th May. So take a minute to complete the survey now because the deadline is 26th of April!



Original title and author: “I need your help!” by Markus Winand.


Customs and, well, errata…

I’ve finally managed to put all the errors that were reported by readers so far online. I’m not exactly proud of this, but there are already more than 30 known errata for the German edition. There are still some German copies of "SQL Performance Explained" on stock, but it seems like I have to arrange a second printing soon. I will surly take up all these errors for the second printing. The English edition has just two errata at the moment, but I doubt it will stay that low. You are more than welcome to report any mistakes you find: I do follow them up :)

And now, something different: customs. I’ve received some inquires if I can indicate how much customs you might have to pay if you order the book from outside the EU. Well, I cannot tell you the exact amount, but help you Google it ;) The important information you need to find the rate in your country is the so-called HS number and the origin of the goods. I’ve to put this information on the CN22 customs declaration on the outside of the package, so authorities know what’s inside. The HS number for books is 490199 and this particular book is printed in Austria. That’s what I write on the customs declaration form. I’ve also been told that some countries might charge other fees besides customs (e.g., VAT). Sorry, don’t know any more details.

However, I’m asking all international buyers to report if and how much fees they had to pay. Until now, I’ve not received any feedback—but I’ve done only one international shipping so far. Next one is due on Tuesday.

Original title and author: “Customs and, well, errata…” by Markus Winand.


Finally: Almost World-Wide Free Shipping

The English edition of “SQL Performance Explained” is finally available. It has its own website http://sql-performance-explained.com/ where you can find a PDF preview and order information. The cover price is EUR 29.95 or GBP 26.99.

If you order it from outside the EU, you must also consider that you might be required to pay import duties in the destination country.

Knowing that I have readers all over the world, I’ve spent a fair amount of time to make world-wide free shipping possible. I almost succeeded. If you use the direct order form I’ll be able to offer free shipping to 150+ countries. However, this is only possible when using a volume discount offered by the Austrian postal service. Therefore, I’ll dispatch international orders only once a week.

Fine print: There are some countries that do not accept dutiable letter post. These are: Afghanistan, Albania, Azerbaijan, Bangladesh, Belarus, Benin, Burkina Faso, Cambodia, Chile, Colombia, Côte d’Ivoire (Rep.), Cuba, Dem. People’s Rep. of Korea, Djibouti, El Salvador, Kazakhstan, Latvia, Mali, Mauritania, Moldova, Nepal, Peru, Russian Federation, Turkmenistan, Ukraine, Uzbekistan and Venezuela (see Article X on page 79 of the Universal Postal Convention).

If you live in one of those countries, please feel free to use the direct order form anyway. There is no obligation in filling in the form. If I know that you live in one of these countries and would like to get the print edition, I can check how I could send it to you anyway.


Update Performance

Todays installment covers update performance. As slow as insert and delete together, but sometimes still faster.


Window-Functions for Pagination

Todays installment finalizes the Chapter on partial results and shows how to use SQL:2003 window-Functions for an offset-based pagination query in SQL Server and Oracle.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com