Saturday, November 05, 2005

Creating a Master File

Okay enough procrastination on my part...

This is more along the lines of a how-to manual instead of my regular drivel. So I guess this should be filed under “How-to Drivel” or something equally impressive.

If someone happens to have a suggestion/short-cut/alternate route/or much needed improvement please don’t hesitate to post it here. The Knights are about feed-back and shared ideas. I promise I won’t take it personally.

Creating an opening book is really easy if you know what kind of positions you like to play. The hard part is doing enough research to have an understanding about which openings lead to said positions. But then again the best way to find out what you like is to play through the openings. You also have to decide whether or not you plan on playing the mainlines or do you plan to play systems in an attempt to reduce study time. There are pros and cons to both methods and I’m not even going to set foot in that territory.

Ok, with any cook book you need a list of ingredients:

(I’m using the following, but there are a variety of programs that could be substituted to achieve roughly the same results.)

Chessbase 9 (CB9)
Bookup 2000 Pro [Build 25] (BU)
Fritz 8 (F8)

Reference Material
Encyclopedia of Chess Openings A-E (ECO)

Reconnaissance and “Rough Draft”
As White I play 1. e4, of course after pushing my King’s pawn two squares forward
my opponent gets his/her say in the matter so I want to try and be prepared as much as I can.
Since I don’t know who my opponent is and what they might play I now start with the first of a seemingly long list of possibilities.
For this example I’m only worried about 1...e5
I meet 1...e5 with 2. Nf3. Using the ECO index I know that 1.e4,e5 falls under the C grouping. I’m not worried about possible transpositions at this point, I’m after information. So I fire up CB9, open a new board, and enter the previous moves then I hit the reference tab. This is when the fun begins, CB9 proceeds to scan the database that I have chosen to be used as my main source of information. (I’m using Megabase 2005, since it contains the most games and strongest players.) After a brief moment or two (depending on your computer speed) CB9 starts giving me a report of what I can expect to see as Black’s second move in order of frequency played.
The reason I use the “reference tab” as opposed to an “opening report” is because the “reference tab” will find transpositions and is faster for the needs of my rough draft.
If I want more detail I can do an “opening report”.
My database reports 286,513 games are found with this position. Notice I didn’t say move order because the position could in theory be reached by several move orders.
(1.Nf3, e5 2. e4), (1.Nf3,e6 2.e3,e5 3.e4), etc. [I just threw this tidbit in now because it is easier to explain transpositions with a simple example than trying to show something 6 moves deep. Now forget that I mentioned transpositions, and that we are looking for a list of Black’s second possibilities. :)]

The report tells me that Black has played the following:
2...Nc6 (237,048) 83%*
2...Nf6 (31,153) 11%
2...d6 (14,147) 5%
2...f5 (1,904) .006%
2...d5 (938) .003%
2...Qe7 (533) .001
2...Bc5 (318) .001

{* These are percentages that I have added to help throw some perspective on what to expect at this point.}

And even more moves than I have shown, but the number of times those moves have been seen in tournament play lessens significantly the farther we get from the top of the list. Does 2...Qg5?? Really need to be prepped?
Since I want to get through with this sometime before the turn of the next century
I need to establish some guidelines on as to how far I’m willing to prepare.
Where do I draw the line? That’s a tough one to answer, because the variations will continue to fluctuate along with the frequency. So it is at this point that I reach for my ECO for a little additional guidance. (I would probably skip the first step and just head straight to the ECO if it weren’t for the fact that some of the data in the ECO assumes you have the previous editions.)

Since the first five have the highest occurrence percentages I will start with them.
1.e4, e5 2.Nf3, Nc6 is shown as group C4. Turning to the C4 chapter gives
me a list of 10 diagrams that are numbered in succession C40-C49.
Here is where knowing what openings you want to play helps, and if you don’t know
this is a great place to gain some exposure.

C40 1.e4 e5 2.Nf3
C41 1.e4 e5 2.Nf3 d6
C42 1.e4 e5 2.Nf3 Nf6
C43 1.e4 e5 2.Nf3 Nf6 3.d4
C44 1.e4 e5 2.Nf3 Nc6
C45 1.e4 e5 2.Nf3 Nc6 3.d4 ed4 4.Nd4
C46 1.e4 e5 2.Nf3 Nc6 3.Nc3
C47 1.e4 e5 2.Nf3 Nc6 3.Nc3 Nf6
C48 1.e4 e5 2.Nf3 Nc6 3.Nc3 Nf6 4.Bb5
C49 1.e4 e5 2.Nf3 Nc6 3.Nc3 Nf6 4.Bb5 Bb4

The only codes that apply to me at the moment are C40, C41, C42
C40 covers all of the offbeat responses such as the Latvian, and Elephant Gambits.
C41 covers the Philidor lines.
C42 covers the Russian/Petroff Classical lines

The rest head into lines that I don’t happen to play at this moment, such as the Two Knights defense, and the Scotch.
All five of Black responses have been addressed except that I don’t see my response to 2...Nc6 in this group.
I meet 2...Nc6 with 3.Bc4. It must be in another set of ECO codes (C5 Group), so back to the book repeating the above process. Fortunately since I am trying to head into Guioco Piano
waters I get to eliminate ten tons of theory by avoiding the vast expanse known as the land of the Ruy Lopez. [Thank You Predrag! :)]
(I could also find each of the ECO codes using CB9 by selecting Tools/Opening Classification. I would have to scroll through each of the move orders, which would take some additional time. The advantage would be that is gives me the name of the general defense. There are plenty of on-line resources that list the moves with the related names
in great detail just do a search on “Openings classified by ECO code”. I usually just grab the code first and eventually the name.)
Now I know what ECO codes I can use to find or filter games.
Whether I create the files myself with CB9 or just download the games it really doesn’t matter. The main thing is that I want them in .pgn format because they are ultimately headed into a soon to be created Bookup file/book.
So now I gather all of the .pgn games I can find for each of the necessary ECO codes.

It’s not really important to have the most recent games or the strongest players for this part of the book building. All I’m really after is a lot of variations to create a master file for 1.e4 e5 2.Nf3 ... It is faster to prune lines out of an opening book than it is to add them. I learned this the hard way.

Now I fire up Bookup 2000 Pro and create a new Book called “E4-E5 Master”, you could call it anything you wish, it doesn’t really matter as long as you can keep track of it.

Select PGN/Import Games/ ->select the .pgn file. ->Reduce the number of plies to import
Down to 24 (12 moves deep, is plenty for my level of play.) De-select Highlight novelties->Click Ok and repeat the process for each of the needed .pgn files.

Once this accomplished there are just a few steps remaining before I can say that I’m finished with the Master file. Commands/Select “Clear Assessments”->Ok. This strips any numerical assessments from all of the positions in the book. (This may or may not be necessary, but I do it anyway.)

Now back to Commands/Select “Clear Rate Symbols”->Ok. This is necessary to clear the stray evaluations, and give you a clean slate. While doing the previous two steps you will have plenty of time to grab a beverage or a snack between commands.

You will want to back-up your newly created “Master” file. I just create a sub-folder for each master and copy the Bookup files into it. Just be sure that you get all of the files for each “Book” because Bookup uses a multitude a files.

After making a back-up Book of the “E4-E5 Master” as a safety check I would close out of all of the new “Books” and then reopen them one at a time to make sure they function properly. After they check out close the “Back-Up Master” and now rename the Book something like “E4-E5 Work”.

Finally... if you have managed to make it this far give yourself a pat on the back because that was a lot of work, and we have only reached the foot of the mountain.

We will start pruning lines in our next installment.


Blue Devil Knight said...

Excellent, helpful post. I am just starting to realize that my opening needs a "little" work. Don't ever delete your blog, please, at it will be like a library burning down.

Pawnsensei said...

I agree with BD. That is one detailed infopost. It's like one of those Chessbase articles.


Pale Morning Dun - Errant Knight de la Maza said...

Agree with PS, you're giving ole Steve Lopez a run for his money. I have Bookup express, but not professional.

One thing I can't figure here is, and correct me if I'm wrong, whether you are importing all the games from CB9 to bookup with the lines you are concerned about. That sounds like what is going on. Are you pulling games that only white wins? And I'm guessing with the Professional Build of Bookup, you are then limiting the depth of all those games to the first 12 moves. Is that about right?

Sounds like a great way to develop a powerful opening book.

Sancho Pawnza said...

Hi pmd,
Heh, thanks for the compliment but I’m pretty sure Steve isn’t quaking in his boots just yet. He’s only ahead in the how-to department by about 4 or 5 hundred published articles, and probably forgotten more about using databases than I will ever know.
If our paths ever cross I will be sure to buy him as many beers as he cares to drink, just a small token of my appreciation for all of his excellent articles.

Now to your question(s)
I'm pulling all of the games I can find related to each of the lines.
I include wins from both sides. All I'm really after is just the possible moves that can arise out of each of the openings. If I had a raw base file that contained a complete log of every possible move that could arise after every possible position I would just use that as my starting point. By reusing the master-file, naming it whatever opening I'm trying to study and doing a "save-as".
There are 20 possible moves available for White at move 1, and then Black has 20 possible replies for each of White's possibilities. You can see how quickly it gets out of hand if a person were to try and create such a book.
Size wise and speed wise it's easier just to dump games into one file then start pruning.
Once I cover the “pruning” phase you will see that who played and who won really won’t matter. All we are trying to establish is a path through the minefield. By creating such a book it allows us to reach a playable middle game, save time on our clocks, and it gives us something that we can review, study, and update just like the tactical puzzles.

King of the Spill said...

Great post, very comprehensive and well thought out. I think it shows how involved a realistic opening study program is.

I was wondering, have you have changed your style of play because of the extra opening knowledge? Also, one opening in particular I was curious if you had an opinion on was the Nimzo-Indian. Whenever I face this as White against somebody who has done their homework, I invariably struggle after move 5. It seems like that defense gives multiple ways that Black can positionally counterattack to equalize or even gain the advantage.

Pawnsensei said...

Hey Sancho. I'm going to need some help setting up CB9. I am lost.....lost!


Sancho Pawnza said...


"I was wondering, have you have changed your style of play because of the extra opening knowledge?"

Hmmm... That's kind of a difficult question to answer. I guess the main difference is that I'm no longer afraid of playing openings that offer long-term strategic possibilities from equal positions as opposed to trying to play the sharpest lines I can find.

“Also, one opening in particular I was curious if you had an opinion on was the Nimzo-Indian.”
It is one of the openings I play against d4 (if allowed) as Black.

“Whenever I face this as White against somebody who has done their homework, I invariably struggle after move 5.”
Well at move 5 you are really only 2 moves into the Nimzo. The real fun hasn't even started. :)

“It seems like that defense gives multiple ways that Black can positionally counterattack to equalize or even gain the advantage.”
There are lots of plans that Black can follow to try and gain the advantage, it really depends on how White handles the threat after 3...Bb4. You have to keep in mind why
Black evens bothers to try and give up Bishop for Knight.
I once read that it is more important for a player to try and grasp the basic plans and strategies in an opening than it is to learn a bunch of lines. Because once you understand the plans behind the given formations it makes it easier to find the good moves.

Even though I managed to dump a ton and a half of games into the creation of each opening master file, you will be surprised on how few lines I will actually keep for review.

Hey PS,
Just let me know when you want to get together and I will be more than happy to help
you get CB9 setup, and start showing you some of the basics that I know and use.