262 Chess Database Basics

by Steve Lopez

Originally published as a series of articles by ChessBase GmbH in 2004; reprinted and revised here by permission of the author.

Technical writing isn't like any other kind of writing. It's not (necessarily) telling a story. It's not like creating a poem in iambic pentameter or a haiku in which strict rules of form must be followed. It's not like writing a newspaper article in which the "five W's" should be provided in the opening paragraph. And writing a technical column for a general and varied audience is a somewhat dicey proposition. You're walking a high wire, in that you don't want to make things so complex that it shoots over the heads of beginners while, at the same time, you need to provide enough "meat" for the more advanced members of the audience.

I was reminded of this the other night as I was browsing a used book store's History section. I was all set to buy a "general overview" of the American Civil War when I suddenly realized that I don't need it. I've read a dozen such books and I derived absolutely nothing new from the last four or five of them. So I saved my hard-earned ducats by putting the book back on the shelf. "Maybe somebody new to the ACW will benefit from it," I thought. And that's when a new thought hit me: I'd "graduated" from the proverbial "Civil War 101 class" ages ago. Consequently, there are many bits of knowledge I take for granted that a reader new to that historical period wouldn't know.

Tonight as I was tearing up the highway between the ditches on I-70 I made yet another related connection by reversing the thought. There are a lot of things I take for granted as a chess writer that might not be "common knowledge" for new readers of my columns. It's not that I'm terribly smart (I'm not) but rather that I have over a decade of experience with chess computer software and there are things I learned in the early going that aren't at all obvious to someone who's just bought his first PC chess program. Over the years in which I've been writing chess software instructional materials I've tried to strike a happy medium between the beginning and advanced software users. I've tried to make the articles as simple as a _______ for Dummies book while simultaneously providing enough meat for the folks who've been around the block a few times.

But I've been recently reminded that some readers have become lost in the shuffle. Sure, almost anyone can install a chess program and be playing a game against the computer within a few minutes. However there are a lot of extra features that baffle new users, features that I take completely for granted.

One of these is the chess database. I'm discovering that there are a lot of recent converts to electronic chess tools who have no idea what a database is or what one is supposed to accomplish by using one. So in this lengthy article we're going to look at chess databases -- and we're going to do it from scratch with no preassumptions being made from my end. We'll start with generalities and then move on to the specifics. We'll use Fritz as our "example" program, but a fair little bit of the ground we'll cover will be applicable to any chess software program that contains database functions (especially ChessBase, as database searching is the primary function of that software, as well as Fritz' sister programs: Hiarcs, Junior, Shredder, and Rybka).

I'm going to try a different approach with this piece. There will be places where extra exposition will be useful but not required. So I'm going to add this extra explanation in the form of footnotes. I can already hear some of you groaning but, trust me, it'll be pretty painless. The extra footnoted material will appear in red lettering and will be interspersed throughout the article instead of appearing at the end of the piece. If you don't want to read them you can easily skip down to the next paragraph that appears in standard black type. If the approach works (and I think it will), great. If not, it was an effort worth the attempt.

The first thing we'll need to do is define a "database". A database is any collection of related information. When new PC users think of a database, they think of some monstrous master collection of information containing all of the world's knowledge (like that computer in the original version of the movie Rollerball or the Enterprise's computer on Star Trek). Sure, that description works but it's not very accurate. A database doesn't have to be "monstrous" -- heck, it doesn't even have to be what most people think of when they hear the word "information". I run an online game league and keep a database of the league's players: names, nicknames ("handles"), e-mail addresses, etc. It's not a huge group; maybe a couple of dozen players participate. But this collection of useful (to me, at least) information by definition constitutes a database.

Back in the pre-electronic age, one would have needed to use a somewhat modified definition for the term "database": any organized collection of related information. Let's look at a familiar example from everyday life: your local telephone directory. Unless you live someplace like Elk Hills, Wyoming (population 45), the phone book would be totally unusable (and therefore worthless) if it wasn't organized. If a resident of Elk Hills needs to look up a phone number, it's not crucial for that list to be an organized one since it'll fit comfortably on a single sheet of paper; he can just visually scan the list until he finds the number he needs. But a denizen of a major metropolitan area needs organization in his list or else he's gonna find bupkis when he looks for a particular number.

The standard organizational method for a printed telephone book is to list people alphabetically by last name. This has worked wonderfully well for decades. If you need to find Ralph Callahan's phone number you just flip to the "C's", then to "Ca", and so on until you find "Callahan". Then you check the first names/initials (also conveniently alphabetized) until you see the "R's". And you should quickly find you old pal Ralphie's number.

There are other ways to organize telephone directories. Way back in the day you used to be able to purchase a printed phone book with the organizational method was numerical order by phone numbers. But early telemarketers found this to be a useful tool for locating people to annoy at dinnertime, so you generally can't get these books anymore. Another approach that's still in alternative use in some places is a "city directory" which groups listings by neighborhoods/street names. But the tried and true method is still the alphabetical phone book which has worked very well.

The approach does have occasional (sometimes embittering) limitations. Say you meet a really hot gal named Evelyn down at the nightclub and she winds up inviting you home. You gladly accompany her and spend a wonderful evening at her place over on Market Street. As you're groggily leaving the next morning, she breathlessly whispers, "Call me?" as you depart. It's not until after you get home that you realize that she never gave you her number, you never found out her last name, and that you were so hungover when you left her place that you didn't take note of which building she lives in -- and Market's a pretty long street.

So what do you do? If all you have is an alphabetical phone book, you're sunk; enjoy the memory and learn to live with the loss. If you have a "city directory" style of book available you have at least a shot at finding her number: just look up all the people on Market whose first names start with "E". Of course if she gave you a fake first name this won't help you either -- but at least there's a ray of hope.

This is exactly why electronic databases (as opposed to paper ones) are becoming so popular. With a decent electronic database you at least have a shot at pulling up the info you need in a fast easy manner. In the above case, you'd load the software, do a search with "Market" as the "Street" parameter and "E" as the first initial, and maybe score a few hits; with luck one of them will be the cutie from down at the club.

That's why we have to take the word "organized" out of our definition of the word "database" when we're dealing with the electronic medium. An electronic database doesn't even have to be organized. The master list of information can be thrown together in any old haphazard manner; in fact if you were to look at a printout of the contents of a lot of electronic databases you'd see that there's really no rhyme or reason to the way they're organized. The entries don't need to be in alphabetical order, numerical order, or any other kind of "order". They can be thrown together any old way because the search tools for an electronic database (even an unorganized one) allow you to pull up the information you need, and you can oftentimes do it much more quickly than you could if you were using an organized print database.

In the words of Cliff Stoll, this is some hot damn stuff. Nowadays there's no need for people to spend countless hours organizing the raw material and for other people to spend countless more hours hunting up specific bits of information. The organizational part is gone and the search times have been cut to seconds.

Here's a personal example. I'm a Civil War historian and one of the legendary primary sources in my field is a 130 volume set of books called The Official Records of the War of the Rebellion. You can purchase printed versions of the books, though it'll cost you dang near three large to buy them all and you might have to add a room to your house in order to store them. Alternatively, you can do what I did: buy the complete collection on CD. Believe me, this has saved me an incalcuable amount of time and, considering what my time as a writer/researcher is worth, the CD paid for itself in the first week I owned it. I once tried looking up all the references to a small West Virginia battle in the paper version; I hunted for hours and still didn't find all of the material (The Official Records is a truly abominably badly organized work). I did the same thing using the battle's name as a search parameter in the CD's serach window and got all of the references in a couple of seconds.

You probably do this all the time without even thinking about it. Everytime you use Google for an Internet search, you're searching a database; the Internet is the world's largest database and is absolutely the worst organized. You can try it right now if you like. Do a Google search for "Britney Spears" and "topless" and you'll instantly have links to hundreds of websites. Of course, this doesn't mean that you're necessarily going to find what you're looking for. You'll get a bazillion hits but none of 'em will have ol' Brit with her duds off -- but they will contain countless offers to sell you such a pic. Heh -- P.T. Barnum was right all along.

OK already -- I can hear you asking what all of this has to do with chess. A chess database is a searchable collection of chess games. You can use a program's search tools to find the information you desire by entering parameters -- you tell the program what to look for and it'll find the games that meet your requirements. And a chess database doesn't even have to be organized in any sort of chronological or alphabetical manner. The games can be stored in any old haphazard order and the search mask will still pull up what you need. [1]

[1] Of course, there are reasons why you might want to have a database that's compiled in an organized manner. If you want to create a tournament crosstable using ChessBase/Fritz software, all of the games of that tournament should be "blocked" together in the database (although you can also do a search for that tournament's games and then create a crosstable from the list of hits). And it's much easier to visually scan down an organized list as a means of "browsing" the database's contents without necessarily doing a search.

Now why would you find a chess database useful? The answers are as varied as the number of chessplayers using databases. Historical researchers and writers often use databases in their work; I used to write a series of articles on great chessplayers from the "Golden Age" and frequently used a database to find and review their games, often as a means of locating games to include in the articles. Correspondence players are always using databases to look up opening variations and positions as a means of evaluating strong moves to use in their postal games. Some folks just like to play through great games of the past -- I know a lot of database users who no longer actively play chess themselves but enjoy replaying the games of classic chess contests. And dang near everyone who owns a chess program with database capabilities will sooner or later create a database of his or her own games as a record of their own chess exploits; the first thing I did when I got my copy of Knightstalker (Fritz1) back in 1992 was to create a database of my USCF over-the-board games (and it's a database I still use all these years later).

By far the most compelling use for a chess database is to use it to improve our own chessplaying skills. That's why most chess books [2] contain example games; the author explains a concept and then illustrates it with one or more actual games in which that idea appeared.

[2] The reason I say "most" is that there are a few notable chess books which contain no games, just an explanation of terms or concepts. Bruce Pandolfini's Weapons of Chess jumps immediately to mind here.

While it's certainly most beneficial to have some sort of "guiding hand" showing you the way, it's not always a requirement. Many chessplayers learn a lot through a form of "osmosis": playing over a large number of games and gradually seeing commonalities and patterns emerging in them. [3] This is a major reason why using a database is so beneficial. You as a player will see common patterns from game to game and while it's not advisable (or even possible) to blindly ape the moves of strong players, seeing and understanding their techniques in these common circumstances and being able to adapt them to the unique (but similar) circumstances in your own games will certainly improve your chess results.

[3] Studies have shown that pattern recognition is a very important component of an individual player's overall chess skill. Chess is a game in which similar general patterns are frequently seen in dissimilar specific positions. The ability to recognize these patterns and adapt a "standard" general set of mental tools and procedures to specific and unique positions is unarguably a major part of a strong chessplayer's skill set.

We've discussed the "what" and the "why". Next we'll look at the "how".

We've discussed what a chess database is and why you'd want to use one. This time around we'll talk about some basic header searches.

What's a "header"? When you open a database and look at the game list you're looking at the basic header information: players, tournament, year, etc. Header searches are the easiest searches to perform because they involve simple typing and/or selection of checkboxes for other parameters.

Start Fritz and hit F12 on your keyboard to go to your database's game list. Then go to the Edit menu and select "Filter games" to bring up the program's "Search mask":

The search mask is the tool that allows you to tell the program what information to look for. There are four tabs at the top of the search mask; we're about to look at the search parameters provided under the "Game data" tab. Most of the parameters you can specify under "Game data" deal with information found in the game headers.

Arguably the most common header search is by player name. The "White" and "Black" fields let you specify names of players for whom you want to search. These fields work just like a telephone directory in that you'll need to type in the last name first, followed by the player's first name. For example, let's say you want to find all of Garry Kasparov's games in which he played the White pieces. In the first box after "White:" you'd type "Kasparov" (without the quotation marks). If you want to be more specific, put a "G" in the second box after "White:". You'd also want to make sure that the box next to "Ignore colors" is unchecked.

Keep in mind a few tips for doing player searches:

The little "dot" between the two boxes is a comma, which indicates that you should type the player's last name in the lefthand box and his first initial or first name in the righthand box.
You can use partial names and "wildcards" (asterisks to replace specific letters) in your searches.
The "Ignore colors" box is very important. If you uncheck this box and type a player's name in the White box, it'll bring up all of that player's games in which he played the White pieces. If you uncheck it and type his name in the Black box, the program will display his games in which he played as Black. If you check this box and type in one player's name, it'll bring up all of that player's games regardless of which color pieces he played.
It's best to leave the first name field blank whenever possible. However if you're finding that more than one player with the same last name is being found, try using the desired player's first initial. Using the complete first name should be a last resort for several reasons. First of all, a player's first name may appear in the database under several different spellings ("Garry", "Gary", "Garik", etc.). Also a player's games might appear with no first name or initial at all -- typing anything in the first name field will cause the program to ignore the games in which no first name or initial is provided. [4]

[4] This latter point is why many players prefer to purchase databases such as Big Database or Mega Database rather than try to assemble databases from myriad Internet downloads, which will likely contain variances in the manner in which player and tournament names are provided. ChessBase's commercial databases contain standardized spellings of player names as well as tournament locations, eliminating the worry of a search "missing" games because of varied nonstandard header info.

Let's say that you want to find all the games two players contested against each other. You'd just type one player's name in the White field and the other player's name in the Black field. If you uncheck "Ignore colors", you'll get all of the games with the color assignments exactly as you specified them. If you check "Ignore colors" you'll get all the games the two opponents played against each other regardless of which player had which color.

The "Tournament" field allows you to search for specific events; for example, typing "Linares" (without the quotes) in this field will cause the program to pull up all the games in the database which were from events in Linares. The Tournament field works best when combined with a "Year" search; more on this later.

"Annotator" will pull up all the games annotated by a specific writer (assuming that any of the games in your database are annotated). As with the Player fields, it's best to use just the writer's last name without a first name or initial; most databases from ChessBase use just the last name of the author in the Annotator field.

The "Elo" fields allow you to specify a range of Elo ratings and further modify them with radio buttons located beneath the numerical dialogues:

"None" is the default value and should be used if you want a lot of "hits" from the database. Many games (particularly ones downloaded from the Internet) don't include player ratings in the game headers; selecting "None" ensures that no games will be ignored due to the absence of Elo data in the header.[5]
"One" means that at least one of the players must have an Elo rating within the range specified.
"Both" means that both players need to have an Elo rating within the specified range of values.
"Av" means that the average of the two players' ratings must fall within the range of values you provide.

[5] Also be aware that nobody had an Elo rating prior to the early 1970's, so selecting an Elo range and clicking a radio button other than "None" guarantees that you'll get no games from the period prior to the introduction of the Elo rating system. I can't tell you how many e-mails and phone calls I've received from disgruntled users who want to find all of Capablanca's games and are torqued because the search found nothing: "But there has to be some games! I even told it to find games with a rating of 2200 and up -- and Capablanca's rating had to be higher than that!" Uh, right.

"Year" lets you specify a single year (if you type the same year in both boxes) or a range of years. This is very useful when coupled with the "Tournament" field; for example, providing the name "Linares" in the "Tournament" field and "1992" in both "Year" boxes will bring up all the games from the Linares 1992 event that are in your database.

Another commonly-used field is the "ECO" field which lets you type in a single Encyclopedia of Chess Openings code (in both boxes) or a range of these codes. For example, typing "B12" in both boxes will get you all of the games in the Caro-Kann Advance (as well as a few extras, like the Fantasy Variation), while typing "E60" in the lefthand box and "E99" in the righthand box will bring up all the games of the King's Indian Defense.

"Moves" lets you type in a range of moves. For example, putting "1" in the lefthand box and "20" in the righthand box will pull up all the games which lasted twenty moves or less.

"Text" will pull up all of the special database texts within the database. These are typically instructional texts (on ChessBase training CDs) or tournament reports (most often found in ChessBase Magazine).

"Result" allows you to find a variety of ending results for games:

1-0 designates games won by White.
0-1 means games won by Black.
½-½ refers to draw games.
Mate indicates games ending in checkmate.
Stalem. will bring up drawn games which end in a stalemate position.
Check provides all of the games which end with a check (though not mate).

You can combine more than one of these "Result" parameters. For example checking "1-0" and "0-1" will give all the games that ended in a decisive result (i.e it would exclude all drawn games). However, choosing "Stalem." in conjunction with one of these would just be dumb -- stalemates are draws by definition. So use a little common sense if you combine these parameters.

You can combine any of the parameters in this screen of the search mask but remeber that the more information you enter into the search mask, the fewer games you'll get as a result since all of the stated conditions must apply in order for a game to qualify as a "hit". The program combines all of the parameters you've set. For example, if you designate "Kasparov" as the player, "E60" through "E99" as the ECO codes, and "1-0" as a result, the program will return a list of E60 to E99 games in which Kasparov was a player and White was the winner. This is not the same as a combined list of all of Kasparov's games along with all White wins along with all E60-E99 games.

After you've set your parameters, click the "OK" button. The program will then search the database and pull up a list of all the games that qualify under the parameters you've set. Just double-click on a listed game to load it on the main chessboard screen; you can now use the cursor keys or VCR buttons to replay the game.

If you've finished with your list of the games found by your search and want to return to the full list of games in Fritz, just go to the "Edit" menu, select "Filter games" from the menu, click the "Reset" button (the one next to the "Cancel" button"), and then click "OK". You'll again see the complete list of games from your database.

Pretty easy, right? You now should be able to do simple header searches on your database. Next we'll look at ways to find things that are "hidden away" with the games themsleves.

Now that we've learned how to perform game header searches on a chess database using the search mask in Fritz it's time to look a little deeper (literally) at some other search types.

The header searches we've already discussed involved searching for games by criteria given in the game list information (player names, ECO code, etc.). This type of search can be performed comparatively quickly, even on very large databases, because the program has only to look at the header information, not at the contents of the games themselves. Another kind of database search involves the annotations within the games; this kind of search usually takes a bit longer than a header search because the program has to look at information contained inside the game. [6]

[6] For ChessBase "old timers", increased search speed was the motivating factor behind the change in the ChessBase data format between ChessBase for Windows and ChessBase 6. In the old (pre-1996) format, all of the game information (header info, moves, annotations, etc.) was concentrated in the .cbf file. When you did even a simple search by player name, the program had to search the entire .cbf file for games played by that player -- and this included a whole lot of data which had nothing to do with the game headers. Starting with ChessBase 6, game data was split across multiple files: the games headers were contained within one file, the game moves within another, annotations within still another, etc. When you do a header search, for example, the program now has to search through just one file which contains only header information. This is much faster and more efficient than the old days when the program was required to search through everything regardless of whether or not it related to the game headers. This also leads directly to a point which will be covered in a later article in this series: a hierarchy by which the program performs combined header/data searches.

In Fritz, open a database and bring up the search mask (as described previously). This time around, though, click the "Annotations" tab at the top of the search mask. You'll see the following dialogue:

This dialogue gives you the opportunity to search for games containing specific annotations. Now this next bit might seem obvious to many readers but it deserves some emphasis for folks new to database use: for an "Annotation" search to work, at least some of the games in the database must be annotated. It sounds like a silly thing to point out but there are a lot of commercial databases which contain no annotated games (ChessBase's Big Database is an example). And the vast majority of databases you can download from the Internet also contain no annotations (The Week in Chess is an example of this; you'll seldom find an annotated game in TWIC). So for this kind of search to work, you must be using a database in which some of the games have annotations. It might be a ChessBase Training CD or the Mega Database; it might even be a database containing games you've annotated yourself. Either way, a database must contain annotations, otherwise an annotation search will always come up "empty".

We'll start in the middle of this dialogue to discuss the "check boxes" within the framed border. Clicking on one of these boxes and then clicking "OK" will cause the program to find all the games in which that particular type of annotation appeared.

Many of these annotation types aren't ones you can create using Fritz (these require ChessBase to be able to add them to games), but Fritz will certainly find these annotation types if they're used in a database. We won't be discussing these annotation types in-depth; instead we'll provide a brief description of them.

Colors -- This refers to the use of colored arrows and squares which emphasize moves of interest or key squares on the board.
Training -- Games which contain timed training questions (either generated automatically by Fritz' post-game analysis features or manually by a human annotator) will be discovered by using this search.
Multimedia -- Games which contain multimedia elements (pictures, sounds, or videos) will be found if this box is selected.
Pawn structure -- This will find games in which the human annotator has chosen to include one or more pawn structure popups.
Piece path -- Selecting this option will list the games in which a human annotator has included popups to illustrate the cumulative moves of a particular piece or pawn.
Variations -- One of the most useful selections you can make in this dialogue, checking this box will provide a list of all games containing replayable alternative variations.
Any text -- Selecting this option will bring up a list of all games containing text commentary. This, too, is one of the more useful toggles in this dialogue.
Any symbols -- This will provide a list of all games which use Informant-style evaluation and commentary symbols.
Critical Opening/Middlegame/Endgame position -- a special feature of ChessBase allows the commentator to mark positions as being of significant interest; these appear highlighted in a different color within the gamescore. Fritz will sometimes also highlight moves using these special designations. You can use this toggle to find all games in which a specific kind of critical position (Opening, Middlegame, or Endgame) appears.

This stuff is all pretty self-explanatory. If you want to find all the games in the database which contain replayable variations, you'd just put a check in the box next to "Variations" and click "OK". Within a few moments the program will provide a list of all games in which such variations appear.

It's possible (and often preferrable) to use more than one of these toggles at a time. Clicking only "Variations" will provide a list of all games with replayable variations, regardless of whether or not they contain any text or symbols. But if you also include "Any text" and "Any symbols" as search parameters the search will turn up the games which use all three variation types within the same game: variations, symbols, and text commentary.

On the other hand, you might want to see only games with text commentary regardless of the use of any other annotation types. In this case you'd select only "Any text"; all games containing text will be identified and listed for you. [7]

[7] There's sometimes a slight hitch here which involves languages. For example, if you're searching a Chessbase Magazine database in which a commentator has annotated a game in French and you've not selected "French" as one of your language options the game will still appear in the game list after the search is complete but the game will appear to have no commentary. This isn't a "bug". The game does qualify as being text-annotated, but the fact that you don't have "French" selected as a visible language choice will mask out the game's text.

Clicking "Any text" means just that: any game which contains something that was typed within an annotation/commentary window. This might be something as simple as the word "Time" added at the end of the gamescore (meaning that one player lost because his flag fell) as the game's only annotation. This is why the search mask also lets you search for specific words or phrases and why you might want to use the "Text 1" and "Text 2" boxes instead of "Any text".

There are several reason why you might want to search for a specific word or phrase. One reason might be to use the search as a kind of "filter" to weed out games with fairly insignificant text usage (such as the word "Time" at the end of the game as described above). Another use might be to search for a specific game which you know uses a particular word in the commentary; for example, you might vaguely remember an old game annotated by Nigel Short in which he said he felt like a "real patootie" after a certain move. You could do a search for "patootie" to find that game (as well as any others in which that colorful expression was invoked).

You can search for multiple expressions by typing them in the "Text 1:" and "Text 2:" boxes. For example, you could type "professional" (without the quotes of course) after "Text 1:" and "computer" after "Text 2:" -- this will list all the games in which either "professional" or "computer" was used (not necessarily both words in the same game, though). Note that this differs substantially from the results obtained when using the check boxes. If you check both "Training" and "Critical Position Opening", you'll get all games in which both annotation forms are used in the same game. But typing words or phrases in both of the text search boxes will yield all games in which the "Text 1:" expression was used plus all games in which the "Text 2:" expression was used -- but this doesn't mean that all the games found will use both expressions in the same game; most (or all) will contain one or the other.

There's an additional tweak available here: the "Whole word" box. If you leave this box unchecked and type in the word "it" after "Text 1:", the search will turn up all games in which those two letters appear together, even if they're part of a longer word. So you'll end up with all games containing the word "it", plus the words "commit", "committed", "fit", "itself", "its", etc. If you want to find all games containing the word standing alone (not as a part of a longer word), you'll want to be sure to check this box. A better example is the word "mate". With the "Whole word" box checked, games containing the word "mate" will be found. If you uncheck this box, the search will include games that have the words "mate", "mated", "checkmate", and "material"(!) in the search result.

"Symbols" lets you hunt for games that contain particular symbols attached to moves. For example, typing "!!" (again without the quotes) lets you find brilliancies, while typing "??" will turn up games containing moves that the annotator thought were blunders. Note that Fritz will occasionally toss in these diacritical marks as part of its analysis, so this is a way to find games that Fritz has so marked.

Note, too, that you can use this dialogue to search for games containing specific Informant-style symbolic notation; Fritz' Help file has a list of the keyboard shortcuts for these symbols. You can use a shortcut to enter a particular symbol into this dialogue box and the search will turn up all games in which that symbol was used by the annotator. For example, you could use CTRL-3 to enter the "unclear" symbol (better known to non-chessplayers as the "infinity" symbol), click "OK", and the search result will be a list of games in which the annotator evaluated a position as "unclear".

The "Deleted" checkbox will bring up a list of games that have been marked for deletion. [8] This is pretty handy for those cases in which you're afraid you might have incorrectly marked a game to be deleted but it's buried somewhere deep in the bowels of your database. Using this toggle will bring up all games that have been "struck out" in the game list and are ready to be deleted. You can pick out the game you marked incorrectly, highlight it, and hit the "Delete" key to unmark it and remove the "ready to be deleted" status.

[8] Deleting games is purposely a multi-step process. There's no "Undo" function when it comes to deleting games from a database; once you've deleted a game, it's gone forever with no chance of parole. So the process involves first marking games for deletion, followed by a second step which performs the actual physical deletion. I occasionally encounter users who accidentally delete games and want to "blame" the software. Sorry, but if you've deleted a pile of games in Fritz that you didn't mean to delete, you've been very unlucky (or, more accurately, very careless).

Finally there's the "Position" box. This has nothing to do with finding specific board positions (we'll cover that feature next). Checking this box will instead bring up all games in the database which start with a position other than the normal opening position for a game of chess. This will include 19th century "odds" games as well as any middlegame tactics problems, endgame studies, or "fragmentary" games (from which the opening moves are missing or lost) that may reside in that database.

Here again you can combine criteria to perform more elaborate searches. You might recall seeing an annotated odds game in your database, a game which also included the word "lightning" as part of the text commentary. To find this game you'd type "lightning" in the "Text 1:" box and check the "Position" box -- the resulting search will turn up your desired game.

We've now covered half the tabs in the Search mask. Next we'll hit the one that seems to give users the highest level of confusion: the Position tab. Trust me, it's easy. We'll take it slowly and you'll soon be an expert in its use. It'll require a fair little bit of explanation though...

We've now seen a basic description of the "Game data" and "Annotations" dialogues; now we're going to start an examination of the "Position" tab. This is arguably the most complex of the search tabs; consequently we're going to spend some extra time with it. Although it's a bit more involved than the other search dialogues it's also pretty simple to use once you get the hang of it. Instead of providing a complete rundown of all features of this dialogue (as we've done with the previous two search tabs), we're going to start with some simple examples just so you can get the hang of how it works. We'll save the list of this dialogue's features for later.

Fire up Fritz (or one of its sister playing programs), open a database and bring up the Search mask (as described earlier). Click the "Position" tab at the top of the Search mask and you'll see the following dialogue:

This dialogue lets you search for board positions that occur within games in the database. As with other search tabs you'll use this dialogue to tell the program what to look for, click "OK", and the program will provide a list of games that qualify.

In using this dialogue, think of the double row of piece buttons to the right of the chessboard as your "box" of chessmen. Click on a button for the piece you want (such as a White King) and then click on the chessboard square upon which you want to place the piece. Let's look at a simple example. Click the "White King" button and place the King on g1. Then click the Black King button and place the piece on c8. [9] Make sure you've selected the radio button to the left of "'Look for' board" (we'll explain all of these radio buttons later):

[9] A shortcut for switching colors while placing the same type of piece is to right-click with the mouse. For example, if you've selected and placed the White King as in the above example and you now want to place a Black King, just right-click on c8 (instead of going back to the piece buttons and clicking on the Black King button). This also works in reverse: if you place, say, a Black Queen first (after clicking on the Black Queen button), you can place a White Queen by right-clicking on a square. This shortcut can save you a world of time when setting up board positions.

If you've followed the directions properly, your dialogue should look like the above illustration. Now just click the "OK" button to get a list of all games containing a position in which a White King is on g1 while a Black King is on c8. Why did we choose these particular squares? It's likely that most of the games (though not necessarily all of them) will involve the players castling on opposite sides of the board.

That's pretty simple. Let's try something a little more involved. Let's say that we want to find all games in which White has fianchettoed a Bishop to g2. This time we'll place a White Bishop on g2 and White pawns on f2, g3, and h2; this will give us the classic Kingside Bishop fianchetto formation:

Clicking "OK" here will bring up a list of all games in the database in which White has a Bishop and pawns on the squares indicated in the illustration.

Note that you do not need to set up a complete legal position in this dialogue; the Position portion of the Search mask allows you to set up position fragments -- that is, partial positions. In the example above, it doesn't matter what other pieces are on what other squares; the program will always bring up games in which the Bishop and pawns are on the indicated squares.

This is exactly what the "'Look for' position" radio button does. By clicking it, you're telling the program, "I want to see a list of games in which these pieces are on these squares."

The "'Exclude' board" radio button does just the opposite: if you were to click that radio button and set up the same Bishop fianchetto position on the board, you'd get a (very long) list of all games in which White never fianchettoed his Kingside Bishop while pawns were simultaneously on f2, g3, and h2.

While you may sometimes find the "'Exclude' board" button useful on its own, it's usually going to be used in conjunction with a "'Look for' board" search. Here again an example will prove useful. Our previous search showed all games in which White fianchettoed his Kingside Bishop. But what if we want games in which the Bishop controls (at least most of) the long diagonal without being blocked by its own pawns? This is where the "'Exclude' board" feature becomes super-useful. Click the "'Look for' board" radio button and set up the Bishop and pawns as in our previous example. When you're done, click the "'Exclude' board" radio button and you'll see the chessboard go blank. Don't worry -- your fianchetto position is still there (just click the "'Look for' board" radio button to double-check this if you like). You've just reset the board to tell the program what the position can't contain.

Click the "'Exclude' board" button to get a blank board. Then click the White pawn button and place White pawns on f3, e4, and d5. What you've now "told" the program is that you want all positions in which White has fianchettoed his Kingside Bishop with pawns on f2, g3, and h2, but in which White does not have pawns on f3, e4, and d5:

Click "OK" and you'll get games in which those conditions apply: the Bishop has fianchettoed and isn't blocked by its own pawns (at least not up to d5 -- the square c6 might be another story).

When you look at the two rows of pieces to the right of the chessboard, you might be wondering what the two "circle" buttons are used for. These are "wild cards"; these circles represent any chess piece or pawn of that color. Let's go back to our White Bishop fianchetto example to see how we can use these wildcards. Our last search turned up games in which the Bishop wasn't blocked by White pawns on f3, e4, and d5. But let's say that we want to see games in which no piece or pawn of White's is blocking the Bishop's control of the diagonal. Instead of placing pawns on those three squares on the "Exclude" board, we'll place a white circle on these three squares instead:

Of course, we still have the Bishop fianchetto fragment set up on the "'Look for' board". Clicking "OK" will provide us with all games containing positions in which White has fianchettoed the Kingside Bishop but in which no White piece or pawn is sitting on f3, e4, or d5 to block the Bishop's path.

Now let's look for an ultra-powerful Bishop mastering the long diagonal with no pieces or pawns of either color on any square between f3 and b7. Leaving the fianchetto position on the "'Look for' board", set up the "Exclude" board to look like this:

Yes, Virginia, you can put more than one piece or wildcard on the same square! Click "OK" and get a list of all games in which a White g2-Bishop dominates the long diagonal with nothing on the squares b7, c6, d5, e4, and f3.

Please note that the side to move in a position doesn't matter in this dialogue. The Search mask is looking for positions without regard to which side is to move next.

You've doubtless noted that we haven't discussed the third radio button in the upper right of this dialogue: the "'Or' board". We'll remedy this omission now. The "'Or' board" is used when a particular piece or pieces can be on any square in a set of selected squares. Here's an easy example. Click the "Reset" button to get rid of our previous example. Now click the "'Or' board" button and place Black Kings on squares a8, b8, and c8 as shown below:

Clicking "OK" here will bring up all games in which the Black King is on a8 or b8 or c8 (this is why it's called the "Or" board). Why this particular search? All games in which Black castled Queenside will be part of the list, as well as games in which the King ran toward the a8 corner (even if Queenside castling wasn't part of the deal).

The "Or" board is a pretty handy tweak to know about when you want to find games with a common theme. For example, you could do an "Or" board search with White Kings placed on e4, e5, d4, and d5 to get games in which White centralized his King.

Here's one more very important option. Normally the Search mask only looks for positions within the main lines of games -- that is, the moves that were actually played. If you want the search to include positions from variations in any annotated games, click the "Include lines in search" box (located directly above the "OK" button) and the program will search for the position fragment in the replayable variations that might be included in the games in addition to finding qualifying positions in the main moves of the database's games.

We've started our investigation of the Position tab in the Search mask, but there's a lot more yet to come...

We've covered the basics of performing position searches using Fritz's Search mask. Now let's look at some advanced features.

You'll of course recall how to open a database and bring up the Search mask. Click on the "Position" tab to again get the following display:

Now let's recreate our Bishop fianchetto position we used as an example in the last section of this article:

As we've previously seen, performing this search will bring up all games in which White has established the classic fianchetto position as shown in the illustration. An interesting twist on this search involves the use of the "Mirror" boxes (located to the right of the piece buttons). Choosing one of these buttons allows you to "mirror" the position fragment.

An example would be to set up the Bishop fianchetto position as above but also click the box next to "Horizontal":

Note what's happened here: the program has placed a red line across the board between the fourth and fifth ranks. Think of this red line as a "mirror" -- if you click "OK" here to perform the search, the program will bring up a list of games in which either White has fianchettoed Kingside or Black has fianchettoed Kingside (but not necessarily with both conditions applying, although that's certainly possible). The program has "mirrored" White's position fragment on Black's side of the board, so you get games in which either player has fianchettoed the Kingside Bishop.

Now let's try using the "Vertical" box instead:

Here again it's easiest to think of the red line as a mirror laid between the d- and e-files. If you do the search now, the program will provide a list of games in which White has fianchettoed on either the Kingside or the Queenside (though again not necessarily on both sides simultaneously, though this too is possible).

Now, just for chuckles, click both the "Horizontal" and the "Vertical" boxes:

Now a search will turn up all games in which White has fianchettoed on either side or in which Black has fianchettoed Kingside. Note, though, that this search will not include games in which Black fianchettoed Queenside (though that may have happened in some games where one of the other three fianchetto positions have occurred).

This brings up an important point: there's a big difference between doing, say, the position search above with the "Horizontal" box checked and a separate search in which you've placed White pawns on f2, g3, and h2 with a White Bishop on g2 plus Black pawns on f7, g6, and h7 with a Black Bishop on g7:

In our earlier example, games in which either player (though not necessarily both) fianchettoed Kingside will be discovered by the search. But in the example immediately above, a game must contain a position in which both players played the Kingside fianchetto.[10]

[10] Furthermore, the fianchettoes must occur on the board at the same time. So games in which White fianchettoes at move eight and subsequently moves the Bishop to, say, f3 before Black has fianchettoed his Kingside Bishop at move twelve will not be found by the search. Note, then, that this isn't a search for all games in which both players fianchettoed Kingside -- it's instead a search for all games in which Bishops have been fianchettoed and are simultaneously placed at g2 and g7 respectively. It's a subtle, but significant, difference.

Thus we see that the "Mirror" boxes can be a handy shortcut for finding somewhat similar "mirrored" position fragments, but only if the feature is understood and used properly.

Earlier we mentioned that header searches are quicker than other types of searches. When I typed that statement I was primarily thinking of the fact that position searches have to examine what's "inside" a game; i.e. the moves, as opposed to header searches which just need to check the game list info. Returning to our Bishop fianchetto example, performing such a search requires that the program look at every move of every game [11] to see if White has fianchettoed his Kingside Bishop.

[11] If the search is conducted exactly as shown in the illustration, the program will actually search for fianchettoes only between moves one and forty, but this will be made more clear over the next couple of paragraphs.

One way to cut down on the amount of work the program has to do is to limit the amount of material it needs to sift through. An excellent means of doing this is provided by the "First" and "Last" boxes. These fields let you specify a range of moves within which the position or position fragment must occur.

Most Bishop fianchettoes as depicted in our example position are going to occur in the opening of a game, right? And since the moves g2-g3 and Bf1-g2 must be played to accomplish the fianchetto, we can rule out the idea of a fianchetto occurring on the game's first move. So for "First" we can enter the value "2" and for "Last" we could enter the number "10". This means that the position must appear on the chessboard somewhere between Move Two and Move Ten. If you started the search using these parameters the program would look at moves two through ten of every game in the database, disregarding all game moves that occur outside that range (i.e. the first move of games and all moves after move ten). This can significantly reduce the amount of time the program takes to complete a search.

The "Length" button lets you tell the program how long the position or position fragment must be on the board. As an example, setting it for "5" means that the position must remain on the board for five moves; in our Bishop fianchetto example, the pawns and Bishop must remain on the specified squares for five or more consecutive moves before any of them moves to a new square. Most often you'll want to have "Length" set to "1" (meaning that the position has to occur for just a single move); this will ensure that all instances in which a position or position fragment occurs will be found by the program.

Although we've looked primarily at searches for position fragments (that is, parts of positions), you can also look for complete positions using the "Position" dialogue. The most common scenario is one in which you're playing through the opening of a game (either a database game or a situation in which you're entering moves in the main chessboard screen) and want to see additional games (if any) in which a given position occurs. For example, you might be playing through a database game, come to move five, and find yourself wondering if that position has occurred in any other games.

There's a really handy shortcut available to you in such an instance. Clicking the "Copy board" button automatically transfers the position from the main chessboard screen in Fritz over to the "Position" dialogue, filling in the dialogue's blank chessboard with whatever position was on the board the last time you were in the program's main chessboard screen.

You could also use this as a shortcut even if you're looking for position fragments instead of complete positions. You could transfer a position to the "Position" dialogue by using "Copy board" and then remove pieces until you've created a position fragment you'd like to search for. An example might be to load an endgame position and then eliminate the Kings and pieces to leave only the pawns; this would let you perform a search for all games containing the same "pawn skeleton" as the endgame you'd been viewing.

The final element of the "Position" dialogue is the "Sacrifice" box. Putting a check in this box while filling out no other information in the Search mask will cause the program to bring up a list of all games in which some sort of material sacrifice occurred. I use this search a fair little bit to locate interesting sacrifices in a database's games.

But you can also use this box to further refine other searches. Returning yet again to our Bishop fianchetto, checking the "Sacrifice" box will cause the program to find all positions in which the fianchetto position is on the board while a sacrifice occurs somewhere on the chessboard. I did a search for the White Kingside fianchetto position occurring between moves 2 and 15 in a database of games from Chess Informant volumes 1 through 75 and the search revealed 16,258 games. Performing the search a second time with the same parameters but with the "Sacrifice" box also checked brought up 153 games in which the fianchetto position was on the board and a sacrifice was offered (though not neccesarily accepted) someplace on the chessboard at the same time.

As previously stated, there are a lot of search parameters available to you in the "Position" dialogue but the dialogue itself is not terribly difficult to use once you understand what the various fields mean. This dialogue alone lets you perform a nearly infinite variety of searches on a database's games, even without using any of the fields in the other search dialogues.

But we're still not finished yet: there's one more dialogue to cover.

This last dialogue isn't terribly complex; consequently this section will be a short one.

Bring up Fritz's Search mask (as described above) and click the "Medals" tab. You should see the following dialogue:

Medals are a special annotation form that ChessBase owners [12] can use to mark games of special interest. Medals appear in a database's game list as a set of multicolored bars. You can search for games that contain a particular medal by selecting it in this dialogue.

[12] You might be able to infer from this comment that ChessBase is indeed required to be able to add or delete medals from a database game; it can't be done in Fritz.

Just click on the box next to a particular type of medal ("Best game" for example) to select it. You'll see the box at the top of the dialogue change color to reflect the color of the medal you've just selected:

Click "OK" and the program will provide a list of all games in the database in which this particular medal was used.

Note that you can select more than one type of medal. The box at the top of the dialogue will change to show all of the medal types you've selected:

However you also need to be aware that if you select, say, two medal types only games in which both types of medals are used (in the same game) will be displayed. For example, I searched a large database for games containing the "Best game" medal and the search provided 182 games. But when I added the "Tactical blunder" medal to the search criteria (to search for both "Best game" and "Tactical blunder", only eleven games were found -- all of which used both medal types.

Note also that there is one exception to Footnote 12 above. A user can't manually assign medals to games using Fritz, but the program will occasionally add the "Tactical Blunder" medal to a game as part of the overnight analysis features.

People frequently ask what the various medal colors mean. I could provide a list (and have done so in previous articles) but some users want to quibble about the names used to describe the colors [13]. You best bet is simply to click on a box in the Medals dialogue of the Search mask -- the box at the top of the dialogue will change and display the color that corresponds to that medal type.

[13] Various graphics cards can (and do) display colors differently from each other; what appears as "Cyan" while using one card may appear somewhat differently when another card is employed. Throw in the fact that some users' definition of "Cyan" may differ from the definition used by the programmers, and that other users might not even know what "Cyan" is, and you can see where the confusion easily occurs.

That's all for medals. See? I told you that this would be a short section! Next we'll put together some searches which use multiple dialogues of the Search mask.

Many years ago, while I was working in auto parts, a customer came to the counter to inquire about a part. He gave me the part description and his vehicle's specs. I looked up the part, pulled it from inventory, and brought it out to him.

"That's not what I want!" he cried indignantly. I dutifully reviewed the description of what he'd asked for and the year, model, and engine of his car. Double-checking everything, I discovered that I'd pulled the exact part he'd requested. "But this is what you asked for," I told him.

"Damn it!" he cried. "Don't get me what I asked for -- get me what I want!"

I see a lot of parallels between that story and the experiences of some chess database users. Back when I did telephone support for chess software I would sometimes have to diagnose the reason why a user's database search wasn't locating what he needed. Unless there was some weird technical glitch (like a corrupted database file), the problem wasn't with the program -- the root cause was located between the keyboard and the chair.

See, a piece of computer software is stupid; it can't think for itself. It can't make intuitive leaps -- it can only search for and find what you ask it to find. It will always get you what you asked for, not necessarily what you wanted.

That's why you need to be specific about what you ask Fritz' (or any other program's) database Search mask to locate for you. This is exactly the reason why we've described in detail how to use the various Search mask dialogues: to help you learn to ask for the material you want with a minimum of confusion.

Now we're going to start looking at "putting it all together" by using the different dialogues to locate what you need and help you avoid a few pitfalls along the way.

I've learned that the single biggest mistake that users make is to ask the program for too much information. Just because a dialogue item exists doesn't mean that you have to use it in every single search. We'll start with a simple example. Let's look for all of the King's Indian Defense games in a particular large database. The easiest way to do this is by searching for ECO codes E60 to E99. [14] After performing the search, we come up with 156,443 games.

[14] If you're going to do a lot of database searches, I heartily recommend that you learn the ECO codes for the openings that you regularly play. You can just type the alphanumeric code into the "ECO" field of the Search mask and very quickly be presented with a list of all games which qualify. If you want to see a "translation table" of ECO codes and their equivalent English names, you can find one here. (And please, please, please bookmark it as one of your "Favorites" now. You would not believe the number of e-mails I get from people who ask me for some link or other that I presented in an article from years gone by. Trust me -- after writing nearly 500 chess software articles on various websites, if you can't remember the article in which I presented a link, neither can I.)

Now let's look for K.I.D. games again, but this time we'll add an isolated White pawn on d4 to the search parameters.[15] With this extra parameter added, the search turns up considerably fewer games -- this time the program finds 3,706 games.

[15] The technical details for how to do this are as follows. Click the "Game data" tab and type "E60" (without the quotes) in the lefthand box to the right of "ECO". In the righthand box, type "E99". You've just told the program that you want all games from ECO codes E60 to E99. Now click the "Position" tab. With the "'Look for' board" radio button selected, place a White pawn on d4. Now click the "'Exclude' board" radio button and place White pawns as follows: c2 through c7 inclusive, e2 through e7 inclusive, and d2, d3, d5, d6, and d7. Click the "OK" tab and the program will search for all Kings Indian Defense games in which White has an isolated d4 pawn.

Now let's toss one more parameter into the mix: we'll do the same search but limit it to the years 1990 through 1999. Upon doing this search we come up with a total of 2,060 games. So we can see that each time we add an extra parameter to the search we get fewer hits. It appears to be a paradox, but it's true: the more information you supply in the Search mask, the less information you receive in return.

Now we'll look at another interesting phenominon that I like to call "Garbage in: Garbage out".[16] If you do a player search you need to spell the player's name correctly if you want any hits. I wish I had a dollar for every phone call or e-mail I've received from users who say, "Your program is crap! I did I search for all of Bobby Fischer's games and got nothing back! Your software doesn't work!"

[16] "Garbage in: Garbage out" (or GIGO for short) is an old computer geek term for exactly the same phenominon I'm describing. If you input junk, you get junk back.

Of course, upon further investigation it's discovered that the problem again occurred somewhere between the keyboard and the chair. The first thing I do is ask how the user spells "Fischer". And, of course, 99% of the time they've left out the "c" and spelled it "Fisher". Chess computer software can't make "fuzzy" assumptions about what you're really looking for; when you type in "Fisher", the program looks for games played by people with that exact name. That old rant "Don't get me what I asked for -- get me what I want!" just doesn't cut it here -- all a program can do is find exactly what you tell it to find. And if you tell it the wrong thing, you get something other than what you wanted. Garbage in, garbage out.

Most of the other 1% of "Fischer errors" are caused by the user typing "Bobby" in the field for the player's first name. In professional quality databases, the man's name is given as "Robert", not "Bobby". We'll come back to the remainder of that last 1% in a moment, after we first hit on another important point.

Player name searches can be tricky and much depends on the quality of the database you're using. Let's use Bobby Fischer as an example again. If you do a player search for "Fischer" (no first name or initial), you'll get Bobby's games -- but you'll also get other players whose last names are also Fischer. You'll need to cut down the search by adding a parameter: the first initial "R". This will get you closer, but you'll still get some other players mixed in. So you spell out the entire first name: "Robert".

And this leads right to another rant I once heard. "Your database is crap! I did a search for "Robert Fischer" and got games played between 1973 and 1991, and after 1992! Everybody knows that Bobby Fischer was inactive during those years! What are you guys trying to pull??"

Nothing. The database search is correct. The other Robert Fischer is a USCF Master and is a frequent player in the DC/Maryland/Virginia area.[17] Bob's games will turn up in a search of any of the larger databases when you use the player name parameters of "Robert Fischer".

[17] I know Bob and he's a really good guy; if you bump into him at a tournament, please give him my absolute best regards. One of my favorite memories from Virginia Chess Federation tournaments was when some young kid would see Bob's name on a wall chart and freak out: "Bobby Fischer's here! He's playing on Board Three!" Man, I don't care how many times I witnessed that; it never got old.

So you might try adding yet another parameter: specifying an Elo rating of 2600+. The problem is that this will eliminate most of Bobby Fisher's games: the majority of his career took place before the introduction of the Elo system, so the bulk of his games won't carry a rating attached to his name.

The point? Sometimes you have to pare down the search results manually because life ain't perfect. In a perfect world (at least as far as databasing is concerned), Bobby Fischer's entire career would have been Elo rated, or he'd never have left chess, or never have made a brief 1992 comeback, or the US Master named Robert Fischer would have been named "Fred Fernwinkle" instead (though Bob would doubtless take issue with that last point). Sometimes you're just going to have to live with the fact that a perfectly-specified search is going to turn up some unwanted results due to the fact that the world isn't perfect. Murphy's Law is the underlying rule of the universe.

Another example is "A. Karpov". You can do a search for "Karpov" and get all of the former World Champion's games -- along with lots of other Karpovs and even a "Karpova" or two. If you limit the search by using "Karpov, A." you get closer, but you also get Al Karpov's games (no joke -- try it and see). You can try "Karpov, Anatoly" and get dangerously close, but then you miss games listed as "Karpov, Anatoli" as well as games in which no first name or initial is provided.

And that drags us kicking and screaming to two more points. The first is that a database search is only as good as the source data. If you're working with a database in which the games of some players appear under multiple name spellings, or in which some use first initials/names as part of the header info while others don't, you're going to get incomplete results. Even professional, commercial databases contain mistakes here and there -- not a happy thought but understandable when you realize that most commercial databases contain several million games these days.[18]

[18] It's interesting (and kind of humorous) to note here that there has been a huge explosion of available chess data, accelerated by the rise in popularity of the Internet. I started in the chess software business in 1992 and back then a 100,000 game database was considered to be "da bomb". Software programs typically shipped with databases of 2,000 games or less. Keep in mind, though, that the most you could fit on a high-density 3.5" floppy disk back then was 5,000 unannotated games. You can fit a lot more on a CD or DVD today with plenty of room to spare. These days if a chess program doesn't ship with at least a half-million game database the consumer feels ripped off. Ah, the march of progress...

So the quality of the source data definitely has an impact on your searches. Another related problem is an insurmountable one: translation problems between alphabets.

It's tough to translate names from one alphabet to another, especially when certain characters have no single English equivalent. Spelling names phonetically isn't a foolproof solution, either, when some languages contain sounds which have no single equivalent in another tongue. Don't believe me? Try this experiment (if you have the necessary library resources). Pull out a present-day atlas and look up the capital of China; you'll likely see it printed as "Beijing". Now find an atlas from as recently as thirty years ago and find the same location -- it'll be printed as "Peiking". Go back farther, to the early 20th century, and you'll see it as "Peiping". But the city's name hasn't changed. The Chinese have been pronouncing it the same way since antiquity. The varied spellings represent the ongoing struggle of Westerners to approximate the Chinese pronounciation in print, exacerbated by the fact that the phonetic sounds don't "translate" well into English characters.

The same thing happens with translations between alphabets. It's a noted fact that many strong chessplayers over the last three-quarters of a century have been from the former Soviet Union.[19] This creates problems for folks compiling chess databases because some characters in the Cyrillic alphabet have no single corresponding character in the English alphabet. Viktor Korchnoi's is the leading example of this problem. In addition to his first name being various spelled with a "c" or a "k", his last name has been spelled "Kortchnoi", "Korchnoi", "Kortschnoy", "Korchnoy", "Kortchnoj", etc. etc. etc. ad infinitum.

[19] Here's the "Cliffs Notes" version. Early in the life of the Soviet Union, it was decided that the alleged superiority of the Communist system over Capitalism would be proven on many battlegrounds: sports, science, militarily, intellectually. The arena the Soviets chose for the intellectual battle was chess. It was a natural choice, since chess was an ingrained part of the Eastern European culture anyway. Government sponsored programs were established in the USSR to develop the populace's chess skills, particularly in the identification and training of child prodigies. This worked wonderfully well -- that's why we've seen many, many more strong Soviet chessplayers than we've seen emerge from the Western bloc. In fact, the "average, man on the street" citizen of the USSR (back in the day) was a much better chess player than his or her Western counterpart: I've read estimates that an "average" Soviet chessplayer's skill corresponded to a USCF Class A rating. I once knew a Russian emigré who became something of a regional "superstar" in the DC/Baltimore area back in the early 1990's, as he was a very strong USCF Master (once he got rated on these shores). He was baffled by the attention -- "back home" he was considered to be a bit better than average, but nothing special.

So how do we deal with this? If you have a database you've built yourself from a variety of sources, you'll need to manually edit the names to try to achieve some type of uniformity (or else live with the fact that you'll need to do multiple searches to find all of the requisite games). If you have a commercial database all you need to is try a search; if you get no hits, manually scan down the database list to pick out the player's name and make a note of the spelling.

OK, now we need to backtrack to Fischer again, but it's to make an important point. I once had a baffling phone call from a user who tried a search for Fischer's games as White and got no hits, but he'd spelled Bobby's last name correctly. This puzzled me for a moment or two, until I had a sudden epiphany.[20] I asked the guy to start clicking on the other Search mask tabs, and I found the problem. He'd previously been doing a search for Grob games (1.g4) and hadn't reset the Search mask. The board in the "Position" dialogue still had a pawn on g4 when he'd typed Fischer's name into the Player field. So the program was looking for all games with Fischer as White that started with 1.g4. There are none -- with only a couple of exceptions, Bobby was strictly an e4 player as White.

[20] I've been asked many times through the years how I became a help desk/technical support person and what are the skills required. It takes a really weird type of person to be a "tech head" (or "propellerhead", as I prefer to refer to myself), and I'm no exception. About 90% of it requires an intimate familiarity with the software -- you need to use it extensively and know the features like the back of your hand. Another 7% or 8% of the job requires logical thinking skills and experimentation -- reconstructing the steps that the user was taking when the problem occurred and a willingness to risk your machine and data (and occasionally your sanity) to experiment and uncover the problem and solution. The other 2% or 3% is the ability to "think outside the box" and hit these "sudden epiphanies" which lead you to the solution (usually doing the aforementioned experimentation for confirmation). I'm not blowing my own horn here, far from it. It's often tough to be "different". In fact, most of us propellerheads are genuine mutants -- and Professor X isn't taking any more applications for new students.

That happy little accident leads us right to another point: the Search mask doesn't reset between searches unless you click the "Reset" button. (The exception being this: after you exit the program, the Search mask resets). This is a huge point -- it's mondo important. I've had dozens of similar calls over the years and, in almost all cases, the problem was that the user hadn't reset the Search mask in-between two unrelated searches.

So the "Reset" button is your friend. Don't be afraid to use it.

We're pretty close to the end of the line with this long article on database basics; I've saved the toughest stuff for last. We might get a wee bit technical with this closing section, but I promise to try to make it as painless as possible.

I want to talk a bit about the speed of database searches and give you a few tips on how to rev them up a bit. There's no "magic tweak" involved, no esoteric computer jargon to master -- it's just a matter of using some common sense when setting up your search criteria.

We've already discussed the "less is more" phenominon: the less information you provide in the Search mask, the more data you get back when the search is complete. There are a few practical applications of this phenominon that we can use to our advantage.

The first application involves the knowledge that there's a hierarchy to how your chess program (ChessBase or one of the Fritz family of playing programs) performs a search. If you create a search that involves both header information (the stuff under the "Game data" tab of the Search mask) and either the "Annotations" or the "Position" tab, the program will look for the header info first before looking "inside" the game itself for annotations or a position. So, for example, if you perform a search for a particular board position but limit the search to, say, a particular ECO code, the program will start scanning the database and look at just the ECO codes in the game headers. It will completely disregard any game that doesn't match the ECO code you entered, but when it locates a game of that ECO designation, it will then look "inside" the game for the position you entered in the Search mask.

You might be thinking, "Big deal. So why is this important?" You can check this out for yourself. Play a variation of the Ruy Lopez, let's say eight moves deep, on a board and then use the "Get board" feature (previously discussed in this series) to transfer it to the "Position" tab of the Search mask. Start your search. If you have a database of more than two million games, go make youself a cup of coffee and come back in a few minutes -- it's going to take a while for your program to complete the search. Why? Because it's searching every game in the database for that board position.

So how do we speed this up? You'll remember that you entered an eight move variation of the Ruy Lopez; while it's possible that the position might appear in a non-Ruy game by some bizarre transposition, it's really not very likely. So you can speed up the search by using the "ECO" field under the "Game data" tab to limit the program's search to just Ruy Lopez games. In this case you'd enter "C60" in the first ECO field and "C99" in the second ECO field. Now the program will totally ignore any game that's not a C60-C99 game and look for your desired position only in games that fall within that range of codes.

You can refine this even further. Let's say that your variation and position were from the Ruy Lopez Exchange. You could use the codes C68 and C69 to speed the search up even more.

There's another way to crank up the speed still further. You'll note under the "Position" tab that there are fields for "First" and "Last" move numbers. Since you entered an eight-move variation, you'd want to set the "First" field to "7" and the "Last" field to "9". [21] Now the program will look at just C68 and C69 games, and examine only moves 7 through 9 for your desired position (instead of moves 1 through 40 as it did when you used the default values). If you try this, you'll notice that the search time is reduced drastically compared to when you did just a position search without any change in the range of ECO codes or move numbers.

[21] I like to "buffer" the values by a move or two either side of the actual number of moves in the variation. This allows the program to catch the all too frequent transpositions which can occur a move or two sooner or later than the move number of a desired variation.

So if you find that your Position or Annotation searches are taking too long, you can speed them up a bit by adding something in the "Game data" fields to limit the search. This allows you to use the reverse of the "less is more" principle to make that concept work to your advantage.

There's almost an art to setting up successful searches. You might remember a game from your database annotated by, say, Kasparov, in which he included a text note about a "Tal-like sacrifice". Instead of just doing an annotation search for that phrase, you can cut down the search time by including "Kasparov" in the "Annotator" field. This way the program will use only games annotated by Kasparov as its "starting point" before looking "inside" the games for the phrase you want.

This technique works well for cutting down the amount of material you'll need to wade through after a successful search. For example, if you do a search for an isolated d-pawn (as we did in an earlier article in this series) you'll likely be confronted by tens of thousands of "hits" if you're searching a database of more than two million games. A good way to limit the results to something more managable would be to include just a single ECO code for one of your favorite openings (or possibly a range of codes if necessary). If you find that you're still getting too many hits, you might consider selecting the box next to "Variations" under the "Annotation" tab (assuming, of course, that you're working with a database that contains some annotated games). This would limit the search to annotated games of a particular opening which contain an isolated d-pawn position. Here again, we're twisting the "less is more" phenominon around and standing it on its head -- we're deliberately including "too much information" in the search criteria to purposely limit the number of hits. We're turning a disadvantage into an advantage, a principle that should be familiar to most chessplayers.

Even if you're doing a straight Position search with no applicable "Game data" criteria, you can always use the "First" and "Last" fields to limit the search to specific parts of the game. If you're looking for a specific opening position (or position fragment), it's silly to have the program search through every position from moves 1 through 40 when moves 4 through 10 would do nicely. The same thing applies to endgames -- if it's a Rook and pawn ending, searches that include moves before move 30 or 40 are really just a needless waste of time (unless you really want to find "demolition derbies" in which the players hoovered the pieces off of the board early in a hell-for-leather rush to get to the endgame).

This brings us to another important point: searching for complete middlegame and endgame positions is generally a waste of time. While the game of chess has a finite number of possible positions, that number is still astronomical. I can tell you from experience that it's highly unlikely that a particular middlegame or endgame position from one of your games has appeared in grandmaster or mater play. This has nothing to do with the "quality" of their games versus that of games played by those of us down here in the fishpond, but is instead a result of the near-infinite possible board positions in chess.

While I certainly don't want to discourage you from doing such a search (on the outside chance that your game position has occurred in some past game in your database), it's much more productive to look for position fragments (partial positions) in the hope that one of them will be close to the position you're researching. For example, I was involved in a Rook and pawn endgame in a correspondence game a decade ago and I was considering offering a draw -- I couldn't see a way for either of us to gain an advantage in that particular position. I did a database search for the exact position and came up with nothing. But when I limited the search by creating a position fragment (of just the Kings and Rooks on the same particular squares) I came up with a game that was just one move off -- one of the pawns was one square away from the position in my own game. I played through the moves and saw that the game was indeed drawn -- so I offered a draw with reasonable confidence that I was doing the right thing.

When you do any kind of general search (such as a search for games of a particular ECO code) on a huge database and get hits, you'll likely get a lot of them. There's often too much material to look at -- even two hundred games is a whopping great amount of data. So how do you know what to look at?

I usually start by looking at annotated games. You'll spot these in the game list by looking for the letters "V" (for Variations) and "C" (for commentary) in the far righthand column. In fact, I'll often do the search a second time with "Variations" checked under the "Annotations" tab. [22] I can then play through these games and get the benefit of the expert commentary included with these games.

[22] I use "Variations" because it's not often that a game will be annotated without them. While it's possible for a game to contain text commentary without the inclusion of replayable variations, it's a really rare occurance.

If I still see a lot of games with commentary in the list, I'll then use the presence of medals as a further criteria. I'll scan down the game list of annotated games and if I see one with a medal I'll play through it. After all, the whole purpose of medals is as a device used to call your attention to particular aspects of significant games.

I don't often include medals in my initial search. Medals weren't included in ChessBase format games until relatively recently (within the last few years), so you miss out on a lot of material if you include a particular medal in your search. One exception, though, is the "Model game (opening plan)" medal. When I enter an opening variation by hand and save it into a database, I'll often insert this particular medal. Later when I'm searching for these commented opening variations I can just do a medal search for "Model game" and the program will pull these games up straight away.

Finally, to bring this article full circle, we need to look again at why database games are important. Some players, particularly novices, don't understand why they should even bother with database searches. To understand their importance, we need first to realize that chess is about more than just playing. It's also about studying and improving. It's also about more than just calculation -- it's also about pattern recognition and memory.

This is why chess books are so popular (and useful). A stronger player or teacher presents important principles in a lesson in a chess book. In 99.9% of these cases, he or she will also include examples from actual games. These examples are provided to illustrate the lesson and to reinforce the lesson in the mind of the reader.

But there are limits to printed books; to keep them at a reasonable length, the author often provides just two or three examples of the principle in action. Many times, though, these principles (Rooks on open files, fianchettoed Bishops, isolated pawns, particular pawn skeletons, etc.) are searchable within a database simply by using the Search mask as we've learned over the past several articles in this series. Let's say you're reading a chess book and come across a lesson on the value of a Rook on the seventh rank. The author has provided a couple of examples, but you'd like to see more. Just fire up your chess program and do a search for a White Rook on any square from a7 through h7 -- you'll find more examples than you can shake a stick at. Play through a dozen or two of these games and the importance of a Rook on the seventh should become firmly fixed in your mind.

Another reason for utilizing database searches is just sheer enjoyment. I have several favorite players and I sometimes like to kick back with a cold one, do a search for the games of, say, Adolf Anderssen, and play through them, just to admire the man's genius. And playing through these games can have a subtle, but important, side effect: inspiration. Playing through Anderssen's, Tal's, or Shirov's games often inspires me to look for sacrificial opportunities in my own games. I actually won a "lost" correspondence game when, inspired by Alexi Shirov, I offered an unsound sacrifice that totally threw my opponent off of his game.

We play through database games for a lot of reasons: knowledge, reinforcement, entertainment, inspiration. But there's a common thread in all of these cases -- our own chessplaying improves. Sometimes it's because we learned something new. Sometimes it's because we spotted a common theme or pattern that we later come across in our own games. Sometimes it's because we're inspired by other players to take chances or jump on opportunities that we'd otherwise have missed. But in all cases we're better players because we reviewed what other, stronger chessplayers have done in their games. And that's the reason why database seaches are important.

Hopefully this article has made the process of creating and executing these searches a whole lot easier and more understandable for you.

Steve Lopez is a professional chess writer from Maryland who has been writing about and supporting chess software for more than a decade. He's also written and edited several chess books and training CDs, some of which are available or are forthcoming from Chess Central.

262

Chess Database Basics

by Steve Lopez

© 2004, 2009, Steven A. Lopez. All rights reserved.

What People Say About Us