Kasparov vs. X3D Fritz

a short history of computer chess

The first chess machine

In 1769 the Hungarian engineer Baron Wolfgang von Kempelen built a chess playing machine for the amusement of the Austrian Queen Maria Theresia. It was a purely mechanical device, shaped like a Turk. Naturally its outstanding playing strength was supplied by a chess master cleverly hidden inside the device. The machine was a fake.

Turing's "paper machine"

It is an amazing fact that the first chess program was written before computers were actually invented. It was written by a visionary man who knew that programmable computers were coming and that, once they were invented, they would be able to play chess.

The man was Alan Turing, one of the greatest mathematicians who ever lived. Turing led the project group that broke the German "Enigma" code and so helped decide the outcome of the Second World War. He was very interested in chess, but in spite of having a brilliant intellect and putting a lot of effort into learning the game he remained a fairly weak player. Soon after the war he wrote the instructions that would enable a machine to play chess. Since there was as yet no machine that could execute the instructions he did so himself, acting as a human CPU and requiring more than half an hour per move. One game is recorded, which Turing's "paper machine" lost to one of his colleagues.

Here's the historic game: Turing's paper machine – Alick Glennie, Manchester 1952: 1.e4 e5 2.Nc3 Nf6 3.d4 Bb4 4.Nf3 d6 5.Bd2 Nc6 6.d5 Nd4 7.h4 Bg4 8.a4 Nxf3+ 9.gxf3 Bh5 10.Bb5+ c6 11.dxc6 0-0 12.cxb7 Rb8 13.Ba6 Qa5 14.Qe2 Nd7 15.Rg1 Nc5 16.Rg5 Bg6 17.Bb5 Nxb7 18.0-0-0 Nc5 19.Bc6 Rfc8 20.Bd5 Bxc3 21.Bxc3 Qxa4 22.Kd2? [22.h5 would have trapped the bishop] 22...Ne6 23.Rg4 Nd4? [23...Rxb2! 24.Bxb2 Rxc2+] 24.Qd3 Nb5 25.Bb3 Qa6 26.Bc4 Bh5 27.Rg3 Qa4 28.Bxb5 Qxb5 29.Qxd6 Rd8 0-1.

Shannon's strategies

Around the same time as Turing another great mathematician, Claude Shannon of the Bell Laboratories, was thinking about teaching a computer to play chess. He realized that the problem would be the very large number of continuations, so he differentiated between an "A-Strategy" which looks at all continuations and a "B-Strategy" which cuts off certain lines. Today we differentiate between "brute force" and "selective" programs, although all strong programs belong more or less to the former category.

Chess instead of atomic bombs

During the war the United States built a giant laboratory in Los Alamos in the deserts of New Mexico. Its purpose was the development of atomic weapons. Working out the correct shape of the implosion charges that trigger the chain reaction required a very large number of calculations.

In 1946 the Hungarian/American mathematician John von Neumann was given the task of designing a powerful calculation machine to speed up the task. In 1950 a giant machine called MANIAC I was delivered. It was filled with thousands of vacuum tubes and switches and could execute 10,000 instructions per second. It was also programmable.

Instead of immediately getting to work on the bombs the scientists started experimenting with the machine. And one of the first things they did was to write a chess program. It was for a reduced 6 x 6 board without bishops. In spite of this the program needed twelve minutes to search to a depth of four ply (with the bishops it would have required three hours).

The program played three games in the mid fifties. The first was against itself (White won), the second against a strong player who spotted it a queen. The game lasted ten hours and the master won. Finally it played against a young lady who had learnt the game a week earlier. The program won the game in 23 moves. It was the first time a human had lost to a computer in a game of intellectual skill.

Here's the second historic game (6 x 6 board, no bishops, no double-step for pawns, no castling)

MANIAC 1 – Human, Los Alamos 1956: 1.d3 b4 2.Nf3 d4 3.b3 e4 4.Ne1 a4 5.bxa4? [5.Nd2 and 6.Nd2-c4+ Nbcxc4 7.b3xc4 with a good game] 5...Nxa4 6.Kd2? Nc3 7.Nxc3 bxc3+ 8.Kd1 f4 9.a3 Rb6 10.a4 Ra6 11.a5 Kd5 12.Qa3 Qb5 13.Qa2+ Ke5 14.Rb1 Rxa5 15.Rxb5 Rxa2 16.Rb1 [to prevent 16...Ra1 mate!] 16...Ra5 17.f3 Ra4 18.fxe4 c4 19.Nf3+ Kd6 20.e5+ Kd5 21.exf6Q Nc5 22.Qf6xd4+ Kc6 23.Nf3-e5 mate.

Chess and mathematics

The main problem of chess programming is the very large number of continuations involved. In an average position there are about 40 legal moves. If you consider every reply to each move you have 40 x 40 = 1600 positions. This means that after two ply (half-moves), which is considered a single move in chess 1600 different positions can arise. After two moves it is 2.5 million positions, after three moves 4.1 billion. The average game lasts 40 moves. The number of potential positions is in the order of 10128 (10 to the power of 128), which is vastly larger that the number of atoms in the known universe (a pitiful 1080).

It is clear that no computer or any other machine will solve the game by looking at all possible continuations. But human beings are also imperfect players. It is only a question of what depth of search is required for a machine to match human strategic skill. Early computers were able to generate and evaluate about 500 positions per seconds, or 90,000 in three minutes that are available per move in a tournament game. This means they could search only three ply (one and a half moves) deep. That led to extremely weak play – the level of a chess novice. To go one ply deeper required about 15,000 positions per second, a thirty-fold increase. But even four ply is very shallow. So it seemed unlikely that computers would ever play master-level chess.

Alpha-beta

A first breakthrough came in 1958 when three scientists of the Carnegie-Mellon University in Pittsburgh (Newell, Shaw and Simon) made an important discovery. You can chop off large parts of the search tree without affecting the final results. They called this the alpha-beta algorithm. It is important to note that is a purely mathematical technique and works without the use of any additional chess knowledge.

This is, very roughly, how alpha-beta works: say the computer has finished evaluating a move and started working on a second move. As soon as a single line shows that it will return a lower value than the first move we can immediately terminate the search. We do not need to know exactly how much worse the second move is. The program will definitely prefer the first move.

Alpha-beta produces exactly the same result as a full search, while looking at only about the square root of the number of positions otherwise required. Suddenly the early computers were able to look five and six ply ahead. In the seventies the world's fastest computers (e.g. the CDC Cyber series) were able to search seven ply deep and had achieved a respectable playing strength. But even with alpha-beta you need a five-fold increase in speed to go one ply deeper. The exponential explosion of numbers once again caught up with the programmers.

The hardware machine Belle

Ken Thompson was a computer scientist who couldn't wait for the million-dollar super-computers to become five or twenty-five times faster in order to get stronger at chess. He and a colleague at the Bell Laboratories decided to build a special purpose machine, using many hundreds of chips worth about 20,000 dollars.

They called the machine "Belle", and it could only play chess. But it was able to search at about 180,000 positions per second (the super-computers at the time were doing 5000 positions). Belle could go down eight to nine ply in tournament games, which enabled it to play in the master category. It won the world computer chess championship and all other computer tournaments from 1980 to 1983, until it was superseded by giant Cray X-MPs costing a thousand times more.

Chess chips

In the middle of the eighties Prof. Hans Berliner, a computer scientist at the Carnegie-Mellon university, picked up where Ken Thompson had left off. Berliner, who had also been world correspondence chess champion, built a hardware-driven chess machine called HiTech. He and his graduate student Carl Ebeling developed a hardware move generator chip. With 64 chips in parallel HiTech narrowly missed winning the world computer chess championship in 1986 (it was won by a Cray).

Soon after that Berliner's students Feng-hsiung Hsu, Murray Campbell and others developed their own machine called ChipTest and later Deep Thought. It cost about 5000 dollars and ran at 500,000 positions per second. Hsu and Campbell subsequently broke with their teacher and joined IBM. Together with Joe Hoane they built IBM's current Deep Blue.

Deep Blue

The machine Garry Kasparov played against in Philadelphia and New York consisted of a IBM SP/2 server equipped with a large number of special-purpose chips which do the fast calculations. Each chip is capable of processing two to three million positions per second. By using over 200 of these chips the overall speed of the program could be raised to 200 million positions per second.

Depth vs playing strength

What does a speed of 200 million positions per second imply in a chess machine? Ken Thompson, the father of Belle (as well as Unix and the computer language C) conducted some very interesting experiments in the 80s to correlate depth of search with increase in playing strength.

Thompson played Belle against itself with one side computing progressively deeper. On an average a single ply of search depth translated to around 200 Elo points – at four ply Belle was playing around 1230, at nine ply it had reached 2328 Elo points.

By extending the curve, which flattens at the top end, one could conclude that a search depth of 14 ply is required to achieve world championship strength (2800).

The conclusion of experts: you need to build a computer that runs at one billion nodes per second (and searches 14 ply deep) if you wish to challenge the human world champion for his title. Deep Blue comes close, but isn't there yet.

The micros

Naturally the quality of programming also plays an important role. Today's top PC programs like Fritz and Junior run at 5,000,000 and more positions per second. They realistically have a playing strength of well over 2700 and are a match for even the top 100 players in the world. In rapid chess only the top dozen or so can compete, in blitz chess probably only two or three players could survive.

Assault on both fronts

An important role in the strength of computers is played by extensive openings books. The collective knowledge and experience of generations of human masters can easily be stored on hard disk and accessed during the opening phase of the game. Even the PC programs know tens of millions of openings positions and have access to full statistics on each of them (which moves were played, with what success, by what calibre of players, etc.). Very often a program will play fifteen or twenty moves of a game before it begins to compute for the first time. Without the benefit of this human knowledge in the openings the programs would be considerably weaker.

While computers are gaining a substantial advantage from the vast amount of openings knowledge that has accumulated in the history of chess, they also profit from a research at the other end of the game.

Endgame databases

Once again it was the ubiquitous Ken Thompson who pioneered this development. In the 80s he began to generate and store all legal endgame positions with four and five pieces on the board. A typical five-piece ending, like king and two bishops vs king and knight, contains 121 million positions. With a pawn, which is asymmetric in its movements, the number rises to 335 million. Thompson wrote programs that generated all legal positions and worked out every forcing line that is possible in each endgame. He also compressed the resulting data in a way that allowed one to store about 20 endgames on a standard CD-ROM.

Using these databases a computer will play each of the endgames absolutely perfectly ("like God"). In any position with the given material on the board it knows instantly whether it is a win, draw or loss and in how many moves. Often it will announce a win or mate in over fifty moves. On the losing side it will defend optimally. Deep Blue uses Thompson's endgame databases, and even the PC program Fritz is now implementing them in its search tree. How this will affect its playing strengths remains to be seen.

Some of the five-piece endings are notoriously difficult or even impossible for human beings to master. A prime example is queen and pawn vs queen, in which no human has the slightest chance against a computer. But these five-piece endgames are tic-tac-toe compared to the six-piece endings which Thompson is currently generating. In some six-piece positions you need to play over 200 accurate moves to win. Often it is impossible for the strongest players in the world to even tell what progress has been made after 100 moves which the computer tells us are absolutely essential. Naturally the development of hardware is working in favour of the computers. Thompson's six-piece endings, which contain 8 to 20 billion positions each, can be compressed to fit nicely on a DVD.

Luckily seven-piece endings, which contain half a trillion positions each, are still a long way off. And even more luckily the two ends – openings research and endgame databases – will never meet. It is highly unlikely that anyone will ever see a computer which plays 1.e4 and announces mate in 40. But it is probably only a matter of time, of a few years or a decade, before a computer will consistently beat the human world champion in the game of chess.

Frederic Friedel

links

A history of the game of chess

Interview with world #2 Kramnik after his 2002 match with Fritz


	US Chess Trust \| Contact webmaster \| X3D \| ChessBase