DLX from Python

Table of Contents

DLX1 is a solver for exact cover problems written by Donald Knuth.

Fetch Sources

Source at https://cs.stanford.edu/~knuth/programs/dlx1.w

We also need the "gb_flip" random number generator included in http://ftp.cs.stanford.edu/pub/sgb/gb_flip.w , which includes a third file http://ftp.cs.stanford.edu/pub/sgb/boilerplate.w .

wget http://ftp.cs.stanford.edu/pub/sgb/boilerplate.w
wget http://ftp.cs.stanford.edu/pub/sgb/gb_flip.w
wget https://cs.stanford.edu/~knuth/programs/dlx1.w

C Source File

Tangle using cweb , then compile.

I used gcc 14.3.0 which needs an extra -Wno-implicit-int because main (int argc, char *argv[]) { is defined in the source with out an int return type.

nix-shell -p gcc
ctangle gb_flip.w
ctangle dlx1.w
gcc dlx1.c gb_flip.c -o dlx1 -O3 -Wno-implicit-int

Create PDF Documentation

cweave dlx1.w
pdftex dlx1.tex

Python Wrapper

To make sure every row is only used once, individual rows must be stored as frozenset so they are hashable and can be set members themselves.

from collections.abc import Iterable
from dataclasses import dataclass, field
import subprocess


@dataclass
class Problem:
    columns: set[str]
    rows: set[frozenset[str]] = field(default_factory=set)

    def add_row(self, row: Iterable[str]):
        self.rows.add(frozenset(row))


@dataclass
class Solution:
    rows: list[set[str]]

The solver encodes a Problem into a string description for DLX, then parses the string output into a Solution .

While this implementation will be good enough for simple experiments with exact cover problems, it should used with trusted input.

<<types>>

@dataclass
class Solver:
    every_nth: int = 1
    limit: int | None = None
    seed: int | None = None
    timeout: int | None = 10

    def _run_dlx(self, description: str) -> str | None:
        args = ["./dlx1", f"m{self.every_nth}"]
        if self.limit is not None:
            args.append(f"t{self.limit}")
        if self.seed is not None:
            args.append(f"s{self.seed}")

        try:
            res = subprocess.check_output(args, input=description.encode("utf8"), timeout=self.timeout)
        except subprocess.TimeoutExpired:
            print(f"Solver did not finish after {self.timeout} seconds")
            return None

        return str(res, encoding="utf8")


    def _encode_problem(self, problem: Problem) -> str:
        lines = []

        lines.append(" ".join(problem.columns))
        for row in problem.rows:
            lines.append(" ".join(row))

        return "\n".join(lines) + "\n"


    def _decode_output(self, output: str) -> dict[int, Solution]:
        result = {}
        key = None
        row_acc = []

        for line in output.split("\n"):
            leading_space = line.startswith(" ")
            line = line.strip()
            if line == "":
                continue

            toks = line.split(" ")
            # The index of the solution "i:"
            if not leading_space and len(toks) == 1:
                if key is not None:
                    result[key] = row_acc
                    row_acc = []
                key = toks[0]
            # A row of the solution, ending with the three tokens "(m", "of", "n)"
            elif leading_space and len(toks) > 3 and toks[-1].endswith(")"):
                row_acc.append(set(toks[:-3]))
            else:
                raise ValueError(f"Invalid output line '{line}'")

        if key is not None:
            result[key] = row_acc

        return result


    def solve(self, problem: Problem) -> str | None:
        description = self._encode_problem(problem)
        if output := self._run_dlx(description):
            return self._decode_output(output)

Since add_row accepts any Iterable[str] , we have some flexibility when defining problems:

<<init_problem>>

s = Solver()
p = Problem({"a", "b", "c", "d"})
p.add_row(["a", "b"])
p.add_row({"b", "c"})
p.add_row("ad")

print(s.solve(p))

If you have an idea how this page could be improved or a comment send me a mail.