Simplifying boolean expressions (part 2)

So where do go about simplifying a boolean expression? Well, we can go about it through a series of pretty simple steps. EE and CS curriculums cover these algorithms depth and i’m going to use a pretty basic one.

First let’s start with what a boolean expression is. We can start with a pretty simple type definition (which we can extend later). Specifically:

type 'a boolean_formula =

    Constant of bool

  | Variable of 'a

  | Not of 'a boolean_formula

  | And of 'a boolean_formula * 'a boolean_formula

  | Or of 'a boolean_formula * 'a boolean_formula

  | Implies of 'a boolean_formula * 'a boolean_formula

  | Equivalence of 'a boolean_formula * 'a boolean_formula

The type is fairly self explanatory. “Constant” is used for representing the built in bool types like true and false. “Variable” will be used for code expressions and can be something as simple as a parse tree node (in the aggressive optimization case), to something as complex as an object which encapsulates all the semantic meaning behind a snippet of C# code. Why would we need more than the parse tree? Well, say you have an expression in it with both “foo” and “this.foo”, well if they mean the same thing then it helps to know that when trying to simplify an expression.

The types “And” and “Or” correspond to the equivalent C# operators “&&” and “||”. The “Equivalence” type corresponds to the “==” in C#. Implies is rather interesting since it doesn’t really have a direct C# equivalent, but it’s there to help with later portions of the algorithm. For those who haven’t seen it before, the truth table for it looks like:

P

Q

R

T

T

T

T

F

F

F

T

T

F

F

T

If you think the definition is a little odd, it helps to consider the Equivalence relation and to realize that:

(P <-> Q) is the same as ((P -> Q) && (Q -> P))

So, the equivalent form of “!((a && !b) || (!a && b))” would be “Not(Or(And(Variable(“a”), Not(Variable(“b”))),And(Not(Variable(“a”)),Variable(“b”))))”

Ok, so now that we have a our boolean_formula type we need to get it into a special form that our simplification algorithm will work on. Based on work done by Church and Rosser we’ll get it into a “normalized” form where any boolean formula will look like an “if-then-else” statement with a recursive structure (with simple atoms at the leaves). To make things a little more interesting, the expression being tested in the “if-then-else” will always be an atomic form that is either a “Constant” or a “Variable”.

The initial type we’ll use to get that is called the “semi_normal_form”. It’s “semi” because it only encapsulates the concept of the “if-then-else” structure and not that the test expression needs to be an atom.

type 'a semi_normal_form =

    SemiNormalVariable of 'a

  | SemiNormalConstant of 'a

  | SemiNormalIf of 'a semi_normal_form * 'a semi_normal_form * 'a semi_normal_form;;

This now allows us to easily convert our boolean formulas into semi-normal-form using very simple and intuitive conversion rules. Rather than spelling out the rules I’m just going to show the code for how to convert them and show how a single one of them works:

let rec boolean_formula_to_semi_normal_form be =

    let bf_to_snf = boolean_formula_to_semi_normal_form (* use a simpler name internally *)

    in

  match be with

        Constant(_) -> SemiNormalConstant be

      | Variable(_) -> SemiNormalVariable be

      | Not(a) -> SemiNormalIf((bf_to_snf a), SemiNormalConstant (Constant false), SemiNormalConstant(Constant true))

      | And(a,b) -> SemiNormalIf((bf_to_snf a), (bf_to_snf b), SemiNormalConstant(Constant false))

      | Or(a,b) -> SemiNormalIf((bf_to_snf a), SemiNormalConstant(Constant true), (bf_to_snf b))

      | Implies(a,b) -> SemiNormalIf((bf_to_snf a), (bf_to_snf b), SemiNormalConstant(Constant true))

      | Equivalence(a,b) -> bf_to_snf(And (Implies (a, b), Implies (b, a)));;

i.e. if you have an expression of the form “!a” then we can convert that into the form:

if (a) { return false; } else { return true; }

and do similar changes to rest of the formulas.

Ok. Now that we have the expression in semi-normal-form it’s time to get them into normal form. I was looking for a type that could express this but was unable to. So instead we’re still going to stick with the semi_normal_form type and just transform any such expression into one that is guaranteed to be in normal form. We won’t have the type system ensuring this for us, but if we’re careful then it’ll be ok.

In order to get an expression into normal form we’ll exploit a handy boolean expression isomorphism. Specifically:

if (a ? b : c) {

    return d;

} else {

    return e;

}

Can be converted into the normal form:

if (a) {

    if (b) {

        return d;

    } else {

        return e;

    }

} else {

    if (c) {

        return d;

    } else {

        return e;

    }

}

The code for this is just:

let rec semi_normal_form_to_normal_form snform =

    let snf_to_nf = semi_normal_form_to_normal_form

    in

    let rec normalize_head head =

  let nh = normalize_head

        in

        match head with

            SemiNormalConstant(_)

          | SemiNormalVariable(_) -> head

          | SemiNormalIf(a1,b1,c1) ->

              (match a1 with

                   SemiNormalConstant(_)

     | SemiNormalVariable(_) -> head

                 | SemiNormalIf(a2,b2,c2) -> SemiNormalIf(a2, nh(SemiNormalIf(b2, b1, c1)), nh(SemiNormalIf(c2, b1, c1))))

    in

    match snform with

        SemiNormalVariable(_)

      | SemiNormalConstant(_) -> snform

      | SemiNormalIf(a, b, c) -> normalize_head (SemiNormalIf ((snf_to_nf a), (snf_to_nf b), (snf_to_nf c)));;

(the helper function “normalize_head” isn’t actually necessary, but it helps change the function from running in exponential time to linear time).

Now, since we don’t have the type system ensuring that the results of this function are actually in normal form, it helps to have a few function along that will verify it for you:

let rec is_normal_form snform =

    let is_atomic x =

      match x with

            SemiNormalVariable(_)

          | SemiNormalConstant(_) -> true

          | _ -> false

    in

    match snform with

        SemiNormalVariable(_)

      | SemiNormalConstant(_) -> true

      | SemiNormalIf(a, b, c) -> is_atomic a && is_normal_form b && is_normal_form c;;

Ok, so now we’re a pretty good state. Our original boolean formula has been converted into this very handy format that will allow us to simplify it easily.

Like many algorithms this one works by breaking up a complex problem into simple ones and providing rules to handle the simple cases. The rules we have for this are pretty intuitive and extensible for the simple cases, while the complex case gets a little more interesting and needs a little consideration.

The base case rules that we’ll consider by default are the following:

1) If you’re a constant, then you’re already simplified.

2) If you’re a variable then you’re already simplified.

3) If you’re an “if” with a test that reduces to “true” then the result is the simplified “then” clause

4) If you’re an “if” with a test that reduces to “false” then the result is the simplified “else” clause

5) If you’re an “if” and both your “then” and “else” clauses reduce to the same thing, then the result is that thing. (i.e not much point in the test if the result will be the same either way)

6) If you’re an “if” and your “then” reduces to “true” and your “else” reduces to “false” then the result is just your test expression. This is same as converting the expression “if (size() > 0) { return true; } else { return false; }” into “return size() > 0;”

Now the final rule works a little bit interesting. It relies on the following traits:

if (a) {

    //Inside here I know ‘a’ is true and I can replace any a’s with “true”

    if (b) {

  //Inside here I know ‘a’ is true and I can replace any a’s with “true”

        //Inside here I know ‘b’ is true and I can replace any b’s with “true”

        return d;

    } else {

        //Inside here I know ‘a’ is true and I can replace any a’s with “true”

        //Inside here I know ‘b’ is false and I can replace any b’s with “false”

        return e;

    }

} else {

    //Inside here I know ‘a’ is false and I can replace any a’s with “false”

    if (c) {

        //Inside here I know ‘a’ is false and I can replace any a’s with “false”

        //Inside here I know ‘c’ is true and I can replace any c’s with “true”

        return d;

    } else {

        //Inside here I know ‘a’ is false and I can replace any a’s with “false”

        //Inside here I know ‘c’ is false and I can replace any c’s with “false”

        return e;

    }

}

 

In other words, when trying to reduce the “then” and “else” clauses we can use what we know must be true of the test expression in order to help us simplify things. By using this ability to push down this information into our subparts as well as the simple base case rules above we can reach a final simplified form. Here’s where we also see why it was useful to have this normal form where the test-condition had to be a constant or a variable. By doing that we ensure that we can push down information into the subclauses.

So now we wrap all these rules up into a function that will do this for us. Note: there’s an interesting interplay between all these rules. We don’t want to apply the last one (the complex one) before applying the simple cases. However, once we apply the last one we might be able to now apply one of the simple ones (like rule 5 or 6). So we need recursion in the function (but not full recursion as that would result in the last rule being applied over and over again).

But for today that will have to be it. Time for bed. I’ll finish this up tomorrow J