Simplifying boolean expressions (part 3)

or "Where we wrap together all that prototype code and get something actually useful"

 

So now we get to the function that does all the work for us:

 

let rec reduce_normal_form nform1 =

    let rnf = reduce_normal_form

    in

    let rec replace snform var const =

        match snform with

            SemiNormalConstant(_) -> snform

          | SemiNormalVariable(_) ->

              if snform = var then

                  const

              else

                  snform

          | SemiNormalIf(a,b,c) -> SemiNormalIf(replace a var const, replace b var const, replace c var const)

    in

    let simple_cases nform2 =

        match nform2 with

            SemiNormalConstant(_) -> nform2

          | SemiNormalVariable(_) -> nform2

          | SemiNormalIf(SemiNormalConstant(Constant(true)), x, _) -> reduce_normal_form x

          | SemiNormalIf(SemiNormalConstant(Constant(false)), _, y) -> reduce_normal_form y

          | SemiNormalIf(SemiNormalVariable(_), x, y) when x = y -> reduce_normal_form x

          | SemiNormalIf(SemiNormalVariable(_) as v, SemiNormalConstant(Constant(true)), SemiNormalConstant(Constant(false))) -> v

          | _ -> nform2

    in

    let simplified = simple_cases nform1 in

        match simplified with

          | SemiNormalIf(SemiNormalVariable(_) as v, y, z) ->

              let reduced_y = rnf (replace y v (SemiNormalConstant(Constant(true)))) in

              let reduced_z = rnf (replace z v (SemiNormalConstant(Constant(false)))) in

              simple_cases (SemiNormalIf(v, reduced_y, reduced_z))

          | _ -> simplified;;

 

It’s a little complicated so I’ll break it down into constituent parts to explain.  The first part is a helper function called “replace.”  The idea behind it is to take a formula and replaces all instances of a variable in it with the specified constant.  In the previous post I showed how normal-form allowed us to know the values of variables in the “then” and “else” portions of an if-clause.  This function allows for that functionality.  As you can see, if it ever sees a variable in an expression it tests if that variable is equal to the variable passed in.  If so, it replaces it with constant passed in.  If it sees an if-expression then it just recursively calls itself on each constituent part of the expression.  This way the knowledge we have about variables will now get passed down throughout the entire structure.  As it stands, this is actually quite inefficient as we are now traversing the entire tree structure on every replace (which will lead to exponential running time).  In practice we would actually use a better system for doing this that would defer this traversal, but this suffices for the sake of a simple prototype.

 

After that helper function is defined we see another helper function “simple_cases”.  This function handles the 6 cases I specified in the first post.  It deals with the basic simplifications like transforming “if (true) { return x; } else { return y; }” into “return x;”

 

Finally these two functions get pulled into the final bit of work we have.  First we take the expression we have and see if it can be trivially reduced.  After that we see if we now have an expression which we know cannot be trivially reduced anymore.  After this we go into the “then” and “else” portions of the if-expression and we replaces the test expression in the “if” with either “true” or “false”.  After that replacement we then try to reduce the subclauses and after that we might now be in a position where the original rules might again apply and so we run “simple_cases” again.

 

So now we’ve gone through and have removed extraneous boolean checks and we’ve reduced the equation to the smallest possible form.  All that remains is a little function that can take this simplified form and reconstitute it into a boolean_formula that can then be dumped into the code.  This is fairly trivial to write as it us just decomposition and the rules of boolean algebrea that we used before in reverse:

 

let rec semi_normal_form_to_bool_expr snform =

    let snf_to_be = semi_normal_form_to_bool_expr

    in

    match snform with

        SemiNormalConstant(a) -> a

      | SemiNormalVariable(a) -> a

      | SemiNormalIf(a, SemiNormalConstant(Constant(true)), SemiNormalConstant(Constant(false))) -> snf_to_be a

      | SemiNormalIf(a, SemiNormalConstant(Constant(false)), SemiNormalConstant(Constant(true))) -> Not (snf_to_be a)

      | SemiNormalIf(a, b, SemiNormalConstant(Constant(false))) -> And (snf_to_be a, snf_to_be b)

      | SemiNormalIf(a, SemiNormalConstant(Constant(false)), b) -> And (Not (snf_to_be a), snf_to_be b)

      | SemiNormalIf(a, SemiNormalConstant(Constant(true)), b) -> Or (snf_to_be a, snf_to_be b)

      | SemiNormalIf(a, b, SemiNormalConstant(Constant(true))) -> Or (Not (snf_to_be a), snf_to_be b);;

 

Smarter rewrite rules can get us back Equivalence operators, and can also easily be extended to handle relational.

 

And there we have it.  Problem easily handled through a series of simple and understandable steps.  End benefit to the user is that we can now examine complicated expressions and offer simpler forms to help you get clearer code.  As long as your code is obeying the semantics of the boolean algebra operators, then this will be totally safe.  If your code doesn’t obey those semantics… well be pretty assured that people who use your types will end up getting very confused at some point in time.

 

A later post will discuss a type in the .Net framework which doesn’t obey these expected relational semantics.  2 points if you can guess what it is J