Jan Jakubův & Josef Urban @ PIW 2016

AI4REASON @ ARG @ CIIRC @ CTU Prague

Given a benchmark problem set, find a collection of complementary E heuristics maximizing the number of solved problems.

`E Prover Brief Intro`- BliStr: Evolving E Heuristics for Benchmark Problems
- Conjecture-related weights for E
- BliStrTune: Evolving E Heuristics Reloaded

- automated theorem prover for FOL with equality
- predefined
`--auto-schedule`mode - command line arguments to guide proof search

- term ordering (
`KBO`,`LPO`, ...) - literal selection (to perform superpositions on)
- clause selection (to select a given clause)
- axiom relevancy pruning (SInE)

- assign integer to each clause
- the smaller the better
- user can use predefined priority functions
`ConstPrio`,`PreferUnits`,`PreferGround`, ...

- assign real to each clause
- the smaller the better
- user can use predefined weight functions
- user can specify parameters
- eq.
`Clauseweight`: basic symbol counting weight`fweight`- symbol weight`vweight`- variable weight`pos_mult`- positive literal multiplier

- CEF defined by
- weight function
- priority function
- weight function parameters

- syntax:
`Clauseweight(PreferUnits,10,1,1.5)` - assign pair
`(prio,weight)`to each clause - select the clause with the smallest pair

- combines more clause evaluation functions (CEFs)
- command line syntax:

```
-H'(3*ConjectureTermPrefixWeight(SimulateSOS,1,3,0,1,1,4,1.5,3), \
3*RelevanceLevelWeight2(DeferSOS,1,1,2,1,400,10,18,200,5,4,2), \
5*ConjectureTermPrefixWeight(PreferGroundGoals,1,3,5,10,1,1,1.5,4))'
```

- priority functions
- weight functions
- clause evaluation functions
- heuristics
`protocol`: proof search control arguments

- E Prover Brief Intro
`BliStr: Evolving E Heuristics for Benchmark Problems`- Conjecture-related weights for E
- BliStrTune: Evolving E Heuristics Reloaded

- method for parameter tuning and algorithm configuration
- by Hutter, Hoos, Stützle, Leyton-Brown, Fawcett
- from University of British Columbia (UBC)
- implementation available for download

- describe configuration parameters and their domains
- write a wrapper to run with a specific configuration
- provide test problems
- run & hope

```
tord {Auto,LPO4,KBO,KBO6} [Auto]
sel {SelectMaxLComplexAvoidPosPred,SelectNewComplexAHP,...} [SelectComplexG]
prord {arity,invfreq,invfreqconstmin} [invfreqconstmin]
simparamod {none,normal,oriented} [normal]
srd {0,1} [1]
forwardcntxtsr {0,1} [1]
splaggr {0,1} [0]
...
```

- the protocols are like giraffes, the problems are their food
- the better the giraffe specializes for eating problems unsolvable by others, the more it gets fed and further evolved

- start with initial protocols
- evaluate current protocols on all problems
- for each protocol, collect best cheap problems
- improve each strategy on its best cheap problems
- (using iterated local search)

- evaluate new strategies
- re-collect best cheap problems (goto 3)
- end when there is no improvement

- E Prover Brief Intro
- BliStr: Evolving E Heuristics for Benchmark Problems
`Conjecture-related weights for E`- BliStrTune: Evolving E Heuristics Reloaded

- Weight
`ConjectureRelativeSymbolWeight`counts symbols with smaller weights for conjecture symbols. - Question: Does it make sense to consider also term structure, not just symbols?
- To answer: We have implemented several new weight functions which measure a clause "related-ness" to a conjecture using different metrics

`TermWeight`- shared subterms with conjecture`PrefixWeight`- common prefix with conjecture terms`LevDistanceWeight`- Levenstein distance`TreeDistanceWeight`- Tree Edit Distance`TermTfIdfWeight`- TF/IDF`StrucDistanceWeight`- structural distance

- Are these new weights helpful?
- How complementary are with previous E weights?
- What are the best parameters for these weights?
- Can we use them with BliStr?

- E Prover Brief Intro
- BliStr: Evolving E Heuristics for Benchmark Problems
- Conjecture-related weights for E
`BliStrTune: Evolving E Heuristics Reloaded`

- BliStr does not change weight parameters
- it has only 12 (or so) hardcoded CEFs
- Idea:
- Extend BliStr to change weight parameters

- Problem: Too big parameter space
- ParamILS does not perform well

- Solution: Use two phases.
- tune global parameters
- tune weight function arguments

-tKBO6 -WSelectComplexG ... 3*ConjectureRelativeTermWeight(ConstPrio,0,1,0.1,18,400,50,300,1,4,0.8,1), 34*ConjectureRelativeTermWeight(PreferUnits,1,1,0.1,100,9999,100,5,1,9999.9,2,0.7), 8*ConjectureRelativeSymbolWeight(PreferGround,0.2,50,100,5,10,0.5,2,0.2)

-tKBO6 -WSelectComplexG 3*ConjectureRelativeTermWeight(ConstPrio,0,1,0.1,18,400,50,300,1,4,0.8,1), 34*ConjectureRelativeTermWeight(PreferUnits,1,1,0.1,100,9999,100,5,1,9999,2,0.7), 8*ConjectureRelativeSymbolWeight(PreferGround,0.2,50,100,5,10,0.5,2,0.2)

- use data from MZR@Turing division at CACS'12
- 1000 training problems were provided beforehand
- 400 new problems were used in the competition
- all problems exported from Mizar by Josef
- Plan: train Blistr and BlistrTune, then compare
- (in progress, only first training run finished)

CEF | # used |
---|---|

FIFOWeight(DeferSOS) | 30 |

FIFOWeight(PreferNonGoals) | 27 |

FIFOWeight(PreferProcessed) | 22 |

StaggeredWeight(DeferSOS,1) | 17 |

StaggeredWeight(DeferSOS,2) | 15 |

weight | # used |
---|---|

ConjectureRelativeTermWeight | 370 |

ConjectureRelativeSymbolWeight | 354 |

ConjectureTermPrefixWeight | 346 |

ConjectureGeneralSymbolWeight | 281 |

RelevanceLevelWeight2 | 276 |

prio | # used |
---|---|

PreferNonGoals | 426 |

PreferProcessed | 283 |

PreferWatchlist | 282 |

PreferUnitGroundGoals | 247 |

ConstPrio | 235 |

weight | # used |
---|---|

ConjectureRelativeTermWeight | 370 |

ConjectureTermPrefixWeight | 346 |

ConjectureStrucDistanceWeight | 77 |

ConjectureLevDistanceWeight | 40 |

ConjectureTermTfIdfWeight | 29 |

ConjectureTreeDistanceWeight | 18 |

ConjectureRelativeSymbolWeight( PreferGroundGoals,0.1,100,100,100,20,1.5,1.5,1.5) ConjectureTermPrefixWeight( PreferNonGoals,1,3,100,9999.9,0,9999.9,3,5) ConjectureTermPrefixWeight( DeferSOS,1,3,0.1,10,0,0.1,4,4) ConjectureRelativeTermWeight( PreferProcessed,1,1,0.1,10,100,50,50,1,3,2,2) ConjectureRelativeSymbolWeight( PreferNonGoals,0.1,100,50,20,18,0.1,1.5,1.5)