Safety and reliability concerns are among the largest obstacles hindering the adoption of AI in many areas. Formal verification methods have been tremendously successful in addressing such concerns by providing safety/correctness guarantees in the realm software and hardware. The core idea is that a concrete implementation of a system is verified against an abstract (logical) specification of how the system is supposed to behave. The success has been largely driven by modern SAT-solvers (and their more powerful extension SMT-solvers) which rest on the CDCL algorithm. In recent years, researchers have demonstrated that properties of neural networks can also be verified using formal verification methods. A huge advantage of this approach is the intrinsic capability of generating counter examples in case of a violation.
While important properties such as adversarial robustness (see our post on Certified error rates for neural networks) and fairness are verifiable using formal specifications, the applicability of these specifications is restricted to low-level input-output properties. However, most safety relevant properties of complex systems such as self-driving cars cannot be properly expressed through such low-level properties. A good example is the requirement that a car has to stop if it approaches a stop sign.
The IJCAI 2022 paper [Xie22N] proposes the use of reference networks which can be addressed through predicates in a logical specification. The reference networks should be independently trained and capture high level concepts like „car is approaching a stop sign “. The verification is then performed by using the reference network as proxy for the concept. Adding the ability to speak about such high-level concepts to the specification language allows to formally express requirements like stopping at red lights. Neuro-symbolic approaches are an emerging class of hybrid methods that combine formal methods with deep learning. There already exists a large community, several publications in that direction were accepted at IJCAI.
The authors develop a specification language for neural networks based on Hoare logic which is popular in software verification. They implement their system on top of existing verification software and conduct a series of experiments based on the MNIST and the German traffic sign benchmark data set to demonstrate multiple possible use cases. In particular, they train a general classifier, a specialized classifier for a particular task and an autoencoder. Using these networks, they verify (or find counterexamples) for the following properties:
- If the specialized classifier recognizes the input as belonging to the special class, then the general classifier will also output the class.
- If the autoencoder has low reconstruction error, then the classifier has high prediction confidence.
- The infinity norm of the distance between two functions represented by neural networks is bounded by a given constant.
While the theory behind formal verification is well understood and many general-purpose implementations exist, the largest challenge is currently the poor scalability of the methods. The approach described above essentially reduces verification of neural networks to a satisfiability test on an SMT formula that grows (linearly) with the size of the network. Therefore, it is currently only possible to verify rather small networks. Moreover, defining, training, and verifying reference networks is a challenge on its own. Nevertheless, the use of reference networks in verification is an interesting direction, and I believe we will see more of it in the future.
Assessing the quality of a generative model with another network is also a key ingredient of generative adversarial networks. There, the referee network is trained jointly with the generative model. I wonder if reference networks in a verification setup can be trained jointly with the network under verification in a similar fashion.