Machine learning and hammers for Coq

Ah, that makes sense. What I did was simply have the user specify a root directory for the project (with a --prelude flag), and then parse the _CoqProject or Make file in that directory to get the load and ML paths. This is sufficient to run the coq scripts without error (at least not those kinds of errors :stuck_out_tongue: ). This was a little simpler in my original context, working with CompCert alone, because there was a single toplevel directory, and the _CoqProject was present there.

1 Like

Once I started running on the CoqGym projects, things were a little more complicated, since the directory in which the build is run is not always the top level directory in a project, and in a few projects there is no _CoqProject or Make file, but the flags are hardcoded into a Makefile. So in the newest version of Proverbot there is a json file specifying metadata about each project to allow running on it, and I’ve manually created a _CoqProject file based on the hardcoded Makefiles of a few projects.

1 Like

@HazardousPeach Thank you, Alex! That was super helpful :slight_smile:

Do you have a link to your script for creating your more diverse data set (your version of CoqGym data set) (and the script for evaluating your models too e.g. passport, proverbot9001 + proverbots variants with astactic/TacTok really any prover available online)?

That would be amazing to see! Thanks again.

  1. Passport doesn’t actually use Coq-Serapi, it’s actually CoqGym/ASTactic based. In Passport we were exploring the impact of adding identifiers, which are completely erased in CoqGym, so it made the comparison easier. Several of my newer collaborators were also more familiar with that infrastructure, so we started there, but are now developing more things on top of Proverbot.
  2. PyCoq looks great! I haven’t tried building and using it yet, so I’m not sure how easy it is to integrate into projects. If it had been around when I started developing Proverbot, I surely would have used it. It’s lower-level than my coq-serapy package, but it would probably serve as a good backend to that if I were to rewrite it. Currently though, the readme says that it doesn’t ā€œhandle errors correctlyā€, so I’m not sure if it’s ready for being used at Proverbot-scale. And it currently only supports a couple of pretty new versions of coq (8.13 and 8.14), and hasn’t been touched in almost six months.
  3. Yes, I generate data on CoqGym using my coq_serapy package (note the ā€œyā€ at the end to distinguish it from Emilio’s work which it uses as a backend). In the original evaluation of CoqGym, I did a lot of stuff manually, but I’m currently working on cleaning that up and making it more automated; tthe json/config file that CoqGym originally provides just gives the test/train split, not the file lists, so I made my own config file for that, including some more metadata that I needed; it can be found here.
  4. If you want to play around with Passport or Proverbot, they both have pretty good instructions on their README’s to get things set up. For Proverbot, you can start with running using pre-trained weights on CompCert, and then moving on to training your own weights and running on the CoqGym projects. I have a bunch of folks currently getting things setup at a couple of universities right now, so I think we’ve run into most of the setup issues, you can email or file an issue if you run into anything and we should be able to get back to you promptly.

Yes, I have those scripts, they are on the develop branch of the Proverbot repo. Unfortunately some of my posts in this thread are getting automatically marked as spam, including my response to your previous message, maybe because of the number of links? So I’ll keep my messages brief until that gets resolved.

Thats a pitty Alex! @HazardousPeach Maybe we can migrate the conversation to somewhere that we have more control? Like a discussion question on git discussion on the proverbot repo or a gitissue in the prover bot gitissues? I made one here in case you prefer to the conversation there. People can refer to it by going there if they are interested: Scripts for generating Proverbot's CoqGym data set and evaluation interactions Ā· Issue #27 Ā· UCSD-PL/proverbot9001 Ā· GitHub