Ah, that makes sense. What I did was simply have the user specify a root directory for the project (with a --prelude
flag), and then parse the _CoqProject
or Make
file in that directory to get the load and ML paths. This is sufficient to run the coq scripts without error (at least not those kinds of errors ). This was a little simpler in my original context, working with CompCert alone, because there was a single toplevel directory, and the _CoqProject was present there.
Once I started running on the CoqGym projects, things were a little more complicated, since the directory in which the build is run is not always the top level directory in a project, and in a few projects there is no _CoqProject
or Make
file, but the flags are hardcoded into a Makefile. So in the newest version of Proverbot there is a json file specifying metadata about each project to allow running on it, and Iāve manually created a _CoqProject
file based on the hardcoded Makefiles of a few projects.
@HazardousPeach Thank you, Alex! That was super helpful
Do you have a link to your script for creating your more diverse data set (your version of CoqGym data set) (and the script for evaluating your models too e.g. passport, proverbot9001 + proverbots variants with astactic/TacTok really any prover available online)?
That would be amazing to see! Thanks again.
- Passport doesnāt actually use Coq-Serapi, itās actually CoqGym/ASTactic based. In Passport we were exploring the impact of adding identifiers, which are completely erased in CoqGym, so it made the comparison easier. Several of my newer collaborators were also more familiar with that infrastructure, so we started there, but are now developing more things on top of Proverbot.
- PyCoq looks great! I havenāt tried building and using it yet, so Iām not sure how easy it is to integrate into projects. If it had been around when I started developing Proverbot, I surely would have used it. Itās lower-level than my coq-serapy package, but it would probably serve as a good backend to that if I were to rewrite it. Currently though, the readme says that it doesnāt āhandle errors correctlyā, so Iām not sure if itās ready for being used at Proverbot-scale. And it currently only supports a couple of pretty new versions of coq (8.13 and 8.14), and hasnāt been touched in almost six months.
- Yes, I generate data on CoqGym using my coq_serapy package (note the āyā at the end to distinguish it from Emilioās work which it uses as a backend). In the original evaluation of CoqGym, I did a lot of stuff manually, but Iām currently working on cleaning that up and making it more automated; tthe json/config file that CoqGym originally provides just gives the test/train split, not the file lists, so I made my own config file for that, including some more metadata that I needed; it can be found here.
- If you want to play around with Passport or Proverbot, they both have pretty good instructions on their READMEās to get things set up. For Proverbot, you can start with running using pre-trained weights on CompCert, and then moving on to training your own weights and running on the CoqGym projects. I have a bunch of folks currently getting things setup at a couple of universities right now, so I think weāve run into most of the setup issues, you can email or file an issue if you run into anything and we should be able to get back to you promptly.
Yes, I have those scripts, they are on the develop branch of the Proverbot repo. Unfortunately some of my posts in this thread are getting automatically marked as spam, including my response to your previous message, maybe because of the number of links? So Iāll keep my messages brief until that gets resolved.
Thats a pitty Alex! @HazardousPeach Maybe we can migrate the conversation to somewhere that we have more control? Like a discussion question on git discussion on the proverbot repo or a gitissue in the prover bot gitissues? I made one here in case you prefer to the conversation there. People can refer to it by going there if they are interested: Scripts for generating Proverbot's CoqGym data set and evaluation interactions Ā· Issue #27 Ā· UCSD-PL/proverbot9001 Ā· GitHub