Hackers' Pub

I'm happy to announce that our paper "Intrinsic Verification of
Parsers and Formal Grammar Theory in Dependent Lambek Calculus" has
been accepted at PLDI 2025. Authors are Steven Schaefer (my PhD
student @stschaefSteven Schaefer), Nathan Varner (an undergrad here at
UM) and Pedro Amorim (@pamorimPedro Amorim). The extended version
of the paper is already up on arxiv (https://arxiv.org/abs/2504.03995)
and the code is available on github
(https://github.com/maxsnew/grammars-and-semantic-actions).

The main idea of the paper is to define formal grammars as types in an
ordered linear logic (aka Lambek calculus). Then the terms are a kind
of intrinsically sound parse transformer, with intrinsically verified
parsers as a special case. The idea is that you write a parser as a
function from strings to parse trees, but the syntactic discipline of
Lambek calculus ensures that the output tree is a parse of the input
string.

Then formalisms like regular expressions and context-free grammars are
just inductive data types in Lambek calculus, and with the addition of
type dependency on non-linear types we are also able to express
automata formalisms. So far, we have an intrinsically verified parser
for regular expressions (using the classic Regex->NFA->DFA pipeline
using the powerset construction) as well as some hand-written parsers
for context-free grammars.

We give a simple denotational model where a grammar is defined as a
family of sets indexed by strings Σ* -> Set, a proof relevant version
of the definition of a formal language as a predicate Σ* -> Prop. This
category is monoidal closed and bicomplete, so supports all the
constructions of dependent Lambek calculus. The denotational semantics
is used as the basis for a shallow embedding of the system in Agda,
where we have formally verified all of our examples.

Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`