Parsing with Rust - Part 3: completing the Tree-sitter grammar

The third post in a series on writing programming-language parsers in Rust. In Part 1, we covered general parsing concepts and looked at some of the Rust crates available for generating a parser. In Part 2, we started writing a Tree-sitter grammar for WDL and used it from Rust. In this post, we’ll implement some of the more interesting parts of the grammar and learn about the rest of the Tree-sitter DSL in the process. The complete WDL grammar is available in the GitHub repository.

Read more →

Parsing with Rust - Part 1: a crash-course

The first post in a series on writing programming-language parsers in Rust. This post gives a high-level introduction to parsing, discusses different types of grammars and parsers, and gives an overview of the most popular Rust crates for generating a parser from a grammar. In future posts, we’ll dive deep into implementing parsers for WDL, a domain-specific language for describing computational workflows.

Read more →

What should my first Wordle guess be?

Like just about everyone else, I’ve gotten sucked into playing Wordle. I typically don’t go in for games with such straight-forward mechanics, but for a couple reasons it hits a sweet-spot for me. First, it is time-boxed: you can only play one round a day and each round is relatively short, so there’s no risk of it sucking up big chunks of time. Second, on its surface it is a word game (and I’m a bit of a word nerd), but underneath it is a game about probability.
Read more →

Multiprocessing architectures in Python

I am currently working on contributing code to create a multi-threaded version of a piece of bioinformatics software I use heavily, Cutadapt. Cutadapt is a Python program that reads through records in a FASTQ file (or pair of FASTQ files) and performs adapter and quality trimming, so the architecture of the program is “read sequentially from one or more files, modify the data, and write the results to one or more output files.
Read more →