In my spare time, I enjoy making tools that are useful for applied economists. A complete listing of all the software I've written is available on my GitHub. A handful of my favorites are described below.
Binscatter2 is an extension of binscatter. It scales much better to large datasets, typically running 2 to 6 times faster than binscatter for large (1m+ obs) datasets. In addition, binscatter2 provides the ability to characterize other aspects of a conditional distribution nonparametrically (i.e., overlaying quantile intervals, plotting arbitrary quantiles), allows for more flexible fit lines and data export commands, support for multi-way fixed effects, and support for the alternative method of residualizing described by Cattaneo et al. (2020).
Pylearn is a suite of programs that implements several types of supervised learning algorithms in Stata. Specifically, Pylearn allows users to implement decision trees, random forests, feed-forward neural networks, gradient boosting, and adaptive boosting. These programs pass data and estimation results between Stata and Python (through Stata 16's built-in Python integration), relying on Python implementations provided by scikit-learn.
Better graphics in Stata.