Exploring the World Through Data Science and Machine Learning

Extracting Utility Functions from LLMs

July 12, 2025

In this post, I’ll talk about how to extract utility functions from a large language model (LLM). This works by generating pairwise preferences, then fitting a Thurstonian model to the data. This approach allows us to quantify how much LLMs value different outcomes. In addition, we can give the LLMs probabilistic questions in the form of lotteries (an X% chance of A or a Y% chance of B) and use that to help us quantify the LLM’s preferences. This post uses the techniques from the paper Utility Engineering by Mazeika et al.

Bayesian Updating

June 24, 2025

This post shows how Bayesian updating works when we observe new evidence. We compare four different prior beliefs about the probability of success of some event. Priors vary along two dimensions: Location – what probability do we believe a priori? Strength – how strongly do we believe it? (number of pseudo-observations encoded in the Beta prior)

Getting Structured Responses from DeepSeek-R1

May 27, 2025

DeepSeek-R1 is an interesting model. It’s generally quite willing to discuss a range of topics, although because it’s a thinking model, the outputs can be a little harder to get into a structured format. In this post, I show how to use langchain and Pydantic to get structured output from DeepSeek-R1.

Visualizing US Census Bureau Data

May 10, 2025

In this post I’m going to build on the last post on Working with US Census Bureau Data and discuss how to visualize it. That post walked through working with the Census Bureau’s API, so in this post I’ll skip those details.

Working with US Census Bureau Data

May 06, 2025

I found that the US Census API is difficult to work with and even LLMs don’t provide working code for it. So I thought it might be helpful to share some techniques that did work. In this post, I’m going to focus on both raw API calls and the Python wrapper.

Google Authentication and Credentials

May 02, 2025

Google Cloud authentication can be confusing. This post explains how Application Default Credentials (ADC) work and how to fix common authentication errors.

Exploring Transformer Weights

November 01, 2024

In this post, let’s visualize the internals of a transformer model. These visualizations reveal some interesting patterns that can help us understand how well the training is going.

1 / 20