All Posts

Number of blog posts: 12

How intelligent must AI be?

21 April 2025

Alex Tabarrok has a post on Marginal REVOLUTION about AI on Tariffs, where he argues that there’s no problem using AI to help inform policy, but that the White House should have also asked the following:

“Suppose the US imposed tariffs on other countries in an effort to reduce bilateral trade deficits to zero using the formula for the tariff of (exports-imports)/imports. What do you estimate would be the consequences of such a policy?”

You can read the full answer of O1 Pro on the original post.

I wonder if you need a frontier model like O1 Pro, and an expensive one at that, to get a similar answer. So, I asked the same question to Gemma 3 13B model, a small open weights model that can be used for free on Google AI Studio and even run locally. Here’s the answer:

A tale of file sizes

15 November 2017

I have recently helped a PhD student read and merge about 150 CSV files. I used R, but the student wanted to use Stata later, so I used the haven package to export to the Stata 14 native file format.

$$x^2=2\beta\alpha$$

There is nothing to report about the process everything was quick and easy and worked as expected. But I noticed that .dta file (Stata file) was substantially larger than the original data. The original data CSV files were a little above 8GB, the consolidated R file was about 1.3GB (no surprises here as R saves the files in a compressed format) but the Stata file was 33.7GB.

Best Practices for Scientific Computing

25 September 2017

Scientific computation is nowadays an integral part of most research in many science fields. But the majority of researchers never had any formal training on how to structure, maintain, and collaborate on such projects. I had to learn the hard way, and there are several resources nowadays that I wished I had access to when I was starting.

A good start is to read Good enough practices in scientific computing paper, and Best Practices for Scientific Computing. They provide apt recommendations from data management, collaboration, project organisation, revision control systems to the writing of manuscripts.

I also recommend The Plain Person’s Guide to Plain Text Social Science by Kieran Healy.

R's JIT compiler

11 September 2017

A couple of days ago I was giving a course on R, and I used the following example\footnote{This function calculates the value of an American call option using a binomial tree.}:

library(fOptions)
system.time(CRRBinomialTreeOption(TypeFlag = "ca", S = 50, X = 50, Time = 5/12, r = 0.1, b = 0.1, sigma = 0.4, n = 2000)@price)
##    user  system elapsed 
##   7.807   0.034   8.002

I was puzzled that on my laptop it took about 8 seconds to compute and on the lab computers it took less the 2 seconds. Since my laptop has a relatively recent CPU, and similar to the ones on the lab computers, I could not explain the difference in performance

Only later it occurred to me that I’m still using R 3.3.3 and the lab computers had 3.4 installed.

Speeding up an OLS regression in R

3 June 2017

Yesterday a colleague told me that he was surprised by the speed differences of R, Julia and Stata when estimating a simple linear model for a large number of observations (10 million). Julia and Stata had very similar results, although Stata was faster. But both Julia and Stata were substantially faster than R. He is an expert and long time Stata user and was sceptical about the performance difference. I was also surprised. My experience is that R is usually fast, for interpreted language, and most procedures that are computationally intensive are developed in a compiled language like C, Fortran or C++, reducing any performance penalty.