It does seem a bit of a strange challenge - a bit reminiscent of high school mat...

HarHarVeryFunny · 2026-01-21T15:00:34 1769007634

I just threw this prompt at Gemini, and it seems (I haven't analyzed the problem to see if it is correct), to be able to extract a clear understanding of the problem, and a specification for the kernel.

"Can you "reverse engineer" what the kernel in this optimization exercise is actually doing - write a specification for it?

https://github.com/anthropics/original_performance_takehome"

Gemini says it's doing inference on a random forest - taking a batch of inputs, running each one through each decision tree, and for each input outputting the sum of these decision tree outputs - the accumulated evidence.

HarHarVeryFunny · 2026-01-21T19:19:08 1769023148

So looking at the actual code (reference_kernel() in problem.py), this "random forest inference" is completely wrong!

It's doing some sort of binary tree traversal, but the hashing and wrap around looks weird - maybe just a made up task rather than any useful algorithm?

saagarjha · 2026-01-21T20:00:07 1769025607

Yes, it’s made up.

fc417fc802 · 2026-01-21T17:17:55 1769015875

This isn't "reverse engineering" it's merely "being able to read fairly simple code you didn't write". A much simpler version of the kernel is provided at the end of problem.py as reference_kernel2.

If you can't make sense of such a small codebase or don't immediately recognize the algorithm that's being used (I'm guilty of the latter) then you presumably aren't someone that they want to hire.

HarHarVeryFunny · 2026-01-21T17:31:30 1769016690

Fair enough, and there are clues in the comments too, but why not just provide the specification of the kernel (inputs and outputs) as part of the problem?

fc417fc802 · 2026-01-21T17:40:21 1769017221

They do. They provide reference_kernel which shows the algorithm itself, build_mem_image which shows the data format you will be working with, and finally reference_kernel2 which implements said algorithm on said data format.

They then provide you with a very naive implementation that runs on their (very simple) VLIW architecture that you are to optimize.

If at the end of that someone is still lost I think it is safe to say it was their goal that person should fail.

HarHarVeryFunny · 2026-01-21T18:18:44 1769019524

Well, yes, they have a reference implementation as documentation, just as they have the simulator as documentation for the ISA ...

The problem is about pipelining memory loads and ALU operations, so why not just give clear documentatation and state the task rather than "here's a kernel - optimize it"? \_(ツ)_/

fc417fc802 · 2026-01-21T18:47:42 1769021262

Presumably that is only one of two purposes, with the other being to test your ability to efficiently read, understand, and edit low level code that you didn't write. I imagine you'd regularly run into raw PTX if you worked for them in the relevant capacity.

And perhaps a third purpose is to use the simulator to test your ability to reason about hardware that you are only just getting familiar with.

HarHarVeryFunny · 2026-01-21T19:40:18 1769024418

I would assume that anyone optimizing kernels at Anthropic has full documentation and specs for what they are working on, as well as a personal butler attending to their every need. This is big money work - every 1% performance improvement must translate to millions of cost savings.

Maybe they specified the challenge in this half-assed way to deliberately test those sorts of skills (even if irrelevant to the job), or maybe it was just lazily put together.

The other thing to note is that if you look at what the reference_kernel() is actually doing, it really looks like a somewhat arbitrary synthetic task (hashes, wraparound), so any accurate task specification would really need to be a "line by line" description of the steps, at which point you may as well just say "here's some code - do this".

menaerus · 2026-01-22T07:01:04 1769065264

In a fast-paced domain such as this one, and especially wrt the (global) competitiveness, development/leadership process is most likely chaotic and "best" practices that we would normally find in other lower-paced companies cannot be followed here. I think that by underspecifiying the assignment they wanted to test the ability of a candidate to fit into such environment, apart from the obvious reason and which is to filter out not enough motivated candidates.

saagarjha · 2026-01-21T19:59:36 1769025576

They do, but documentation is not always complete or correct.