The last layer of feature engineering
Every major ML breakthrough follows the same pattern: more compute and data beat human engineering. Computer vision went from hand-crafted SIFT descriptors to learned end-to-end features. Natural language processing abandoned parse trees and part-of-speech tags for transformer representations. As Rich Sutton noted, methods that leverage computation ultimately win over methods that leverage human knowledge.
I think computational abstractions are the next target for this pattern. Programming languages abstract away machine code with human-readable syntax. Databases abstract away file systems with declarative queries. These abstractions emerged from humans solving informal optimization problems: programming languages minimize coding time and errors, databases minimize data retrieval complexity. With enough data and compute, AI could learn better computational abstractions than humans designed.
I see two ways this can unfold:
-
Raw computation: Train AI systems on primitive computational operations and bit sequences. The AI gets access to basic operations like
read
,write
,add
,multiply
,and
,or
,equals
, plus control flow operations. Everything else becomes bit patterns the AI manipulates with these primitives. The model could learn computational patterns humans never designed, but this would require massive execution trace datasets and the system would be completely opaque. -
Abstraction discovery: AI systems search the space of possible computational tools and invent new abstractions optimized for their tasks. For example, instead of making dozens of API calls to gather user data, validate permissions, transform formats, and store results, it might invent a single
user_analysis_pipeline
that performs this workflow atomically. More radically, it could discover computational shortcuts that reduce complex problems to simple operations, similar to how Fast Fourier Transforms made signal processing efficient. These discoveries would enable solving problems that are currently impossible due to computational constraints.
Abstraction discovery feels more promising near-term because we can build on existing optimization techniques. The core challenge is search: how do you efficiently explore the space of possible computational abstractions? This feels like the most important research question for next-generation AI systems. The first systems that master abstraction discovery could have fundamental capability advantages over those constrained by human-designed computational interfaces.