174. An Ocean of Models

In the classic Sherlock Holmes short story “The Adventure of the Bruce Partington Plans” (originally published in 1908), Sherlock is explaining to Dr. Watson, what a great mind his brother Mycroft is.

“We will suppose that a minister needs information as to a point which involves the Navy, India, Canada and the bimetallic question; he could get his separate advices from various departments upon each, but only Mycroft can focus them all, and say offhand how each factor would affect the other. They began by using him as a short-cut, a convenience; now he has made himself an essential”

Mycroft is able to get inputs from various departments, weigh the pros and cons of each department’s inputs and then come up with an overall higher level recommendation. The story does not lay this out, but it is feasible, they got inputs from multiple teams within each department, which were considered before making the recommendation at the department level and sent to Mycroft.

An ocean of models

AI models have been getting better and better every year.

For example, we have all seen data on how newer Generative AI models have done better and better in terms of acing aptitude tests. Their scores have been going higher and higher. (It is not clear whether this is the right way to test the strength of a model). Many of the generative AI models now have added multi-modal capabilities, which include images, video, and some audio.

AI models have evolved leaps and bounds, especially on the computer vision side (relevant to agriculture) with segmentation, and object detection models. Segmentation and object detection models are key to solving many of the computer vision related challenges like autonomy, weed detection, sense and act, etc. in agriculture.

(For reference, segmentation divides an image into regions and assigns labels to each pixel. Segmentation provides detailed information about object boundaries and regions. Object detection identifies specific objects in an image or video and classifies them. Object detection focuses on finding the position and boundaries of objects.)

We have seen improvements in what robots can do physically compared to a few years ago. For example, Boston Dynamics’ humanoid robot could do push-ups in April, and now it can do work in a demo space, by moving engine parts between bins.

Side note: Boston Dynamics - when can we get these humanoids to clean up our houses??? In the future, parents will complain to their kids for not assigning a room cleaning task to their humanoid robot, instead of complaining about their kid not having made their bed or cleaned their room.

One might think these general purpose models from one of these tech companies will keep getting better and better, and businesses can use these models to solve some of their pressing technology challenges.

Is it possible to do so?

If we draw an analogy with an NFL team (or any sports team), players put in a lot of effort to excel along a handful of dimensions which are important to win an NFL game. When they excel along a handful of dimensions, they do not excel along other dimensions. For humans, it makes sense as there are limitations on time, effort, and physical attributes needed to excel along different dimensions. For example, a center in football will find it challenging to be a wide receiver.

AI models have fewer limitations. If enough data is given to a model along multiple dimensions, it could in theory be good at multiple dimensions. The extreme examples are the large language and multimodal models like Gemini from Google, and ChatGPT from OpenAI.

An Ocean of Models

An ocean of models

Why might businesses run multiple models?

The tools are coming!

SFTW Interview: Navigating the messy middle

An Ocean of Models

An ocean of models

Why might businesses run multiple models?

Read next

The tools are coming!

SFTW Interview: Navigating the messy middle