Welcome to another edition of “Software is Feeding the World.” This week’s edition will take a first look at the GenAI Benchmark announcement done by the University of Illinois and provide an update on the SFTW Convo series.
Announcements
- Most enterprise GenAI projects get stuck in proof-of-concept land. SFTW released a white paper on Wednesday, which provides a practical guide on how to break through the POC wall with case studies from four different organizations. You can get the free white paper here: “POC to OMG! The Realities of Deploying GenAI at the Farm Gate.” There have been 538 downloads of the paper. Don’t miss out by getting your free copy today!
- Rhishi Pethe will be chairing the opening fireside chat at the pre-summit at the “AI in agriculture forum” on March 10th.
- Rhishi Pethe will be hosting the inaugural AgTech Alchemy Summit on March 10th along with other AgTech Alchemy co-founders (Sachi Desai, and Walt Duflock.)
- Rhishi Pethe will be chairing the “AI / GenAI: Transforming Legacy Industries to Improve Customer Outcomes” breakout session on March 11th at the World Agritech Summit with Pratik Desai (Kissan AI), Sachi Desai (Bayer), and Stewart Collis (Gates Foundation)
GenAI Benchmark consortium for agriculture announcement
I am quite skeptical of standards and benchmarks, especially if they are enforced by fiat. When I spent a significant amount of time in logistics and transportation, everyone talked about the EDI X12 “standards”, but as soon as you dug into it, there was very little which was standard about it. There are very few universal standards and protocols like HTTP, TCPIP etc.
Many people bemoan the lack of data interoperability standards within agriculture. As always, the issue with the lack of standards is not that people don’t want standards, but it is more about the value they can create through an exchange of data and information. The talk of standards always reminds me of this classic XKCD comic:

Source: https://xkcd.com/927/
I am probably less skeptical about benchmarks. I become quite skeptical when benchmarks do not reflect real-world use cases, or they are not domain specific benchmarks. Benchmarks are also not useful if they do not measure model limitations.
G. Bailey Stockdale has been publishing scores on how different foundation LLMs perform on a standard certified crop advisor test here in the US. He has been using the CCA test to see how different models perform and improve over a period of time.
