Skip to content
13 min read

Traive Finance: Banking for the Unbanked

Deep dive into use of AI by Traive Finance

Traive Finance: Banking for the Unbanked

Everyone deserves access to insurance, but not everyone deserves access to credit - anon

I recently had a conversation with Aline Pezante (Co-Founder and Chief Product Officer of TrAIve) and Mohammed Ghassemi (VP Data Science of TrAIve) about their recent publication of two new papers about using artificial intelligence for credit risk assessment and generative AI for credit risk report generation for farmers in Brazil.

Aline and her co-founder have a strong background of working in banking and finance, and uniquely understand some of the challenges faced by different entities.

TrAIve is a B2B provider of Artificial Intelligence tools to financial institutions, and other businesses.

“Traive: bridging the gap between agribusiness and credit. The comprehensive platform for managing the agricultural credit process, with AI-powered scores.”

This edition of S3 (SFTW Startup Spotlight) is based on my own research on financing for agriculture in Brazil, application of Bayesian models, and LLMs, and my conversation with Aline and Mohammad.

Before I go into the technical details on new research from TrAIve, I want to provide some context on credit access for farmers in Brazil. There are significant challenges for financial institutions to provide credit to farmers in Brazil (and for farmers to access credit).

TrAIve is trying to address some of these challenges.

Credit access for farmers in Brazil

Measured in U.S. dollars, the value of Brazil’s agriculture, including cultivation of crops and livestock production, grew an average of 8% annually over the past two decades (2000–20), with agricultural output doubling and livestock production increasing threefold.

Brazil is now a top-5 producer of 34 commodities and is the largest net exporter of agricultural products in the world. However, increases in fuel and fertilizer costs, credit and storage limitations, an overburdened port and transport system, and pressure to preserve the environment are challenging the long-term growth of Brazilian agriculture.

While there are many headwinds for continued agricultural growth in Brazil, access to credit is a serious challenge for farmers.

In Brazil, financing for agriculture comes from three sources

  1. Government agricultural credit disbursed through the National System of Rural Credit (33%)
  2. Agricultural processors (20%)
  3. Commercial banks or other Government agencies (47%)

If we dig one level deeper, two factors are expected to limit access to credit for Brazil’s producers:

  1. Higher credit costs (such as from interest rate increases)
  2. The current high rate of indebtedness of crop and livestock producers, increasing risk on loans.

Agricultural industry and commercial banks are categorizing agricultural loans as high risk because of an already high level of farm indebtedness, accrued for operational and investment credit and estimated by Brazil’s Central Bank at nearly $580 million in 2021.

Even though Brazil’s agriculture output has seen a massive increase, the percentage of family farms and corporate entities which access credit is limited to only about 15% of the denominator with significant regional differences across more than 5 million farmers and farms in Brazil. (For reference, the % of farmers who can and do access credit is in the 70-80% range.)

Brazilian President Lula has recently made a move to allot R$ 440 billion ($ 91.8 billion) for domestic agribusiness financing in 2023, under the annual crop plan known as Plano Safra.

But there has been major funding inequity in Brazil. For example, in 2017 the Brazilian Ministry of Agriculture invested 6 times more public funds in industrial agribusiness than it did in family farming.

Challenges for private lenders and institutions

The long preamble paints a clear picture, that there is tremendous room for growth for lenders to deploy capital in a more informed way, and for 85% of the farmers to access credit commensurate with the real risk (not perceived) of their farming operation.

Private institutions face many challenges to provide credit to farmers in Brazil.

Due to the dispersed nature of farmers there are high transaction costs, information asymmetry is acute due to lack of data about the farmer or the farming operation, the regulatory environment in Brazil is onerous and challenging and there is a lack of financial literacy among farmers. Given the tremendous growth potential for agriculture, it is an area of significant opportunity for lenders, agribusiness, and farmers. (Highlights by me)

Agriculture supply chains are complicated. They are made more complicated with government subsidies, regulations, and other policies. In the US, Wall Street is able to deploy a significant amount of capital to the agriculture industry as they have a significant amount of information available about farmers and their farming operations. This helps them manage their risk efficiently, provide liquidity, and provide it quickly..

This is not always the case in Brazil. There is significant friction and systemic issues for data to prevent a seamless credit workflow, which is fair and explainable.

It makes financial institutions leery to deploy capital within agriculture.

This is where TrAIve comes in.

TrAIve 

TrAIve is part of a new breed of companies in the fintech space, trying to address some of the problems highlighted above. They are not relying on basic analytics, but are built from ground up on the latest technology stack, and they are native to artificial intelligence and lately generative AI. (for example, ChatGPT).

TrAIve is derived from the word “thrive”, but it has AI in its name - AI = artificial intelligence

Traive is an AI native, business to business, multi-sided financial platform with services available for different participants in the credit ecosystem.

What does this mean?

AI native = Traive’s technology stack uses artificial intelligence tools and expertise as its base to build all its solutions on top of.

Multi-sided platform = Most activities in life are multi-party and have multiple sides. Farm credit is definitely multi-sided with the producer, the originator, capital markets etc.

Multi-sided platforms are extremely powerful, as they have inbuilt network effects, low system latencies, lower transaction costs, and can create powerful discovery mechanisms for participants.

(I am speaking based on experience, having built a multi-sided logistics platform for grocery retailers, CPG companies, and more than 3000 logistics service providers in my previous life. At Amazon Kindle my experience working at Amazon Kindle, while sitting between millions of readers on one side, and publishers, authors, designers etc. on the other side).

My first big question to Aline and Mohammed was,

“Why do we need companies like TrAIve, when there are specialized global financial institutions like RaboBank and a host of regional lenders?”

According to Aline, large institutions dedicated to agriculture alone cannot solve this problem. Large institutions are mostly dedicated to large farmers, for whom they have a lot of data. Large producers can provide collaterals, and farming history.

Small to medium sized farmers are data poor. Financial institutions do not have a good way to judge their creditworthiness, and so those farmers either end up getting unfavorable terms or no credit at all. Even large institutions like Goldman Sachs, JPMorgan, etc. have stayed mostly in the consumer market.

Traive’s artificial intelligence models and their credit reporting using generative AI (for example, ChatGPT) is able to operate in a data sparse environment, establish credit risk in a variety of scenarios, and present the credit analysis report back to the lender sliced along many different dimensions.

Image source: TrAIve website

Using artificial intelligence for credit management

When you train an artificial intelligence model based on data from a given context, it can typically perform well in said context. If you try to run the same model in a different context, which it has not seen before, most AI models struggle to perform well.

For example, let us say you have trained a smartphone based weed identification model based on data collected in the afternoon in Iowa. If you take the same model, and run it in the evening in Kansas, the model might struggle to have the same performance due to different lighting conditions, different soil backgrounds, the same weed species might look a bit different because they are in a different growth stage etc. This is called domain shift or out-of-distribution.

In the context of the credit process, Traive is using a unique approach which combines artificial intelligence models with Bayesian network modeling. Bayesian network models have been extensively used in other domains like spam filtering, medical diagnosis, but the use of Bayesian models is under-represented in agriculture.

Bayesian Networks are suitable for agriculture [1]  because they are able to

According to TrAIve, their researchers,

discovered a more effective agricultural loan default prediction method, overcoming the limitations of traditional models that falter under unpredictable conditions (like pest infestations and weather shifts) and data scarcity, a breakthrough that could transform banking industry practices in lending to farmers with limited data. [2]

Data is key to building AI models, and TrAIve took a systemic approach to understand the characteristics of available data, and how they could apply different techniques to work within the constraints of data availability. 

Data challenges

TrAIve created a dataset with 9 different variables to understand credit risk scoring.

Model variables in TrAIve model

Model variables in TrAIve model

TrAIve researchers collected data from about 97K loans sourced from nine of the largest financial and supply chain entities in Brazil. As you can see from the image, the delinquency performance across these institutions varies wildly with delinquency rates varying from 1.05% to 22.27%! Also, the sample sizes are fairly small as we go down the list of institutions in table 1 below.

A majority of the data is missing across different loan applications, with only the main crop and the credit history score being available for all applications.

The TrAIve team found the data attributes present in different applications vary quite a bit, and which institution received the application was one of the biggest factors. (This is known as concept drift and is related to data distribution).

This could be because of many factors - different lending standards across institutions, different customer profiles across institutions, type of farmers and crops, etc. Heterogeneity is expected, and helps reduce the risk of the overall portfolio, but financial institutions need to get to a credit performance evaluation, in spite of the heterogeneous data. 

Given how different data elements are missing across different institutions, (and state, crop, and crop year), one can see it can be challenging for a traditional financial institution to assess credit and delinquency risk for a given farmer.

(See Table 3 below - you can ignore all the technical terms, which you don’t need to worry about. The key takeaway: data distribution is most affected by the lending institution, state (which could be correlated to the lending institution), crop type, and crop year).

How does TrAIve try to address this data distribution and data availability problem, which is especially acute for midsize farmers, and come up with a model which works across different situations?

Industry aware models

TrAIve is using a specific technique called Bayesian Network models in combination with artificial intelligence models. The term “Bayesian Network” models might sound daunting. In layman’s terms, it means the following.

Imagine a complex system with many interconnected components, each influencing the others in various ways. This system could be anything from a medical diagnosis to a financial market to a loan delinquency model. It can be difficult to understand how all these pieces interact and what the overall outcome will be.

Think of a Bayesian network as a map of this complex system. It uses two key elements:

  1. Variables: These represent the different parts of the system, like symptoms in a medical diagnosis or economic factors in a market.
  2. Arrows: These represent the relationships between the variables. For example, an arrow from "fever" to "flu" indicates having a fever makes you more likely to have the flu.

This map has a specific structure

  1. A graph with arrows which have a direction and there are no loops. This ensures there's a clear flow of information and no circular dependencies.
  2. Probabilities: Each variable has a probability attached to it, indicating the likelihood of it being true. These probabilities can be adjusted based on the values of its parents (the variables connected to it by incoming arrows).

If you want to see a simple example of a Bayesian network with symptoms of flu, please check references at the bottom of the article. [3]

Based on this, TrAIve codified hundreds of years of knowledge and experience from many financial experts and created their own network of variables, how these variables affect each other, and with what magnitude.

Figure 1 shows the specific network built by TrAIve. If you look closely, there are 8 arrows going into the final credit performance circle, indicating all of those 8 variables have an impact on the credit performance.

TrAIve is able to pull data from publicly available sources, and rely on experts for domain expertise. They can also understand which experts are the “real experts”  and lean into their priors. Experts are particularly good at sub-domains.

It helps TrAIve stitch together a big library of models, which take into account various data elements like weather, satellite, supply chain infrastructure, supply & demand, etc. It helps TrAIve create their own underlying data dictionary and ontology, with their domain expertise.

The main crop impacts the final credit performance indirectly through agronomic score, market score, and financial score. The model says a farmer’s behavior score, which is sourced from the consumer credit bureau, is not impacted by the planted crop, which makes sense.

Based on the research results, TrAIve’s models perform better when they go out-of-distribution, as the Bayesian network model helps them fill the gaps in data with a higher degree of confidence. Due to this, TrAIve can potentially provide these models to financial institutions, who want to service the underserved market of mid-sized farmers.

According to Mohammad Ghissemi, VP of Data Science for TrAIve, their models not only provide point estimates of the final credit performance score, but also provide confidence ranges around the point estimate.

This information is important for the financial institution to understand the credit performance and loan delinquency probability of a particular loan application, to come up with the right terms for the credit application, and also understand different tranches within their portfolio.

At least based on the research paper, the salient takeaway is that a Bayesian network model created by experts combined with an artificial intelligence model, performs better in out-of-distribution environments, compared to an artificial intelligence model trained with large amounts of data.

Based on this, I asked about applicability of the model in other areas.

According to Aline, and Mohammad, what transfers across from one context to the next will depend on what is embedded within the model, and what kind of Bayesian network is most applicable to the new context from an agronomic, economic, market, and social standpoint.

This highlights the importance of domain knowledge to get better performance.

Data quality is a thorny problem for any artificial intelligence model. Mohammad explained certain heuristics used by TrAIve by looking for expected patterns in data (for example, yield), and when they see significant deviations from expected patterns, it is a flag for the data science teams to take a closer look.

TrAIve’s B2B model lets financial institutions embed TrAIve’s models in their workflow.

A better user experience for financial institutions and borrowers

If you have had to deal with a lower credit score, your credit report explains why you have a lower credit score. (you missed a few payments or you defaulted on some loan). The output of the AI based credit model cannot be purely a score (with a confidence interval), but the model should be explainable and let the potential borrower know, why and how they got the credit score they got.

In edition 146, “How about that user experience?” I had argued how LLMs are, can, and will be used to improve the user experience of dealing with complex analysis.

Traive researchers have proposed an innovative method that boosts the capability of Large Language Models (LLMs) to produce dependable credit risk reports – assisting capital market and agriculture supply chain players in making wiser lending decisions more efficiently — a groundbreaking development for the agricultural finance sector. [4]

TrAIve is using an innovative approach which layers a large language model (LLM) layered on top of the Bayesian network model discussed in the previous section to generate credit risk reports. The Bayesian network model helps the LLM generate a credit risk report, which explains the underlying model behavior in an easy to understand language.

Based on the research paper, an LLM generated report (LLM-G) beats the human generated report (HG), when evaluated against the question, “Which report was more helpful for you to assess the credit risk?” 90 to 10 in English and 60 to 40 in Portuguese!

What will be different for a midsize farmer with & without TrAIve?

Towards the end of my conversation with Aline, and Mohammad, I wanted to get to the so-what of all the technical progress made by TrAIve.

My question was simple.

What will be different for a midsize farmer with & without TrAIve?

According to Aline, without TrAIve, a midsize farmer has a difficult job to be understood by lenders in their community. They have to put up significant collateral (which they might not have access to), have to explain the same things about their farming operation, history, and personal situation again and again, fill out onerous data requirements, and this could still result in unfavorable terms or no access to credit.

With TrAIve, a midsize farmer will be able to engage with a lender, based on some minimum set of data. The farmer will have a more representative evaluation of the credit risk score. It will help them get liquidity faster and at a cheaper rate than is possible today.

Financial institutions (which are TrAIve’s primary customers) will have an AI powered assistant, which will help them serve an underserved market, help them better understand the risk of their portfolios, and potentially bring in additional financial institutions to Brazil, and create more liquidity for farmers.

References

[1] Brett Drury, Jorge Valverde-Rebaza, Maria-Fernanda Moura, Alneu de Andrade Lopes, A survey of the applications of Bayesian networks in agriculture, Engineering Applications of Artificial Intelligence,

Volume 65, 2017, Pages 29-42, ISSN 0952-1976, https://doi.org/10.1016/j.engappai.2017.07.003

[2] Ana Clara Teixeira, Hamed Yazdanpanah, Aline Pezente, and Mohammad Ghassemi. 2023. Bayesian Networks Improve Out-of-Distribution Calibration for Agribusiness Delinquency Risk Assessment. In 4th ACM International Conference on AI in Finance (ICAIF '23), November 27--29, 2023, Brooklyn, NY, USA. ACM, New York, NY, USA 9 Pages. https://doi.org/10.1145/3604237.3626897

[3] A simple example of Bayesian Networks

Bayesian networks have the ability to reason about uncertainty

Think of it like this:

[4] Ana Clara Teixeira, Vaishali Marar, Hamed Yazdanpanah, Aline Pezente, and Mohammad Ghassemi. 2023. Enhancing Credit Risk Reports Generation using LLMs: An Integration of Bayesian Networks and Labeled Guide Prompting. In 4th ACM International Conference on AI in Finance (ICAIF '23), November 27--29, 2023, Brooklyn, NY, USA. ACM, New York, NY, USA 9 Pages. https://doi.org/10.1145/3604237.3626902