Brooke, the AI Broker
Problem & Motivation
Buying a home is not only one of the largest financial decisions a person will make in their life but can also be one of the most overwhelming and stressful. In fact
40% of Americans said that buying a house is the most stressful event in modern life
As a result of increased cost of living and rising housing prices, the median age for first-time home buyers has reached a record high of 38 years old. Furthermore, new real estate rules coming in Summer 2025 will only increase the financial burden placed on many home buyers by requiring buyers to pay their agents directly. For a median-priced home in the United States, this comes out to an additional $12.8K- $15.3K in additional costs that buyers may no longer be able to overlook. However, paying an agent is often not the ultimate solve.
Home buyers say the purchasing process is particularly difficult due to:
- Having unhelpful agent (24%)
- Struggling with negotiations (28%)
- Exceeding their budgets (40%)
Our project aims to empower users to take control of their homebuying experience by providing them with expert level guidance and support at their fingertips. BROOKE provides all the benefits of a traditional broker, without the cost and with ease of access. By integrating AI into real estate, we can help alleviate some of the financial burden of the homebuying process, allowing more individuals to buy their dream home.
Data Source
To power Brooke and her tools, we made use of three key datasets:
Training Data: Zillow’s real estate queries
Housing Data: Redfin’s housing market data
Financial Data: Tax brackets and tax rates from taxfoundation.org, mortgage rates from Freddie Mac MBS, and homeowners insurance costs from Policy Genius.
The data from Zillow consists of over 20,000 synthetically generated questions and legally compliant responses covering a range of real estate-specific topics. The Redfin data consists of historical home listing prices, home sale prices, active listings, number of days on the market, and more relevant KPIs. The data from Freddie Mac and Policy Genius includes average insurance costs by Zip, State + Federal Tax Brackets, and up-to-date mortgage rates by State + credit score.
All of our data was quality-checked for nulls, most recent date (2024+), and accuracy. We conducted EDAs to ensure that values were as expected for data regarding mortgages, insurance rates, and housing prices. We also confirmed that the training data from Zillow was comprehensive for the 108 distinct topics covered, and no clear topics were missing or lacked adequate coverage.
Zillow Real Estate Query Training Coverage - EDA
Mortgage Rate Heat Map - EDA
Data Science Approach
Brooke is designed to provide a holistic view of the real estate market while providing personal support and relieving the stress that comes with buying a home through three key features : 1) Chat with Brooke 2) Budget Calculator 3) Market Trends.
Chat with Brooke
To power conversation with Brooke, we opted to use the open source model DeepSeek-R1 Distill LLaMa 8B because of its reasoning skills as well as its fast and efficient processing. This ensures high-quality responses without the wait time associated with a live broker.
To ensure our chatbot generates both helpful as well as legally compliant responses aligned with the Fair Housing Act and the Equal Credit Opportunity Act, we fine-tuned the model using three key datasets:
Zillow’s dataset of real estate-focused queries and responses generated using GPT-40
Our proprietary dataset of compliance-safe responses generated using Claude (Anthropic)
Redfin property data (integrated through a RAG pipeline, along with the specific user inputs through an introductory questionnaire)
To optimize performance, we fine-tuned the model using Unsloth which allows for efficient memory, using 70% less memory and running 2x faster while maintaining reliable model performance.
Budget Calculator
For our Budget Calculator, we loaded and cleaned regional-level insurance and tax data, and aggregated housing cost data by region to calculate appropriate ranges. We also established a classification process based on property type to interact with user inputs in the initial questionnaire.
The tool is powered through Gradio, with expert review by both a real estate agent and financial expert separately.
Market Trends
For Market trends, we loaded and cleaned city, state, and metro-level KPIs, and created additional aggregations for 3 and 6-month rolling averages to power a more readable visualization. Similarly to the Budget Calculator, the Market Trends product is default populated based on the user inputs in the initial questionnaire to provide a more seamless user journey.
The visualization is powered through Gradio and Altair.
Evaluation
To ensure that chatting with Brooke delivers accurate and compliant results, we conducted an evaluation process comparing Brooke to other open-source models such as:
- LLaMA3
- Mistral Small
- DeepSeek-R1 (8B Distilled)
We benchmarked all of the models against Claude 3 Sonnet. We opted to use Claude 3 Sonnet because of its reasoning abilities while also producing safe and compliant answers at a high speed. We tested against 2,000 real estate-focused queries, half of which were focused on safety/compliance and half that prioritized the usefulness of the model. We chose to measure performance by using the BLEURT Score because it is excellent at focusing on the content of the responses rather than prioritizing exact word matches as well as its effectiveness for evaluating longer conversations with Brooke.
Our model outperformed all 3 models that we compared it to across both safety and usefulness indicating that Brooke provides both highly accurate and legally compliant responses.
Impact
Ultimately, our product is a one-stop shop real estate product designed to replace the broker. Furthermore, each part of the product delivers value beyond what is currently available in the market:
- Brooke, the AI Chatbot: Brooke serves the public, and is trained to specifically answer real estate questions, as opposed to other existing chatbots.
- Budget Calculator: Our tool takes into account the tax savings from buying, a large oversight by competitors, as identified by SMEs.
- Market Trends: Our tool covers the entire US, with augmented KPIs, in comparison to brokers, who typically specialize in one region.
The real estate industry is changing, and buyers are looking for smarter, more affordable solutions. Our mission was to build a product that gives buyers the power to buy a home without relying on an expensive agent—saving users thousands while making the process easier and less intimidating.
For first-time buyers, or anyone trying to maximize their budget, we believe that Brooke is a game-changer.
Key Learnings
As we reflect on our product, we’re proud to have overcome a number of challenges as we built our one-stop-shop broker. To power Brooke with multiple tools, we needed to invest heavily in data engineering, resource constraint mitigation, and project integration work.
Data Engineering: For data engineering, we combed through numerous data sources, APIs, and models, and pivoted multiple times to ensure our product was thorough and differentiated from competition.
Resource Constraints: We cleaned, adapted, and compressed our data sources to overcome resource constraints (RAM), and invested into training our open source model (monetary).
Project Integration: Lastly, to effectively combine each individual tool built into one cohesive product for our users, we spent the last sprints weaving the products together cohesively in GIT and simplifying and defining complicated user journeys.
While we’re proud of the product we’ve built this semester, we have a few roadmap items in our vision.
- Product Granularity: We have relatively few customer journey customizations, and limited the data granularity for market trends due to resource and time constraints. Given more time, we would like to include more granular insights and allow for more customer segmentation differentiations.
- Product Scope: For our product, we decided intentionally to focus on buyers, but our product (Brooke, Market Trends, and even Budget Calculator) can be retargeted to be seller/broker-facing with a different interface. We’d also want to include potentially other broker functions, such as referral identification.
- Live Product Updates: Lastly, we’d like to enhance our data pipeline, and make Brooke a tool that is continuously updated and enhanced, even past the end of this course.
Acknowledgements
We would like to extend our gratitude to Danielle Cummings and Todd Holloway for their guidance and support throughout the past semester as we brought our vision to life. You have provided insights that challenged us to improve our product in ways we on our own would not have considered.
We would also like to thank Sandra Dasaad (real estate agent), Mallory, Neal, Jeff, and Lisa (current prospective homebuyers), our domain experts who helped determine the vision and viability of this project by sharing domain expertise and background to how Brooke could be best utilized in the market.
Lastly, we would like to thank Wes Yee, our financial expert who ensured our budget calculations are sufficient and up to par.
Image and Research Credits:
https://listwithclever.com/research/homebuyer-sentiment/
https://images.app.goo.gl/aXNv9X4djdLRp2Rb6
Logo generated with AI