Intentive banner
MIDS Capstone Project Spring 2026

Intentive

Proactive AI companion for goal alignment 

Motivation

Goal pursuit is a central challenge in human behavior. The dominant paradigm in productivity tools treats goal management as a storage and reminder problem — users set goals, tools surface them periodically. This treats goal execution as a deterministic scheduling problem, ignoring the stochastic, context-dependent nature of human motivation, energy, and attention.

 

According to a 2024 Forbes Health survey, 92% of people who set long-term goals never achieve them — that's 9 out of 10 people.

Mission: "To build personal AI that empowers people to stay aligned with their long-term goals."

 The problem isn't willpower or laziness. It's the enormous distance between wanting something and doing something about it, day after day, month after month.

There are three main root causes for failure of long term goals:

  1. Present Bias: Our brains are wired to trade tomorrow's reward for today's comfort. Long-term commitments are sacrificed for short-term comfort.
  2. Cognitive Overload: Goals quietly disappear as they fall out of working memory due to constant noise and competing priorities of daily life.
  3. Static tools: Today's productivity apps are transactional — they record what you did but never understand what you're going through or what works for you.

Our Solution

Intentive formalizes goal pursuit as a longitudinal behavioral inference problem: given a stream of naturalistic check-in observations over time, can a system accurately model the health of a goal pursuit trajectory, predict completion likelihood, detect early stall signals, and intervene with timely, contextually appropriate nudges. It can generate thoughtful reflections that help people course correct and also get the dopamine hit that they need to continue pursuing their chosen goals.

 

Intentive delivers this through four core capabilities:

  • Persistent Memory. Goals within Intentive are not stored as static checkboxes. They are encoded as meaning — capturing your preferences, personality, and underlying motivations to build a dynamic, evolving understanding of what matters to you and why.

  • Daily Check-Ins. A single natural-language message is all that is required each day. Intentive automatically maps your input to every relevant goal, eliminating the friction of manual logging and keeping your progress continuously up to date.

  • Intelligent Reflections. Intentive connects day-to-day actions to your broader long-term arc, creating a continuous thread between your present self and the person you are striving to become — making progress visible, meaningful, and cumulative.

  • Thoughtful Nudges. When life disrupts momentum — as it inevitably does — Intentive proactively re-engages, gently and without judgment, to re-anchor you to the goals you identified as most important.

The Intentive Engine

Intentive is architected as a closed-loop adaptive system — functionally analogous to a reinforcement learning engine for human behavior. Users define long-term goals, which are encoded semantically as the reward state. Periodic natural-language check-ins serve as observation inputs, which the system uses to model behavioral patterns, update goal states, and generate reflective outputs. These reflections surface personalized insights that inform the user's next actions — closing the loop and beginning the next cycle. With each iteration, the system builds a richer model of the user's preferences, progress, and personality, compounding in value over time. The result is not a static productivity tool, but a self-improving growth engine that learns as you live.

Our Data Science Approach

The system is built on the following core technical components:

1. Long-Term Goal Representation & Semantic Memory User goals are captured in natural language and stored as evolving semantic objects rather than static entries. Each goal is represented using a vector embedding for semantic retrieval, paired with lightweight metadata (creation time, last engagement timestamp, inferred goal state, and user-confirmed updates). This structure allows goals to evolve without being overwritten and enables retrieval behavior to be directly inspected and evaluated.

2. Check-in to Goal Mapping & Goal State Inference:  Lightweight AI Agents map text based check-ins to goals and infer goal states over time: Active, Completed, Paused, Stalled, or Dropped/Not Relevant. Inference draws on sparse signals including interaction frequency, semantic similarity between recent activity and stored goals, elapsed time since last engagement, and explicit user updates.

3. Agentic Reflections & Nudges:  Rather than optimizing for compliance or engagement, the system generates brief reflective summaries that reconnect recent activity (or inactivity) with long-term intentions — designed to reduce forgetting and ambiguity rather than prescribe actions. It also generates timely nudges to help people re-focus on drifting goals.

4. Context-Aware Agentic Suggestions A bounded agentic component retrieves external events or activities relevant to user goals by querying structured APIs (e.g., Eventbrite, Meetup, Ticketmaster Discovery) and controlled web search endpoints (e.g., Tavily). Retrieved items are embedded, filtered, and ranked using semantic similarity and contextual constraints such as timing and availability.

5. RAG based goal tips:

Intentive uses a RAG pipeline built on a behavioral science research corpus (PubMed, ArXiv — 5 papers, 278 chunks) to surface scientifically proven goal-achievement tips. This separates Intentive from apps that only provide generic encouragement or re-hash popular beliefs.

Other apps say…

“Believe in yourself! Keep trying and have confidence you can do it! Get sleep and eat vegetables. Eye of the Tiger!”

Intentive says…

“Create Specific Implementation Plans. Develop clear, detailed strategies for overcoming potential challenges in training. Specify exact times, locations, and actions.”

 

 


Data

Because the system is designed to support long-term goals across many domains, it does not rely on a single fixed dataset. Core intelligence comes from how the system represents goals, maintains memory, and reasons about context.

Research Papers for RAG grounding  

External Contextual Data Sources For context-aware agentic features, the system queries:

  • Tavily (bounded agentic search) — for publicly available opportunities when structured APIs are insufficient

All external sources are used for information retrieval only. Decision-making around relevance, timing, and whether to surface a suggestion is handled by system logic, keeping agentic behavior controlled and interpretable.


System Architecture

Intentive employs a microservices architecture with 3 separate databases to achieve long-term memory for personalized user experiences. The user interacts with the frontend which is protected using a secure socket layer(SSL) via nginx. The frontend calls the backend API that contains all the business logic and processes data to be written to the data layer referred to as the Memzero API, which wraps the PostgreSQL, Quadrant, and Redis databases. It also side-effects tasks to a RabbitMQ job queue which are picked up by agentic workers running LangGraph which provide LLM inferences using user-inputted data. The results of these agents are written back to the Memzero API for long-term persistence and also published back to the frontend using server-sent events(SSE).

Privacy considerations were the inspiration for this architecture. The decision to create a dedicated API for memory management was intentional to ensure that the sensitive user data was locked down and kept isolated from other systems with its own authentication. Furthermore, the privacy policy promises the right to deletion which is fulfilled by the Memzero API which ensures the record is removed from the Qdrant vector store as well as from the PostgreSQL database.


Agents

LLM reasoning is provided via an asynchronous agentic workflow utilizing tasks published using the AMQP messaging protocol to a RabbitMQ broker which implements a durable queue that writes messages to disk such that results will not be lost if the broker were to restart. Quality of service is set to a prefetch of 1, so messages aren’t sent until the previous one has been acknowledged by the consumer. The agents run in a dedicated container using a threadpool executor allowing for parallel processing of user inputs. There are 16 dedicated agent types with 13 of them utilizing LLM inference and 3 of them operating using logical rules. The general categories are cognitive which track user mental state via sentiment and behavioral metrics, goal intelligence which track goal progression, contextual which provide context-aware suggestions to users, and the search agent which searches the internet for relevant resources to provide context-aware suggestions. Using these agents, Intentive is able to provide metrics to users on how well they’ve been doing on their goals, schedule adaptive nudges that are tailored to users’ usage preferences, and surface targeted resources such as local events that could be related to their goals. Results from the agents aren’t returned back through the RabbitMQ broker and are instead sent to the MemzeroAPI for long-term persistence, with immediately useful results pushed to the frontend using SSE. Performance metrics for these agents are tracked using Braintrust, which provides evaluation metrics for both the software efficiency as well as the LLM and prompt performance.


Long Term Memory

Long term memory is achieved using a combination of SQL and NoSQL databases which augment the function and performance of the system. Memories and textual information are stored in the PostgreSQL database while the vectorized representations of them are stored in a dedicated Qdrant vector store with separate collections for different kinds of memories such as goals, check-ins, or research-based documents. The inputs from the users are vectorized using AWS Bedrock embeddings and mapped to a UUID value that is stored with the associated text in our SQL database. When the backend or the agents process the latest messages from the user, they will also query the Memzero API to retrieve the most semantically similar memories which are identified using cosine vector similarity which allow the system to provide contexts enriched with past events and observations to ensure that user progress is represented. To decrease the load on our SQL database, the MemzeroAPI employs a Redis cache that allows for fast retrieval of frequently accessed data.


Evaluation

Intentive uses a two-layer evaluation framework to measure the quality of its AI-generated outputs. All results are tracked in Braintrust to support iterative comparison across prompt and agent changes.

Component-level evaluation

The first layer tests each of our twelve agent prompts in isolation. Every prompt is run against a fixed set of inputs and scored along multiple dimensions. Some scorers are heuristic, such as validating JSON format, enforcing word counts, checking that numeric outputs fall within expected ranges. Others are LLM-based judges that rate qualitative properties such as empathy, actionability, and specificity using structured rubrics with anchor points at each score level.

To support rapid iteration on these prompts, we built a companion tool: a Streamlit-based prompt engineering lab. It allows any contributor to select an agent prompt, fill in test variables, run it against multiple LLM providers, and compare outputs side-by-side. The same interface can trigger the full Braintrust eval suite to track how prompt changes affect quality over time.

 

Journey-level evaluation

Testing prompts in isolation cannot answer whether the system as a whole behaves well over time. To address this, we built a second evaluation layer that simulates complete 30-day user journeys.

This layer is driven by seven synthetic personas, each designed around a distinct behavior pattern: a user who starts strong but loses motivation, one who is consistently exhausted, one who disengages for two weeks before returning, and so on. Each persona's journey is simulated by a LangGraph agent that iterates through 30 days. Each day, an LLM role-plays as the persona and decides whether to check in, what to write, and at what energy level — based partly on their baseline tendency and partly on what the system sent them the day before. Throughout the journey, nine real agent prompts fire in sequence: goal strategies, check-in matching, reflections, nudges, context suggestions (backed by real Tavily search), and monthly momentum summaries.

 

 

The full transcript is scored across seven dimensions: coherence across days, fit to the specific persona, strategy calibration to user expertise, estimated goal progress and system contribution to that progress, presence of failure modes, projected three-month retention, and simulated nudge reactions. Each run produces both a markdown transcript and an interactive HTML visualization of the full journey.

Limitations

This framework is designed as a development tool to iterate on the system with confidence that quality is moving in a useful direction. It is not a substitute for real user data. Every quality score ultimately reflects one LLM evaluating another, and the personas themselves are based on assumptions about common goal-tracking archetypes rather than empirical user research. The framework can surface whether advice appears personalized and actionable, but cannot verify domain correctness. The next step in validation is a small-scale user pilot designed to measure actual retention and goal progress, and to test whether the simulated eval scores are predictive of real-world impact.


 Impact

Difficulty sustaining long-term goals is a near-universal human experience with real-world consequences: abandoned health goals contribute to preventable illness, interrupted learning limits career growth, and repeated failure erodes self-efficacy. From an AI perspective, this problem exposes a fundamental limitation of current systems — they are optimized for immediacy rather than continuity. A successful solution would advance the design of AI systems that remember, adapt, and support long-term human intentions rather than optimize for short-term engagement.

The global productivity market today sits at 60 to 70 billion dollars, with AI-powered personal assistants representing a rapidly growing segment. Our target users are professionals and students. 


Acknowledgements

  • Prof Korin Reid and Prof Todd Holloway for their guidance and support throughout the Capstone Journey. 

  • Prof Vinicio De Sola for his time and idea about using user personas and peer reviewed research papers based RAG system for grounded tips related to goals. 


Future Work

  • Real User Pilot — Deploy to 10–20 users over two weeks to validate whether simulated evaluation scores predict real-world engagement.
  • Nudge Calibration — Refine nudge tone, timing, and content, identified as the highest-priority improvement from internal evaluation.
  • Fine-Tuned Guardrails — Explore lightweight fine-tuned models for content guardrails to reduce operational cost without sacrificing accuracy.
  • Voice Check-Ins — Introduce voice input to make daily logging more natural, particularly for mobile users.
  • Calendar Integration — Automatically surface goal-relevant events and deadlines directly from the user's calendar.
  • Mobile-Native App — Develop a native iOS and Android experience to reduce friction and support consistent daily engagement.

Resources

Foundational Theory & Behavioral Science

Memory, RAG, and Agent Architectures

 

Last updated: April 16, 2026