Article

Prototyping with AI to help navigate SNAP purchase restrictions: lessons from edge cases

Updated
Table of contents

Starting January 1st, 2026, Indiana will implement new SNAP purchase restrictions under a federally approved waiver. Recipients will no longer be able to use their benefits to buy "soft drinks" or "candy" — categories operationalized with specific criteria. With USDA approving six additional states this week, bringing the total to 12, understanding how recipients and retailers will navigate new purchase restrictions becomes increasingly important.

While purchase restrictions in SNAP are a highly debated policy issue, the implementation reality creates practical challenges for both recipients and retailers trying to answer the more fundamental question of what is, and is not, allowed under the new rules.

The difficulty of understanding how the restrictions are likely to apply to food items where the answer is not obvious (“edge cases”) presented a compelling opportunity to test current AI capabilities: could AI models help people quickly determine whether a specific food item falls under the new restrictions, or are allowed?

We built both a prototype and initial evaluation to get a basic sense of how well models do in this area, identify specific difficult “edge case” food items, and develop an AI workflow that could tackle this problem.

Why this problem is well-shaped for AI capabilities#why-this-problem-is-well-shaped-for-ai-capabilities

The core problem that SNAP recipients will face starting in January can be decomposed into two pieces that map well to current AI model strengths:

1. Image recognition: Modern vision models can reliably identify food items from photos, meaning users could potentially just take a picture at the store using the app rather than having to type out product names (or even find and scan the bar code.)

2. Applying general rules to specific items: The waiver includes specific definitions of "soft drinks" and "candy" (see below.) That means figuring out if a food is allowed or not involves applying those policy specifics to a given item with general purpose reasoning. A question like: “Does corn syrup count as ‘natural fruit juice’? Yes or No” is one that modern AI models are capable of handling.

Here is the waiver’s specific definitions:

Candy: A preparation of sugar, honey, or other natural or artificial sweeteners in combination with chocolate, fruits, nuts, or other ingredients or flavorings in the form of bars, drops, or pieces. The term does not include any preparation requiring refrigeration.

Soft Drinks: Nonalcoholic beverages that contain natural or artificial sweeteners. The term does not include beverages that contain milk or milk products, soy, rice, or similar milk substitutes, or are exclusively naturally sweetened using natural vegetable and / or fruit juice.

We started by building a relatively simple multimodal prototype in our AI sandbox application that took an image and attempted to determine if the item was allowed or disallowed under the waiver rules. This was our initial “zero shot” test to get a very basic sense of how well AI models do out of the box.

It worked reasonably well overall, but we quickly found interesting edge cases.

For example, a Snickers ice cream bar would sometimes be categorized as “allowed” and other times as “not allowed.” In our earliest tests, we even saw language models saying it was both allowed and disallowed within the same response.

This told us we needed to build a more robust text-based evaluation data set (“eval”) before tackling the full multimodal version.

Building the evaluation dataset and applying the rules to difficult edge cases#building-the-evaluation-dataset-and-applying-the-rules-to-difficult-edge-cases

Following our usual approach, we started by building an eval data set with three categories:

  • Clearly allowed items: e.g. bread, eggs, fresh produce
  • Clearly prohibited items: e.g. Coca-Cola, Twix bars — obvious candy and soft drinks
  • Ambiguous cases: items that require careful interpretation of the legal definitions to determine if allowed

For the ambiguous cases, we had to apply the rules ourselves and develop our own interpretations, since implementation is still in progress and there is not yet a canonical reference for every possible product out there.

Edge cases reveal the complexity of implementing a policy like purchase restrictions#edge-cases-reveal-the-complexity-of-implementing-a-policy-like-purchase-restrictions

Some of the most interesting findings came from testing the “gray area” edge case items:

Protein bars: While not traditionally considered "candy," many protein bars fit the legal definition under the waiver.

Sunny Delight: Evaluating this one actually surprised me. What seems like a primarily fruit juice drink in fact has high fructose corn syrup as its second ingredient, with "less than 2% each" of various fruit juices. Under the waiver's definition, this clearly qualifies as a soft drink because it contains sweeteners that aren't natural fruit or vegetable juice.

Ocean Spray Cranberry Juice Cocktail: This became a hotly contested case in our weekly SNAP trivia in Slack. The original version (“...juice cocktail”) has added sugar, and even describes why: “nutrient-dense cranberries are naturally tart, so we add sugar for taste.” That appears to make it a soft drink under the restriction waiver’s terms. However, Ocean Spray also makes a "100% juice" cranberry version that would be allowed by virtue of being exclusively sweetened with natural fruit juice.

Liquid Death: While primarily known as a water brand, the flavored varieties are labeled as "Soda Flavored Sparkling Water" — so is it soda or sparkling water? Or more specifically, is it a soft drink, as defined? Based on a strict reading of the waiver, the flavored versions would likely be prohibited while the plain sparkling water versions would be allowed.

Not forcing incorrect assumptions by the model: the "needs more info" problem#not-forcing-incorrect-assumptions-by-the-model-the-needs-more-info-problem

However, one lesson that emerged quickly in building the evaluation was that forcing a simple binary classification (allowed vs. not allowed) was not in fact the best approach.

Many items required additional information to make an accurate determination. For example:

  • Ingredient lists (to check for added sweeteners)
  • Whether the item requires refrigeration (relevant to the candy definition)
  • The form factor (bar, drop, or piece—also part of the candy definition)

Taking one example above, if all we gave the model were the text “Liquid Death” then it would have to make an assumption about which sub-type/flavor it was assessing. This also illustrates why using an image as the input can be more helpful: it would likely have the flavor in it.

Rather than forcing the model to guess, we found value in creating a third category of golden answers: “need more info.”

This concept—where models can abstain from making predictions when confidence is low—is known as selective prediction or abstention in AI evaluation.

An emerging workflow pattern for categorizing food items#an-emerging-workflow-pattern-for-categorizing-food-items

As we worked through examples and edge cases, the shape of a workflow fit to the problem began to emerge:

  1. Image identification: Can we reliably identify what product the person is asking about from an image they take in a store aisle and convert it into a text description?
  2. Initial classification: Run the text description through the model to get an initial assessment of likely “allowed,” “not allowed,” or “more information needed”
  3. Information gathering: If more information is needed, gather that. Either
    1. Ask the user (e.g. potentially having them photograph the ingredient list), or
    2. Attempt automated research (e.g. search the web for “Liquid Death Severed Lime ingredients” and gather it that way without additional user burden)
  4. Final determination: Provide the answer, along with an explanation why

Step 3 also brings up an interesting design consideration. While we could potentially do a web search for ingredient information, prompting users to take a picture of the ingredient list makes it clearer what makes a particular food or beverage allowed or not under the policy. This could help people build mental models of what is allowed, rather than needing to rely on using the tool every time.

What this reveals about implementation#what-this-reveals-about-implementation

Testing these edge cases highlighted just how complex even a simple sounding policy implementation can truly be.

While eliminating "soft drinks” and “candy" from allowable SNAP purchases sounds straightforward, working through the thousands of products in a typical grocery store and determining what counts or not is an intensive process.

The definitions in the waiver are necessarily specific to allow determinations to be made, but they create classifications that don't always match people’s intuitive expectations. Creating such definitions is an innately difficult task.

The complexity affects everyone involved: recipients trying to understand what they can buy, retailers trying to configure their systems, and state agencies providing guidance.

Next steps#next-steps

We're continuing to refine the text-based evaluation and plan to build out the image recognition component more rigorously. The goal is to create a tool we have sufficient confidence in (via evaluation) to pilot, and one that genuinely reduces friction for people navigating these new restrictions.

We are also exploring if we can feed large datasets of potential "edge case" food items (for example, by using grocery APIs) to identify items where policymakers may need to determine whether they are allowed as part of implementation.

If you would like more specific details on our methods and eval, we'd love to hear from you. You can reach us by emailing Dave at dave.guarino@joinpropel.com

This research was led by Leo Mancini and Dave Guarino. This post was written by Dave Guarino with draft editing and feedback by Anthropic’s Claude 4 Sonnet model.