The proliferation of recipe websites and food blogs is digitizing the recording and sharing of recipes. This results in an enormous amount of data about personal preferences, cooking methods, and ingredient combinations, often including demographic information. Many are attempting to use this wealth of data to build taste graphs and to strengthen recommendation algorithms.
A new study by University of Michigan Associate Professor Lada Adamic, Chun-Yuen Teng and Yu-Ru Lin, analyzes Allrecipe.com‘s 46,337 recipes, 1,976,920 user reviews, and data from approximately 530,609 users to understand the fundamentals of cooking and user preferences. One of the most interesting features of this study is the inclusion of user’s reviews in the data set. User comments are a largely untapped source of insights about ingredient combinations and optimal substitutions. Can you imagine, for example, if the wealth of insightful recipe and restaurant recommendations contained within Chowhound’s forums could be captured and used in a more structured way?
The following is a summary of their findings reposted from Adamics blog*. Recipe Recommendations Using Network Analysis can be read in full here.
Finding #1
If one examines complementary ingredients, two main communities fall out, one sweet, the other savory (see image above).
And there is a smaller, third community of ingredients for mixed-drinks.
Recommendation #2
Recipe reviews are a goldmine of data. There are ample suggestions for modifications (additions, deletions, increases, decreases, substitutions). These could be used to create “flexible” recipes, suggesting a range for the quantity of an ingredient, and possible substitutes. In fact, a substitute network reveals global communities of interchangeable ingredients.
Finding #3
Ingredient networks can be used to predict recipe ratings. “These networks encode which ingredients go well together, and which can be substituted to obtain superior results, and permit one to predict, given a pair of related recipes, which one will be more highly rated by users.” It appears that the substitute network in particular encodes nutrition information, e.g. users’ preferences for “healthier” variants for a recipe.
Finding #4
The hypothesis presented in Catching Fire, that humans have evolved to prefer cooking methods that extract more energy value from food, is consistent with recipe ratings. Recipes that call for heating (baking, boiling, grilling), are rated on average more highly than those that only call for mechanical preparation methods (chopping, mixing). Chemical methods (marinating & brining) give a slight additional boost.
Finding #5
US regional preferences are easily discernable, e.g. frying being popular in the south, and grilling being popular on the west coast and in the mountain regions. It would be interesting to study how these are affected by the availability of ingredients and cultural influences.
Also, stay tuned for some fantastic related work by YY Ahn, Sebastian Ahnert, James Bagrow and Laszlo Barabasi, getting to the bottom of recipe preferences by analyzing networks of flavor compounds in food pairings.
*Published with permission from Lada Adamic.