Kunihiro Tada, Mindware Research Institute
In the previous article, we (1) created a rich text of food-report-style tasting experiences of 20 fictitious crisps, (2) obtained their embedding vectors, (2) extracted three-dimensional factors from them using a dimensionality reduction technique called UMAP, and (3) learnt them into Potato Chips semantic map using self-organising maps (SOM) , and (4) introduced the possibility of using LLM to interpret the factors and select arbitrary spaces in the semantic map to explore new product concepts. It can be said to be similar to Internal Preference Mapping, which is the first stage of a method called Preference Mapping in sensory evaluation analysis: in Preference Mapping, sensory characteristics are quantified and complex information that cannot be expressed in this way is In contrast, the proposed method can handle rich information using natural language. Also, Preference Mapping uses only two axes, the first and second principal components of PCA, whereas the SOM allows mapping with less information loss by using more factors.
Clustering consumers
Now the next step is to cluster the consumers’ rating scores for the same 20 crisps as before. The traditional method uses a general hierarchical cluster analysis to divide consumers into a reasonable number of groups, but we have not gained any insights from this, which is very disappointing. We can use SOM to arrange consumers in two dimensions and explore their space by choosing different numbers of clusters.
The Viscovery SOMine data mining system has a ‘profile analysis’ function, which allows you to select any area on the map and perform interactive (and efficient) multiple comparison tests. The findings from this allow you to develop a strategy for ‘which area of consumers to target’. The real essence of data mining is really in this area, but it is not likely to be easy for the public to reach a level of understanding of this.
The proposed method utilises LLM to interpret the features of sufficiently subdivided clusters to make the discovery of target regions easier.
Here we have fictitious data from 200 consumers rating 20 different crisps on a 5-point scale. This is then clustered by SOM. However, for each consumer, a normalised numerical value is used to contribute to the ordering of the SOM, so that the sum of each person’s score is 1. The original score is given as an additional attribute that does not contribute to the ordering. An image of the resulting map is shown below:
The highest clustering quality indicator is the number of clusters of 4. That helps to understand the overall significant clusters, but we should look at more detailed clustering because in product planning we should focus on local differences between products. Here we have chosen a cluster count of 20. The results were presented to the LLM and interpreted as follows:
C1: The Classic and Sweet Flavor Enthusiasts
- Top Snacks:
- Classic Sea Salt, Maple Bacon Dream, Honey Mustard Sweetness
- Interpretation:
This cluster favors classic salty flavors and sweet profiles, particularly bacon-maple combinations and honey-mustard notes. It reflects a preference for familiar yet indulgent tastes.
(Full text here)
end of quote
The ultimate aim is to plan and develop products, so this is where the consumer space is explored and targets defined. Details require our consulting service, but the method uses a colour scale to narrow down the nodes by specifying a range of characteristic values. The following nodes were selected for the experiment:
Integration into product maps
In traditional Preference Mapping, in order to merge consumer preference patterns with the initial Internal Preference Mappinng, the first and second principal components of the Internal Preference Mapping are used as explanatory variables and the preference scores for each product are The cluster mean is used as the objective variable and a model is created for each cluster in linear regression with an interaction term. Although linear regression creates a surface model by using interactions, it is artificial as it is a simple quadratic equation and is unlikely to be accurate even if there are several hundred products to be compared. This is the main problem with traditional Preference Mapping.
We can model with free-form surfaces using SOM, so we can build more realistic and natural models.
The year before last, when I was invited to a sectional meeting of the Japanese Society for Sensory Evaluation, I expressed the view that the ‘response’ of consumers to each sensory characteristic could be quantified by calculating the product of the matrices, since the two matrices share the same number of products, in order to link expert scores and consumer scores, but this time I took a different approach is taken.
Since we use the SOM, we can associate additional attributes besides those that contribute to the ordering of the SOM. Using this, we can transpose the consumer score matrix and associate it with the original map, so that we can see how the scores of the targets we want to look at, such as per consumer, cluster average, target area, etc., are distributed on the original map at our disposal. We believe that this will ultimately allow for an analysis that is consistent with the objectives of Preference Mapping.
Below is a map that transposes the consumer score matrix and associates it with the original product map, showing the scores in the target areas defined in the previous step. this corresponds to the final map of Preference Mapping method. In the Preference Mapping, the model was an artificial plane or curved surface model, whereas the SOM shows that a very varied surface has been created.
The red nodes represent the characteristic space with the highest scores, from which the product concept is to be determined by selecting those areas where the product could be differentiated from existing products as much as possible. In the experiment, the dark shaded nodes were selected and the UM1, UM2 and UM3 values for these nodes were again presented to the LLMs, who were then asked about their concept. The answers are shown below:
Product Concept: “Spiced Honey Complex”
Flavor Characteristics
This flavor is a fusion of sweetness and complex spices.
- The first taste delivers the gentle sweetness and smooth richness of honey.
- Gradually, a blend of spices such as cumin, garlic, and black pepper unfolds in layers, creating a flavor experience that is far from monotonous.
- The finish offers a subtle, peppery kick, balancing the sweetness with a lingering spiced complexity that keeps you coming back for more.
Appearance
Golden chips with a light shimmer, sprinkled with flecks of black pepper and herbs, create a refined and elegant visual appeal.
Target Audience
- Consumers looking for a sophisticated and adult-oriented flavor
- Those who appreciate complex, layered taste experiences
- Snack lovers who enjoy a balance of sweetness and spiciness in a gourmet-style product
Tagline
“Sweet and spicy – an irresistible moment of indulgence.”
Concept Insights
This product strikes a perfect balance between:
- UM1 (Satisfaction/Richness): Delivering moderate richness and a satisfying depth of flavor
- UM2 (Balance/Layered Depth): Highlighting a multi-dimensional taste where flavors harmonize beautifully
- UM3 (Freshness/Lightness): A slight lean towards richness over lightness, focusing on bold and indulgent flavor complexity.
The result is a snack that feels sophisticated, multi-layered, and irresistibly balanced.
end of quote
Conclusion
Traditional Preference Mapping combines (1) quantification of sensory characteristics, (2) PCA, (3) cluster analysis and (4) linear regression in order to link the sensory evaluation of a product by experts and the preference of the product by consumers, but frankly speaking, it seems difficult to create a reliable model in that way, as each step There is a loss of information at each stage of the process and it seems difficult to create a reliable model.
We have shown that (1) by using LLM, it is possible to use text expressed in natural language in addition to regular numerical data and incorporate all the subtleties of information into the model, and (2) by using SOM, it is possible to explore the data space intuitively and build natural models using free surfaces. The results showed that (1) the SOM can be used to intuitively explore the data space and build natural models with freeform surfaces.
Although we did not use conventional quantified sensory evaluation data in this study, SOM can integrate different types of data, so it is possible to integrate the method shown here and conventional methods on top of SOM. We welcome the opportunity to develop such examples in collaboration with motivated companies.