Self-Organizing Maps as a Concept Generation Tool
Mindware Innovation Maps uses Self-Organizing Maps (SOM) to map academic papers, news articles, and ideas in high-tech fields into semantic space. We use SOM because it is the best tool for representing “Concepts”.
Kohonen published the first SOM algorithm in 1982, and this is the version that is still taught at universities today. However, SOM technology has come a long way since then, and we use Viscovery SOMin, which is a state-of-the-art SOM with various improvements, including batch training, node initialization, data scaling, topological clustering methods, profile analysis, SOM local regression methods and introduction of energy functions, etc.
Much can be said about the benefits of SOM, both from the perspective of statistical and machine learning algorithms, but for the sake of brevity, we focus here on the fact that SOM is a technical tool for representing “concepts”.
The SOM has the ability to learn and organize vast amounts of information by placing similar piece of information closer together. The resulting collection of similar information forms a “concept.”
The “concept” here is the concept itself as defined in philosophy and logic. Concepts consist of “extension” and “intention. Extension” refers to the range to which a symbol refers, and “intention” refers to the common properties shared by the members within that range.
In business, “ideas” and “concepts” are often confused. We first propose to clearly distinguish between them.
For example, a business idea refers to a specific idea about a business. This includes who the customer is, what value to provide, how to provide it, what the price is, and how to collect payment. Of course, anything else that comes to mind is an idea. A business concept, on the other hand, must be a coherent set of business ideas. A concept is a collection of ideas with common characteristics.
What should you pay attention to when evaluating ideas in business? Should innovative ideas that no one else has thought of be appreciated? Should it be evaluated based on expected economic impact? Of course, we do not take such viewpoints lightly.
However, no matter how good the individual ideas are, what happens when you implement several ideas inconsistently, contradictoryly, or even incompatible? You will definitely fall into chaos and your business will not succeed. In business, it is essential to be able to shoot the second and third arrows in an orderly fashion. It could be paraphrased as “strategy” that the ideas are made having a consistency like this. So, “concept” is similar to “strategy” rather than “idea”.
Thus, each idea must be positioned within a concept. Then we should make a decision on which concept to adopt.
Use of Large-Scale Language Models
The reason why SOM has come back into the limelight as a concept generator is because of its compatibility with large-scale language models (LLM). Attempts to visualize text mining results using SOMs have been made for more than 20 years, but they have not been very practical at the level of traditional text mining techniques such as counting term occurrences.
A key factor behind the recent development of LLM is the ability to express words, sentences, and documents as vector quantities. Vector operations have made it possible to create rich linguistic expressions and to detect similar meanings even if the expressions are different.
SOM too operates using vector (array) operations. SOM can learn embedding vectors of sentences and documents as data records. The embedding vector of LLM consists of several hundred to several thousand dimensions. For example, Open AI’s embedding vector has 1536 dimensions.
Non-SOM methods discard information in many dimensions when the dimensions are reduced from thousands to just 2, whereas SOM allows mapping while preserving information in all dimensions. SOM summarizes a hyper-multidimensional “semantic space” as a set of discretized nodes. Rather than discarding dimensions, nodes in the SOM are arranged according to topological order in a hyper-multidimensional space.
Demonstrations of Concept Analysis
Concept generation using SOM, which is used in Innovation Maps, can be applied in a wide range of areas other than Innovation Maps. Examples include analyzing customer feedback in a call center or analyzing patent information. In some areas in the world, it is beginning to be adopted for information management and knowledge management in administrative agencies. The Innovation Maps project serves as a demonstration that will lead to the development of such applications.
How to open the project files >>
^Tutorials top