Mindware Research Institute provides the following solutions based on over 25 years of experience in the fields of statistics and data science.

Concept Generator

It is now common knowledge to use LLM to generate ideas or scraype huge amount of information from the net, but a new problem is how to evaluate them. Mindware Research Institute proposes grouping many ideas based on their similarities. This grouping is called a “concept”. This is because in philosophy and logic, a concept is defined as consisting of extension and intention (connotation). In layman’s terms, extension refers to the scope of a group, and intension refers to common characteristics within a group.

No matter how innovative an idea may be, developing a business based on inconsistent and disparate ideas is a bad move from a strategic perspective. In the past, this could only be determined using a linguistic sense, but a major change in recent years in cutting-edge data science is that it has become possible to calculate ideas and concepts by converting them into vector quantities. Grouping ideas in a semantic space using vector computations provides the following benefits:

  1. Strategic idea evaluation: By clearly defining the conceptual differences between ideas and groups through LLM, you can accurately judge which areas your company should work on from the dimensions of business philosophy and business strategy.
  2. Creation and fine-tuning of additional ideas: Based on the positional relationship of a large number of existing ideas mapped in semantic space, LLM can accurately predict the content of unknown ideas located in the middle or outside of them.

Unlabeled Data Learning

Creating labeled training data is a bottleneck in current AI and machine learning projects. It is difficult to achieve better results when using low-quality training data from outsourced labeling tasks. Therefore, Mindware Research Institute proposes the effective use of unsupervised learning to avoid this problem. Honestly, you can’t replace every project with this, but by carefully redefining “What do we want in the end?” the segmentation models you get from unsupervised learning can often be effective.

Synthetic Data Creation

Today, data utilization such as statistics and data science are indispensable in order to incorporate a scientific approach into business. However, the data that can be realistically used only represents the present or past, and there is very little data that represents the future. Ther is a dilemma that the more you try to utilize data in your business, the more you end up looking at the past instead of the future. Especially when new markets are being created through technological innovation, there is no data on future markets. In order for a company to formulate a strategy in such a situation, it is necessary to create a blueprint for the future that can be extrapolated as much as possible from the currently available information. Data synthesis techniques provide a solution to this problem.