2024年12月19日 In Data Science Semantic data mining that fundamentally changes information analysis 2 Kunihiro Tada, Mindware Research Institute In the previous article, we (1) created a rich text of food-report-style tasting experiences of 20 fictitious crisps, (2) obtained their embedding vectors, (2) extracted… Read More
2024年12月16日 In Data Science Semantic data mining that fundamentally changes information analysis 1 Kunihiro Tada, Mindware Research Institute We have already demonstrated in the Innovation Map project that embedded vectors can be generated from product description texts for IT and data science products,… Read More
2024年7月15日 In Data Science SOM as a platform for ensembles of multi-machine learning models In a typical data science course, self-organizing maps (SOMs) are likely introduced as a method for data visualization, dimensionality reduction, clustering, and exploratory data analysis. Because it can be used… Read More
2024年6月8日 In Data Science UMAP-SOM: A cutting-edge technique for enabling ultra-multidimensional data mining When performing data clustering (cluster analysis), the results change depending on which attributes (variables) are included in the analysis and how much weight is placed on each attribute, but traditional… Read More
2023年10月27日 In Data Science Ensemble model — Bagging If you just use a machine learning library's algorithm according to a textbook in a real problem, you won't be able to achieve better than 70% to 80% performance. Making… Read More
2023年10月26日 In Data Science Occam’s razor – principle of parsimony Occam's Razor is a principle that has its origins in medieval European scholasticism, and as a Japanese person, I feel a little embarrassed to discuss it. As always, check out… Read More
2023年10月24日 In Data Science How to use K-means Cluster analysis or data clustering can be hierarchical or non-hierarchical, and K-means is non-hierarchical clustering. The K-means procedure is very straightforward: Arrange K reference vectors with random numbers. The K… Read More
2023年10月17日 In Data Science No Free Lunch Theorem and Feature Engineering Talking about No Free Lunch Theorem is a bit risky. This is because this can be taken as a denial of machine learning algorithms. This is a theorem in the… Read More
2023年10月10日 In Data Science Classification and Clustering Classification and clustering are often confused because they are similar. In machine learning, it is generally explained that classification is supervised learning, and clustering is unsupervised learning. In statistical terms,… Read More
2023年10月7日 In Data Science Concentration on the sphere – The curse of dimensionality When people are going to do something, it is very important that they know the limits of that thing ahead of time. Machine learning methods may not work well with… Read More