2023年10月27日 In Data Science Ensemble model — Bagging If you just use a machine learning library's algorithm according to a textbook in a real problem, you won't be able to achieve better than 70% to 80% performance. Making… Read More
2023年10月26日 In Data Science Occam’s razor – principle of parsimony Occam's Razor is a principle that has its origins in medieval European scholasticism, and as a Japanese person, I feel a little embarrassed to discuss it. As always, check out… Read More
2023年10月24日 In Data Science How to use K-means Cluster analysis or data clustering can be hierarchical or non-hierarchical, and K-means is non-hierarchical clustering. The K-means procedure is very straightforward: Arrange K reference vectors with random numbers. The K… Read More
2023年10月17日 In Data Science No Free Lunch Theorem and Feature Engineering Talking about No Free Lunch Theorem is a bit risky. This is because this can be taken as a denial of machine learning algorithms. This is a theorem in the… Read More
2023年10月10日 In Data Science Classification and Clustering Classification and clustering are often confused because they are similar. In machine learning, it is generally explained that classification is supervised learning, and clustering is unsupervised learning. In statistical terms,… Read More
2023年10月7日 In Data Science Concentration on the sphere – The curse of dimensionality When people are going to do something, it is very important that they know the limits of that thing ahead of time. Machine learning methods may not work well with… Read More
2023年10月6日 In Data Science Classification doesn’t exist – Ugly Duckling theorem Descartes conducted thought experiments that thoroughly questioned things in order to establish the scientific method. Does what I am seeing really exist? Perhaps they do not exist when I am… Read More
2023年10月4日 In Data Science How to use Clustering Quality Measures In recent years, clustering quality measures have been calculated in cluster analysis, and many academic societies now require that papers include clustering quality measures as a criterion for adopting a… Read More
2023年9月29日 In Data Science How to identify SOMs that should not be used SOM is a well-known technique that can also be found in R and Python libraries. However, to be frank, most of those SOMs are poorly made and are the cause… Read More
2023年9月11日 In Quantitative business strategy management Competitive analysis map of major data science tools Using ChatGPT, investigated the interoperability of key data science tools and created a map with SOM. I beleive that one of the requirements for the success of a data science… Read More