Occam’s Razor is a principle that has its origins in medieval European scholasticism, and as a Japanese person, I feel a little embarrassed to discuss it. As always, check out Wikipedia for more information. Reading the Wikipedia article, you will see that this principle was later inherited by Isaac Newton, Ernst Mach, and others, and has contributed to the development of science. This principle still seems useful in data science today.
My own undestanding of Occam’s Razor is that when there are multiple theories or models that can explain a similar number of cases, the simpler is better. The previously mentioned “Concentration on the sphere” and “Ugly Duckling theorem” have shown us that it is pointless to increase the number of attributes unnecessarily, but we believe that this principle even more actively recommends simpler models. Of course, simplicity doesn’t necessarily mean it’s good; you should consider the balance between simplicity and performance.
“Everything should be made as simple as possible, but not simpler.” — Albert Einstein
It is clear that the idea that using every available attribute will produce a better model is completely off the mark. To get a little more practical, if adding new features doesn’t improve the quality of your model, you might want to consider that your approach may be entirely wrong.
Karl Popper’s philosophy of science states that falsifiability is a requirement for science to be a science. If a theory contains too many assumptions, it may not be possible to disprove it because of any of the assumptions, even if you give examples that contradict it. In other words, it becomes like a religious debate with a ready-made excuse for every objection. We can consider that Occam’s Razor is revered in the scientific world inorder to clearly distinguishes between right part and wrong part.
Here is one difficult problem that confronts us today. In other words, in the fields of machine learning and artificial neural networks, only relatively simple models were used, but this was in order to accumulate careful demonstrations. This is because even if you suddenly deal with a large model, you won’t know what’s wrong. However in recent years, there have been cases where large-scale models have been created and they have worked surprisingly well. The question is how to think about this. So this is a shift happening from science to technology. For any technology, the scale in the laboratory and the scale in practice are many orders of magnitude different.
Historically, philosophy has been the basis of science. And science is the basis of modern technology. However, in the world of technology, there are principles that are different from those in philosophy and science. Technology must be of practical use in the real world. Furthermore, it is not always the case that a technology is fully clarified scientifically before it is put into practical use; in some cases, these two processes occur at the same time. You’ll likely need to make wise decisions about what can be kept simple and what can be made more complex.
The practical world needs engineers, not scientists.