As a scholar, industry practitioner over 15 years of experience and off late as a trainer, I have seen a lot of trends emerge in Data Science from 2003 onwards. Unfortunately all the time, the analytics, data mining, now data science, is operated more on the basis of tools than approach or conceptual prospective.
Earlier whenever it came to Data Analytics, people would talk about software like SPSS, SAS, MATLAB, etc. Now, the same trend seems to have caught up with Data Science. People are busy merely talking about R, PYTHON, and other tools. So, is that the correct approach to learn or teach Data Science/Analytics? To help both learners and trainers, here are some important points to consider, as per my perspective.
Basically, Data Analysis/Data Science is not just a code, it is a science deriving insights from data. In the process, we need to apply a lot of experiments/approaches, and then, to execute those experiments or approaches we need a tool, so, tool is only for executing. Data Science is full of content and approaches, so focus on concepts. As one of my mentors/ex-bosses always reiterates – conceptual knowledge and contextual familiarity is important. Focus more on context. In Data Science, conceptual knowledge means understanding mathematics, statistics and computers, at large.
Another aspect that I have observed is data science is broadly “data + science”. Science means algorithms, but what is happening now is that people tend to almost forget about first part which is data and merely focus on algorithms which is not at all feasible. Without having proper knowledge on data you will never get to a good model to analyze it.
For the learners:
Firstly, Data Science is not code, so do not worry too much about tools – for example, R vs Python, etc.
Learn the concepts clearly, rather than just executing code, as conceptual knowledge is important.
Focus on basics of mathematics/statistics/computers and understand them, because in the real world, most of the algorithms will fail, then this knowledge will help you to tune or refine the algorithm. Here, conceptual knowledge is more important.
Please working on a few real projects, at least. Even if it is just running projects on standard data/standardized library.
Please always try to explain the Data Science in the simplest manner.
For the trainers:
Do not just try to give the content. Nowadays, content is available everywhere but there is limited context. Please try to provide context with the content from the perspective of the user/learner.
Do not make Data Science complicated, using difficult-to-comprehend language. Always deliver in simple language and make the leaners comfortable with concepts.
Please do not try to give excessive content via websites blogs, codes on GitHub, etc. As a trainer, find some good resources and follow them till the end.
Always understand the learners’ profiles and teach them in their language.