tau-data Indonesia bekerjasama dengan Traveloka Indonesia dan beberapa institusi ternama di Indonesia bersama-sama menginisiasi sebuah sistem kurikulum yang terbuka, dinamis, dan adaptif terhadap sistem pendidikan dan perkembangan kebutuhan industri, akademik, dan pemerintahan akan lulusan, solusi, dan teknologi terkait pengolahan data. Open kurikulum ini digunakan pada sistem pembelajaran daring yang diberikan secara cuma-cuma di tau-data Indonesia. Lebih jauh lagi, kurikulum ini terbuka untuk secara kontinu direvisi dan modifikasi mengikuti perkembangan ilmu dan teknologi yang sangat dinamis. Bagi pengguna yang ingin menggunakan sistem kurikulum konvensional berbasis kompetensi dipersilahkan untuk mengaksesnya di Link berikut.
- Open Curriculum
- Core Modules
- Elective Modules
- Supplementary Modules & Advanced Modules
Data science is paramount in the data era (industry 4.0) and its fundamentals should be mastered by all data related professionals such as data analyst, data scientist, or data engineer. However, due to the complexity and dynamics of the field and the vast technologies or tools that involve, it is very challenging for data enthusiasts to keep-up. One of the remedies to this situation is constant learning and improvement for all of the related parties in an organization.
There have been numerous formal and informal educational institutions dedicated to overcoming the knowledge gaps in data science. Informal data science education is currently a growing business, due to the slow developments in formal education to adapt to industry needs. Nevertheless, a couple of major flaws from most informal educations are the lack of roadmap/curriculum and fundamental science behind the skills that are being taught. Practical skills produced from these systems are superficial and limited to some basic cases only. Participants are having trouble to solve the ever-evolving data challenges.
Tau-data roadmap (curriculum) is one of our answer to the issues previously mentioned. We understand that the conventional curriculum system is incapable to solve sophisticated and rapid industries’ need. Tau-data roadmap is different from conventional curriculum due to the following properties. First, it is highly customizable for different data professions and with different focus/levels of interest. Moreover, it is open-sourced and adopt the AI/Big data philosophies of crowd-based knowledge system to evolve continuously through time. Thus, the curriculum presented in this page is a progressing roadmap, the improvement will continue nevermore. Finally, as an informal “curriculum”, tau-data’s roadmap is also very flexible in its content. For example, each module in the roadmap need not be of the same unit of time as in formal curricula.
To optimally utilize the roadmap, we complement it with an assessment tool. The test measures some fundamental skills or competencies that a data professional should have. A participant does not need to score high in all of the components measured. Most successful data professionals are those with substantial, deep expertise in at least one aspect of data science. In the future we will use the output of this test as the basis for the recommended (next) modules to be taken and the depth of topics that need to be discussed (education 4.0 system ~ AI-based curriculum). The results of this assessment can also be used to form a better data science team with varying in-depth skills in industry or government institutions.
We appreciate inputs from experts in data science field that enriched the roadmap (curriculum) with their vast knowledge and experience:
- Pak Doan Siscus dan Pak Juan Intan Kanggarawan (Traveloka Indonesia)
- Data Science dan Analytics Leaders Traveloka Indonesia.
- Dr. Sarini, M.Stats (Universitas Indonesia)
- Setia Pramana, M.Sc, PhD (Badan Pusat Statistik Republik Indonesia – STIS)
- Dr. Andry Alamsyah, S.Si, M.Sc (Telkom University)
- Dr.Eng. Anto Satriyo Nugroho, M.Eng (BPPT – INAPR)
- Dr.rer.Pol. Dedy Dwi Prastyo, Msc (Institut Teknologi Sepuluh Nopember – ITS)
- Dr. Bagus Sartono, M.Si (Institut Pertanian Bogor – IPB University)
- Dr. Tri Handhika, S.Si, M.Si (Universitas Gunadarma)
- Dr. Syopiansyah Jaya Putra, M.Sis (NICT – UIN Jakarta)
- Dr. Imam Marzuki Shofi, M.T. (ZiShof – UIN Jakarta)
Ultimately, we would like to express our profound gratitude and deep appreciation to Traveloka Indonesia for its guidance and support that make this project possible. As in the continuous improvements of the roadmap system.
We prefer to call the curriculum that we designed as a roadmap. Just like a Train roadmap that can take a passenger from an initial location to many different locations that he/she desired; the proposed roadmap is intended to have the same functionality. A learner can start from his/her current capability and reach his/her target capability by choosing the corresponding path. Once the destination reached, then another path can be taken, but without having to go through the nodes (modules/topics) that already been taken before. We designed a generic data science roadmap, a user can choose a path to become a data analyst, data scientist, AI/data engineer using the roadmap (as shown in the picture below).
As an example, those who would like to become a Data Analyst can have a partial (sub) graph of the curriculum above as follow (this path is suitable to be taken by bachelor degree students in data science or related fields):
In our generic roadmap, there are six types of modules (nodes/units), namely core modules, elective/recommended modules for DA and DS, recommended modules for DE, recommended modules for AI engineer, and supplementary modules. For the Data Analyst recommended module, the AI engineer modules and DE recommended modules are not present. These modules are connected by directed lines which mean an order (requirement) and a dashed line which mean these connected modules can be learned concurrently. Further details on the modules are as follows:
Core Modules (CM)
CMs are recommended sets of topics that need to be taken by all data related profession: DA, DS, DE, and even AI engineer. Nevertheless, all topics within a module do not necessarily need to be taken by all data related professions. For instance, in the Statistical foundation for data science module, a DE would just need to take several basic topics, while a DS is recommended to pass all of the topics available in that module. Vice versa in Database fundamental core module a DA/DS would only need to take some topics whereas a DE need to master all of the topics in that module.
We also would like to emphasize that even within these core modules, there are advanced topics listed in the syllabi. These topics are optional and only recommended once a data professional complete all of CMs and some recommended modules. More details are given in each of the advanced topics information.
The user will learn core skills as they mastered the core modules. These core skills will make the “why” questions in the EMs much easier to understand. In other words, the fundamental understanding in CMs is one of the key factors to understand not only the strengths and weakness in advanced models but also even to extend the models in to a novel approach to solve a more challenging/complex data challenges in the future. In short, CMs are the basic must-have skills for data professionals.
Some of the topics in the core modules includes discussions on the ever-evolving challenges in DA/DS roles in introduction to data science and big data module, best practices in programming for data science in foundations of algorithms and data structures module, and how linear models becomes the foundation of most advanced data science models in (general) linear models module. Modules in CM category is as followings:
- Pendahuluan Data Science dan Big Data (DSBD).
- Foundations of Algorithms, Data Structures, and Programming (ADSP)
- Mathematical Foundation for Data Science (MFDS)
- Statistical Foundation for Data Science (SFDS)
- Pendahuluan Basis Data untuk Data Science (BDDS)
- Exploratory Data Analysis (EDA)
- Pendahuluan Soft Skills untuk Data Professional (SSDP)
- Generalized Linear Model (GLM)
Elective Modules (EM)
EMs are modules that will enhance data professionals’ skills. Nevertheless, in industrial context a data professional need not to be an expert on all of the modules available in EM, hence it is elective based on the user’s daily tasks/responsibilities. In this section we elaborate details on modules that are recommended for data analysts. In total there are three modules, namely Sampling techniques and experimental design, Time Series Analysis, and Soft Skills for Data Professionals. These modules are also recommended for Data Scientists and we assume that in order to become proficient DS then one need to become a sound DA first.
- Sampling Techniques and Experimental Design (STED)
- Time Series Analysis (TSA)
- Supervised Learning – Classification Models (SLCM)
- Unsupervised Learning – Interdependence Methods (ULIM)
- Spatial Data Analysis (SDA)
- Applied Data Mining (ADM)
- Natural Language Processing and Text Mining (NLPTM)
- … (to be continued)
Supplementary & Advanced Modules (SM)
SMs are modules that in our experience will further enriched data professionals even further to distinguish themselves even among data experts. Competencies gained in SM when combined with the competencies in CM and RM will result in individuals that have capabilities to not only apply their skills to the fullest but also to innovate (creativity level). The supplementary modules can be modules that are not directly linked to data analyst or data scientist core competencies but in a way enhance their whole values in an industry. Or modules on advanced data science topics that might be useful for some special cases in an industry or organization. In the recommended roadmap for Data Analysts, based on our initial field study (divisions meeting and survey in some companies) we recommend one supplementary module namely Bayesian Statistics.
- Bayesian Statistics (BYS)
- … (to be continued)
In the era where data-driven policies have become one of the key factors in winning business competitions, the continues growth of data professional’s skills is no longer an option. However, growth needs to include grounded fundamental theory to face ever-changing data challenges and also needs to be done systematically. Based on these needs a curriculum as guidance is needed. A conventional curriculum system that is normally used in a formal academic institution may not fit the private sector (industry) needs. Whether because it is too focused on analytics (theory) or contains a significant number of topics that are not directly related to the industry daily tasks.
We presented a highly customizable (adaptive) and dynamics data professionals’ curriculum using graph (network) system. The (current version – 1.0) general curriculum was trimmed to focus on data analysts’ competencies. The content of this curriculum is a result of comprehensive discussions between tau-data and Traveloka’s council members formed for the Traveloka Data University program and experts from well-known institutions (as stated in our Acknowledgements). The resulting curriculum is enriched by case studies that are contributed by tau-data’s partners. Furthermore, a blended system of online and offline learning was possible to make the learning system more flexible and optimal for the fast work phase in current industry needs.