Understanding the needs of a business is the starting point for any project. by John Williams
In the data science process, it is crucial to understand the problem that we want to solve.
In this stage, the interaction with the customer is critical and provides an opportunity to clarify the goals and objectives of the project as well as the metrics and data sources, to answer the questions that define the objectives.
Definition of Objective
During this step, we ask questions to identify the key variables and the main goal of the project. The following are ideas of machine learning techniques used to answer specific questions:
Regression (How much or How many?)
Classification (Which Category?)
Clustering (Which group?)
Anomaly detection (Is this weird?)
Recommendation (Which Option?).
It is also essential to define who will be part of the team and what would the success metric be.
Identification of data sources
Identify data sources that contain relevant variables to answer the questions.
In the case of the inexistence of the data locally, it may be appropriate to identify external data sources or update the existing systems to collect the needed data.
The outcome of this stage is a list of business requirements that will guide the project.