Understanding the needs of a business is the starting point for any project. by John Williams
In the data science process, it is crucial to understand the problem that we want to solve.
In this stage, the interaction with the customer is critical. It provides an opportunity to clarify the goals and objectives of the project as well as the metrics and data sources, to answer the questions that define the objectives.
Definition of Objective
During this step, we ask questions to identify the key variables and the project's primary goal. The following are ideas of machine learning techniques used to answer specific questions:
Regression (How much or How many?)
Classification (Which Category?)
Clustering (Which group?)
Anomaly detection (Is this weird?)
Recommendation (Which Option?).
Defining who will be part of the team and the success metric is also essential.
Identification of data sources
Identify data sources that contain relevant variables to answer the questions.
In the case of the inexistence of the data, it may be appropriate to identify external data sources or update the existing systems to collect the needed data.
The outcome of this stage is a list of business requirements that will guide the project.