High data quality for ML refers to data that can be used to train high performance models. Poor training data quality for ML results in models that have low performance, are biased, and cannot generalize.
Some important properties of data for it to be considered high quality include its accuracy, consistency, and level-of-noise. For the task the model is being trained for, the data should also be relevant, complete, timely, representative, and unbiased. High-quality data is essential for building robust and reliable models that can generate accurate predictions or perform desired tasks effectively.