ValueError: Input contains NaN, infinity or a value too large for dtype(‘float64‘)

╰半夏微凉° 2022-11-07 04:23 353阅读 0赞

问题

刚开始学习 sklearn ,运行下面的代码时报错,

  1. from sklearn.feature_extraction import DictVectorizer
  2. from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
  3. from sklearn.preprocessing import MinMaxScaler,StandardScaler ,Normalizer
  4. from sklearn.impute import SimpleImputer
  5. import numpy as np
  6. import jieba
  7. def im():
  8. """ 缺失值处理 :return: """
  9. im = SimpleImputer(missing_values='NaN',strategy='mean')
  10. data = im.fit_transform([[1,2],[np.nan,3],[7,6]])
  11. print(data)
  12. if __name__ == "__main__":
  13. im()

运行报错,

ValueError: Input contains NaN, infinity or a value too large for dtype(‘float64’)。具体如下:

  1. Traceback (most recent call last):
  2. File "E:/pycharm_workspace/matplotlibDemo/feature.py", line 104, in <module>
  3. im()
  4. File "E:/pycharm_workspace/matplotlibDemo/feature.py", line 95, in im
  5. data = im.fit_transform([[1,2],[np.nan,3],[7,6]])
  6. File "D:\skl3\lib\site-packages\sklearn\base.py", line 699, in fit_transform
  7. return self.fit(X, **fit_params).transform(X)
  8. File "D:\skl3\lib\site-packages\sklearn\impute\_base.py", line 288, in fit
  9. X = self._validate_input(X, in_fit=True)
  10. File "D:\skl3\lib\site-packages\sklearn\impute\_base.py", line 262, in _validate_input
  11. raise ve
  12. File "D:\skl3\lib\site-packages\sklearn\impute\_base.py", line 255, in _validate_input
  13. copy=self.copy)
  14. File "D:\skl3\lib\site-packages\sklearn\base.py", line 421, in _validate_data
  15. X = check_array(X, **check_params)
  16. File "D:\skl3\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
  17. return f(*args, **kwargs)
  18. File "D:\skl3\lib\site-packages\sklearn\utils\validation.py", line 664, in check_array
  19. allow_nan=force_all_finite == 'allow-nan')
  20. File "D:\skl3\lib\site-packages\sklearn\utils\validation.py", line 106, in _assert_all_finite
  21. msg_dtype if msg_dtype is not None else X.dtype)
  22. ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Input contains NaN, infinity or a value too large for dtype('float64') 表示 Input 的值包含太长了。

解决方法

  1. from sklearn.feature_extraction import DictVectorizer
  2. from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
  3. from sklearn.preprocessing import MinMaxScaler,StandardScaler ,Normalizer
  4. from sklearn.impute import SimpleImputer
  5. import numpy as np
  6. import jieba
  7. def im():
  8. """ 缺失值处理 :return: """
  9. im = SimpleImputer(missing_values=np.nan,strategy='most_frequent')
  10. data = im.fit_transform([[1,2],[np.nan,3],[7,6]])
  11. print(data)
  12. if __name__ == "__main__":
  13. im()

运行结果报错:

  1. [[1. 2.]
  2. [1. 3.]
  3. [7. 6.]]

发表评论

表情:
评论列表 (有 0 条评论,353人围观)

还没有评论,来说两句吧...

相关阅读