finess/autorisations: corrections suite à la maj annuelle 2025
Ça y est, comme chaque année, les fichiers annuels du finnes ont été produits et ça casse le pipeline.
Cette fois-ci on a la date de mise en oeuvre qui contient des valeurs nulles pour les autorisations:
tests/finess/test_processing.py:301: in test_validate_autorisations_as_data
schema.validate(aut_df[aut_df["date_export"] >= "2015"])
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/api/pandas/container.py:375: in validate
return self._validate(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/api/pandas/container.py:404: in _validate
return self.get_backend(check_obj).validate(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/container.py:97: in validate
error_handler = self.run_checks_and_handle_errors(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/container.py:172: in run_checks_and_handle_errors
error_handler.collect_error(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/error_handlers.py:38: in collect_error
raise schema_error from original_exc
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/container.py:192: in run_schema_component_checks
result = schema_component.validate(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/api/pandas/components.py:169: in validate
return self.get_backend(check_obj).validate(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/components.py:119: in validate
validate_column(check_obj, column_name)
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/components.py:89: in validate_column
error_handler.collect_error(err.reason_code, err)
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/error_handlers.py:38: in collect_error
raise schema_error from original_exc
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/components.py:68: in validate_column
validated_check_obj = super(ColumnBackend, self).validate(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/array.py:69: in validate
error_handler = self.run_checks_and_handle_errors(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/backends/pandas/array.py:150: in run_checks_and_handle_errors
error_handler.collect_error(
../../.cache/pypoetry/virtualenvs/bqss-p5cbdOMl-py3.11/lib/python3.11/site-packages/pandera/error_handlers.py:38: in collect_error
raise schema_error from original_exc
E pandera.errors.SchemaError: non-nullable series 'date_mise_en_oeuvre' contains null values:
E 601 <NA>
E 602 <NA>
E 603 <NA>
E 604 <NA>
E 1827 <NA>
E ...
E 378009 <NA>
E 428189 <NA>
E 428190 <NA>
E 428191 <NA>
E 428192 <NA>
E Name: date_mise_en_oeuvre, Length: 659, dtype: string
J'ai remarqué ça en travaillant sur #225 (closed)