-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Download of drive urls sometimes fails with NonMatchingChecksumError: Artifact https://drive.google.com/... has wrong checksum.
Explanation: Drive sometimes reject the download attempt, and the rejection page is downloaded instead of the data:
- If the user is based in china (should use VPN)
- If there is too many downloads of the same file.
The best solution currently is to manually download the data (https://www.tensorflow.org/datasets/overview#manual_download_if_download_fails), rather than using the automated download which got rejected by drive.
Otherwise:
- Try the download latter on.
- Try on a different computer
- Rather than downloading the file in each colab connection, load the dataset from a GCS bucket. See instructions.
Not sure there can be a solution on Google Drive side, while preventing abuse.
On TFDS side, we could make the error message more explicit when we detect a drive URL.
jpgard, ChanchalKumarMaji, fel-thomas, csttsn, Tianyu00 and 5 moreAlyetama, asparsa, nesrnesr, sajastu and ltz0120
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request