WARC file for Parks & Trails New York Homepage, 2017 March 10
- Acquisition information:
-
crawl: 271770
Crawl RulesLimit host facebook.com to 100 documents
Limit host twimg.com to 100 documents
Limit host twitter.com to 100 documents
Crawl Timesstart_date: 2017-03-07T20:17:57Z
original_start_date: 2017-03-07T20:17:57Z
last_resumption: None
processing_end_date: 2017-03-10T20:36:52Z
end_date: 2017-03-10T20:24:57Z
elapsed_ms: 259207592
Crawl Typestype: MONTHLY
recurrence_type: MONTHLY
pdfs_only: False
test: False
Crawl Limitstime_limit: 259200
document_limit: None
byte_limit: None
crawl_stop_requested: None
Crawl Resultsstatus: FINISHED_TIME_LIMIT
discovered_count: 20222
novel_count: 3653
duplicate_count: 16100
resumption_count: 0
queued_count: 469
downloaded_count: 19753
download_failures: 104
warc_revisit_count: 16100
warc_url_count: 19742
total_data_in_kbs: 941383
duplicate_bytes: 726469721
warc_compressed_bytes: 21355462
Crawl Technical Detailsdoc_rate: 0.08
kb_rate: 3.0
- Physical / technical requirements:
- Researchers interested in data analysis with web archives may request a WARC file. WARC files are very large and difficult to work with. Your request may take time to process, and we may be unable to deliver your request remotely. Please consult an archivist if you are interested in advanced research with web archives.
Using these materials
- Access:
- The archives are open to the public and anyone is welcome to visit and view the collections.
- Collection restrictions:
- Access to this collection is restricted because it is unprocessed. Portions of the collection may contain recent administrative records and/or personally identifiable information. Please contact an archivist for more information.
- Collection terms of access:
- The University Archives are eager to hear from any copyright owners who are not properly identified so that appropriate information may be provided in the future.