WARC file for New York Civil Liberties Union, 2017 February 28

Acquisition information:

crawl: 269970

Crawl Rules

Limit host twitter.com to 1000 documents

Limit host upload.wikimedia.org to 1000 documents

Limit host en.wikipedia.org to 1000 documents

Limit host facebook.com to 500 documents

Ignore Robots.txt for nyclu.org (last updated 2016-08-25)

Crawl Times

start_date: 2017-02-28T19:46:11Z

original_start_date: 2017-02-28T19:46:11Z

last_resumption: None

processing_end_date: 2017-03-03T20:01:34Z

end_date: 2017-03-03T19:47:27Z

elapsed_ms: 259268734

Crawl Types

type: MONTHLY

recurrence_type: MONTHLY

pdfs_only: False

test: False

Crawl Limits

time_limit: 259200

document_limit: None

byte_limit: None

crawl_stop_requested: None

Crawl Results

status: FINISHED_TIME_LIMIT

discovered_count: 166232

novel_count: 30070

duplicate_count: 22551

resumption_count: 0

queued_count: 113611

downloaded_count: 52621

download_failures: 11

warc_revisit_count: 22524

warc_url_count: 52603

total_data_in_kbs: 3220897

duplicate_bytes: 2003277277

warc_compressed_bytes: 495088912

Crawl Technical Details

doc_rate: 0.2

kb_rate: 12.0

Physical / technical requirements:
Researchers interested in data analysis with web archives may request a WARC file. WARC files are very large and difficult to work with. Your request may take time to process, and we may be unable to deliver your request remotely. Please consult an archivist if you are interested in advanced research with web archives.

Using these materials

Access:
The archives are open to the public and anyone is welcome to visit and view the collections.
Collection restrictions:
Access to this collection is restricted because it is unprocessed. Portions of the collection may contain recent administrative records and/or personally identifiable information. Please contact an archivist for more information. Certain restrictions may apply.
Collection terms of access:
The University Archives are eager to hear from any copyright owners who are not properly identified so that appropriate information may be provided in the future.

Access options

Ask an Archivist

Ask a question or schedule an individualized meeting to discuss archival materials and potential research needs.

Schedule a Visit

Archival materials can be viewed in-person in our reading room. We recommend making an appointment to ensure materials are available when you arrive.

Make a Remote Request

We may also be able to deliver digital scans remotely for a fee.