WARC file for Accounting Office - University at Albany-SUNY, 2017 May 17
- Acquisition information:
-
crawl: 301465
Crawl RulesIgnore Robots.txt for www.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for www.alumni.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for www.ualbanysports.com (last updated 2016-02-11)
Ignore Robots.txt for library.albany.edu (last updated 2017-05-19)
Ignore Robots.txt for alumni.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for asrc.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for atmos.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for bioinformatics.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for cela.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for choose.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for cs.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for csda.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for imls.ctg.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for www.ctg.albany.edu (last updated 2017-05-19)
Ignore Robots.txt for cwig.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for events.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for hr.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for ibl.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for illiad.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for liblogs.albany.edu (last updated 2017-05-19)
Ignore Robots.txt for libguides.library.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for scholarsarchive.library.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for listserv.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for m.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for math.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for omega.math.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for mumford.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for nyjm.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for pdp.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for resnet.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for rit.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for cyberphysics.rit.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for rna.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for slsc.albany.edu (last updated 2017-05-19)
Ignore Robots.txt for uaems.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for uapps.albany.edu (last updated 2016-02-11)
Ignore Robots.txt for wiki.albany.edu (last updated 2016-02-11)
Block host dev.library.albany.edu
Host Rule Type C CONTAINS meg.library.albany.edu:8080/archive/search? (last updated 2017-05-11)
Crawl Timesstart_date: 2017-05-17T16:55:17Z
original_start_date: 2017-05-17T16:55:17Z
last_resumption: None
processing_end_date: 2017-05-23T00:29:43Z
end_date: 2017-05-22T21:46:25Z
elapsed_ms: 449440076
Crawl Typestype: MONTHLY
recurrence_type: MONTHLY
pdfs_only: False
test: False
Crawl Limitstime_limit: 432000
document_limit: None
byte_limit: None
crawl_stop_requested: None
Crawl Resultsstatus: FINISHED_TIME_LIMIT
discovered_count: 2923335
novel_count: 388249
duplicate_count: 1238165
resumption_count: 0
queued_count: 1296921
downloaded_count: 1626414
download_failures: 336
warc_revisit_count: 1238113
warc_url_count: 1626286
total_data_in_kbs: 265327774
duplicate_bytes: 251274786133
warc_compressed_bytes: 3306915146
Crawl Technical Detailsdoc_rate: 3.62
kb_rate: 590.0
- Physical / technical requirements:
- Researchers interested in data analysis with web archives may request a WARC file. WARC files are very large and difficult to work with. Your request may take time to process, and we may be unable to deliver your request remotely. Please consult an archivist if you are interested in advanced research with web archives.
Using these materials
- Access:
- The archives are open to the public and anyone is welcome to visit and view the collections.
- Collection restrictions:
- Access to this collection is unrestricted.
- Collection terms of access:
- This page may contain links to digital objects. Access to these images and the technical capacity to download them does not imply permission for re-use. Digital objects may be used freely for personal reference use, referred to, or linked to from other web sites. Researchers do not have permission to publish or disseminate material from these collections without permission from an archivist and/or the copyright holder. The researcher assumes full responsibility for conforming to the laws of copyright. Some materials in these collections may be protected by the U.S. Copyright Law (Title 17, U.S.C.) and/or by the copyright or neighboring-rights laws of other nations. More information about U.S. Copyright is provided by the Copyright Office. Additionally, re-use may be restricted by terms of University Libraries gift or purchase agreements, donor restrictions, privacy and publicity rights, licensing and trademarks. The M.E. Grenander Department of Special Collection and Archives is eager to hear from any copyright owners who are not properly identified so that appropriate information may be provided in the future.