A large-scale exploration of terms of service documents on the web

Terms of service documents are a common feature of organizations' websites. Although there is no blanket requirement for organizations to provide these documents, their provision often serves essential legal purposes. Users of a website are expected to agree with the contents of a terms of service document, but users tend to ignore these documents as they are often lengthy and difficult to comprehend. As a step towards understanding the landscape of these documents at a large scale, we present a first-of-its-kind terms of service corpus containing 247,212 English language terms of service documents obtained from company websites sampled from Free Company Dataset. We examine the URLs and contents of the documents and find that some websites that purport to post terms of service actually do not provide them. We analyze reasons for unavailability and determine the overall availability of terms of service in a given set of website domains. We also identify that some websites provide an agreement that combines terms of service with a privacy policy, which is often an obligatory separate document. Using topic modeling, we analyze the themes in these combined documents by comparing them with themes found in separate terms of service and privacy policies. Results suggest that such single-page agreements miss some of the most prevalent topics available in typical privacy policies and terms of service documents and that many disproportionately cover privacy policy topics as compared to terms of service topics.

"© Soundarya Nurani, Sundareswara, Mukund Srinath, Shomir Wilson, and C. Lee Giles | ACM (2021). This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in the Proceedings of the 21st ACM Symposium on Document Engineering, https://doi.org/10.1145/3469096.3474940.



Work Title A large-scale exploration of terms of service documents on the web
Open Access
  1. Soundarya Nurani Sundareswara
  2. Mukund Srinath
  3. Shomir Wilson
  4. C. Lee Giles
License In Copyright (Rights Reserved)
Work Type Article
  1. Proceedings of the 21st ACM Symposium on Document Engineering
Publication Date August 16, 2021
Publisher Identifier (DOI)
  1. https://doi.org/10.1145/3469096.3474940
Deposited October 05, 2022




This resource is currently not in any collection.

Work History

Version 1

  • Created
  • Added A_large-scale_exploration.pdf
  • Added Creator Soundarya Nurani Sundareswara
  • Added Creator Mukund Srinath
  • Added Creator Shomir Wilson
  • Added Creator C. Lee Giles
  • Published