Google Scholar Data Public

This study attempted to locate engineering dissertation citations in Google Scholar, Compendex, and Scopus Researchers are Carmen Cole, Angela R Davis, Vanessa Eyer, and John J Meier The researchers searched in Proquest Digital Dissertations as a source, searching only for 2016 doctoral dissertations in the U.S. limiting full text reference lists only Subject headings for 9 engineering disciplines were used to search. From the results, the URLs of all reference lists were saved Robert Olendorf created a script to extract all citations from the reference list webpage URLs The data were retrieved using a R script  https://github.com/olendorf/parsed_thesis_citations The folder "data/web_pages/" contains all HTML pages downloaded from Proquest The folder "data/" contains a .csv file of all citations with the four digital numeral extracted by the R script The file "GoogleScholarData.csv" contains 20 randomly sampled from each of the 9 subjects over 7 decades (1260 data records) Each record in the file is a comma deliminated data element containing the following fields Unique is a unique identifier for each record Subject is the engineering field of the dissertation Citation is a full text citation in quotes "" citation from a randomly sampled dissertation in the subject Year is the year extracted by a script from the record (not error checked) Format is the type of reference (Book, Conference, Journal, Other) assigned by the researchers Google Scholar contains a F if the citation was found in Google Scholar, C for a partial record, and N for not found Compendex contains a F if the citation was found in Compendex, C for a partial record, and N for not found Scopus contains a F if the citation was found in Scopus, C for a partial record, and N for not found

README

This study attempted to locate engineering dissertation citations in Google Scholar, Compendex, and Scopus
Researchers are Carmen Cole, Angela R Davis, Vanessa Eyer, and John J Meier

The researchers searched in Proquest Digital Dissertations as a source, searching only for 2016 doctoral dissertations in the U.S. limiting full text reference lists only
Subject headings for 9 engineering disciplines were used to search. From the results, the URLs of all reference lists were saved

Robert Olendorf created a script to extract all citations from the reference list webpage URLs
The data were retrieved using a R script https://github.com/olendorf/parsedthesiscitations
The folder "data/web_pages/" contains all HTML pages downloaded from Proquest
The folder "data/" contains a .csv file of all citations with the four digital numeral extracted by the R script

The file "GoogleScholarData.csv" contains 20 randomly sampled from each of the 9 subjects over 7 decades (1260 data records)
Each record in the file is a comma deliminated data element containing the following fields

Unique is a unique identifier for each record
Subject is the engineering field of the dissertation
Citation is a full text citation in quotes "" citation from a randomly sampled dissertation in the subject
Year is the year extracted by a script from the record (not error checked)
Format is the type of reference (Book, Conference, Journal, Other) assigned by the researchers
Google Scholar contains a F if the citation was found in Google Scholar, C for a partial record, and N for not found
Compendex contains a F if the citation was found in Compendex, C for a partial record, and N for not found
Scopus contains a F if the citation was found in Scopus, C for a partial record, and N for not found

Collections

This Work is not currently in any collections.

Items in this Work

User Activity Date
User Robert K Olendorf has updated Google Scholar Data 6 months ago
User Robert K Olendorf has attached google_scholar_2018.zip to Google Scholar Data 6 months ago
User Robert K Olendorf has attached README.txt to Google Scholar Data 6 months ago
File could not be sent to SHARE Notify 6 months ago
User Robert K Olendorf has deposited Google Scholar Data 6 months ago